query your EC2 instances using AWS CLI

While AWS CLI is well documented, I was a bit surprised that I was not able to find many samples of tag querying and filtering…. here is small snippet I created last weekend

  • selects from one region
  • queries several fields including two custom tags, in my case “Name” and “instance_role”
  • wildcard filter on custom tag in my case Name like *postgres*
  • outputs as easy to read sorted table
aws ec2 describe-instances \
    --output   table \
    --region   us-east-1 \
    --query   'Reservations[].Instances[].[ Tags[?Key==`Name`].Value | [0], Tags[?Key==`instance_role`].Value | [0], PublicIpAddress, PrivateIpAddress, State.Name, Placement.AvailabilityZone, InstanceId, InstanceType, LaunchTime ]' \
    --filters 'Name=tag:Name,Values=*postgres*' \
   | sort -n -k 2

it cannot be easier than that


SHA1 with BASE64 Hadoop Hive UDF

This is a simple UDF for applying SHA1 + BASE64 on a string in Hive. Works like a charm in Hadoop Hive (tested with CDH 4.2.1)

package io.jackass.hadoop.hive.udf.crypto;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

import java.security.*;
import org.apache.commons.codec.binary.Base64;

public final class sha1 extends UDF {

	public Text evaluate(final Text s) {
	    if (s == null) {
                return null;
	    try {
	    	MessageDigest md = MessageDigest.getInstance("SHA1");
	    	byte[] hash = md.digest();
              Base64 encoder = new Base64();

		return new Text(encoder.encodeToString(hash));
	    } catch (NoSuchAlgorithmException nsae) {
	    	throw new IllegalArgumentException("SHA1 is not setup");

It’s really simple to use it in Hive

    ADD JAR hive-crypto-udfs-1.0.jar;
    CREATE TEMPORARY FUNCTION sha1 as 'io.jackass.hadoop.hive.udf.crypto.sha1';
    select sha1('1111') from your_table;

if you need some help building JAR file, here is old school javac (tested with CDH 4.2.1)

  • place code above to subdirectory io/jackass/hadoop/hive/udf/crypto/sha1.java
  • run code below
  •     CP=$(find "/opt/cloudera/parcels/CDH/lib" -name '*.jar' -printf '%p:' | sed 's/:$//')
        javac -classpath $CP io/jackass/hadoop/hive/udf/crypto/sha1.java
        jar -cf hive-crypto-udfs-1.0.jar  -C . .

    Dashing with Oracle database

    I have been using Dashing http://dashing.io/ with great success. It is absolutely amazingly simple framework even for non-ruby developers. Here is a simple recipe how to create jobs pulling data from Oracle database.


    One Time Setup

    I am running centOS 6.5 64bit, bit in general this shoud work on any platform with few twists


    install Oracle thin client and install ruby-oci9

    • get 3 zips (or rpms or …) from Oracle web site http://www.oracle.com/technetwork/topics/linuxx86-64soft-092277.html
    • run the install, in my case there are 3 ZIPs
      cd ~
      mkdir oracle
      #wget files below manually (oracle requires login)
      unzip instantclient-basic-linux.x64-
      unzip instantclient-sdk-linux.x64-
      unzip instantclient-sqlplus-linux.x64-
      export ORACLE_HOME=~/oracle
      export LD_LIBRARY_PATH=$ORACLE_HOME/instantclient_12_1
      cd ~/oracle/instantclient12_1
      ln -s libclntsh.so.12.1 libclntsh.so
      gem install ruby-oci8
    • in case you need more help, check this http://rubydoc.info/gems/ruby-oci8/frames/file/README.md


    Let’s Play


    install dashing

    really don’t expect me to retype how to install it  😉


    edit gem file in your dashing project and add

    require 'ruby-oci8'


    setup username/password/oracle server

    For simplicity I will setup UID/PSW as shell variable

    export ORACLE_USER=myid
    export ORACLE_PASSWORD=mypassword
    export ORACLE_TNS=myoracleserver/myService



    Sample Job sending data to Rickshawgraph

    require 'oci8'
    points_field1 = []
    points_field2 = []
    last_x = 1
    elements_total = 100
    SCHEDULER.every '10s', :first_in => 0 do |job|
              select sum(field1)  field1,
                     sum(field2)  field2,
                from mytable	  
    		  ") do |r|
        last_x += 1
        points_field1  << { x: last_x, y: r[0].to_i }
        points_field2 << { x: last_x, y: r[1].to_i }
    	series = [
    	          { name: "FIELD1",  data: points_field1.last(elements_total)  },
    	          { name: "FIELD2",  data: points_field2.last(elements_total)  }			  
        send_event('oracle-graph', series: series )
      rescue Exception => e
        puts e.message



    Sample Job sending data to List


    require 'oci8'
    SCHEDULER.every '10s', :first_in => 0 do |job|
      mylist = Hash.new  
                  select field1, count(*) from mytable group by field1
                ") do |r|    
        mylist[r[0]] = { label: r[0], value: r[1].to_i.to_s }
      send_event('oracle-list', { items: mylist.values })
      rescue Exception => e
        puts e.message