SHA1 with BASE64 Hadoop Hive UDF

This is a simple UDF for applying SHA1 + BASE64 on a string in Hive. Works like a charm in Hadoop Hive (tested with CDH 4.2.1)

package io.jackass.hadoop.hive.udf.crypto;

import org.apache.hadoop.hive.ql.exec.UDF;

import org.apache.commons.codec.binary.Base64;

public final class sha1 extends UDF {

	public Text evaluate(final Text s) {
	    if (s == null) {
                return null;
	    try {
	    	MessageDigest md = MessageDigest.getInstance("SHA1");
	    	byte[] hash = md.digest();
              Base64 encoder = new Base64();

		return new Text(encoder.encodeToString(hash));
	    } catch (NoSuchAlgorithmException nsae) {
	    	throw new IllegalArgumentException("SHA1 is not setup");

It’s really simple to use it in Hive

    ADD JAR hive-crypto-udfs-1.0.jar;
    CREATE TEMPORARY FUNCTION sha1 as 'io.jackass.hadoop.hive.udf.crypto.sha1';
    select sha1('1111') from your_table;

if you need some help building JAR file, here is old school javac (tested with CDH 4.2.1)

  • place code above to subdirectory io/jackass/hadoop/hive/udf/crypto/
  • run code below
  •     CP=$(find "/opt/cloudera/parcels/CDH/lib" -name '*.jar' -printf '%p:' | sed 's/:$//')
        javac -classpath $CP io/jackass/hadoop/hive/udf/crypto/
        jar -cf hive-crypto-udfs-1.0.jar  -C . .

    Leave a Reply

    Fill in your details below or click an icon to log in: Logo

    You are commenting using your account. Log Out /  Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out /  Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out /  Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out /  Change )


    Connecting to %s

    %d bloggers like this: