How to hash data with Java MessageDigest

Let’s have a look at the Java Security MessageDigest class that provides one-way hash functions and how to get individual MessageDigest instances without using Strings for the algorithm names everywhere in our code.

Existing solutions

If you need hashing functions like the popular MD5 or one of the Secure Hash Algorithms (SHA) in your project I suggest that you either have a look at Apache Commons Codec or Google Guava for example.

There is a class called DigestUtils in Apache Commons Codec that helps you to work with hash functions. It even has helper methods that return the digest for the data as a hex string.

Google Guava on the other hand has a class called Hashing with various helper methods to get the hash function you want. With the HashFunction you create a new Hasher, feed data into it and finally call the hash method. Have a look at the JavaDoc or the Wiki for very detailed information.

Rolling your own

If you want to work with the standard Java API without any other libraries you can use the java.security.MessageDigest class. When creating an instance of the MessageDigest class it’s important to know that the algorithm names are defined in the Java Cryptography Architecture Standard Algorithm Name Documentation.

By definition every implementation of the Java platform is required to at least support MD5, SHA-1 and SHA-256. If you want to find out which algorithms are supported by your current installation you can use the following code:

public void showAvailable() {
   final Set algorithms = Security.getAlgorithms("MessageDigest");
   System.out.println(algorithms);
 }

This should print something like “[SHA-384, SHA-224, SHA-256, MD2, SHA, SHA-512, MD5]”. Now instead of calling the getInstance method on the MessageDigest class with these strings we can come up with a simple enum that holds them for us:

public enum MessageDigestName {
  MD2,
  MD5,
  SHA,
  SHA_224,
  SHA_256,
  SHA_384,
  SHA_512;

  public String getMessageDigestName() {
    return this.name().replace('_', '-');
  }
}

There is a helper method in the enum to get the actual algorithm name according to the Standard Algorithm Name Documentation.

In practice you can write another utility method to return the hash for a given string for example.

try {
  final MessageDigest md = MessageDigest.getInstance(MessageDigestName.SHA_256.getMessageDigestName());

  md.update("Test".getBytes(StandardCharsets.UTF_8));
  byte[] digest = md.digest();

  final BigInteger number = new BigInteger(1, digest);
  final String hexHash = number.toString(16);

  // Result: 532eaabd9574880dbf76b9b8cc00832c20a6ec113d682299550d7a6e0f345e25
  return hexHash;
} catch (NoSuchAlgorithmException ex) {
  throw new RuntimeException(ex);
}

It starts by getting the MessageDigest instance, adding data to it, calculating the hash by calling the digest method and then converting this result to a hex string.

As you can see it is not that hard to set this up by yourself without any other libraries. It is a good exercise to understand what is going on. For production I would recommend using an existing, established library that has been reviewed like Apache Commons Codec or Google Guava instead.