Java: How to get UTF-8 charset constant

Let’s have a look at a solution to use character set strings like “UTF-8” in your code without placing these strings everywhere.

History

Back in 2012 I still used a little helper enum called Charsets that had values for different standard character sets that I may need. It looked like this:

public enum Charsets {

	ISO_8859_1("ISO-8859-1"),
	US_ASCII("US-ASCII"),
	UTF16("UTF-16"),
	UTF16BE("UTF-16BE"),
	UTF16LE("UTF-16LE"),
	UTF8("UTF-8");

	private final String charset;

	private Charsets(final String charset) {
		this.charset = charset;
	}

	public String getCharset() {
		return this.charset;
	}
}

You can find the same idea in Apache Commons Lang3 and the class CharEncoding which has static Strings for the character encoding names. In Google Guava you have a class Charsets that has similar static properties but instead of Strings you have the class Charset from java.nio.charset.

JDK 7 and newer

Starting with Java 7 a class StandardCharsets from java.nio.charset was introduced which is basically the Charsets class from Google Guava. There is no need for a helper class or enum or a third party library since this is part of the standard API now. In fact the classes from Apache Commons Lang3 and Google Guava are marked as deprecated or at least carry a hint that you should use StandardCharsets instead.

Examples

Let’s have a look at two examples. Suppose that you have read some bytes from a file or a webservice and you want to construct a String. Instead of adding the character set name as a string like “UTF-8” you can use the Charset provided by the StandardCharsets class.

@Test
public void bytes() {
  final byte[] bytes = { 67, 104, 114, 105, 115, 116, 105, 97, 110 };
  final String test = new String(bytes, StandardCharsets.UTF_8);
  assertEquals("Christian", test);
}

Other parts of the standard API allow this as well. In this example we read all the lines in a file and we can use the StandardCharsets class here too. There should be no need to handle any magic strings to specify character set names in your code.

private static final String TEST_FILE = "src/test/resources/test-file.txt";

@Test
public void readFile() throws IOException {
  final List<String> testFileLines = Files.readAllLines(Paths.get(TEST_FILE), StandardCharsets.UTF_8);
  assertEquals(3, testFileLines.size());
  assertEquals("Third line öäü.", testFileLines.get(2));
}