3 Ways to Build a Java String

As programmers, we generate a lot of text. I’m not just talking about the sheer quantity of code written, but the text generated by our programs. Whether its writing messages to log files, generating error messages for user, or building messages sent between systems, our applications work with text a lot. But have you really thought about the various ways in which your code generates these strings of text?

Java, as with most languages, has several ways in which to construct a bit of text into a String object. I’ve used and continue to use all of them, depending on what I’m doing. Let’s look at a few of the more common ways to build a Java String and when you’d choose to use one over another. Hopefully you learn something new or find it a good refresher for things you may already know.

String Concatenation

The most straightforward way is to just “add” the strings together, as follows:

1
String result = "Processing record " + recordId + " for user " + userId;

This is straightforward and is reasonably efficient. It’s a syntax that’s used in many languages and it’s pretty clear what’s going on.

In the early days of Java, this was an inefficient way to construct a string since each “+” operator created a new String object. Today, however, the compilers are smart enough to optimize this example into a StringBuilder. Which brings us to the next way to create a string…

StringBuilder

While the previous example works well in simple cases, if you start needing to include conditional or looping logic, the concern about intermediate String construction comes into play.

For example:

1
2
3
4
5
String result = "Processed records: ";
for (int recordId : recordIds) {
result += recordId + " ";
}
result += "in " + elapsedMillis + "ms";

Here, the loop in the middle makes it harder for the compiler to optimize out the intermediate String creation. The alternative is to use a StringBuilder.

1
2
3
4
5
6
StringBuilder buf = new StringBuilder("Processed records: ");
for (int recordId : recorIds) {
buf.append(recordId).append(" ");
}
buf.append("in ").append(elapsedMillis).append("ms");
String result = buf.toString();

In this example, no intermediate String objects are created. The StringBuilder maintains an internal array of characters and only creates the final String object when toString() is called.

With Java 8, this can be reduced even further:

1
2
3
4
String result = new StringBuilder("Processed records: ")
.append(String.join(" ", recordIds))
.append("in ").append(elapsedMillis).append("ms")
.toString();

This still has an intermediate String creation by the String.join method, but it’s certainly much better than the original version. If you want to eliminate that String creation, or you’re not using Java 8, the Guava Joiner class can help.

1
2
3
4
StringBuilder buf = new StringBuilder("Processed records: ");
Joiner.on(" ").appendTo(buf, recordIds);
buf.append(("in ").append(elapsedMillis).append("ms")
String result = buf.toString();

StringBuffer

There is an older class, StringBuffer, which does the same thing as StringBuilder. The only difference is that StringBuffer‘s methods are all synchronized. This means that you could, in theory, have multiple threads all safely updating the same StringBuffer. In practice, nobody ever does this. Ok, somebody probably does this, but I’ve certainly never found the need to do so.

What’s the big deal about synchronization? Every time Java enters a synchronized block or method, there’s a lot of overhead to set up the monitor and other mechanism.

All this to say, unless you’re using a really old version of Java, you’re usually better off using StringBuilder.

Placeholder Replacement

If you’re like me, who came to Java from a C and C++ background, all this “adding” or appending of strings seems quite verbose when you’re used to using something like printf or its cousin, sprintf. There are a variety of placeholder replacement strategies in the Java world, but they all revolve around one idea: create a string with some special characters representing the dynamic bits, and then specify parameters to inject the values you want.

Java has provided a version of this idea since nearly the beginning with the MessageFormat class.  While there are many ways to use this class, the simplest is as follows:

1
2
3
4
String result = MessageFormat.format(
"Processed record {0} for user {1} in {2}ms",
recordId, userId, elapsedTime
);

Each value in curly braces corresponds to one of the remaining arguments, in order. If you were planning on printing this message a lot, you’d actually create an instance of the MessageFormat object and reuse it, saving the time necessary to parse the pattern each time.

In a nod to the C/C++ community, Java 1.5 introduced the String formatter. This provides the ability to specify message patterns using the same syntax as the printf function. For example:

1
2
3
4
String result = String.format(
"Processed record %d for user %d in %0.3f seconds",
recordId, userId, elapsedTime/1000
);

Again, if you were going to print this message a lot, create a Formatter instance and reuse it over and over, rather than parse the input pattern over and over again.

These are just two examples that are part of the Java language. There are plenty of other libraries that use this idea, from Spring’s PlaceholderResolver to the slf4j log message format.

Placeholder replacement mechanisms make the code easier to read and clearer, but they are usually more expensive to generate.

Which to Use?

With all these different ways of combining and constructing Strings, which one is the best? As with most things, the answer is, “it depends.” But all else being equal, pick the version that is the most readable. Which one is the easiest to understand and maintain, given your particular use case?

That said, there are a few situations that are known to cause application performance issues:

  • Loops. If you’re looping over a list of things, make sure you’re using StringBuilder or letting a library, such as Guava, handle it for you. Don’t use string concatenation.
  • Repeated placeholder patterns. If you’re planning on using a particular string pattern a lot in the course of your application, build the object once, let it compile your pattern, and reuse that object.
  • Log messages. Pay attention to expensive string construction for the benefit of logging, especially at DEBUG levels. I’ve seen applications that performed poorly simply because there were thousands of debug level logging statements creating strings that were never printed. Many modern logging frameworks, such as SLF4j and Log4j 2 have pattern (and Lambda!) based mechanisms that are only invoked if the message will actually be written.

These are well known patterns that you should pay attention to as good programming practice. Otherwise, follow Donald Knuth’s admonishment that “premature optimization is the root of all evil.” Write something clear and easy to maintain. Optimize only when you have proved that a particular bit of code is a performance problem.

Conclusion

As you can see, there are a lot of ways to build a Java String object. If you’re new to programming, it’s hard to go wrong using either string concatenation or the StringBuilder class. As you get more experienced, you can start using the additional methods. I didn’t even touch on some of the template languages such as Velocity or Freemarker!

Whether you’re a novice or experienced programmer, it’s good to review the basics from time to time. Take a moment to review how you’re building strings in your application. Are they readable, are they using some known anti-patterns?

Keep learning, keep growing.

Question****: What is your favorite method to build a Java String value?