Have you thought about your threads lately? No, I’m not referring to how stylish your clothing looks. Have you thought about how you execute the asynchronous operations in your application? Splitting off tasks into separate threads is often a great way to improve performance or reduce the response times. But if you don’t give any thought to how those tasks are organized and managed, things can blow up quickly.
Limited Resources
Threads are a finite resource. On Linux/Unix systems, the number of threads, across all applications, is limited to the total number of processes allocated to a particular user. Running “ulimit -u” will tell you how many processes/threads you have available. In addition, each thread uses memory and CPU to maintain, even if they’re not doing anything.
To account for this limited resource, developers typically create a pool of threads that are shared among the various tasks. This limits the number of threads created and allows them to be reused, avoiding the cost of spinning up a new one. Often, this is a general purpose pool of threads for performing any background task you may want to run.
Threads Behaving Badly
The problem with a general purpose thread pool is that a badly behaving operation can chew up all the threads, starving out all the other operations. It doesn’t have to be a tricky deadlock scenario either. There’s any number of reasons a thread might freeze up or otherwise lock up all the threads in the pool:
- Long running database query
- Infinite loop
- Deadlock
- A sudden influx of requests
- Network latency
Notice that some of these items don’t actually involve locking up the thread entirely. They’re just long running operations which prevent or delay other work from getting done by the thread pool.
Queue Up
One common way to handle thread pool exhaustion is to put a queue in front of it. If the thread pool is fully utilized, operations are queued up to run on the next available thread. This can help in some situations, but it has a few drawbacks:
- Operations are delayed. This might not be acceptable, especially if you have a user request waiting for an operation to complete.
- Failure is still an option. You still have to handle the case that the operation doesn’t run at all. Unless you happen to like memory leaks, the queue has a limit to the number of operations it will accept.
Many times, you’re better off limiting or eliminating the queue entirely. If the thread pool is that saturated, something is probably wrong, and the application should react accordingly.
Targeted Thread Pools
A better way to handle things is to create specialized, targeted thread pools for each of the types of operations you want to execute. For example, you might have thread pools for:
- Particular database queries
- Individual API endpoints
- Sending messages to a broker (e.g. RabbitMQ or Kafka)
- Writing log messages
Creating a thread pool for a specific purpose isolates it from other pools. If the operations running in one thread pool start to misbehave, they don’t interfere with or prevent the execution of operations on the other thread pools. It also makes debugging easier. If you see a thread pool with the name “kafka-send-thread-1”, you have a pretty good idea what that thread is supposed to be doing.
The Nuts and Bolts
How you create these thread pools is entirely up to you and the programming language you’re using. In Java, there are a few options
- Manually. Using
Executors.newFixedThreadPool()
is the quickest way to get started. If you want to control the size of the queue or you want the thread pool to grow and shrink over time, create aThreadPoolExecutor
instance. Just keep in mind that the core pool size doesn’t grow until the queue is full. - Spring Task Executor. Specifically, the
ThreadPoolTaskExecutor
can be used to create a thread pool with a configurable queue size and a pool of threads that can grow and shrink. In newer versions of Spring, you can simply annotate the methods you want executed asynchronously with the@Async
annotation. This annotation takes the name of a task executor, allowing you to control which thread pool the operation executes on. - Hystrix. This framework from Netflix provides a whole framework around service isolation, including thread pooling. Each “command” runs in a separate thread pool, with no queueing by default. In addition, it implements the Circuit Breaker pattern to avoid sending requests to a service that is known to have problems.
One final note. Whatever mechanism you use, make sure to name your threads. The default Java Executors will give each thread rather unimaginative names like “pool-3-thread-2”. This gives you very little to go on when it comes time to troubleshoot your application. Various implementations of ThreadFactory
exist (Guava has one, for example) that allow you to provide more descriptive names to your threads.
Tame Your Threads
Multi-threaded applications are a way of life these days. Pay attention to how you’re managing your threads or you’ll find they grow into a jumbled mess. Create separate thread pools for each type of operation you need to run asynchronously in your application, limit the size of any queue you put in front of the pools, and make sure your threads have reasonable names.
Managing multiple threads can be tricky to get right. A little organization can make it easier to keep track of what’s going on.