Resources utilization in reactive services

Let me start this post with a question. Imagine a service returning a value whose fetching from other service (e.g. database) takes 1 second (for sake of the example let’s assume it always takes 1 second):

@SpringBootApplication
@RestController
public class WebApplication {

    public static void main(String[] args) {
        SpringApplication.run(WebApplication.class, args);
    }

    @GetMapping("/value")
    String fetchValue() throws InterruptedException {
        TimeUnit.SECONDS.sleep(1);
        return "42";
    }
}

How many transactions per second can we get when we hit this service with 10 concurrent users?

I know you already have an answer but let’s be a good software engineer and measure instead of guessing. Let’s run siege in benchmark mode with 10 concurrent users, each issuing 10 subsequent requests:

$ siege -b -c 10 -r 10 http://localhost:8080/value
Transactions:                100 hits
Availability:             100.00 %
Elapsed time:              10.05 secs
Data transferred:           0.00 MB
Response time:              1.00 secs
Transaction rate:           9.95 trans/sec
Throughput:                 0.00 MB/sec
Concurrency:                9.99
Successful transactions:     100
Failed transactions:           0
Longest transaction:        1.01
Shortest transaction:       1.00

We got 9.95 transactions per second, which is close to the theoretical maximum of 10 tps.

That was easy. Let’s make it a bit more interesting: how many tps will we get if we increase the number of concurrent users to 50?

$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions:                500 hits
Availability:             100.00 %
Elapsed time:              25.05 secs
Data transferred:           0.00 MB
Response time:              2.40 secs
Transaction rate:          19.96 trans/sec
Throughput:                 0.00 MB/sec
Concurrency:               47.87
Successful transactions:     500
Failed transactions:           0
Longest transaction:        3.01
Shortest transaction:       1.00

WAT? It’s not even near to 50 tps. How is that even possible? Well let me share you a secret, I have set max number of Tomcat worker threads to 20.

server.tomcat.max-threads=20

So now that we know what is the limiting factor let’s get rid of this custom worker thread limit and repeat the test with the default settings (200 worker threads in case of Tomcat 8.5):

$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions:                500 hits
Availability:             100.00 %
Elapsed time:              10.06 secs
Data transferred:           0.00 MB
Response time:              1.00 secs
Transaction rate:          49.70 trans/sec
Throughput:                 0.00 MB/sec
Concurrency:               49.95
Successful transactions:     500
Failed transactions:           0
Longest transaction:        1.01
Shortest transaction:       1.00

The actual numbers are not that interesting (yes, we went close to 50 tps) as the view of threads usage:

alt text

We start with close to 30 live threads in our service and when users’ requests hit it we quickly reach 70 live threads and keep them alive for some time after the traffic is gone just in case they could be reused.

Keeping in mind that we’re limited by the number of working threads we can easily tell that once we exceed that number of requests we would start queuing (can you tell what would your services do when excessive hit rate lasts for longer periods of time?).

With such a long running tasks handled that way we can easily make our service unresponsive - that’s not reactive at all, not resilient at all and not any other buzzword-capable at all and no service in 2017 should be so dull. So let’s sprinkle our service with some reactive magic by replacing old handler method with a reactive one:

@GetMapping("/value")
Mono<String> fetchValue() {
    return Mono.just("42")
        .delayElement(Duration.ofSeconds(1));
}

and using org.springframework.boot:spring-boot-starter-webflux1 dependency instead of org.springframework.boot:spring-boot-starter-web.

With all this reactive bits and pieces in place let’s see what happens if we hit our service with 50 concurrent users:

$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions:                500 hits
Availability:             100.00 %
Elapsed time:              10.06 secs
Data transferred:           0.00 MB
Response time:              1.01 secs
Transaction rate:          49.70 trans/sec
Throughput:                 0.00 MB/sec
Concurrency:               49.99
Successful transactions:     500
Failed transactions:           0
Longest transaction:        1.02
Shortest transaction:       1.00

Almost 50 tps, same result as we had with non-reactive, plain-old Servlet-based example running on Tomcat with default 200 worker threads. It doesn’t look impressive unless you take a look at threads usage:

alt text

We start with around 20 live threads and then go up to 30 of them (number of used threads depends on the number of CPU cores you have). Not bad for handling 50 concurrent requests.

Can you take a guess on how many threads will be used for handling 250 concurrent users?

alt text

It’s still the same number of threads and we were able to get close to 250 tps.

$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions:               2500 hits
Availability:             100.00 %
Elapsed time:              10.08 secs
Data transferred:           0.00 MB
Response time:              1.01 secs
Transaction rate:         248.02 trans/sec
Throughput:                 0.00 MB/sec
Concurrency:              249.44
Successful transactions:    2500
Failed transactions:           0
Longest transaction:        1.03
Shortest transaction:       1.00

Of course acheiving so good results was only possible because we were reactive all the way down the stack. Should we block on the execution thread the results would be worse than the ones from the classic Servlet-based service. Moreover we wouldn’t achieve so high tps rates with computation intensive tasks however typical services spend a lot of time blocking.

Having said that let me stress it once more, we were able to handle 250 concurrent requests with just a handful of threads. As the old Unix saying goes “less is more”.

As a side note: if you prefer functional-style programming you can replace SpringMVC-style handler method mapping (and get rid of @RestController annotation as well) by a RouterFunction definition that routes incoming requests to handler functions:

@Bean
RouterFunction<ServerResponse> routerFunction() {
    return route(GET("/value"), request -> fetchValueHandler());
}

Mono<ServerResponse> fetchValueHandler() {
    return ServerResponse.ok()
            .body(fetchValue(), String.class);
}

Mono<String> fetchValue() {
    return Mono.just("42")
            .delayElement(Duration.ofSeconds(1));
}

but it does not make your service run faster nor use less threads.

Do you feel distracted by this side note about functional-style routing? I hope so, because I’ll ask one more question about achievable tps rate for the following example:

@GetMapping("/value")
Mono<String> fetchValue() throws InterruptedException {
    TimeUnit.SECONDS.sleep(1);
    return Mono.just("42");
}

How many tps can we get when we hit this service with 100 concurrent users?

In this case the tps rate is really scary, way worse than the one from the non-reactive, plain-old Servlet-based example. I hope you spotted the problematic part.

The problem is that we blocked on the execution thread and that means really low tps rates (how many [hyper-threaded] CPU cores do you have?) and lots of queued (or timed-out) requests.

As you can see reactive programming can yield great resource usage optimizations and allow you achieve better troughputs but you have to understand at least the basics of this approach, so that you don’t bring your services down to its knees. And that’s only part of a reactive services story, there are other important pieces like request and response body serialization or backpressure just to name a few.

  1. at the time of writing WebFlux (which is part of Spring Framwork 5.0) is in RC1 and Spring Boot 2.0 is still available only as snapshot releases 

Leave a Comment