/ meta

What a load of $@#!

I decided to do a quick load test using Apache bench ab on the blog from my local machine to get a baseline for performance. To my dismay, CPU on the EC2 mini instance was the bottleneck.

The ceiling for requests per second maxed out at approximately 65 rps - Horrible. Okay, it's not that bad, but it could be better.

Apache bench results

Base line

Here's what Apache bench returned with concurrency set to 1 and number of requests set to 10:

$ ab -n10 -c1 notsuperuser.do/
This is ApacheBench, Version 2.3 <$Revision: 1373084 $>
Benchmarking superuser.do (be patient).....done

Server Software:        nginx/1.4.3
Server Hostname:        superuser.do
Server Port:            80
Document Path:          /
Document Length:        3372 bytes

Concurrency Level:      1
Time taken for tests:   2.510 seconds
Complete requests:      10
Failed requests:        0
Write errors:           0
Total transferred:      35740 bytes
HTML transferred:       33720 bytes
Requests per second:    3.98 [#/sec] (mean)
Time per request:       250.970 [ms] (mean)
Time per request:       250.970 [ms] (mean, across all concurrent requests)
Transfer rate:          13.91 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      111  116   3.2    117     121
Processing:   127  135   4.8    134     143
Waiting:      126  134   4.9    133     141
Total:        242  251   5.9    253     258

Percentage of the requests served within a certain time (ms)
  50%    253
  66%    256
  75%    256
  80%    257
  90%    258
  95%    258
  98%    258
  99%    258
 100%    258 (longest request)

The raw data

The statistics for concurrency levels of 5, 10, 25, 50, and 100 are below:

Concurrency Level:      5
Requests per second:    18.64 [#/sec] (mean)
Time per request:       268.174 [ms] (mean)
              min  mean[+/-sd] median   max
Connect:      112  118   4.7    116     131
Processing:   127  148  13.7    146     180
Waiting:      126  147  13.8    145     179
Total:        242  265  16.3    262     309


Concurrency Level:      10
Requests per second:    35.40 [#/sec] (mean)
Time per request:       282.503 [ms] (mean)
              min  mean[+/-sd] median   max
Connect:      111  117   3.4    117     132
Processing:   126  155  30.8    144     266
Waiting:      125  154  30.8    143     265
Total:        241  272  31.6    261     388


Concurrency Level:      25
Requests per second:    62.44 [#/sec] (mean)
Time per request:       400.409 [ms] (mean)
              min  mean[+/-sd] median   max
Connect:      111  117   3.3    116     133
Processing:   128  270  89.3    257     470
Waiting:      126  269  89.2    256     469
Total:        241  387  89.6    373     588


Concurrency Level:      50
Requests per second:    65.95 [#/sec] (mean)
Time per request:       758.153 [ms] (mean)
              min  mean[+/-sd] median   max
Connect:      111  118   4.9    117     135
Processing:   321  622 109.1    610     889
Waiting:      320  621 109.1    609     889
Total:        436  741 111.0    726    1023


Concurrency Level:      100
Requests per second:    66.98 [#/sec] (mean)
Time per request:       1493.048 [ms] (mean)
              min  mean[+/-sd] median   max
Connect:      111  118   3.7    117     138
Processing:   577 1297 119.6   1321    1651
Waiting:      576 1295 120.9   1320    1650
Total:        696 1414 120.2   1437    1780

Cumulative request completion

I ran a longer test with the concurrency level set to 100, and the number of requests set to 5000. The following plot shows, a sort of sideways, CDF of requests by time to complete:

ab-plot

It makes more sense reflected; where the x-axis is time, and the y-axis is number of requests completed within x milliseconds.
ab-plot-reflected

Resource utilization

Using the termcap interface for monit, I pulled up the following CPU and memory utilization during the tests. Memory stayed about constant around 100 MB, but CPU utilization was maxing out after concurrency exceeded 25 (probably somewhere around 20 or so). Not very scientific, I know, but I wasn't actually planning on doing a real load test.

pm2-monit

Flaws in methodology

There are a number of flaws in my ad-hoc load testing. Here are a few:

  • I didn't have proper monitoring when I performed the testing. I was just using a combination of top and monit.
  • I didn't have a plan going into the testing - it was something I chose to do on a whim. I should have detrmined if I wanted to measure performance (requests per second), scalability (concurrent connections), or stability.
  • All the load tests were run from my laptop, introducting additional variables such as network latency, client resource limitations, etc.
    • I should have set up another EC2 instance to perform the load testing from.
    • Or, I should have used a distributed load testing mechanism to generate more realistic load. For exmaple, blitz.io offers a nice service.
  • I should have performed multiple trials at each concurrency level.

So what now?

Well, I think I'll be okay with what I've got for the time being. I don't suspect I'll be getting that much traffic anytime soon; however, it's pretty trivial to change instance sizes. If I do see a sudden bump in traffic, then I'll probably switch up to a small instance.

Things to investigate

I may try self-throttling CPU utilization, in order to avoid Amazon's severe CPU throttling on mini instances. My hypothesis is that net throughput would increase, and the blog would be more stable under sudden load.

In addition to making due with what I have, I'll probably take a day or so trying out the different instance sizes. However, I need to get some better metrics and monitoring before committing my wallet to that.

Aside from Apache bench, I'll probably also try to do some testing using blitz.io.

Testing on a small instance

Just for kicks, I reran the Apache bench load test using a small instance, and these were the results:

ab-plot-small

As you can see, the results were much better this time around - much more consistent and stable.