I decided to do a quick load test using Apache bench
ab on the blog from my local machine to get a baseline for performance. To my dismay, CPU on the EC2 mini instance was the bottleneck.
The ceiling for requests per second maxed out at approximately 65 rps - Horrible. Okay, it's not that bad, but it could be better.
Apache bench results
Here's what Apache bench returned with concurrency set to 1 and number of requests set to 10:
$ ab -n10 -c1 notsuperuser.do/ This is ApacheBench, Version 2.3 <$Revision: 1373084 $> Benchmarking superuser.do (be patient).....done Server Software: nginx/1.4.3 Server Hostname: superuser.do Server Port: 80 Document Path: / Document Length: 3372 bytes Concurrency Level: 1 Time taken for tests: 2.510 seconds Complete requests: 10 Failed requests: 0 Write errors: 0 Total transferred: 35740 bytes HTML transferred: 33720 bytes Requests per second: 3.98 [#/sec] (mean) Time per request: 250.970 [ms] (mean) Time per request: 250.970 [ms] (mean, across all concurrent requests) Transfer rate: 13.91 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 111 116 3.2 117 121 Processing: 127 135 4.8 134 143 Waiting: 126 134 4.9 133 141 Total: 242 251 5.9 253 258 Percentage of the requests served within a certain time (ms) 50% 253 66% 256 75% 256 80% 257 90% 258 95% 258 98% 258 99% 258 100% 258 (longest request)
The raw data
The statistics for concurrency levels of 5, 10, 25, 50, and 100 are below:
Concurrency Level: 5 Requests per second: 18.64 [#/sec] (mean) Time per request: 268.174 [ms] (mean) min mean[+/-sd] median max Connect: 112 118 4.7 116 131 Processing: 127 148 13.7 146 180 Waiting: 126 147 13.8 145 179 Total: 242 265 16.3 262 309 Concurrency Level: 10 Requests per second: 35.40 [#/sec] (mean) Time per request: 282.503 [ms] (mean) min mean[+/-sd] median max Connect: 111 117 3.4 117 132 Processing: 126 155 30.8 144 266 Waiting: 125 154 30.8 143 265 Total: 241 272 31.6 261 388 Concurrency Level: 25 Requests per second: 62.44 [#/sec] (mean) Time per request: 400.409 [ms] (mean) min mean[+/-sd] median max Connect: 111 117 3.3 116 133 Processing: 128 270 89.3 257 470 Waiting: 126 269 89.2 256 469 Total: 241 387 89.6 373 588 Concurrency Level: 50 Requests per second: 65.95 [#/sec] (mean) Time per request: 758.153 [ms] (mean) min mean[+/-sd] median max Connect: 111 118 4.9 117 135 Processing: 321 622 109.1 610 889 Waiting: 320 621 109.1 609 889 Total: 436 741 111.0 726 1023 Concurrency Level: 100 Requests per second: 66.98 [#/sec] (mean) Time per request: 1493.048 [ms] (mean) min mean[+/-sd] median max Connect: 111 118 3.7 117 138 Processing: 577 1297 119.6 1321 1651 Waiting: 576 1295 120.9 1320 1650 Total: 696 1414 120.2 1437 1780
Cumulative request completion
I ran a longer test with the concurrency level set to 100, and the number of requests set to 5000. The following plot shows, a sort of sideways, CDF of requests by time to complete:
It makes more sense reflected; where the x-axis is time, and the y-axis is number of requests completed within x milliseconds.
Using the termcap interface for monit, I pulled up the following CPU and memory utilization during the tests. Memory stayed about constant around 100 MB, but CPU utilization was maxing out after concurrency exceeded 25 (probably somewhere around 20 or so). Not very scientific, I know, but I wasn't actually planning on doing a real load test.
Flaws in methodology
There are a number of flaws in my ad-hoc load testing. Here are a few:
- I didn't have proper monitoring when I performed the testing. I was just using a combination of
- I didn't have a plan going into the testing - it was something I chose to do on a whim. I should have detrmined if I wanted to measure performance (requests per second), scalability (concurrent connections), or stability.
- All the load tests were run from my laptop, introducting additional variables such as network latency, client resource limitations, etc.
- I should have set up another EC2 instance to perform the load testing from.
- Or, I should have used a distributed load testing mechanism to generate more realistic load. For exmaple, blitz.io offers a nice service.
- I should have performed multiple trials at each concurrency level.
So what now?
Well, I think I'll be okay with what I've got for the time being. I don't suspect I'll be getting that much traffic anytime soon; however, it's pretty trivial to change instance sizes. If I do see a sudden bump in traffic, then I'll probably switch up to a small instance.
Things to investigate
I may try self-throttling CPU utilization, in order to avoid Amazon's severe CPU throttling on mini instances. My hypothesis is that net throughput would increase, and the blog would be more stable under sudden load.
In addition to making due with what I have, I'll probably take a day or so trying out the different instance sizes. However, I need to get some better metrics and monitoring before committing my wallet to that.
Aside from Apache bench, I'll probably also try to do some testing using blitz.io.
Testing on a small instance
Just for kicks, I reran the Apache bench load test using a small instance, and these were the results:
As you can see, the results were much better this time around - much more consistent and stable.