CherryPy 60x as slow in benchmark with 8 requesting threads compared to 7
I'm curious why when benchmarking Python web server CherryPy using ab
, with -c 7
(7 concurrent threads) it can server 1500 requests/s (about what I expect), but when I change to -c 8
it drops way down to 25 requests/s. I'm running CherryPy with numthreads=10 (but it doesn't make a different if I use numthreads=8 or 20) on a 64-bit Windows machine with four cores running Python 2.6.
I'm half-suspecting the Python GIL is part of the issue, but I don't know why it only happens when I get up to 8 concurrently-requesting threads. On a four core machine I'd expect it might change at -c 4
, but this is not the case.
I'm using the one-file CherryPy web server that comes with web.py, and here's the WSGI app that I'm testing against:
from web.wsgiserver import CherryPyWSGIServer
def application(environ, start_response):
start_response("200 OK", [("Content-type", "text/plain")])
return ["Hello World!",]
server = CherryPyWSGIServer(('0.0.0.0', 80), application, numthreads=10)
try:
server.start()
except KeyboardInterrupt:
server.stop()
The ab
output for 7 and 8 concurrent threads is:
C:\\> ab -n 1000 -c 7 http://localhost/
...
Concurrency Level: 7
Time taken for tests: 0.670 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 130000 bytes
HTML transferred: 12000 bytes
Requests per second: 1492.39 [#/sec] (mean)
Time per request: 4.690 [ms] (mean)
Time per request: 0.670 [ms] (mean, across all concurrent requests)
Transfer rate: 189.46 [Kbytes/sec] received
C:\\> ab -n 1000 -c 8 http://localhost/
...
Concurrency Level: 8
Time taken for test开发者_JAVA技巧s: 7.169 seconds
Complete requests: 158
Failed requests: 0
Write errors: 0
Total transferred: 20540 bytes
HTML transferred: 1896 bytes
Requests per second: 22.04 [#/sec] (mean)
Time per request: 362.973 [ms] (mean)
Time per request: 45.372 [ms] (mean, across all concurrent requests)
Transfer rate: 2.80 [Kbytes/sec] received
On my linux box, it's due to the retransmission of a TCP packet from ab
, although I'm not exactly sure why:
No. Time Source Destination Protocol Info Delta
10682 21.218156 127.0.0.1 127.0.0.1 TCP http-alt > 57246 [SYN, ACK] Seq=0 Ack=0 Win=32768 Len=0 MSS=16396 TSV=17307504 TSER=17306704 WS=6 21.218156
10683 21.218205 127.0.0.1 127.0.0.1 TCP 57246 > http-alt [ACK] Seq=82 Ack=1 Win=513 Len=0 TSV=17307504 TSER=17307504 SLE=0 SRE=1 0.000049
10701 29.306438 127.0.0.1 127.0.0.1 HTTP [TCP Retransmission] GET / HTTP/1.0 8.088233
10703 29.306536 127.0.0.1 127.0.0.1 TCP http-alt > 57246 [ACK] Seq=1 Ack=82 Win=512 Len=0 TSV=17309526 TSER=17309526 0.000098
10704 29.308555 127.0.0.1 127.0.0.1 TCP [TCP segment of a reassembled PDU] 0.002019
10705 29.308628 127.0.0.1 127.0.0.1 TCP 57246 > http-alt [ACK] Seq=82 Ack=107 Win=513 Len=0 TSV=17309526 TSER=17309526 0.000073
10707 29.309718 127.0.0.1 127.0.0.1 TCP [TCP segment of a reassembled PDU] 0.001090
10708 29.309754 127.0.0.1 127.0.0.1 TCP 57246 > http-alt [ACK] Seq=82 Ack=119 Win=513 Len=0 TSV=17309526 TSER=17309526 0.000036
10710 29.309992 127.0.0.1 127.0.0.1 HTTP HTTP/1.1 200 OK (text/plain) 0.000238
10711 29.310572 127.0.0.1 127.0.0.1 TCP 57246 > http-alt [FIN, ACK] Seq=82 Ack=120 Win=513 Len=0 TSV=17309527 TSER=17309526 0.000580
10712 29.310661 127.0.0.1 127.0.0.1 TCP http-alt > 57246 [ACK] Seq=120 Ack=83 Win=512 Len=0 TSV=17309527 TSER=17309527 0.000089
The original "GET" packet wasn't picked up by Wireshark either. For some reason, ab
tries to send a request and fails, even though the TCP connection was double-ACk'd just fine. Then the client's TCP stack waits for a few seconds for a packet that was never sent to be ACK'd, and when it sees no ACK, retries and succeeds.
Personally, I wouldn't worry about it. If there's a problem, it's not one with CherryPy. It could be related to the internals of ab
, the use of HTTP/1.0 instead of 1.1, the lack of keepalive, the use of localhost instead of a real socket (which simulates some realities of network traffic and ignores others), the use of Windows (wink), other traffic on the same interface, load on the CPU...the list goes on and on.
精彩评论