Socket read() hangs for a while when there is no data to read
Hi' I'm writing a simple http port forwarder. I read data from port 80, and pass the data to my lighttpd server, on port 8080.
As long as I write() data on the socket on port 8080 (forwarding the request) there's no problem, but when I read() data from that socket (forwarding the response), the last read() hangs a lot (about 1 or 2 seconds) before realizing there's no more data and returning 0.
I tried to set the socket to non-blocking, but this doesn't work, as sometimes it returns EWOULDBLOCKING even if there's some data left (lighttpd + cgi can be quite slow). I tr开发者_运维知识库ied to set a timeout with select(), but, as above, a slow cgi could timeout the socket when there's actually some data to transmit.
Update: SOLVED. It was the keepalive after all. After I disabled it in my lighttpd configuration file, the whole thing runs flawlessly.
Well, for the sake of completion, and as per my comment:
It is likely that the HTTP server itself (lighttpd in your case) is maintaining a persistent connection to your proxy because your proxy relayed a header containing “Connection: keep-alive
”. This header aids when the client wants to make multiple requests over the same connection. So, because lighttpd received this header, it assumed it was going to receive further requests and kept the socket open, causing read
to block in your proxy.
Disabling keep-alive in your lighttpd configuration is one way to fix it, but also you could also strip the “Connection: keep-alive
“ from the header before you relay it to your web server.
Using both non-blocking sockets and select
is the right way to go. Returning EWLOULDBLOCK doesn't mean that the entire stream of data is finished being received, it means that, instantaneously, there is nothing to read right now. That's exactly what you want, because it means that read
won't wait even half a second for more data to show up. If the data isn't immediately available it will return.
Now, obviously, this means you will need to call read
multiple times to get the complete data. The general format for doing this is a select loop. In pseudocode:
do
select ( my_sockets )
if ( select error )
handle_error
else
for each ( socket in my_sockets ) do
if ( socket is ready ) then
nonblocking read from socket
if ( no data was read ) then
close socket
remove socket from my_sockets
endif
endif
loop
endif
loop
The idea is that select
will tell you which sockets have data available for reading right now. If you read one of those sockets, you are guaranteed either to get data or to get a return value of 0, indicating that the remote end closed the socket.
If you use this method, you will never be stuck in a read
call that is not reading data, for any length of time. The blocking operation is the select
call, and you can also select over writeable sockets if you need to write, and set a timeout if you need to do things periodically.
Don't do that!
Keepalives boost performance from other clients. Instead, fix your client. Send a Connection: close
header in your client and make sure your request doesn't claim HTTP/1.1
compliance. (If for no other reason than that you probably don't handle chunked encoding either.)
I guess that I would use non-blocking I/O to full extend. Instead of setting timeouts I'd rather wait for event's:
while(select(...)) {
switch(...) {
case ...: // Handle accepting new connection
case ...: // Handle reading from socket
...
}
}
Sinle-thread, blocking forwarder will cause problems anyway with multiple clients.
Sorry - I don't remember exact calls. Also it can be strange in some cases (IIRC - you need to handle write), but there are libraries which simplify the task.
精彩评论