开发者

Socket read() hangs for a while when there is no data to read

Hi' I'm writing a simple http port forwarder. I read data from port 80, and pass the data to my lighttpd server, on port 8080.

As long as I write() data on the socket on port 8080 (forwarding the request) there's no problem, but when I read() data from that socket (forwarding the response), the last read() hangs a lot (about 1 or 2 seconds) before realizing there's no more data and returning 0.

I tried to set the socket to non-blocking, but this doesn't work, as sometimes it returns EWOULDBLOCKING even if there's some data left (lighttpd + cgi can be quite slow). I tr开发者_运维知识库ied to set a timeout with select(), but, as above, a slow cgi could timeout the socket when there's actually some data to transmit.


Update: SOLVED. It was the keepalive after all. After I disabled it in my lighttpd configuration file, the whole thing runs flawlessly.


Well, for the sake of completion, and as per my comment:

It is likely that the HTTP server itself (lighttpd in your case) is maintaining a persistent connection to your proxy because your proxy relayed a header containing “Connection: keep-alive”. This header aids when the client wants to make multiple requests over the same connection. So, because lighttpd received this header, it assumed it was going to receive further requests and kept the socket open, causing read to block in your proxy.

Disabling keep-alive in your lighttpd configuration is one way to fix it, but also you could also strip the “Connection: keep-alive“ from the header before you relay it to your web server.


Using both non-blocking sockets and select is the right way to go. Returning EWLOULDBLOCK doesn't mean that the entire stream of data is finished being received, it means that, instantaneously, there is nothing to read right now. That's exactly what you want, because it means that read won't wait even half a second for more data to show up. If the data isn't immediately available it will return.

Now, obviously, this means you will need to call read multiple times to get the complete data. The general format for doing this is a select loop. In pseudocode:

do
  select ( my_sockets )

  if ( select error ) 
    handle_error
  else
    for each ( socket in my_sockets ) do
      if ( socket is ready ) then
        nonblocking read from socket
        if ( no data was read ) then
          close socket
          remove socket from my_sockets
        endif
      endif
    loop
  endif
loop

The idea is that select will tell you which sockets have data available for reading right now. If you read one of those sockets, you are guaranteed either to get data or to get a return value of 0, indicating that the remote end closed the socket.

If you use this method, you will never be stuck in a read call that is not reading data, for any length of time. The blocking operation is the select call, and you can also select over writeable sockets if you need to write, and set a timeout if you need to do things periodically.


Don't do that!

Keepalives boost performance from other clients. Instead, fix your client. Send a Connection: close header in your client and make sure your request doesn't claim HTTP/1.1 compliance. (If for no other reason than that you probably don't handle chunked encoding either.)


I guess that I would use non-blocking I/O to full extend. Instead of setting timeouts I'd rather wait for event's:

while(select(...)) {
    switch(...) {
    case ...: // Handle accepting new connection
    case ...: // Handle reading from socket
    ...
    }
}

Sinle-thread, blocking forwarder will cause problems anyway with multiple clients.

Sorry - I don't remember exact calls. Also it can be strange in some cases (IIRC - you need to handle write), but there are libraries which simplify the task.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜