How to determine if an HTTP response is complete
I am working on building a simple proxy which will log certain requests which are passed through it. The proxy does not need to interfere with the traffic being passed through it (at this point in the project) and so I am trying to do as little parsing of the raw request/response as possible durring the process (the request and response are pushed off to a queue to be logged outside of the proxy).
My sample works fine, except for a cannot reliably tell when the "response" is complete so I have connections left open for longer than needed. The releva开发者_StackOverflow中文版nt code is below:
var request = getRequest(url);
byte[] buffer;
int bytesRead = 1;
var dataSent = false;
var timeoutTicks = DateTime.Now.AddMinutes(1).Ticks;
Console.WriteLine(" Sending data to address: {0}", url);
Console.WriteLine(" Waiting for response from host...");
using (var outboundStream = request.GetStream()) {
while (request.Connected && (DateTime.Now.Ticks < timeoutTicks)) {
while (outboundStream.DataAvailable) {
dataSent = true;
buffer = new byte[OUTPUT_BUFFER_SIZE];
bytesRead = outboundStream.Read(buffer, 0, OUTPUT_BUFFER_SIZE);
if (bytesRead > 0) { _clientSocket.Send(buffer, bytesRead, SocketFlags.None); }
Console.WriteLine(" pushed {0} bytes to requesting host...", _backBuffer.Length);
}
if (request.Connected) { Thread.Sleep(0); }
}
}
Console.WriteLine(" Finished with response from host...");
Console.WriteLine(" Disconnecting socket");
_clientSocket.Shutdown(SocketShutdown.Both);
My question is whether there is an easy way to tell that the response is complete without parsing headers. Given that this response could be anything (encoded, encrypted, gzip'ed etc), I dont want to have to decode the actual response to get the length and determine if I can disconnect my socket.
As David pointed out, connections should remain open for a period of time. You should not close connections unless the client side does that (or if the keep alive interval expires).
Changing to HTTP/1.0 will not work since you are a server and it's the client that will specify HTTP/1.1 in the request. Sure, you can send a error message with HTTP/1.0 as version and hope that the client changes to 1.0, but it seems inefficient.
HTTP messages looks like this:
REQUEST LINE
HEADERS
(empty line)
BODY
The only way to know when a response is done is to search for the Content-Length header. Simply search for "Content-Length:" in the request buffer and extract everything to the linefeed. (But trim the found value before converting to int).
The other alternative is to use the parser in my webserver to get all headers. It should be quite easy to use just the parser and nothing more from the library.
Update: There is a better parser here: HttpParser.cs
If you make a HTTP/1.0 request instead of 1.1, the server should close the connection as soon as it's through since it doesn't need to keep the connection open for another request.
Other than that, you really need to parse the content length header in the response to get the best value.
Using blocking IO and multiple threads might be your answer. Specifically
using(var response = request.GetResponse())
using(var stream = response.GetResponseStream())
using(var reader = new StreamReader(stream)
data = reader.ReadToEnd()
This is for textual data, however binary handling is similar.
精彩评论