Remove HTTP headers from a raw response
Let's say we make a request to a URL and get back the raw response, like this:
HTTP/1.1 200 OK
Date: Wed, 28 Apr 2010 14:39:13 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 Set-Cookie: PREF=ID=e2bca72563dfffcc:TM=1272465553:LM=1272465553:S=ZN2zv8oxlFPT1BJG; expires=Fri, 27-Apr-2012 14:39:13 GMT; path=/; domain=.google.co.uk Server: gws X-XSS-Protection: 1; mode=block Connection: close
<!doctype html><html><head>...</head><body>...</body></html>
What would be the best way to remove the HTTP headers from the response in C#? With regexes? Parsing it into some kind of HTTPResponse object and using only the body?
EDIT:
I'm using SOCKS to 开发者_如何转开发make the request; that's why I get the raw response.
Headers and body are separated by empty line. it is really easier to do it without RE. Just search for first empty line.
If you use HttpWebrequest class you get an HttpWebResponse object returned which in turn contains a collection of Headers. You can then remove them, parse them or do whatever you wish with them.
Note that using the substring method will leave you with a leading carriage return. I used this:
string HTTPHeaderDelimiter = "\r\n\r\n";
if (RawHTTPResponse.IndexOf("HTTP/1.1 200 OK") > -1)
{
HTTPPayload = RawHTTPResponse.Substring(RawHTTPResponse.IndexOf(HTTPHeaderDelimiter)+HTTPHeaderDelimiter.Length);
}
else
{
return;
}
精彩评论