Help With .NET CookieContainer
I've recently run into some problems with the CookieContainer. Either I'm doing something seriously wrong or there is some kind of bug w/ the CookieContainer object. It doesn't seem to update the cookie collection with certain Set-Cookie headers.
This might be a lengthy post and I appologize, but I want to be as thurough as possible so I'm going to list my HTTP sniffing logs as well as my actual implementation code.
public bool SendRequest(HttpWebRequest request, IDictionary<string, string> data, int retries)
{
// copy request in case request instance already failed
HttpWebRequest newRequest = (HttpWebRequest)HttpWebRequest.Create(request.RequestUri);
newRequest.Method = request.Method;
// if POST data was provided, write it to the stream
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
try
{
using (HttpWebResponse resp = (HttpWebResponse)newRequest.GetResponse())
{
//CookieCollection newCooks = getCookies(resp.Headers);
//updateCookies(newCooks);
this.cookieJar = newRequest.CookieContainer;
this.Html = getResponseString(resp);
/* remainder snipped */
So there is the code, here are two request to responses I sniffed in Fiddler:Request 1
POST /login/ HTTP/1.1
Host: www.site.com
Content-Length: 47
Expect: 100-continue
Connection: Keep-Alive
Response 1
HTTP/1.1 200 OK
Date: Wed, 02 Dec 2009 17:03:35 GMT
Server: Apache
Set-Cookie: tcc=one; path=/
Set-Cookie: cust_id=2702585226; domain=.site.com; path=/; expires=Mon, 01-Jan-2011 00:00:00 GMT
Set-Cookie: cust_session=12%2F2%2F2009%20%2012%3A3%3A35; domain=.site.com; path=/; expires=Wed 2-Dec-2009 17:33:35
Set-Cookie: refer_id_persistent=0000; domain=.site.com; path=/; expires=Fri 2-Dec-2011 17:3:35
Set-Cookie: refer_id=0000; domain=.site.com; path=/
Set-Cookie: private_browsing_mode=off; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Set-Cookie: member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%开发者_如何学Python5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Request 2
GET /test?search=jjkjf HTTP/1.1
Host: www.site.com
Cookie: tcc=one; cust_id=2702585226; private_browsing_mode=off; member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D
So as you can see, the CookieContainer (this.cookieJar) which is used for every request is not picking up the Set-Cookie header for refer_id, cust_session, refer_id_persistent. However it does pick up cust_id, private_browsing_mode, tcc, and member_session... Any ideas why this might be?
Just wanted to update this post in case someone else came across this. Issue is that .NET complies with the RFC specification for cookie tags, but not all sites do. So, ultimately, the issue is not Microsoft, or .NET for the matter. (Although, IE, manages the cookies fine so it would be better to rewrite their .NET cookie parsing methods using the same parsing methods) The issue is the sites that do not follow RFC specifications.
Nonetheless, an issue I've often encountered is that sites will use commas in the expiration dates in their cookies. .NET interprets these as separators between different cookie fields and strips the ending and everything there after off of the cookie.
RFC spec: "Cookie:, followed by a comma-separated list of one or more cookies." An easy solution to this problem would be for the web server to enclose values with commas in quotation marks, per the RFC document. However, there is no RFC police, so we can only hope that people follow the rules.
MSDN SetCookies: SetCookies pulls all the HTTP cookies out of the HTTP cookie header, builds a Cookie for each one, and then adds each Cookie to the internal CookieCollection that is associated with the URI. The HTTP cookies in the cookieHeader string must be delimited by commas.
MSDN GetCookieHeader GetCookieHeader returns a string that holds the HTTP cookie header for the Cookie instances specified by uri. The HTTP header is built by adding a string representation of each Cookie associated with uri. Note that the exact format of the string depends on the RFC that the Cookie conforms to. The strings for all the Cookie instances that are associated with uri are combined and delimited by semicolons.
This string is not in the correct format for use as the second parameter of the SetCookies method.
This is only a quick scan through you code but it seems that you are sending post data before you send the cookies in the request.
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
What might be happening is when you write your post data to the stream this is sent to the remote server. However for a cookies to be set they must be sent to the server before any postdata. The simple solution is to swap this around like so:
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
CookieContainer has 2 major issues that I have come across, whether it is by design or bug I don't know.
1) Cookies set on a 302 post are not picked up. Example
Post to site
302 redirect response
Load New page which sets cookie
Solution Set autoredirect to false and manually follow the redirects and set the cookies yourself
2) .Net is VERY fussy about incorectly form cookie strings that have a comma in the string. This is actually correct, but occasioally cookies have the date set that include a comma, which stops all cookies being set.
Solution Manually parse cookie strings and add yourself. A horrible task. I have a sprawing mess of a hack function, that loops and ifs but the end result is it works for all cases I have thrown at it so far. IT isn't pretty but it gets job done
Not sure the above is your issue, but maybe. If not some food for thought anyway
My solution: replace " UTC" with " GMT".
Try using CookieContainer.GetCookieHeader and CookieContainer.SetCookies
YourCookieContainer.GetCookieHeader(new Uri("your url"));
YourCookieContainer.SetCookies(new Uri("your url"), "string from GetCookieHeader");
精彩评论