开发者

Regex - match HTTP Via tag

I have problems with parsing a HTTP "Via" tag from a client's browser. This is an example of an HTTP header that I got:

GET / HTTP/1.0
Accept: application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, */*
Accept-Language: sr-Latn-RS
User-Age开发者_高级运维nt: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MATM; AskTbGOM2/5.8.0.12304)
Accept-Encoding: gzip, deflate
Host: 10.0.1.7
Via: 1.1 smtp.local:3128 (squid/2.6.STABLE21)
X-Forwarded-For: 10.0.0.75
Cache-Control: max-age=259200
Connection: keep-alive

Now, I need to get the smtp.local:3128 part from this header, but the regex I wrote does not work.

Example pattern, written in C# (doesnt work):

string matchHttpVia = @"Via: 1.1 (\.+:\d+)";

Note that there could also be an IP instead of a hostname.


To parse Via: x.x host:port you can use the regex:

Via: \d+\.\d+ (.*:\d+) (\(.*\))?

This should also be sufficient actually:

Via: \d+\.\d+ (.*:\d+)

That should do the trick for all possible cases of 'version', host and port.


As Konerak commented, removing the backslash from before the dot, giving Via: 1.1 (.*:\d+) should fix your problem. \. matches only a literal dot character where . matches any character.

Note though, that this will only work if "1.1" is the only thing that can appear between the "Via:" and the hostname/IP. I don't know enough about HTTP headers to know if that's a safe assumption, but it seems like it might not be.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜