开发者

regex in c# to extract page.com/users/(this)/xxxxx/

I've been trying to solve this for last two hours but it just doesnt work :(

I have downloaded html code of one web page and then I have removed all double white spaces and all new lines so the whole code is one line string.

And then I have 开发者_运维知识库to extract one piece of data from it

page.com/users/(this)/xxxxx/.....

match = Regex.Match(htmlCode, "page.com/users/(.*)/xxxxx/");
string user = match.Groups[1].ToString();

but it doesn't work, I always get (this)/xxxxx/ + the rest of html code.

Anyone know why doesn't this work?


Instead of the greedy (.*), use ([^/]*).


Your .* is matching everything after that, including the /xxxxx/ portion.


Specify .* more specifically like [^/]+ meaning there has to be something there and it can be anything but a /


try

match = Regex.Match(htmlCode, "page.com/users/([^/]*)/xxxxx/");
string user = match.Groups[1].ToString();


try page.com/users/([^/]*)/xxxxx/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜