开发者

c# regex - matching optionals after a named group

I'm sure this has been quite numerous times but though i've checked all similar questions, i couldn't come up with a solution.

The problem is tha开发者_如何学Got i've an input urls similar to;

  1. http://www.justin.tv/peacefuljay
  2. http://www.justin.tv/peacefuljay#/w/778713616/3
  3. http://de.justin.tv/peacefuljay#/w/778713616/3

I want to match the slug part of it (in above examples, it's peacefuljay).

Regex i've tried so far are;

 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)?
 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)

But i can't come with a solution. Either it fails in the first url or in others.

Help appreciated.


The easiest way of parsing a Uri is by using the Uri class:

string justin = "http://www.justin.tv/peacefuljay#/w/778713616/3";
Uri uri = new Uri(justin);
string s1 = uri.LocalPath; // "/peacefuljay"
string s2 = uri.Segments[1]; // "peacefuljay"

If you insisnt on a regex, you can try someting a bit more specific:

Match mate = Regex.Match(str, @"http://(\w+\.)*justin\.tv(?:/(?<Slug>[^#]*))?");
  • (\w+\.)* - Ensures you match the domain, not anywhere else in the string (eg, hash or query string).
  • (?:/(?<Slug>[^#]*))? - Optional group with the string you need. [^#] limits the characters you expect to see in your slug, so it should eliminate the need of the extra group after it.


As I see it there's no reason to treat to the parts after the "slug".

Therefore you only need to match all characters after the host that aren't "/" or "#".

http://.*\.justin\.tv/(?<Slug>[^/#]+)


http://.*\.justin\.tv/(?<Slug>.*)#*?

or

http://.*\.justin\.tv/(?<Slug>.*)(#|$)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜