开发者

c# regex to extract link after =

Couldn't find better title but i need a Regex to extract lin开发者_如何学Pythonk from sample below.

snip...  flashvars.image_url = 'http://domain.com/test.jpg' ..snip

assuming regex is the best way.

thanks


Consider the following sample code. It shows how one might extract from your supplied string. But I have expanded upon the string some. Generally, the use of .* is too all inclusive (as the example below demonstrates).

The main point, is there are several ways to do what you are asking, the first answer given uses "look-around" while the second suggests the "Groups" approach. The choice mainly depend upon your actual data.

        string[] tests = {
                @"snip...  flashvars.image_url = 'http://domain.com/test.jpg' ..snip",
                @"snip...  flashvars.image_url = 'http://domain.com/test.jpg' flashvars2.image_url = 'http://someother.domain.com/test.jpg'",
        };
        string[] patterns = {
                @"(?<==\s')[^']*(?=')",
                @"=\s*'(.*)'",
                @"=\s*'([^']*)'",
                             };
        foreach (string pattern in patterns)
        {
            Console.WriteLine();
            foreach (string test in tests)
                foreach (Match m in Regex.Matches(test, pattern))
                {
                    if (m.Groups.Count > 1)
                        Console.WriteLine("{0}", m.Groups[1].Value);
                    else
                        Console.WriteLine("{0}", m.Value);
                }
        }


A simple regex for this would be @"=\s*'(.*)'".


Edit: New regex matching your edited question:

You need to match what's between quotes, after a =, right?

@"(?<==\s*')[^']*(?=')"

should do.

(?<==\s*') asserts that there is a =, optionally followed by whitespace, followed by a ', just before our current position (positive lookbehind).

[^']* matches any number of non-' characters.

(?=') asserts that the match stops before the next '.

This regex doesn't check if there is indeed a URL inside those quotes. If you want to do that, use

@"(?<==\s*')(?=(?:https?|ftp|mailto)\b)[^']*(?=')"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜