开发者

Regex - Find javascript methods and its variables in text

Best Solution i come up with so far, given a textblock it finds those methods that have paramters, but also the function with parameter key like this: "get: function(key)".

    public class JavaScriptMethodFinder
{
    static readonly string pattern = @"(?<=\s(?<Begin>[a-zA-Z_][a-zA-Z0-9_]*?)\(|\G)\s*((['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|(?<IsEnd>\)))";
    private static readonly Regex RegEx = new Regex(pattern, RegexOptions.Compiled);

    public IEnumerable<dynamic> Find(string text)
    {
        var t = RegEx.Matches(text);
        dynamic current = null;
        bool isBegin;
        foreach (Match item in t)
        {

            if (isBegin = (item.Groups["Begin"].Value != string.Empty))
            {
                current = new ExpandoObject();
                current.MethodName = item.Groups["Begin"].Value;
                current.Parameters = new List<string>();
                current.Parameters.Add(item.Groups[1].Value);
            }else
                current.Parameters.Add(item.Groups[1].Value);
            if (item.Groups["IsEnd"].Value != string.Empty)
            {
                isBegin = false;
                if(!(item.Groups["Begin"].Value != string.Empty))
                    current.Parameters.Add(item.Groups[1].Value);
                yield return current;
            }

        }

    }

}

I wanna find Methods and its Variables. Given two examples.

First Example

function loadMarkers(markers)
{
     markers.push(
            new Marker(
              "Hdsf", 
              40.261330438503,
              10.4877055287361,
              "some text"
            ) 
      );
}

Second Example

var block = new AnotherMethod('literal', 'literal', {"key":0,"key":14962,"key":false,"key":2});

So far i have, tested here: http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

(?<=Marker\(|\G)\s*((?<name>['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|\))

Found 5 matches: "Hdsf", has 2 groups: "Hdsf" " 40.261330438503, has 2 groups: 40.261330438503 10.4877055287361, has 2 groups: 10.4877055287361 "some text" 开发者_开发技巧) has 2 groups: "some text" " ) has 2 groups:

(?<=AnotherMethod\(|\G)\s*((?<name>['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|\))

Found 3 matches: 'literal', has 2 groups: 'literal' ' (name) 'literal', has 2 groups: 'literal' ' (name) {"key":0,"key":14962,"key":false,"key":2}) has 2 groups: {"key":0,"key":14962,"key":false,"key":2} (name)

I would like to combine it such that i have one expression

  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter
  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter

so when i scan a page which contains both cases, i will get two matches witch ect the first capture being the method name and then the following is the paramters.

I been trying to modify what i already have, but its to complex with the LookBehind stuff for I to understand it.


Regex's are a very problematic approach for this type of project. Have you looked at using a genuine JavaScript parser/compiler like Rhino? That will give you full awareness of JavaScript syntax "for free" and the ability to walk your source code meaningfully.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜