
Regex - Find javascript methods and its variables in text

Best Solution i come up with so far, given a textblock it finds those methods that have paramters, but also the function with parameter key like this: "get: function(key)".

    public class JavaScriptMethodFinder
    static readonly string pattern = @"(?<=\s(?<Begin>[a-zA-Z_][a-zA-Z0-9_]*?)\(|\G)\s*((['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|(?<IsEnd>\)))";
    private static readonly Regex RegEx = new Regex(pattern, RegexOptions.Compiled);

    public IEnumerable<dynamic> Find(string text)
        var t = RegEx.Matches(text);
        dynamic current = null;
        bool isBegin;
        foreach (Match item in t)

            if (isBegin = (item.Groups["Begin"].Value != string.Empty))
                current = new ExpandoObject();
                current.MethodName = item.Groups["Begin"].Value;
                current.Parameters = new List<string>();
            if (item.Groups["IsEnd"].Value != string.Empty)
                isBegin = false;
                if(!(item.Groups["Begin"].Value != string.Empty))
                yield return current;




I wanna find Methods and its Variables. Given two examples.

First Example

function loadMarkers(markers)
            new Marker(
              "some text"

Second Example

var block = new AnotherMethod('literal', 'literal', {"key":0,"key":14962,"key":false,"key":2});

So far i have, tested here: http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx


Found 5 matches: "Hdsf", has 2 groups: "Hdsf" " 40.261330438503, has 2 groups: 40.261330438503 10.4877055287361, has 2 groups: 10.4877055287361 "some text" 开发者_开发技巧) has 2 groups: "some text" " ) has 2 groups:


Found 3 matches: 'literal', has 2 groups: 'literal' ' (name) 'literal', has 2 groups: 'literal' ' (name) {"key":0,"key":14962,"key":false,"key":2}) has 2 groups: {"key":0,"key":14962,"key":false,"key":2} (name)

I would like to combine it such that i have one expression

  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter
  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter

so when i scan a page which contains both cases, i will get two matches witch ect the first capture being the method name and then the following is the paramters.

I been trying to modify what i already have, but its to complex with the LookBehind stuff for I to understand it.

Regex's are a very problematic approach for this type of project. Have you looked at using a genuine JavaScript parser/compiler like Rhino? That will give you full awareness of JavaScript syntax "for free" and the ability to walk your source code meaningfully.





