开发者

Javascript lexer / tokenizer (in Python?)

Does any开发者_Python百科one know of a Javascript lexical analyzer or tokenizer (preferably in Python?)

Basically, given an arbitrary Javascript file, I want to grab the tokens.

e.g.

foo = 1

becomes something like:

  1. variable name : "foo"
  2. whitespace
  3. operator : equals
  4. whitespace
  5. integer : 1


http://code.google.com/p/pynarcissus/ has one.

Also I made one but it doesn't support automatic semicolon insertion so it is pretty useless for javascript that you have no control over (as almost all real life javascript programs lack at least one semicolon) :) Here is mine:

http://bitbucket.org/santagada/jaspyon/src/tip/jaspyon/

the grammar is in jsgrammar.txt, it is parsed by the PyPy parsing lib (which you will have to download and extract from the pypy source) and it build a parse tree which I walk on astbuilder.py

But if you don't have licensing problems I would go with pynarcissus. heres a direct link to look at the code (ported from narcissus):

http://code.google.com/p/pynarcissus/source/browse/trunk/jsparser.py

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜