Static code analysis for new language. Where to start? [closed]
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this questionI just been given a new assignment which looks like its going to be an interesting challenge.
The customer is wanting a code style checking tool to be developed for their internal (soon to be open sourced) programming language which runs on the JVM. The language syntax is very Java like.
The customer basically wants me to produce something like checkstyle.
So my question is this, how would you approach this problem? Given a clean slate what recommendations would you make to the customer?
I think I have 3 options
Write something from scrat开发者_Go百科ch. Id prefer not to do this as it seems like this sort of code analysis tool problem has been solved so many times that there must be a more "framework" or "platform" orientated approach.
Fork an existing code style checking tool and modify the parsing to fit with this new language etc etc
Extend or plug into an existing static code analysis tool. (maybe write a plugin for Yasca?)
Such tools basically have to implement a compiler front-end for at least a subset of the language. The easiest starting point is often to adapt an existing compiler front-end, so you should definitely start by looking at your customer's compiler. If you are lucky it will have a clean separation between the front-end and back-end and will be able to use it as-is and use the AST or whatever IR the front-end produces to do your additional analysis.
You don't want to write all this stuff from scratch.
See the DMS Software Reengineeering Toolkit. This has generalized compiler machinery for parsing, building ASTs, constructing symbol tables, constructing/traversing control flow and data flow graphs and call trees.
DMS can be obtained with a full Java front end that builds ASTs, symbol tables and the flow analyses above. DMS handles language dialects with aplomb, so it should be as straightforward as practical to modify this front end to match your customer's Java-variant language and yet acquire all this analysis machinery.
What about PMD? Ive used PMD for years but never really drilled down into its inner workings before.
PMD can be extended by writing a custom language parser, which is done by providing implementations of the following within a JAR on the class path.
net.sourceforge.pmd.cpd.Language
net.sourceforge.pmd.cpd.Tokenizer
http://pmd.sourceforge.net/cpd-parser-howto.html
Then by using the PMD rule designer I can define rules from the resulting AST.
The thing I like about PMD is that its a broadly recognised code analysis tool in the Java space so has lots of third party support. E.g Eclipse plugin, Hudson CI plugin etc etc
Take a look at FindBugs
精彩评论