Is there a utility which given an ANTLR grammar will produce matching strings?
I have an ANT开发者_如何学CLR grammar and I would like to fuzz my parser.
Are you looking for generation from a CFG grammar? Ie. the generation of strings that are accepted by the grammar? This could be a good idea to check for grammar correctness, but keep in mind that the set of accepted strings is most probably infinite. Any really bad bugs should already be apparent in the grammar specification, and hopefully by the checking of LL-ness.
I dont know of any tool in the ANTLR world, neither did a quick google search on (E)BNF generation reveal anything useful.
It is, however, not very difficult to roll your own generator if performance and such is not an issue. Prolog would spring to mind, there are loads of litterature available, but if you do not want to leave Java, i suspect homebrewing is the way to go. Its fun anyway.
Assume you generated sentences (strings of tokens) from your ANTLR grammar. Why do you think your ANTLR-based parser would object to them?
What you really have to do is to produce not-quite-legal strings. So, what you need is a generator that can produce erroneous strings.
Given that ANTLR generates a set of procedures from your ANTLR grammar, I think it would be difficult to produce a sentence-generator using the generated parser. What you need is the explicit model of the grammar. And this already available to you: the ANTLR input grammar.
An additional complication I see is generation of legal tokens from the regexes that make up the token definitions. Again, you'd need to process the ANTLR input to do this.
Processing both of these seem technically straightforward. The best engine to use as a foundation is likely the ANTLR front end, which obviously parses ANTLR specs, and so must hold some representation of the ANTLR input.
Was looking for something similar and found GramTest, which seems to be suitable, but instead of ANTLR grammar uses BNF grammar as input.
This tool allows you to generate test cases based on arbitrary user defined grammars. The input grammar is given in BNF notation. Potential applications include fuzzing and automated testing.
For more background info they link to the following blogposts:
- How does grammar-based test case generation work?
- Practical tips for implementing grammar-based test case generation
精彩评论