Reverse regular expressions to generate data
In one of the StackOverflow Podcasts (the one wh开发者_开发知识库ere guys were discussing data generation for testing DBs -- either #11 or #12), Jeff mentioned something like "reverse regular expressions", which are used exactly for that purpose: given a regex, produce a string which will eventually match said regex.
What is the correct term for this whole concept? Is this a well-known concept?
The Perl module String::Random
(in the CPAN) does this. Takes a subset of regular expressions, and does a random walk through it.
Abstract: Recursive transition network (with the postmodernism generator as an interesting example)
One specialization would be your "reverse regex".
As to the terminology: A regular expression is a form of grammar that describes all the words belonging to a specific regular language (namely all the inputs matched by the expression).
Therefore one could call your question: "How can create a random word that matches a given regex" or "How can I obtain a random word belonging to a specified regular language".
It absolutely possible to generate data from regular expressions. Some open source projects are under development in this area.
A tutorial about how to generate random password from regex will explain you how it is done. xeger (reverse of regex, an opensource project) is used in the tutorial. Please got through the tutorial to learn more.
精彩评论