Regular Expression library in C/C++
I want to write regular expressi开发者_Go百科on library in C/C++. What is the good starting point , any books or articles.
I know there are may libraries are available , but I want to write my own version.
A good starting point is to use existing implementations and criticize them.
Pay attention to data structures and design decisions you don't like.
Avoid them when you write your version.
[Edit 16-Jan-2015] I recently encountered this beautiful book Beautiful Code. I recommend you go through Chapter 1, "A Regular Expression Matcher" by Brian Kernighan.
You can read the classic paper by Ken Thompson, "Regular expression search algorithm" ... http://portal.acm.org/citation.cfm?doid=363347.363387 ... this paper should give you a good understanding on how regular expressions are matched using finite automata.
This is another page giving some detailed information by Russ Cox ... http://swtch.com/~rsc/regexp/
Hope these help you get started.
I don't know a book that will help you with the implementation details -- and I'm sure there are tons of details to make it efficient. However, the book Languages and Machines, by Thomas A. Sudkamp, will be of help to understand the ideas behind an implementation.
I think what you'll need to do is compile a regular expression into a finite automata. If you don't know much about grammars and automatas, then part II of that book "Grammars, Automata, and Languages" will be of great help.
The book Compilers, principles, techniques, & tools; by Alfred Aho, Monica Lam, Ravi Sethi and Jeffrey Ullman (also refered to as the dragon book), may also be of help. It's oriented towards making a compiler for a computer language, not for regular expression language. However, you'll probably find it helpful, specially the part about parsing, as it has more of a practical nature (as opposed to Languages and Machines that is very theoretical).
Anyway, if I was to write a regular expression language, those would be my starting points. I recommend you borrowing both from the library you have access to. Other than that, you should take a look at working implementations. I'm just guessing here, but I think there'll be probably good documentation regarding Perl regular expression implementation. Seeing they're so popular and work so well.
Good luck.
精彩评论