How do you understand the output of re.pm when debug turned on?
[root@ test]# perl -e 'use re "debug";"a" =~ /.*/';
Compiling REx `.*'
size 3 Got 28 bytes for offset annotations.
first at 2
1: STAR(3)
2: REG_ANY(0)
3: E开发者_JAVA技巧ND(0)
anchored(MBOL) implicit minlen 0
Offsets: [3]
2[1] 1[1] 3[0]
Matching REx ".*" against "a"
Setting an EVAL scope, savestack=3
0 <> <a> | 1: STAR
REG_ANY can match 1 times out of 2147483647...
Setting an EVAL scope, savestack=3
1 <a> <> | 3: END
Match successful!
Freeing REx: `".*"'`
Anyone can interpret this?
The output has two important parts: pattern compilation and runtime matching.
The first part describes the nodes, of which there are three, in the compiled automaton.
STAR(n)
matches zero or more of the following node and continues through node n.REG_ANY
matches any character except newline (i.e.,/./
)END
marks the end state of the automaton.
MBOL
matches beginning-of-line in multiline match mode, i.e., /^/m
. This is there implicitly because of the .*
at the beginning of your pattern. (Remember: regex quantifiers are greedy by default.)
The minimum length of a string that can match your pattern is zero, or the empty string. (Remember: the *
quantifier always succeeds!)
Offsets are of the form
NODENUM:POSITION[LENGTH]
and link nodes to the regex in your program. In your case, .*
(nodes 2 and 1) begins at the first position in your pattern, and the end state is there implicitly. Offsets handy for regex debuggers, e.g., to highlight which subpattern is currently attempting to match.
Now that it's compiled, it can be matched, and the latter part traces the execution. The Pragmas and Debugging section of the perlretut documentation explains the form of the lines that describe match progress:
Each step is of the form
n <x> <y>
, with<x>
the part of the string matched and<y>
the part not yet matched.
The match in your question begins with no text consumed, then .*
matches a
, and the pattern matches successfully.
The eval scope is machinery related to executable code in regexes, which you don't use.
The Debugging Regular Expressions section of the perldebguts documentation gives more background information, and, as always, use the source, Luke!
精彩评论