Perl split pattern
According to the perldoc, the syntax for split is:
split /PATTERN/,EXPR,LIMIT
But the PATTERN
can also be a single- or double-quoted string: split "PATTERN", EXPR
. What difference does it make?
Edit: A difference I'm aware of is splitting on backslashes: split /\\/
vs split '\\'
. The second form doesn't wor开发者_JAVA百科k.
It looks like it uses that as "an expression to specify patterns":
The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at runtime. (To do runtime compilation only once, use /$variable/o .)
edit: I tested it with this:
my $foo = 'a:b:c,d,e';
print join(' ', split("[:,]", $foo)), "\n";
print join(' ', split(/[:,]/, $foo)), "\n";
print join(' ', split(/\Q[:,]\E/, $foo)), "\n";
Except for the ' '
special case, it looks just like a regular expression.
PATTERN
is always interpreted as... well, a pattern -- never as a literal value. It can be either a regex1 or a string. Strings are compiled to regexes. For the most part the behavior is the same, but there can be subtle differences caused by the double interpretation.
The string '\\'
only contains a single backslash. When interpreted as a pattern, it's as if you had written /\/
, which is invalid:
C:\>perl -e "print join ':', split '\\', 'a\b\c'"
Trailing \ in regex m/\/ at -e line 1.
Oops!
Additionally, there are two special cases:
- The empty pattern
//
, which splits on the empty string. - A single space
' '
, which splits on whitespace after first trimming any leading or trailing whitespace.
1. Regexes can be supplied either inline /.../
or via a precompiled qr//
quoted string.
I believe there's no difference. A string pattern is also interpreted as a regular expression.
perl -e 'print join("-",split("[a-e]","regular"))';
r-gul-r
As you see, the delimiter is interpreted as a regular expression, not a string literal.
So, it's mostly the same - with one important exception: split(" ",...
) and split(/ /,...
) are different.
I prefer to use /PATTERN/
to avoid confusion, it's easy to forget that it's a regexp otherwise.
Two observable rules:
- the special case
split(" ")
is equivalent tosplit(/\s+/)
. - for everything else (it seems—don't nail me),
split("something")
is equal tosplit(/something/)
精彩评论