Why does my regular expression fail with certain substitutions?
I am new to perl and not sure how to achieve the following. I am reading a file and putting the lines in a variable called $tline. Next, I am trying to replace some character from the $tline. This substitution fails if $tline has some special characters like (, ?,= etc in it. How to escape the special c开发者_StackOverflow社区haracters from this variable $tline?
if ($tline ne "") {
$tline =~ s/\//\%;
}
EDIT
Sorry for the confusions. Here is what I am trying to do.
$tline =~ s/"\//"\<\%\=request\.getContextPath\(\)\%\>\//;
This is working for most of the cases. But when the input file has ? in it, it is failing.
How about:
$tline =~ s/\Q$var\E/;
That will cause quotemeta
to be applied to contents of $var
which is being used as the pattern.
This isn't a valid regex:
$tline =~ s/\//\%;
It gets read like this to perl
$tline =~ s/a/%;
Where a = /
What you wanted to do is replace a forward-slash with a percent sign you probably want
$tline =~ s/\//%/;
Which is better written like this:
$tline =~ s,/,%,;
You probably also want to replace more than just the first forward-slash, so you want the /g
flag:
$tline =~ s,/,%,g;
And, this exactly what tr
(transliteration) does:
$tline =~ tr,/,%,;
UPDATE I think what you want is a simple quotemeta()
which takes your input, and regex-escapes the meta characters
$ perl -e'print quotemeta("</foo?>")'
\<\/foo\?\>
You could place all your special characters between square brackets (called a "character class"). The following will replace all left parentheses, question marks and equal signs in your string with percent signs:
my $tline = 'fo(?=o';
$tline =~ s/[(?=]/%/g;
print "$tline\n";
Prints:
fo%%%o
quotemeta
is a good function for getting a exact literal with special characters into a regex. And \Q
and \E
are good operators for doing the same thing inside the regex.
However, you're search expression is not that complex. In your edit, you're simply looking for a double quote and a slash. In fact, I've quite simplified your expression so it contains not a single backslash. So it's not a problem for quotemeta
nor for that matter \Q
and \E
.
Once pared down, I don't see anything in your revised substitution that would cause a problem with '?' in $tline
.
Key to the simplification is that '.', '(', and ')' mean nothing special to the replacement section of your expression, so this is equivalent:
$tline =~ s/"\//"<%=request.getContextPath()%>\//;
Not to mention easier to read. Of course this is even easier:
$tline =~ s|"/|"<%=request.getContextPath()%>/|;
Because in Perl, you can choose the delimiter you wish with the s
operator.
But with any of these, this works:
use Test::More tests => 1;
my $tline = '"/?"';
$tline =~ s|"/|"<%=request.getContextPath()%>/|;
ok( $tline =~ /getContextPath/ );
It passes the test. Perhaps you're having a problem with more than one substitution on a line. That can be fixed with:
$tline =~ s|"/|"<%=request.getContextPath()%>/|g;
That g is the global switch on the end, saying make this substitution for as many times as it occurs in the input.
However, since I can see what you are doing, I suggest an even tighter specification of what you want to search:
$tline =~ s~\b(href|link|src)="/~$1="<%=2request.getContextPath()%>/~g;
And when I run this:
use Test::More tests => 2;
my $tline = '"/?"';
$tline =~ s/"\//"<%=request.getContextPath()%>\//;
ok( $tline =~ /getContextPath/ );
$tline = 'src="/?/?/beer"';
ok( $tline =~ s~\b(href|link|src)="/~$1="<%=request.getContextPath()%>/~g
);
I get two successes.
Your true problem is yet unspecified.
Well, one way to do it is to put all the characters you want to replace in square brackets. Like so:
$string =~ s/[,?=\/]//; # This will remove the first ',', '?', '=', or '/' from your string.
If you want to remove all the '?' in a string, for example, use a g on the end of it like so:
$string =~ s/[?]//g;
I'm a little rusty, but I believe that you only need a '\' in front of \ or /, (and of course the other special characters like \n,\t, etc...). Like so:
$string =~ s/[\\]/[\/]/g; # Switch from DOS to Unix delimiters.
$string =~ s/[\n\t]//g; # Remove all newlines and tabs
As others have said, the code you've posted isn't going to work since you forgot the last /. That's another nice reason to keep the "weird" characters in a box.
精彩评论