Either convert a glob to regex or have Perl handle glob patterns
I have a config .ini file where users can specify a file pattern either using a Perl regular expression or as an Ant globbing pattern. The following for instance would prohibit a user from creating a file that's not allowed under Windows:
[BAN Defined using Ant Globbing]
file = **/prn.*
ignorecase = true
[BAN Defined using Regular expressions]
match = /(aux|con|com[0-9]*|lpt[0-9]*|nul|clock$)\.?[a-z]$
ignorecase = true
Right now, I have to convert the glob into a regular expression in order to programmatically handle it. I have a routine that does it, but it's kind of convoluted. I am looking for one of the following:
- An easy way of converting a glob into a regular expression
- Methods of matching glob expressions like you can with regular expressions.
For example:
if ($regex =~ /\/(aux|con|com[0-9]*|lpt[0-9]*|nul|clock$)\.?[a-z]$) {
if ($glob ?magic? /**/prn.*/) {
I was hoping that there is some magical Perl way of doing this. So, is there an easy can't miss way of doing this:
BTW, here's my subroutine in case anyone is interested:
sub glob2regex {
my $glob = shift;
my $regex = undef;
my $previousAstrisk = undef;
foreach my $letter (split(//, $glob)) {
#
# ####Check if previous letter was astrisk
#
if ($previousAstrisk) {
if ($letter eq "*") { #Double astrisk
$regex .= ".*";
$previousAstrisk = undef;
next;
} else { #Single astrisk: Write prev match
$regex .= "[^/]*";
$previousAstrisk = undef;
}
}
#
# ####Quote all Regex characters w/ no meaning in glob
#
if ($letter =~ /[\{\}\.\+\(\)\[\]]/) {
$regex .= "\\$letter";
#
# ####Translate "?" to Regular expression equivelent
#
} elsif ($letter eq "?") {
$regex .= ".";
#
# ####Don't know how to handle astrisks until the next line
#
} elsif ($letter eq "*") {
$previousAstrisk = 1;
#
# ####Convert backslashes to forward slashes
#
} elsif ($letter eq '\\') {
$regex .= "/";
#
# ####Just a letter
#
} else {
$regex .= $letter;
}
}
#
# ####Handle if last letter was astrisk
#
if ($previousAstrisk) {
$regex .= "[^/]*";
}
#
# ####Globs are anchored to both beginning and ending
#
$regex = "^$regex\$";
return $re开发者_如何学运维gex;
}
Given that:
- ? matches exactly one character except '/'
- * matches zero or more characters except '/'
- ** matches anything including /
If you don't care about format-checking and some corner cases like '***', then the following strategy, where you first convert special characters to custom-designed escape sequences, then convert the escape sequences to final strings, might work:
my $rgx="^$glob\$";
$rgx=~ s|!|!e|g;
$rgx=~ s|[+]|!p|g;
$rgx=~ s|[*]{2}|!d|g;
$rgx=~ s|[*]|!s|g;
$rgx=~ s|[?]|!q|g;
$rgx=~ s|[.]|\\.|g;
$rgx=~ s|!d|.*|g;
$rgx=~ s|!s|[^/]*|g;
$rgx=~ s|!q|[^/]|g;
$rgx=~ s|!p|\\+|g;
$rgx=~ s|!e|!|g;
if ($path =~ m|$rgx|){
return 1;
}
Apparently, there's no neat Perl Guru trick for creating a regular expression from a glob. Drats.
The best I can do is find a CPAN module like Text::Glob that does it. However, Text::Glob
doesn't do Ant style expanded globbing, so I'd have to modify it anyway. And, the code is no simpler than what I already have.
So, I'm just sticking with what I have.
Thanks anyway.
精彩评论