开发者

Find positions of given characters

I want to find positions of some characters in order to process开发者_如何学JAVA them without using monstrous recursive and inefficient regular expression. Here is how I do it:

my @charpos=();
s/(?=([«»\n]))/push @charpos, [$1, 0+($-[0])]; "";/ge;
# sort {$a->[1] <=> $b->[1]} @charpos;

But this solution uses «substitute» operator to substitute with empty string, is this normal? Should the commented line be uncommented?


For your general problem, you might want to examine sub parse_line in Text::ParseWords.

In the context of the code you gave in your question, I would avoid modifying the source string:

#!/usr/bin/perl

use utf8;
use strict; use warnings;

my $x = q{«...«...»...«...»...»};

my @pos;

while ( $x =~ /([«»\n])/g ) {
    push @pos, $-[1];
}

use YAML;
print Dump \@pos;


There’s more than one way to skin a cat:

#!/usr/bin/env perl

use 5.010;
use utf8;
use strict;
use warnings qw< FATAL all >;
use autodie;
use open qw< :std OUT :utf8 >;

END { close STDOUT }

my @pos = ();
my $string = q{«...«...»...«...»...»};
($string .= "\n") x= 3;

say "string is:\n$string";

for ($string) {
    push @pos, pos while m{
        (?= [«»\n] )
    }sxg;;
}
say "first  test matches \@ @pos";

@pos = ();

## this smokes :)
"ignify" while $string =~ m{
    [«»\n]
    (?{ push @pos, $-[0] })
}gx;
say "second test matches \@ @pos";

__END__
string is:
«...«...»...«...»...»
«...«...»...«...»...»
«...«...»...«...»...»

first  test matches @ 0 4 8 12 16 20 21 22 26 30 34 38 42 43 44 48 52 56 60 64 65
second test matches @ 0 4 8 12 16 20 21 22 26 30 34 38 42 43 44 48 52 56 60 64 65

But please credit Sinan.


In general, to find the positions of characters in a string, you can do it this way:

my $str = ...;
my @pos;
push @pos, pos $str while $str =~ /(?=[...])/g;

And then all the positions where the regex matched will be in @pos. At least with this method you are not constantly rewriting your source string.


A regular-expression free cat skinimization to add to the manual. Whether it is monstrous is in the eye of the beholder:

use List::Util q/min/;
my @targets = ('«','»',"\n");
my $x = q{«...«...»...«...»...»};
my $pos = min map { my $z = index($x,$_); $z<0?Inf:$z } @targets;
my @pos;
while ($pos < Inf) {
    push @pos, $pos;
    $pos = min map { my $z = index($x,$_,$pos+1); $z<0?Inf:$z } @targets;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜