How to exclude submatches in Perl?

2023-03-05 13:48 问答作者：

I have to split a string into pieces containing words or special characters.

Let´s say I have the string 'This is "another problem..."'. What I want to get is an array consisting of these pieces: ('This', 'is', '"', 'another', 'problem', '...', '"').

I have done this in JavaScript with the following RegExp which works fine:

string.match(/([^-\s\w])\1*|[-\w]+/g); // works

Using the same approach in Perl does not work because of the subpattern I use to combine consecutive characters and I get these matches as well:

@matches = $string =~ m/(([^开发者_如何学运维-\s\w])\2*|[-\w]+)/g; # does not work

Is there a way of getting rid of the subpatterns/submatches either in the result or in the regexp itself?

In your "does not work" example, I think you mean \2, not \1.

You'd have to iterate through the matches to do this:

push @matches, "$1" while $string =~ m/(([^-\s\w])\2*|[-\w]+)/g;

my @matches;
push @matches, ${^MATCH} while $string =~ /([^-\s\w])\1*|[-\w]+/pg;

my @matches;
push @matches, $1 while $string =~ /(([^-\s\w])\2*|[-\w]+)/g;

my $i = 1;
my @matches = grep ++$i % 2, $string =~ /(([^-\s\w])\2*|[-\w]+)/g;

In Perl, there's more than one way to do it (TMTOWTDI):

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my $str='Here\'s a (good, bad, ..., ?) example to be used in this "reg-ex" test.';

# NB: grepping on $_ will remove empty results

my @matches = grep { $_ } split(/
  \s*             # discard possible leading whitespace
  (
    \.{3}         # ellipsis (must come before punct)
  |
    \w+\-\w+      # hyphenated words
  |
    \w+\'(?:\w+)? # compound words
  | 
    \w+           # other words
  | 
    [[:punct:]]   # other punctuation chars
  )
/x,$str);

print Dumper(\@matches);

will print:

$VAR1 = [
      'Here\'s',
      'a',
      '(',
      'good',
      ',',
      'bad',
      ',',
      '...',
      ',',
      '?',
      ')',
      'example',
      'to',
      'be',
      'used',
      'in',
      'this',
      '"',
      'reg-ex',
      '"',
      'test',
      '.'
    ];

继续阅读：perl regex

How to exclude submatches in Perl?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？