awk to perl conversion

2023-02-06 19:02 问答作者：

I have a directory full of files containing records like:

FAKE ORGANIZATION
799 S FAKE AVE
Northern Blempglorff, RI 99xxx


                                                                      01/26/2011
     These items are being held for you at the location shown below each one.
     IF YOU ASKED THAT MATERIAL BE MAILED TO YOU, PLEASE DISREGARD THIS NOTICE.

     The Waltons. The complete  DAXXXX12118198
     Pickup at:CHUPACABRA LOCATION                                 02/02/2011







                                                  GRIMLY, WILFORD
                                                  29 FAKE LANE
                                                  S. BLEMPGLORFF RI  99XXX

I need to remove all entries with the expression Pickup at:CHUPACABRA LOCATION.

The "record separator" issue: I can't touch the input file's formatting -- it must be retained as is. Each record is separated by roughly 40+ new lines.

Here's some awk ( this works ):

BEGIN { 
    RS="\n\n\n\n\n\n\n\n\n+" 
    FS="\n"
}
!/CHUPACABRA/{print $0}

My stab with perl:

perl -a -F\n -ne '$/ = "\n\n\n\n\n\n\n\n\n+";$\ = "\n";开发者_StackOverflowchomp;$regex="CHUPACABRA";print $_ if $_ !~ m/$regex/i;' data/lib51.000

Nothing is returned. I'm not sure how to specify 'field separator' in perl except at the commandline. Tried the a2p utility -- no dice. For the curious, here's what it produces:

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
            # process any FOO=bar switches

#$FS = ' ';     # set field separator
$, = ' ';       # set output field separator
$\ = "\n";      # set output record separator

$/ = "\n\n\n\n\n\n\n\n\n+";
$FS = "\n";

while (<>) {
    chomp;  # strip record separator
    if (!/CHUPACABRA/) {
    print $_; 
   }   
}

This has to run under someone's Windows box otherwise I'd stick with awk.

Thanks!

Bubnoff

EDIT ( SOLVED ) **

Thanks mob! Here's a ( working ) perl script version ( adjusted a2p output ):

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
            # process any FOO=bar switches

#$FS = ' ';     # set field separator
$, = ' ';       # set output field separator
$\ = "\n";      # set output record separator

$/ = "\n"x10;
$FS = "\n";

while (<>) {
    chomp;  # strip record separator
    if (!/CHUPACABRA/) {
    print $_; 
    }   
}

Feel free to post improvements or CPAN goodies that make this more idiomatic and/or perl-ish. Thanks!

In Perl, the record separator is a literal string, not a regular expression. As the perlvar doc famously says:

Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)

Still, it looks like you can get away with $/="\n" x 10 or something like that:

perl -a -F\n -ne '$/="\n"x10;$\="\n";chomp;$regex="CHUPACABRA";
       print if /\S/ && !m/$regex/i;' data/lib51.000

Note the extra /\S/ &&, which will skip empty paragraphs from input that has more than 20 consecutive newlines.

Also, have you considered just installing Cygwin and having awk available on your Windows machine?

There is no need for (much)conversion if you can download gawk for windows

Did you know that Perl comes with a program called a2p that does exactly what you described you want to do in your title?

And, if you have Perl on your machine, the documentation for this program is already there:

C> perldoc a2p

My own suggestion is to get the Llama book and learn Perl anyway. Despite what the Python people say, Perl is a great and flexible language. If you know shell, awk and grep, you'll understand many of the Perl constructs without any problems.

继续阅读：perl

awk to perl conversion

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？