开发者

How can I correctly process this file containing tab separated values in Perl?

I am fairly new to Perl and know next to nothing about Perl's 'proper' syntax.

I have a text file that I use everyday with a listing of names, and other info for our users. This file changes daily and sometimes has two rows in it(tab delimited), and other times has 100+ rows in it.

The file also varies开发者_开发技巧 between 6-9 columns of data in a row. I have put together a Perl script that uses the split function on tabs, but the issue I am running into is that if I take row a, which has 5 columns in it and then add a second row b that has 6 columns in it that are all populated with data.

I cannot figure out how to get Perl to see that row a only has 5 columns of data and to continue parsing the text file from that point forward. It continues, but the output wraps lines strangely. How can I get around this issue? I hope that made sense.


You will have to post some code and possibly some sample data, but here's a code that is parsing rows of different lengths without issue.

Script:

#!/usr/bin/perl
use strict;

while (<STDIN>)
{
    chomp;
    my @info = split("\t");
    print join(";", @info), "\n";
}

exit;

Test File:

  jsmith  101     777-222-5555    Office 1        Building 1      Manager 
  aposse  104     777-222-5556    Office 2        Building 2      Stock Clerk 
  jbraza  105     777-222-5557    Office 3 
  mcuzui  102     777-222-5557    Office 3        Building 3      Cashier 
  ghines  107     777-222-5557    Office 3

Output:

%> test.pl < file.txt
jsmith;101;777-222-5555;Office 1;Building 1;Manager
aposse;104;777-222-5556;Office 2;Building 2;Stock Clerk
jbraza;105;777-222-5557;Office 3
mcuzui;102;777-222-5557;Office 3;Building 3;Cashier
ghines;107;777-222-5557;Office 3


You should post some sample data and code and explain desired behavior in terms of what the code currently does and what you want it to do. split will give you as many fields as there are in the input.

#!/usr/bin/perl

use strict; use warnings;

while ( my $row = <DATA> ) {
    last unless $row =~ /\S/;
    chomp $row;
    my @cells = split /\t/, $row;
    print "< @cells >\n";
}

__DATA__
1 2 3 4 5
a b c d e f


Text::CSV module can be used for parsing tab-separated-values as well. In reality, Text::CSV could parse values delimited by any character.

Relevant excerpt from its POD:

The module accepts either strings or files as input and can utilize any user-specified characters as delimiters, separators, and escapes so it is perhaps better called ASV (anything separated values) rather than just CSV.

#!/usr/bin/env perl

use strict;
use warnings;

use Text::CSV;

my $csv = Text::CSV->new( { 'sep_char' => "\t" } );

open my $fh, '<', 'data.tsv' or die "Unable to open: $!";

my @rows;
while ( my $row_ref = $csv->getline($fh) ) {
    push @rows, $row_ref;
}

$csv->sep_char('|');
for my $row_ref (@rows) {
    $csv->combine(@$row_ref);
    print $csv->string(), "\n";
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜