开发者

How can I create a CSV file from unequal length arrays in Perl?

I have to parse a file so that I can import it to excel. So, I thought the best way was to create a csv file. In this fi开发者_开发百科le, I have to divide contents into different categories and represent them in different columns. So, I have parsed the file to create different arrays corresponding to the categories. Now, I am trying to create a csv file with these arrays (thought of using a for loop). But the problem is, that arrays are of unequal length.

INPUT

NM_144736.3
NM_144963.1
XM_144975.2
BC144986.1
NM_144989.1
BC145001.1
XM_145018.2
NM_145015.2
XM_030711.2
AK145024.1
AK145030.1
NM_145034.1

I have used regex to parse data into different arrays. All the NM to @array1, XM to @array2, BC to @array3, AK to @array4. If creating arrays is not a good idea, please let me know what is? How else can I go about generating csv file from the above data.

Edit:

OUTPUT

NM_144963.1,XM_144975.2,BC144986.1,AK145024.1
NM_144963.1,XM_145018.2,BC145001.1,AK145030.1
NM_144989.1,XM_030711.2
NM_145015.2
NM_145034.1


Parse and write directly to an excel spreadsheet, without importing:

use Spreadsheet::WriteExcel;                                                    

my %hash;                                                                       

# Parse the data into a hash of arrayrefs                                       
push @{$hash{substr $_, 0, 2}} => $_ for <DATA>;                               

# Create spreadsheet                                                            
my $workbook = Spreadsheet::WriteExcel->new('perl.xls');                        
my $worksheet = $workbook->add_worksheet;                                       

# Loop through hashref keys                                                     
my @array = sort keys %hash;                                                    
for (0..@array-1) {                                                             

  # Create column based on arrayref                                             
  $worksheet->write_col(0, $_, $hash{$array[$_]});.                             
}                                                                               

# Close and save spreadsheet                                                    
$workbook->close;                                                               


Using parallel arrays like that is a bad idea. In fact, whenever you find yourself using names such as @array1, @array2 etc, recognize that it is bad idea. And, no, naming the arrays @NM, @XM etc would not have made it better.

The way I see it, you have a single column of data and you have not specify how to split that single column in to multiple columns. ... Nope, my mind reading abilities fell short. Please post desired output and don't leave to our imagination to figure it out.

use strict; use warnings;
use List::AllUtils qw( each_arrayref);

my @fields = qw( NM XM BC AK );
my %data;

while ( <DATA> ) {
    chomp;
    if ( /^([A-Z]{2})_?[0-9]+\.[0-9]$/ ) {
        push @{ $data{$1} }, $_;
    }
}

print join(',', @fields), "\n";

my $it = each_arrayref @data{ @fields };

while ( my @values = $it->() ) {
    print join(',', map{ defined($_) ? $_ : '' } @values ), "\n";
}

__DATA__
NM_144736.3
NM_144963.1
XM_144975.2
BC144986.1
NM_144989.1
BC145001.1
XM_145018.2
NM_145015.2
XM_030711.2
AK145024.1
AK145030.1
NM_145034.1

Output:

NM,XM,BC,AK
NM_144736.3,XM_144975.2,BC144986.1,AK145024.1
NM_144963.1,XM_145018.2,BC145001.1,AK145030.1
NM_144989.1,XM_030711.2,,
NM_145015.2,,,
NM_145034.1,,,
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜