开发者

Perl Help...Read file..do math...write file? New at Perl

I have a file that is as such:

DATA_SET1
INFO1 INFO2 INFO3 = ### ##开发者_Go百科# ###
INFO4 = ###
INFO5 = ###
INFO6 = ###
INFO7 = ###
INFO8 = ###
DATA_SET2
INFO1 INFO2 INFO3 = ### ### ###
INFO4 = ###
INFO5 = ###
INFO6 = ###
INFO7 = ###
INFO8 = ###
etc...

I need to do some statistics of the numbers. EX: the average of the INFO4 number from DATASET1, DATASET2, etc... Then I have to write the average to another file:

STATISTICS:
INFO4 Average: ### 

I am VERY VERY new to PERL. This is probably very easy to do, I just have no idea where to start.

Thank you for you help!


First you need to get the file into a data structure. Something like this should work if the formatting you have is always fairly the same.

#!/usr/bin/env perl

use strict;
use warnings;

use Data::Dumper; # for printing the hash of results at the end

my $file = $ARGV[0]; # Specify the file as first command line argument
open my $fh, '<', $file;

my %data;
my $current_set = 'default';
while(my $line = <$fh>) {
  chomp $line;
  if ($line =~ /(DATA_SET\d+)/) {
    $current_set = $1;
  } elsif ($line =~ /=/) {

    my ($vars, $vals) = split(/\s*=\s*/, $line);

    my @vars = split(/\s/,$vars);
    my @vals = split(/\s/,$vals);

    die "length of variable declarations is not equal to length value declarations" 
      unless (@vars == @vals);

    while (@vars) {
      my $var = shift @vars;
      my $val = shift @vals;
      $data{$current_set}{$var} = $val;
    }
  }
}

print Dumper \%data;

#This assumes that each DATA_SET has an INFO4 term. 
#  N.B. It will assume a zero if not defined!
my @INFO4 = map { $data{$_}{'INFO4'} } keys %data;
die "Nothing to average" unless @INFO4;

my $sum;
foreach (@INFO4) {
  $sum += $_;
}
my $av = $sum / scalar @INFO4;

print "$av\n";

At the end I just print the created data structure, you will need to do your homework here to use this data structure (EDIT: Added averaging over INFO4 terms). perldoc is a good place to start. Also if you need some high powered math, I would look at the Perl Data Language (PDL) which implements a fast array math (Matlab like) numerical language in Perl.

Good luck.


If you're just calculating the averages of INFO 4 values, you could probably just use regex to identify and split those values. Here's a rough script sample that you might be able to use to get started (these aren't necessarily best practices, but I tried to make it clear what is going on). It reads a data file and adds to the average when it's an Info 4 value. (I used substring assuming that there's no errors in the data input, but again, this is just meant to be a rough answer that works for your sample case). You may also need to consider sprintf to round off your values as needed (it currently will act as a float). Hope this helps out.

open (IN, "file.txt") or die "Unable to open input file.\n";
while ($line = ){
    chomp($line);
    if ($line =~ m/INFO4/i){
        $average += int(substr($line, 8,length $line));
        $count++;
    }
}
close (IN);
$average = ($average/$count) if $count > 0;

open (OUT, ">output.txt") or die "Unable to open output file.\n";
print OUT "INF04 Average: $average\n";
close(OUT);


I'd start here Learning Perl or here Programming Perl and here Perl Cookbook if you know more than one langiage

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜