How can I sum data over five minute time intervals in Perl?

2022-12-09 22:13 问答作者：

I have a file in below format.

DATE Time, v1,v2,v3
05:33:25,n1,n2,n3
05:34:25,n4,n5,n5
05:35:24,n6,n7,n8
and so on upto 05:42:25.

I want calculate the values v1, v2 and v3 for every 5 min interval. I have written the below sample code.

while (<STDIN>) {
    my ($dateTime, $v1, $v2, $v3) = split /,/, $_;
    my ($date, $time) = split / /, $dateTime;
}

I can read all the values but need help to sum all the values for every 5 min interval. Can anyone please suggest me the code to add the time and values for every 5 min.

Required output

05:33 v1(sum 05:33 to 05:37) v2(sum 05:33 to 05:33) v3(sum 05:33 to 05:33)
05:38 v1(sum 05:38 to 05:42) v2(sum 05:38 to 05:42) v3(sum 05:38 to 05:42)
and开发者_运维知识库 so on..

The code is a variation the ~~previous~~ answer by Sinan Ünür below, except:

(1) Function timelocal will allow you to read in Day,Month,Year -- so you can sum any five minute gap.

(2) Should deal with case where final time gap is < 5 minutes.

#!/usr/bin/perl -w
use strict;
use warnings;
use Time::Local;
use POSIX qw(strftime);

my ( $start_time, $end_time, $current_time );
my ( $totV1,      $totV2,    $totV3 );          #totals in time bands

while (<DATA>) {
    my ( $hour, $min, $sec, $v1, $v2, $v3 ) =
      ( $_ =~ /(\d+)\:(\d+)\:(\d+)\,(\d+),(\d+),(\d+)/ );

    #convert time to epoch seconds
    $current_time =
      timelocal( $sec, $min, $hour, (localtime)[ 3, 4, 5 ] );    #sec,min,hr

    if ( !$end_time ) {
        $start_time = $current_time;
        $end_time   = $start_time + 5 * 60;    #plus 5 min
    }
    if ( $current_time <= $end_time ) {
        $totV1 += $v1;
        $totV2 += $v2;
        $totV3 += $v3;
    }
    else {
        print strftime( "%H:%M:%S", localtime($start_time) ),
          " $totV1,$totV2,$totV3\n";
        $start_time = $current_time;
        $end_time   = $start_time + 5 * 60;    #plus 5 min
        ( $totV1, $totV2, $totV3 ) = ( $v1, $v2, $v3 );
    }
}

#Print results of final loop (if required)
if ( $current_time <= $end_time ) {
    print strftime( "%H:%M:%S", localtime($start_time) ),
      " $totV1,$totV2,$totV3\n";
}

__DATA__
05:33:25,29,74,96
05:34:25,41,69,95
05:35:25,24,38,55
05:36:25,96,63,70
05:37:25,84,65,74
05:38:25,78,58,93
05:39:25,51,38,19
05:40:25,86,40,64
05:41:25,80,68,65
05:42:25,4,93,81

Output:

05:33:25 352,367,483
05:39:25 221,239,229

Obviously, not tested much, for lack of sample data. For parsing the CSV, use either Text::CSV_XS or Text::xSV rather than the naive split below.

Note:

This code does not make sure the output has all consecutive five minute blocks if the input data has gaps.
You will have problems if there are time stamps from multiple days. In fact, if the time stamps are not in 24-hour format, you will have problems even if the data are from a single day.

With those caveats, it should still give you a starting point.

#!/usr/bin/perl

use strict;
use warnings;

my $split_re = qr/ ?, ?/;
my @header = split $split_re, scalar <DATA>;
my @data;

my $time_block = 0;

while ( my $data = <DATA> ) {
    last unless $data =~ /\S/;
    chomp $data;
    my ($ts, @vals) = split $split_re, $data;

    my ($hr, $min, $sec) = split /:/, $ts;
    my $secs = 3600*$hr + 60*$min + $sec;

    if ( $secs > $time_block + 300 ) {
        $time_block = $secs;
        push @data, [ $time_block ];
    }

    for my $i (1 .. @vals) {
        $data[-1]->[$i] += $vals[$i - 1];
    }
}

print join(', ', @header);
for my $row ( @data ) {
    my $ts = shift @$row;
    print join(', ',
        sprintf('%02d:%02d', (localtime($ts))[2,1])
        , @$row
    ), "\n";
}


__DATA__
DATE Time, v1,v2,v3
05:33:25,1,3,5
05:34:25,2,4,6
05:35:24,7,8,9
05:55:24,7,8,9
05:57:24,7,8,9

Output:

DATE Time, v1, v2, v3
05:33, 10, 15, 20
05:55, 14, 16, 18

This is a good problem for Perl to solve. The hardest part is taking the value from the datetime field and identifying which 5 minute bucket it belongs to. The rest is just hashes.

my (%v1,%v2,%v3);
while (<STDIN>) {
    my ($datetime,$v1,$v2,$v3) = split /,/, $_;
    my ($date,$time) = split / /, $datetime;
    my $bucket = &get_bucket_for($time);
    $v1{$bucket} += $v1;
    $v2{$bucket} += $v2;
    $v3{$bucket} += $v3;
}
foreach my $bucket (sort keys %v1) {
    print "$bucket $v1{$bucket} $v2{$bucket} $v3{$bucket}\n";
}

Here's one way you could implement &get_bucket_for:

my $first_hhmm;
sub get_bucket_for {
    my ($time) = @_;
    my ($hh,$mm) = split /:/, $time;  # looks like seconds are not important

    # buckets are five minutes apart, but not necessarily at multiples of 5 min
    # (i.e., buckets could go 05:33,05:38,... instead of 05:30,05:35,...)
    # Use the value from the first time this function is called to decide
    # what the starting point of the buckets is.
    if (!defined $first_hhmm) {
        $first_hhmm = $hh * 60 + $mm;
    }

    my $bucket_index = int(($hh * 60 + $mm - $first_hhmm) / 5);
    my $bucket_start = $first_hhmm + 5 * $bucket_index;
    return sprintf "%02d:%02d", $bucket_start / 60, $bucket_start % 60;

}

I'm not sure why you would use the times starting from the first time, instead of round 5 minute intervals (00 - 05, 05 - 10, etc), but this is a quick and dirty way to do it your way:

my %output;
my $last_min = -10; # -10 + 5 is less than any positive int.
while (<STDIN>) {
    my ($dt, $v1, $v2, $v3) = split(/,/, $_);
    my ($h, $m, $s) = split(/:/, $dt);
    my $ts = $m + ($h * 60);
    if (($last_min + 5) < $ts) {
        $last_min = $ts;
    }
    $output{$last_min}{1} += $v1;
    $output{$last_min}{2} += $v2;
    $output{$last_min}{3} += $v3;
}
foreach my $ts (sort {$a <=> $b} keys %output) {
    my $hour = int($ts / 60);
    my $minute = $ts % 60;
    printf("%01d:%02d v1(%i) v2(%i) v3(%i)\n", (
            $hour,
            $minute,
            $output{$ts}{1},
            $output{$ts}{2},
            $output{$ts}{3},
        ));
}

Not sure why you would do it this way, but there you go in procedural Perl, as example. If you need more on the printf formatting, go here.

继续阅读：datetime perl

How can I sum data over five minute time intervals in Perl?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？