开发者

Adding more features in perl script

In the below perl script, I check my folder name (which is in the date format like 11-08-31) with the current date. If it matches, I process the folder. It also checks the previous day folder if there is no folder in today's date. I already asked this type of question here but I need to make some changes here and add new features as well:

  • The script checks for the previous date if todays not find. But I need to check if the previous date has already been processed or not so that I donot process it again. So, Do I need to create a list for it?

  • This script checks only for the one previous date. What if I have to check for the 2 previous days? Thanks for your help. hope you understand my doubts.

Updated: This perl script run automatically when It checks the curent date with the folder name. The folder is a tar folder which is loaded from other server.

So, basically I need to run the script if it matched with the folder name and current date.

Problem: Sometimes, I used to get the folder next day and my perl script checks only for the current date. The folder i get has the name which is previous date (not the current date).So, I need to do processing of the folder manually. I need to automate it in my perl script


#!/usr/bin/perl
use strict;
use warnings;
use Cwd;
use DateTime;
use File::Copy;

# set to your desired time zone
my $today = DateTime->now( time_zone => "America/New_York" );
my $td = $today->strftime("%y-%m-%d");

# strongly recommended to do date math in the 'floating'/UTC zone
my $yesterday = $today->set_time_zone('floating')->subtract( days => 1);
my $yd = $yesterday->set_time_zone('America/New_York')->strftime("%y-%m-%d");

my $dir = shift or die "Provide path on command line. $!";

if ($dir eq '.') {
    $dir = cwd;
}
elsif ($dir !~ /^\//) {
    $dir = cwd() . "/$dir"; 
}

opendir my $dh, $dir or die $!;
my @dir = sort grep {-d and /$td/ || /$yd/} readdir $dh;
closedir $dh or die $!;
@dir or die "Found no date directories. $!";

my $dday = "$dir/$dir[-1]"; # is today unless today not found, then yesterday
my $fdir = '/some/example开发者_如何学C/path/';    
my @gzfiles = glob("$dday/*tar.gz");

foreach my $zf (@gzfiles) {  
    next if (($zf =~ /BMP/) || ($zf =~ /LG/) || ($zf =~ /MAP/) || ($zf =~ /STR/)); 
    print "$zf\n";
    copy($zf, $fdir) or die "Unable to copy. $!";
}


Well, another way to do it, as suggested by mugen kenichi, is to use Storable. This way stores a hash with all processed directories in it. Then when you run your program, it can check the hash to see if they have been processed.

You would need a one-time script to set up the hash of processed directories.

#!/usr/bin/perl
use strict;
use warnings;
use Storable;

# This script to be run 1 time only. Sets up 'processed' directories hash.
# After this script is run, ready to run the daily script.

my $dir = '.'; # or what ever directory the date-directories are stored in

opendir my $dh, $dir or die "Opening failed for directory $dir $!";
my @dir = grep {-d && /^\d\d-\d\d-\d\d$/ && $_ le '11-04-21'} readdir $dh;
closedir $dh or die "Unable to close $dir $!";

my %processed = map {$_ => 1} @dir;

store \%processed, 'processed_dirs.dat';

Then, a script to be run periodically to find and process your date directories.

#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
use Storable;

my $dir = shift or die "Provide path on command line. $!";

my $processed = retrieve('processed_dirs.dat'); # $processed is a hashref

opendir my $dh, $dir or die "Opening failed for directory $dir $!";
my @dir = grep {-d && /^\d\d-\d\d-\d\d$/ && !$processed->{$_} } readdir $dh;
closedir $dh or die "Unable to close $dir $!";
@dir or die "Found no unprocessed date directories";

my $fdir = '/some/example/path';

for my $date (@dir) {
    my $dday = "$dir/$date";
    my @gzfiles = glob("$dday/*tar.gz");

    foreach my $zf (@gzfiles) {  
        next if $zf =~ /BMP/ || $zf =~ /LG/ || $zf =~ /MAP/ || $zf =~ /STR/; 
        print "$zf\n";
        copy($zf, $fdir) or die "Unable to copy $zf to $fdir. $!";
    }
    $processed->{ $date } = 1;
}

store $processed, 'processed_dirs.dat';


If you want to persist the status of whether these directories were processed beyond a single run of your app, you could create a .processed file in each directory and check for the existence of this file before you process the directory.

If you just need to store the status of these directories (processed or unprocessed) during the execution of your script, you could use a hash keyed with the directory name:

my %PROCESSED = ();

if ($processing_done) {
  %PROCESSED{$dirname} = 1;
} else {
  %PROCESSED{$dirname} = 0;
}

You can check to see if each directory has been processed by reading the key value from the hash:

if (%PROCESSED{$dirname} == 0) {
 ... do some processing
} else {
 ... this one is already done
}


This solution finds all directories yet to be processed that are newer than the most recent direcory-date processed. You have manually record it the first time, (before the script is run). The script will update it from that point on.

The file could be named like my $last = 'dir_last.dat'; I just entered a file at the command line like:

C:\Old_Data\perlp>echo 11-07-14 > dir_last.bat

C:\Old_Data\perlp>type dir_last.bat
11-07-14

C:\Old_Data\perlp>

This assumes the newest directory was 11-07-14. You must find out this yourself before running the script.

#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;

my $dir = shift or die "Provide path on command line. $!";

my $last = 'dir_last.dat';

open my $fh, "<", $last or die "Unable to open $last $!";
chomp(my $last_proc = <$fh>);
close $fh or die "Unable to close $last $!";

opendir my $dh, $dir or die "Opening failed for directory $dir $!";
my @dir = sort grep {-d && /^\d\d-\d\d-\d\d$/ && $_ gt $last_proc} readdir $dh;
closedir $dh or die "Unable to close $dir $!";
@dir or die "Found no date directories after last update: $last_proc";

my $fdir = '/some/example/path';

for my $date (@dir) {
    my $dday = "$dir/$date";
    my @gzfiles = glob("$dday/*tar.gz");

    foreach my $zf (@gzfiles) {  
        next if $zf =~ /BMP/ || $zf =~ /LG/ || $zf =~ /MAP/ || $zf =~ /STR/; 
        print "$zf\n";
        copy($zf, $fdir) or die "Unable to copy $zf to $fdir. $!";
    }
}

open  $fh, ">", $last or die "Unable to open $last $!";
print $fh "$dir[-1]\n"; # record the newest date-directory as processed
close $fh or die "Unable to close $last $!";

Notice that I didn't rely on cwd like the first script. It really wasn't needed there and isn't needed here. opendir, glob and copy all can handle the dot (cwd) directory and relative paths.

The header includes the lines use strict; and use warnings;. Their purpose is to alert you of errors in your code (most all perl scripts should use them unless an expert decides to exclude them - for what reason I don't know). The first line tells unix where to find the interpreter (perl).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜