开发者

How to get a list of all html files in all subdirectories alphabetically sorted in perl?

Currently I am doing this:

# Find all files
File::Find::find(
    sub {
        my $file = $_;
        return if -d $file; 
        return if $file !~ /(.*)\.htm$/;

        ...my processing code

    }, $inputdir
);

But I want to process all the files alphabetically, ultimately I'd like to store all the file names in an array, sort the array, then use a for each loop and put my processing code in there, but I am completely stuck how to do it.

I've done lots of googling but like everything else in perl, there are 100s of ways to do everything, and none of them seem to let me do all the things I want to, that is all files ending in .html, all subdirectories within a specific directory, and alphabetically sorted based on their file name, not their directory structure.

Can anyone help me out? I know this can be done fairly easily, I just cannot figure 开发者_如何学JAVAit out.

Thanks :)

edit: I've tried doing this:

File::Find::find(
    sub {
        #Only process html files
        my $file = $_;
        return if -d $file; 
        return if $file !~ /(.*)\.htm$/;

        push(@files, $File::Find::name);

    }, $inputdir 
);

But then if I sort the array @files it will sort it based on the entire string, I just want to sort it based on the filename. I don't think there is a way to do it with File::find:find as there no way it can know what the order is until it has traversed all the files, so I need to do the sort afterwards.


you can use File::Basename - Parse file paths into directory, filename and suffix and Schwartzian transform to sort files on the basis of filename like,

 @files = map { $_->[0] }
    sort { $a->[1] cmp $b->[1] }
    map { [$_, fileparse($_, "\.html?")] } @files; 

The fileparse() routine of File::Basename divides a file path into its $directories, $filename and (optionally) the filename $suffix. so get the filename and pass it into Schwartzian transform for sorting.


File::Next has sort options.


One more solution could be hash caching methods like first get the filename from the File::Basename and place results into a cache and then we can then simply sort on the cached values, ie.,

my %cache;
foreach my $file (@files){
  $cache{$file} = fileparse($file, "\.html?");
}
@files = sort{$cache{$a} cmp $cache{$b}}@files;


This will not be a performance winner but is included to show the excellent File::Find::Rule and it might be fun and acceptable for small file trees. Also uses Path::Class.

use warnings;
use strict;
use File::Find::Rule;
use Path::Class qw( file );

my @files = map { file($_) }
    File::Find::Rule->file()
    ->name("*\.html")
    ->in(shift||".");

for my $file ( sort { lc($a->basename) cmp lc($b->basename) } @files )
{
    print $file, $/;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜