开发者

How can I sort an array so that certain file extensions sort to the top?

I have an array containing a list of files. I want to sort it in a way that it will let me have .txt files in the beginning of the array and the rest of files after that.

This is what I'm doing now, which works fine.

@files = (grep(/\.txt$/,@files),grep(!/\.txt$/,@files));

Is there a be开发者_运维百科tter way to do it though?


You asked a follow-up comment about doing this for more than one file extension. In that case, I'd build off the Schwartzian Transform. If you're new to the ST, I recommend Joseph Hall's explanation in Effective Perl Programming. Although the Second Edition is coming out very soon, we basically left his explanation as is so the first edition is just as good. Google Books seems to only show one inch of each page for the first edition, so you're out of luck there.

In this answer, I use a weighting function to decide which extensions should move to the top. If an extension doesn't have an explicit weight, I just sort it lexigraphically. You can fool around with the sort to get exactly the order that you want:

@files = qw(
    buster.pdf
    mimi.xls
    roscoe.doc
    buster.txt
    mimi.txt
    roscoe.txt
    buster.rpm
    mimi.rpm
    );

my %weights = qw(
    txt 10
    rpm  9
    );

my @sorted = 
    map { $_->{name} }
    sort { 
        $b->{weight} <=> $a->{weight}
         ||
        $a->{ext}    cmp $b->{ext}
         ||
        $a cmp $b
        }
    map {
        my( $ext ) = /\.([^.]+)\z/;
            { # anonymous hash constructor
            name => $_,
            ext => $ext,
            weight => $weights{$ext} || 0,
            }
        }
    @files;

$" = "\n";
print "@sorted\n";


 

@sorted = sort { $b=~/\.txt$/ <=> $a=~/\.txt$/  ||  $a cmp $b } @files

will put .txt files first and otherwise sort lexicographically (alphabetically).

@sorted = sort { $b=~/\.txt$/ <=> $a=~/\.txt$/ } @files

will put .txt files first and otherwise preserve the original order (sort is stable since Perl 5.8)


You just need to add a sort in front of each of your greps:

 my @sorted =
   (
   sort( grep /\.txt\z/,   @files ),
   sort( grep ! /\.txt\z/, @files )
   );

The trick here is that you are partitioning the list then sorting each partition independently. Depending on what you are doing, this might be a lot better than trying to do everything in one sort operation. Conversely, it might not always be better.

There are various other ways to get this done, but they aren't this simple. :)

Here's a quick benchmark on my MacBook Air with vanilla Perl 5.10.1:

There are 600 files to sort
     brian:  3 wallclock secs @ 369.75/s (n=1161)
   control:  3 wallclock secs @ 1811.99/s (n=5744)
      leon:  4 wallclock secs @ 146.98/s (n=463)
   mobrule:  3 wallclock secs @ 101.57/s (n=324)
      sort:  4 wallclock secs @ 559.62/s (n=1746)

Here's the script:

use Benchmark;

use vars qw(@files);

@files = qw(
    buster.pdf
    mimi.xls
    roscoe.doc
    buster.txt
    mimi.txt
    roscoe.txt
    ) x 100;


printf "There are %d files to sort\n", scalar @files;

sub leon {  
    my @sorted = 
        map { $_->[0] } 
        sort { $a->[1] <=> $b->[1] } 
        map { [ $_, !/\.txt$/ ] 
        } @files;
    }

sub brian {
     my @sorted =
       (
       sort( grep /\.txt\z/,   @files ),
       sort( grep ! /\.txt\z/, @files )
       );
    }

sub mobrule {
    my @sorted = 
        sort { ($b=~/\.txt\z/) <=> ($a=~/\.txt\z/)  ||  $a cmp $b } 
        @files;
    }

sub plain_sort {
    my @sorted = sort @files;
    }

sub control {
    my @sorted = @files;
    }

timethese( -3,
     {
     brian   => \&brian,
     leon    => \&leon,
     mobrule => \&mobrule,
     control => \&control,
     sort    => \&plain_sort,
     }
     );


Sort takes an optional block as first argument, though in this case a Schwartzian transform would be quicker.

@files = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, !/\.txt$/ ] } @files;


To handle multiple extensions efficiently, you could modify brian d foy's sorted greps by partitioning your array in one pass, and then sort each partition independently.

use strict;
use warnings;

use List::MoreUtils qw(part);

my @files = qw(
    bar        Bar.pm       bar.txt
    bar.jpeg   foo          foo.pm
    foo.jpeg   zebra.txt    zebra.pm
    foo.bat    foo.c        foo.pl
    Foo.pm     foo.png      foo.tt
    orange     apple        zebra.stripe
);


my @parts = part { get_extension_priority($_) } @files;

my @sorted = map { sort( @{ $_ || [] } ) } @parts; 

print map "$_\n", @sorted;

BEGIN {

    # Set extension priority order
    my @priority = qw( stripe txt nomatch pl jpeg  );

    # make a hash to look up priority by extension
    my %p = map { $priority[$_], $_ } 0..$#priority;

    sub get_extension_priority {
        my $file = shift;

        return scalar @priority 
            unless /[.](\w*)$/;

        return scalar @priority 
            unless exists $p{$1};

        return $p{$1};
    }
}


Code golf? This will not produce nasty warnings:

@files = map { $_->[0] } sort { @$b <=> @$a } map { [$_, /\.txt$/] } @files
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜