How can I sort an array so that certain file extensions sort to the top?
I have an array containing a list of files. I want to sort it in a way that it will let me have .txt files in the beginning of the array and the rest of files after that.
This is what I'm doing now, which works fine.
@files = (grep(/\.txt$/,@files),grep(!/\.txt$/,@files));
Is there a be开发者_运维百科tter way to do it though?
You asked a follow-up comment about doing this for more than one file extension. In that case, I'd build off the Schwartzian Transform. If you're new to the ST, I recommend Joseph Hall's explanation in Effective Perl Programming. Although the Second Edition is coming out very soon, we basically left his explanation as is so the first edition is just as good. Google Books seems to only show one inch of each page for the first edition, so you're out of luck there.
In this answer, I use a weighting function to decide which extensions should move to the top. If an extension doesn't have an explicit weight, I just sort it lexigraphically. You can fool around with the sort to get exactly the order that you want:
@files = qw(
buster.pdf
mimi.xls
roscoe.doc
buster.txt
mimi.txt
roscoe.txt
buster.rpm
mimi.rpm
);
my %weights = qw(
txt 10
rpm 9
);
my @sorted =
map { $_->{name} }
sort {
$b->{weight} <=> $a->{weight}
||
$a->{ext} cmp $b->{ext}
||
$a cmp $b
}
map {
my( $ext ) = /\.([^.]+)\z/;
{ # anonymous hash constructor
name => $_,
ext => $ext,
weight => $weights{$ext} || 0,
}
}
@files;
$" = "\n";
print "@sorted\n";
@sorted = sort { $b=~/\.txt$/ <=> $a=~/\.txt$/ || $a cmp $b } @files
will put .txt files first and otherwise sort lexicographically (alphabetically).
@sorted = sort { $b=~/\.txt$/ <=> $a=~/\.txt$/ } @files
will put .txt files first and otherwise preserve the original order (sort
is stable since Perl 5.8)
You just need to add a sort in front of each of your greps:
my @sorted =
(
sort( grep /\.txt\z/, @files ),
sort( grep ! /\.txt\z/, @files )
);
The trick here is that you are partitioning the list then sorting each partition independently. Depending on what you are doing, this might be a lot better than trying to do everything in one sort operation. Conversely, it might not always be better.
There are various other ways to get this done, but they aren't this simple. :)
Here's a quick benchmark on my MacBook Air with vanilla Perl 5.10.1:
There are 600 files to sort
brian: 3 wallclock secs @ 369.75/s (n=1161)
control: 3 wallclock secs @ 1811.99/s (n=5744)
leon: 4 wallclock secs @ 146.98/s (n=463)
mobrule: 3 wallclock secs @ 101.57/s (n=324)
sort: 4 wallclock secs @ 559.62/s (n=1746)
Here's the script:
use Benchmark;
use vars qw(@files);
@files = qw(
buster.pdf
mimi.xls
roscoe.doc
buster.txt
mimi.txt
roscoe.txt
) x 100;
printf "There are %d files to sort\n", scalar @files;
sub leon {
my @sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, !/\.txt$/ ]
} @files;
}
sub brian {
my @sorted =
(
sort( grep /\.txt\z/, @files ),
sort( grep ! /\.txt\z/, @files )
);
}
sub mobrule {
my @sorted =
sort { ($b=~/\.txt\z/) <=> ($a=~/\.txt\z/) || $a cmp $b }
@files;
}
sub plain_sort {
my @sorted = sort @files;
}
sub control {
my @sorted = @files;
}
timethese( -3,
{
brian => \&brian,
leon => \&leon,
mobrule => \&mobrule,
control => \&control,
sort => \&plain_sort,
}
);
Sort takes an optional block as first argument, though in this case a Schwartzian transform would be quicker.
@files = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, !/\.txt$/ ] } @files;
To handle multiple extensions efficiently, you could modify brian d foy's sorted grep
s by partitioning your array in one pass, and then sort each partition independently.
use strict;
use warnings;
use List::MoreUtils qw(part);
my @files = qw(
bar Bar.pm bar.txt
bar.jpeg foo foo.pm
foo.jpeg zebra.txt zebra.pm
foo.bat foo.c foo.pl
Foo.pm foo.png foo.tt
orange apple zebra.stripe
);
my @parts = part { get_extension_priority($_) } @files;
my @sorted = map { sort( @{ $_ || [] } ) } @parts;
print map "$_\n", @sorted;
BEGIN {
# Set extension priority order
my @priority = qw( stripe txt nomatch pl jpeg );
# make a hash to look up priority by extension
my %p = map { $priority[$_], $_ } 0..$#priority;
sub get_extension_priority {
my $file = shift;
return scalar @priority
unless /[.](\w*)$/;
return scalar @priority
unless exists $p{$1};
return $p{$1};
}
}
Code golf? This will not produce nasty warnings:
@files = map { $_->[0] } sort { @$b <=> @$a } map { [$_, /\.txt$/] } @files
精彩评论