The fastest way (execution time) to find the longest element in an list

2023-01-24 21:55 问答作者：

Is this the fastest (execution time) way to find the longest element in a list?

#!/usr/bin/env perl
use warnings;
use 5.012;
use List::Util qw(reduce);
use List::Util::XS;

my @array = qw( one two three four five six seven eight nine ten eleven 开发者_如何学Go);

my $l = reduce{ length($a) > length($b) ? $a : $b } @array;

say $l;

When only trying to find one element of a list, there is no need to construct an N sized data structure as many answers here have done. The fastest O(N) way to do this is to walk the array, keeping track of the largest element. That way you have O(N) accesses of the list, and O(1) memory usage.

sub longest {
    my $max = -1;
    my $max_i = 0;
    for (0 .. $#_) {              # for each index
        my $len = length $_[$_];  # only get length once per item
        if ($len > $max) {        # save index and update max if larger
            $max = $len;
            $max_i = $_;
        }
    }
    $_[$max_i]   # return the largest item
}

If you are going to be running the above code many times, I would suggest inlining the body of the subroutine.

EDIT:

drewk's benchmark revealed that the array index in the above code is a bit of a bottleneck. Experimenting a little more, I have finally found a method that is faster than the reduce solution:

sub fastest {
    my $max = -1;
    my $max_ref;
    for (@_) {
        if (length > $max) {  # no temp variable, length() twice is faster
            $max = length;
            $max_ref = \$_;   # avoid any copying
        }
    }
    $$max_ref
}

which results in the following benchmark:

           Rate longest   drewk  reduce fastest
longest 44245/s      --    -21%    -30%    -47%
drewk   55854/s     26%      --    -11%    -33%
reduce  63014/s     42%     13%      --    -25%
fastest 83638/s     89%     50%     33%      --

Here is slightly modified version of OMG_peanuts with for and less variables:

my $len = length $array[0];
my $longest = 0;
for my $i (1 .. $#array) {
    my $i_len = length $array[$i];
    if($i_len > $len) {
        $longest = $i;
        $len = $i_len;
    }
}
my $l = $array[$longest];

I was playing a bit with benchmarks, getting this for small numbers (original array)

           Rate REDUCE TMPVAR TMPFOR
REDUCE 234862/s     --    -0%    -7%
TMPVAR 235643/s     0%     --    -6%
TMPFOR 251326/s     7%     7%     --

and this for larger number or items (original array x 100)

         Rate TMPVAR TMPFOR REDUCE
TMPVAR 3242/s     --   -28%   -32%
TMPFOR 4503/s    39%     --    -5%
REDUCE 4750/s    47%     5%     --

Note that suitability of algorithm heavily varies due to data specifics (I would guess longer strings may increase weight of length function in algorithm).

EDIT: Here is full code for the benchmark (long array version, short is missing x 100 in array definition)

use Benchmark  qw(:all);
use List::Util qw(reduce);

my @array = qw( one two three four five six seven eight nine ten eleven ) x 100;

cmpthese(-2, {
    REDUCE => sub {
        my $l = reduce{ length($a) gt length($b) ? $a : $b } @array;
    },
    TMPVAR => sub {
        my $idx = 1;
        my $lastLength = length $array[0];
        my $lastElt = $array[0];
        my $listLength = scalar @array;
        while ($idx < $listLength) {
          my $tmpLength = length $array[$idx];
          if ($tmpLength > $lastLength) {
            $lastElt = $array[$idx];
            $lastLength = $tmpLength
          }
          $idx++
        }
        my $l = $lastElt;
    },
    TMPFOR => sub {
        my $len = length $array[0];
        my $longest = 0;
        for my $i (1 .. $#array) {
            my $i_len = length $array[$i];
            if($i_len > $len) {
                $longest = $i;
                $len = $i_len;
            }
        }
        my $l = $array[$longest];
    },
});

My fastest is:

sub drewk {
    my $len = -1;
    for (@_) {
        my $tmp=length($_);
        if ( $tmp > $len ) {
            $longest = $_;
            $len = $tmp;
        }
    }
    return $longest;
}

But benchmarking against:

sub strom {
    my $max = -1;
    my $max_i = 0;
    for (0 .. $#_) {              # for each index
        my $len = length $_[$_];  # only get length once per item
        if ($len > $max) {        # save index and update max if larger
            $max = $len;
            $max_i = $_;
        }
    }
    $_[$max_i]   # return the largest item
}

sub red {
    return reduce{ length($a) > length($b) ? $a : $b } @_;
}

Shows that reduce is fastest:

            Rate  strom  drewk reduce
strom  1323455/s     --   -38%   -45%
drewk  2144549/s    62%     --   -10%
reduce 2390707/s    81%    11%     --

The other benchmark is Eric Strom's sub

A little golfish:

my @unsorted = qw( one two three four five six seven eight nine ten eleven );
my $longest =  (
    map { $_->[0] } 
    sort { $b->[1] <=> $a->[1] } 
    map { [ $_, length $_ ] } @unsorted 
)[0];

say $longest;

EDIT: the map/sort/map is a Schwartzian transform for anyone unfamiliar with that technique and wondering.

Assuming the goal is to just find the longest string, not its index:

my $longest = $array[0];
my $len = length $longest;
for my $str (@array) {
    if ( length($str) > $len ) {
        $longest = $str;
        $len = length($str);
    }
}

If you really want to cut-off number of computed length's, then take a look at Schwartian transform and adopt it to your problem.

EDIT:

I see that no one posted complete example I meant, so here it is (I haven't benchmarked it yet as I'm not in from of my personal computer):

my @array = qw( one two three four five six seven eight nine ten eleven );
my $longest = (
                reduce { $a->[1] > $b->[1] ? $a : $b } 
                map { [ $_, length $_ ] }
                @array
              )[0];

say $longest;

This appears to be significantly faster than other solutions (based on fastest_Eric_Storm),

use warnings;
use 5.012;
use Benchmark qw(:all) ;
use List::Util qw(reduce);

my @array = map { ($_) x 50 } qw( one two three four five six seven eight nine ten eleven );

sub list_util_xs {
    my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
    return $l;
}

sub fastest_Eric_Strom {
    my $max = -1; my $max_ref;
    for (@array) {
        if (length > $max) {
            $max = length;
            $max_ref = \$_;
        }
    }
    return $$max_ref;
}

sub ysth {
    my $longest = $array[0];
    my $len = length $longest;
    for my $str (@array) {
    if ( length($str) > $len ) {
        $longest = $str;
        $len = length($str);
    }
    }
    return $longest;
}

sub mpapec {
    my $max = -1;
    my $max_ref;
    length > $max and ($max, $max_ref) = (length, \$_) for @array;
    return $$max_ref;
}

cmpthese( -10, {
    'list_util_xs'  => sub{ list_util_xs() },
    'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
    'ysth'      => sub{ ysth() },
    'mpapec'    => sub{ mpapec() },
});

output

                      Rate list_util_xs fastest_Eric_Storm       ysth     mpapec
list_util_xs       13479/s           --               -24%       -24%       -29%
fastest_Eric_Storm 17662/s          31%                 --        -0%        -6%
ysth               17680/s          31%                 0%         --        -6%
mpapec             18885/s          40%                 7%         7%         --

You could use some temp var to avoid calculing length again and again:

my @unsorted = qw( one two three four five six seven eight nine ten eleven ); 
my $idx = 1;
my $lastLength = length $unsorted[0];
my $lastElt = $unsorted[0];
my $listLength = scalar @unsorted;
while ($idx < $listLength) {
  my $tmpLength = length $unsorted[$idx];
  if ($tmpLength > $lastLength) {
    $lastElt = $unsorted[$idx];
    $lastLength = $tmpLength
  }
  $idx++
}
print "Longest element:$lastElt";

Benchmark results :

           Rate REDUCE TMPVAR
REDUCE 169297/s     --   -29%
TMPVAR 237926/s    41%     --

First I tested if all routines give me the right result. MBO's routine didn't pass the first test (it returns an array-reference); to give him a second change I modified the routine to get the right result.

I run the benchmark some times and I didn't always get the same order. So I would say ( as already sed here ) ysth's and fasest_Eric_Strom are the fastest but list_utils reduce is almost as fast as they are; what's easy to read from the results is that David Precious sort-version is the slowest and MBO's modified reduce-version is second slowest.

My conclusion: list_utils reduce is the winner of the best price-performance ratio.
edit: I was too fast with the prize giving ceremony: List::Util - reduce - length - encoding - question

David_Precious      64147/s             -- -36%        -73%               -79%  -80% -81%         -85%               -86% -87%
MBO                100195/s            56%   --        -58%               -67%  -69% -70%         -77%               -79% -80%
OMG_peanuts        237772/s           271% 137%          --               -21%  -27% -30%         -45%               -50% -52%
longest_Eric_Strom 300466/s           368% 200%         26%                 --   -8% -11%         -31%               -36% -40%
drewk              325883/s           408% 225%         37%                 8%    --  -4%         -25%               -31% -34%
bvr                338156/s           427% 237%         42%                13%    4%   --         -22%               -28% -32%
list_util_xs       434114/s           577% 333%         83%                44%   33%  28%           --                -8% -13%
fastest_Eric_Strom 471812/s           636% 371%         98%                57%   45%  40%           9%                 --  -5%
ysth               497198/s           675% 396%        109%                65%   53%  47%          15%                 5%   --

#!/usr/bin/env perl
use warnings;
use 5.012;
use Benchmark qw(:all) ;
use List::Util qw(reduce);

my @array = qw( one two three four five six seven eight nine very_long_long ten eleven );

sub list_util_xs {
    my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
    return $l;
}

sub longest_Eric_Strom {
    my $max = -1; my $max_i = 0;
    for (0 .. $#array) {
        my $len = length $array[$_]; 
        if ($len > $max) { 
            $max = $len;
            $max_i = $_;
        }
    }
    return $array[$max_i];
}

sub fastest_Eric_Strom {
    my $max = -1; my $max_ref;
    for (@array) {
        if (length > $max) {
            $max = length;
            $max_ref = \$_;
        }
    }
    return $$max_ref;
}

sub David_Precious {
    my $longest =  ( map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $_, length $_ ] } @array )[0];
    return $longest;
}

sub MBO {
    my $longest = ( reduce { $a->[1] > $b->[1] ? $a : $b } map { [ $_, length $_ ] } @array )[0];
    return $longest->[0];
}

sub drewk {
    my $len = -1; my $longest;
    for (@array) {
        my $tmp=length($_);
        if ( $tmp > $len ) {
            $longest = $_;
            $len = $tmp;
        }
    }
    return $longest;
}

sub ysth {
    my $longest = $array[0];
    my $len = length $longest;
    for my $str (@array) {
    if ( length($str) > $len ) {
        $longest = $str;
        $len = length($str);
    }
    }
    return $longest;
}

sub bvr {
    my $len = length $array[0];
    my $longest = 0;
    for my $i (1 .. $#array) {
    my $i_len = length $array[$i];
    if($i_len > $len) {
        $longest = $i;
        $len = $i_len;
    }
    }
    return $array[$longest];
}

sub OMG_peanuts {
    my $idx = 1;
    my $lastLength = length $array[0];
    my $lastElt = $array[0];
    my $listLength = scalar @array;
    while ($idx < $listLength) {
    my $tmpLength = length $array[$idx];
    if ($tmpLength > $lastLength) {
    $lastElt = $array[$idx];
    $lastLength = $tmpLength
    }
    $idx++
    }
    return $lastElt;
}

cmpthese( -10, {
    'list_util_xs'  => sub{ list_util_xs() },
    'longest_Eric_Storm' => sub{ longest_Eric_Strom() },
    'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
    'David_Precious'    => sub{ David_Precious() },
    'MBO'       => sub{ MBO() },
    'drewk'         => sub{ drewk() },
    'ysth'      => sub{ ysth() },
    'OMG_peanuts'   => sub{ OMG_peanuts() },
    'bvr'       => sub{ bvr() },
});

You could reduce the number of times you have to calculate the string's length by reducing to a struct or array, containing the length next to the string itself.

Further, the iteration is (to be) optimized by the reduce algorithm, the length call is hardly optimizeable.

继续阅读：algorithm performance perl

The fastest way (execution time) to find the longest element in an list

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？