开发者

Perl 5 - Iterator

I have implemented a simple iterator in perl. I normally work with C#, and use iterators and functional programming quite frequently. So I thought it would be simple to get some basics working in perl.

Problem is, I'm getting some poor performance, I don't expect be any faster than for or foreach, but I thought someone could give me some insight in how to speed it up.

Here is the guts of my package:

package Iterator;
use strict;

#Constructor for Iterator type
sub new {
  my $type = $_[0];
  my $this = {};

  #set default values
  $this->{Array} = @_[1];
  $this->{Index} = 0;
 $this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

  #return self
  bless($this, $type);
  return $this;
}

#Iterates next
sub Next {
 return $_[0]->{Sub}->();
}

Allows you to do this:

my $iterator = Iterator->new(\@array);
while (defined(my $current = $iterator->Next())) {
  print $current, "\n";
}

Not flashy... yet.

Also enables some functional code like this:

my $sum = 0;
Iterator
  ->new(\@array)
  ->Where(sub { $_[0] % 2 == 0 })
  ->ForEach(sub { $sum += $_[0] });

Which would sum up all the even values of an array开发者_运维知识库.

My bottleneck is the iteration code:

$this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

Any pointers to speed this up?


A bit late to the game here, but since you are concerned about performance, one of the largest bottlenecks in iterator type code is that the fields of your hash based object need to be dereferenced on each access. One way to combat this is to use closures in which the fields are closed over variables, avoiding unneeded dereferencing.

In my module List::Gen which contains a fairly performant implementation of lazy lists, I wrote the utility function curse which makes closure based objects behave like normal Perl objects.

Here is a short example of your iterator class written with curse. In a simple benchmark summing 1000 numbers, this method is twice as fast as yours, even after fixing all of the inefficiencies noted in the other answers.

{package Iterator;
    use List::Gen 'curse';
    sub new {
        my ($class, $array) = @_;
        my $index = 0;
        curse {
            next  => sub {$$array[$index++]},
            index => sub :lvalue {$index},
        } => $class
    }
    sub reset {shift->index = 0}
} 

If you are really pushing for more speed, since the next method does not need to be passed anything, you could even write:

my $next = $iterator->can('next');

while (defined (my $x = $next->()) {...}

Which will give you a 30% to 40% speed boost over a normal method call.

You can read the source of List::Gen for more advanced usage of curse


You might find it useful to read a bit of Higher Order Perl.


this line:

my @array = @{$this->{Array}};

duplicates the array into @array, and I don't think you want to. Just do $#{$this->{Array}} to find the endpoint of your array.


A much more efficient version:

package Iterator;
use strict;

#Constructor for Iterator type
sub new {
  my $type = shift;
  my $array = shift;
  my $this = {};
  $this->{array} = $array;
  $this->{index} = 0;
  bless($this, $type);
  return $this;
}

#Iterates next
sub Next {
  my $this = shift;
  return $this->{array}->[$this->{index}++];
}


Summing even numbers is easier done using grep and List::Util:

use List::Util 'sum';
say sum grep { not $_ % 2 } (1 .. 10); // 30

It seems very likely to me that that the code suggested by your question is over-engineering. Unless you can come up with a decent example that cannot be easily solved using the traditional Perl primitives.


Have a look at List::Util and List::MoreUtils for utilities that may help you with this. You can even use perl5i for a more modern looking syntax.

Example:

use perl5i::2;

my @nums = (0..100);
my $sumEven = @nums->grep(sub { $_ % 2 == 0 })->reduce(sub { $a+$b });

say $sumEven;


There is already an array iterator in CPAN, so you can look at its approach if you have not done it yet.

By the way in your code you have:

#set default values
$this->{Array} = @_[1];

I assume you want to say $_[1]. With @_[1] you are requesting an array slice of one element. At the end the result is the same but the semantics isn't. The curious thing is that I was expecting to have an array of one element if I do @_[1] or an error but tested in the debugger and you obtain the scalar (at least in perl 5.10). Perl 6 will go for this behaviour anyway and will not change sigil for accessing elements in arrays or hashes so you are coding 'advanced' Perl ;-)


Don't unload the stored array. You're copying every element of an array from where it is pointed at by $this->{Array} to the local list @array when you do this:

my @array = @{$this->{Array}};

Also if you know that you are going to stop when you hit undef, then you don't have to even check bounds.

$this->{Sub} = sub { return $this->{Array}[++$this->{Index}]; }

Is all you need. When {Index} gets out of range, it will return undef.

In addition, you can write your expression in Perl like:

$sum += $_ foreach grep { $_ % 2 == 0 } @array;


A much simpler Perl iterator:

my @array = (1, 2, 3, 4);
while (my $i = shift @array)
{
    print $i . "\n";
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜