开发者

Computing number of arrays containing an element in perl

I think I've just been looking at this too long. I have some data that looks like this:

@a = (
    { name => 'ethan', depth => 0 },
    { name => 'victoria', depth => 1 },
    { name => 'stephen', depth => 2 },
    { name => 'christopher', depth => 3 },
    { name => 'isabella', depth => 2 },
    { name => 'ethan', depth => 3 },
    { name => 'emma', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'olivia', depth => 2 },
    { name => 'alexander', depth => 3 },
    { name => 'stephen', depth => 1 },
    { name => 'sophia', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'ava', depth => 1 },
    { name => 'joshua', depth => 2 }
);

This is a simple 'tree' data structure. Every time 'depth' = 0, that is the beginning of a new 'tree'. What I would like to know is in how many of these trees do each of the names appear? The desired result would be a single hash with the names as the key, and the count as the value.

The kink in this is, if you look clos开发者_开发技巧ely, the first tree contains the name 'ethan' twice. Any tree can have any name more than once, but that should only count as 1, since they all occur in the same tree. However, 'michael' would have a count of 2, since he appears in two different trees.

I can think of a few ways of doing this, but they all involve multiple loops and seem somewhat brute force and inelegant. Hopefully, someone else out there can come up with a better solution.


I'm not 100% sure about your problem spec -- is this the correct output?

  alexander 1
        ava 1
christopher 1
       emma 1
      ethan 1
   isabella 1
     joshua 1
    michael 2
     olivia 1
     sophia 1
    stephen 2
   victoria 1

If so, then this code seems to do the job:

my @names = (
  # your @a above
);

my (%seen, %count);

for my $entry (@names) {
  if ($entry->{depth} == 0) {
    ++$count{$_} for keys %seen;
    %seen = ();
  }
  ++$seen{ $entry->{name} };
}

++$count{$_} for keys %seen;
print "$_\t$count{$_}\n" for sort keys %count;

that is, just keep a tally of names that only gets shuffled into the global count when we reach the root of the tree.


Here is one more way:

#!/usr/bin/perl

use strict; use warnings;

my @data = (
    { name => 'ethan', depth => 0 },
    { name => 'victoria', depth => 1 },
    { name => 'stephen', depth => 2 },
    { name => 'christopher', depth => 3 },
    { name => 'isabella', depth => 2 },
    { name => 'ethan', depth => 3 },
    { name => 'emma', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'olivia', depth => 2 },
    { name => 'alexander', depth => 3 },
    { name => 'stephen', depth => 1 },
    { name => 'sophia', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'ava', depth => 1 },
    { name => 'joshua', depth => 2 }
);

my @trees;

for my $x ( @data ) {
    push @trees, {} unless $x->{depth};
    $trees[-1]->{ $x->{name} } = undef;
}

my @names = keys %{ { map { $_->{name} => undef } @data } };
for my $name ( sort @names ) {
    printf "%s appears in %d tree(s)\n",
        $name, scalar grep { exists $_->{$name} } @trees;
}


%count = ();
for (@a)
{
    %found = () unless $_->{depth};
    my $name = $_->{name};
    unless ($found{$name}) 
    {
        ++$count{$name};
        $found{$name} = 1;
    }
}
return %count;

Basically, what we're doing is keeping a hash of the names we've found in the current tree (which is cleared out whenever the tree switches). If the current name hasn't been found yet, we bump the count and note that we've found it so it won't get counted again til the tree switches again.


I think this is shortest and simplest solution

my (%count, %seen);
for my $e (@a) {
  %seen = () unless $e->{depth};
  $count{$e->{name}}++ unless $seen{$e->{name}}++;
}

print "$_ => $count{$_}\n" for sort keys %count;

Result:

alexander => 1
ava => 1
christopher => 1
emma => 1
ethan => 1
isabella => 1
joshua => 1
michael => 2
olivia => 1
sophia => 1
stephen => 2
victoria => 1


The way I'd go about solving this is to use a recursive function that you map over the array. The edge case is is a depth of 0, in which case you return. otherwise, hold on to the root name, incrementing its value in a hash counter, and then recurse into the subtree where you increment that name as long as it doesn't match the root name, then recurse again, etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜