Perl - How do I count and print occurrences of domains in email address array?

2023-03-04 08:43 问答作者：

I have been struggling with this for a couple days now and cannot seem to figure it out.

I have an array of email addresses that were created via push(@emails,$email) in a while loop.

I am attempting to create a list of unique domains with occurrence count of each in the array.

Ordered by number of o开发者_运维问答ccurrences.

So, if the array @emails has:

john@yadoo.com ringo@geemail.net george@zoohoo.org paul@yadoo.com

I can print:

yadoo.com 2
geemail.net 1
zoohoo.org 1

I found this example based on emails in a file but, WAY over my head. Can someone help me in a more verbose code example that can be used with an array of email addresses?

perl -e 'while(<>){chomp;/^[^@]+@([^@]+)$/;$h{$1}++;}
foreach $k (sort { $h{$b} <=> $h{$a} } keys %h)  {print $h{$k}." ".$k."\n";} infile

I also tried: (more to my level of lack of understanding)

foreach my $domain (sort keys %$domains) {
  print "$domain"."=";
  print $domains->{$domain}."\n";
};

AND

my %countdoms;
$countdoms{$_}++ for @domains;
print "$_ $countdoms{$_}\n" for keys %countdoms;

The best result I got of many different attempts was a total count (which was 1812 (accurate count) with a number 2 next to it. I am close, possibly?

Instead of giving you another answer, let me explain you what your code example is doing:

while(<>){chomp;/^[^@]+@([^@]+)$/;$h{$1}++;}
foreach $k (sort { $h{$b} <=> $h{$a} } keys %h)  {print $h{$k}." ".$k."\n";}

The first line counts the domains from emails in files.

while(<>) iterates over the input files line by line. The input files are the file(s) passed as arguments or stdin if no arguments were passed. Each line is placed in $_.

chomp; simply removes the newline from the end of $_.

/^[^@]+@([^@]+)$/ is the regular expression that parses out the domain and is applied to $_. It checks for something that has no '@' in the first part, then a '@' and then no '@' in the last part. It remembers the last part, which will be stored in $1. ^ and $ stand for the beginning and the end of the string, respectively.

$h{$1}++; uses the domain (in $1) to increment the count in the hash %h. This works even if it's not present, because undef behaves here like 0.

In order to make this work for your list, you can just do

foreach(@emails) {/^[^@]+@([^@]+)$/;$h{$1}++;}

The second line prints the domains from the hash %h.

sort { $h{$b} <=> $h{$a} } keys %h returns a list of domains sorted by descending occurrence by using the comparison function $h{$b} <=> $h{$a} to look up the count. Note that it's b <=> a, not a <=> b, this makes it descending.

The rest of line 2 prints out the result.

If you have your email address populated in an array this'll get you a count for each domain. I'm sure someone can produce something prettier!

my @emails = ('john@yadoo.com','ringo@geemail.net','george@zoohoo.org','paul@yadoo.com');

my %domainCount;

foreach(@emails){
    if ($_ =~ /@(\w+.*)/){
        $domainCount{$1}++;
    }
}

for my $domain (sort { $domainCount{$b} <=> $domainCount{$a}} keys %domainCount ){
    print "$domain - $domainCount{$domain}\n";
}

It's a bit crude because I am rusty on Perl but this should do the job:

use strict;
$|=1;
my ($dom, %hsh);
my @arr = ('john@yadoo.com', 'ringo@geemail.net', 'george@zoohoo.org', 'paul@yadoo.com');
foreach (@arr) {
    ($dom) = ($_ =~ /.*\@(.*)$/);
    $hsh{$dom}++;
}
foreach (keys %hsh) {
    print ("$_:$hsh{$_}\n");
}

Another variation:

use strict;
use warnings;

my @array 
    = qw<john@yadoo.com ringo@geemail.net george@zoohoo.org paul@yadoo.com>
    ;
my %dom_count;
$dom_count{ $_ }++ foreach map { ( split '@' )[-1] } @array;
foreach my $pair ( 
    sort { $b->[1] <=> $a->[1] or $a->[0] cmp $b->[0] } 
    map  { [ $_ => $dom_count{ $_ } ] } keys %dom_count 
    ) { 
    print "@$pair\n";
}

继续阅读：arrays count perl unique

Perl - How do I count and print occurrences of domains in email address array?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？