How do I efficiently filter lines of input against an array of data?
I am trying to read a file into a temporary variable, filtering the file based off of items in an array. I am doing this by opening a file and in the while loop of reading the file, run another loop (very bad idea IMO) to check to see if the contents match the array, if so the line is discarded and it proceeds to the next line.
It works, but its bad when there are 20,000 lines of input. I am reading with an array of 10 items, which essentially turns it into a 200,000 line file.
开发者_StackOverflow社区Is there a way to process this quicker?
Assuming you want to discard a line if any item in your array is found, the any
function from List::MoreUtils will stop searching through an array as soon as it has found a match.
use List::MoreUtils qw(any);
while (<>) {
my $line = $_;
next if any { $line =~ /$_/ } @list;
# do your processing
}
If you happen to know which items in your array are more likely to occur in your lines, you could sort your array accordingly.
You should also Benchmark your approaches to make sure your optimization efforts are worth it.
Mash the array items together into a big regex: e.g., if your array is qw{red white green}
, use /(red|white|green)/
. The $1
variable will tell you which one matched. If you need exact matching, anchor the end-points: /^(red|white|green)$/
.
精彩评论