Find All Possible Combination of Features (Column) in a Tab Delimited Data
I 开发者_StackOverflowhave a data that looks like this:
1 1:-0.394668 2:-0.794872 3:-1 4:-0.871341 5:0.9365 6:0.75597
1 1:-0.463641 2:-0.897436 3:-1 4:-0.871341 5:0.44378 6:0.121824
1 1:-0.469432 2:-0.897436 3:-1 4:-0.871341 5:0.32668 6:0.302529
-1 1:-0.241547 2:-0.538462 3:-1 4:-0.871341 5:0.9994 6:0.987166
1 1:-0.757233 2:-0.948718 3:-1 4:-0.871341 5:-0.33904 6:0.915401
1 1:-0.167147 2:-0.589744 3:-1 4:-0.871341 5:0.95078 6:0.991566
The first column is class, and next 6 columns are features, I am trying to find all possible combination of features (2 features, 3 features, ... 5 features),
E.g.:
feat1 - feat2
feat1 - feat3
...
feat5 - feat6
...
feat1 - feat2 -feat3 -feat4 -feat 5
feat1 - feat2 -feat3 -feat4 -feat 6
..etc..
One of the file feat12.txt
contains:
1 1:-0.394668 2:-0.794872
1 1:-0.463641 2:-0.897436
1 1:-0.469432 2:-0.897436
-1 1:-0.241547 2:-0.538462
1 1:-0.757233 2:-0.948718
1 1:-0.167147 2:-0.589744
Is there any existing implementation of that in Perl?
There is, of course, Algorithm::Combinatorics and/or Set::CrossProduct but it is hard to figure out from your problem description which one would be more appropriate.
Maybe you can use something like this as a starting point:
#!/usr/bin/perl
use strict; use warnings;
use Algorithm::Combinatorics qw( combinations );
while ( my $line = <DATA> ) {
last unless $line =~ /\S/;
my $row = [ $line =~ /([1-6]:\S+)/g ];
for my $i (2 .. 6) {
my $it = combinations($row, $i);
while ( my $x = $it->next ) {
print "@$x\n";
}
}
}
__DATA__
1 1:-0.394668 2:-0.794872 3:-1 4:-0.871341 5:0.9365 6:0.75597
1 1:-0.463641 2:-0.897436 3:-1 4:-0.871341 5:0.44378 6:0.121824
1 1:-0.469432 2:-0.897436 3:-1 4:-0.871341 5:0.32668 6:0.302529
-1 1:-0.241547 2:-0.538462 3:-1 4:-0.871341 5:0.9994 6:0.987166
1 1:-0.757233 2:-0.948718 3:-1 4:-0.871341 5:-0.33904 6:0.915401
1 1:-0.167147 2:-0.589744 3:-1 4:-0.871341 5:0.95078 6:0.991566
C:\Temp> c 1:-0.167147 2:-0.589744 3:-1 1:-0.167147 2:-0.589744 4:-0.871341 1:-0.167147 2:-0.589744 5:0.95078 … 2:-0.589744 3:-1 5:0.95078 6:0.991566 2:-0.589744 4:-0.871341 5:0.95078 6:0.991566 3:-1 4:-0.871341 5:0.95078 6:0.991566 1:-0.167147 2:-0.589744 3:-1 4:-0.871341 5:0.95078 1:-0.167147 2:-0.589744 3:-1 4:-0.871341 6:0.991566 1:-0.167147 2:-0.589744 3:-1 5:0.95078 6:0.991566 1:-0.167147 2:-0.589744 4:-0.871341 5:0.95078 6:0.991566 1:-0.167147 3:-1 4:-0.871341 5:0.95078 6:0.991566 2:-0.589744 3:-1 4:-0.871341 5:0.95078 6:0.991566 1:-0.167147 2:-0.589744 3:-1 4:-0.871341 5:0.95078 6:0.991566
精彩评论