Merge tab delimited text files into a single file
What is the easiest method for joining/merging all files in a folder (tab delimited) into a single file? They all share a unique column (primary key). Actually, I only need to combine a certain column and link on this primary key, so the output file would contain a new column for each file. Ex:
KEY# Ratio1 Ratio2 Ratio3
1 5.1 4.4 3.3
2 1.2 2.3 3.2
etc....
There are many other columns in each file that I do开发者_如何学Cn't need to combine in the output file, I just need these "ratio" columns linked by the unique key column.
I am running OS X Snow Leopard but have access to a few Linux machines.
use the join(1) utility
I actually spent some time learning Perl and solved the issue on my own. I figured I'd share the source code if anyone has a similar problem to solve.
#!/usr/bin/perl -w
#File: combine_all.pl
#Description: This program will combine the rates from all "gff" files in the current directory.
use Cwd; #provides current working directory related functions
my(@handles);
print "Process starting... Please wait this may take a few minutes...\n";
unlink"_combined.out"; #this will remove the file if it exists
for(<./*.gff>){
@file = split("_",$_);
push(@files, substr($file[0], 2));
open($handles[@handles],$_);
}
open(OUTFILE,">_combined.out");
foreach (@files){
print OUTFILE"$_" . "\t";
}
#print OUTFILE"\n";
my$continue=1;
while($continue){
$continue=0;
for my$op(@handles){
if($_=readline($op)){
my@col=split;
if($col[8]) {
$gibberish=0;
$col[3]+=0;
$key = $col[3];
$col[5]+=0; #otherwise you print nothing
$col[5] = sprintf("%.2f", $col[5]);
print OUTFILE"$col[5]\t";
$continue=1;
} else {
$key = "\t";
$continue=1;
$gibberish=1;
}
}else{
#do nothing
}
}
if($continue != 0 && $gibberish != 1) {
print OUTFILE"$key\n";
} else {
print OUTFILE"\n";
}
}
undef@handles; #closes all files
close(OUTFILE);
print "Process Complete! The output file is located in the current directory with the filename: _combined.out\n";
精彩评论