Extending String by Recursion in Perl
Given a seed string I would like to extend it using a prefix hash and read list, I want to extend that string in this way:
- Input seed string
$seed
- Extract the last
k
base of that seed - Check the prefix_hash for the reads in read_list (reads where the first k-base) is the same with the seed.
- Merge that reads to end of $seed
- Redo Step 2 for the merged reads until no ends can be extended
I'm stuck with my code below:
use strict;
use Data::Dumper;
use Carp;
my $k = 2;
my %readlist = (
"read1" => "ACTGA",
"read2" => "ACAAA",
"read3" => "CTCGC",
"read4" => "GAGGG",
"read5" => "TTTCC",
);
my %prefix_hash = (
# This is a hash of array (prefix hash).
# The key is the last base
# And array is the read that contain that base
# at first k position
# in this case k = 2
"AC" => ["read1","read2"],
"AG" => ["read3"],
"GA" => ["read4"],
"TT" => ["read5"]
);
my $seed = "AAAAC";
my @newreads = extend_seed($seed);
sub extend_seed {
my $str = shift;
my @new_str;
my $first_lastk_str = substr($str,-($k));
print "$first_lastk_str\n";
# I'm stuck here how can I recurse and merge
return @new_str;
}
Given the example above. I want to get the following output
Intial AAAAC
First_merge AAAACTGA # Seed merge with Read 1
AAAAACAA # Seed merge with Read 2
Last_merge AAAACTGAGGG # Fir开发者_StackOverflow中文版st_merge merge with Read 4
What's the way to go about it?
First, you need a merge_strings
routine:
sub merge_strings {
my ($x, $y, $k) = @_;
return sprintf '%s%s', $x, substr $y, $k;
}
The routine assumes that the last $k
characters of $x
and the first $k
characters of $y
match.
use strict; use warnings;
use Data::Dumper;
my $k = 2;
my %readlist = (
"read1" => "ACTGA",
"read2" => "ACAAA",
"read3" => "CTCGC",
"read4" => "GAGGG",
"read5" => "TTTCC",
);
my %prefix_hash = (
"AC" => ["read1","read2"],
"AG" => ["read3"],
"GA" => ["read4"],
"TT" => ["read5"]
);
my $seed = "AAAAC";
my @newreads = extend_seed($seed, $k, \%prefix_hash, \%readlist);
print Dumper \@newreads;
sub merge_strings {
my ($x, $y, $k) = @_;
return sprintf '%s%s', $x, substr $y, $k;
}
sub extend_seed {
my ($x, $k, $prefix, $reads) = @_;
my $key = substr $x, -$k;
return unless exists $prefix->{$key};
my @ret = map merge_strings($x, $_, $k),
@{$reads}{@{ $prefix->{$key} }};
push @ret, map extend_seed($_, $k, $prefix, $reads), @ret;
return @ret;
}
Output:
$VAR1 = [ 'AAAACTGA', 'AAAACAAA', 'AAAACTGAGGG' ];
精彩评论