开发者

Extending String by Recursion in Perl

Given a seed string I would like to extend it using a prefix hash and read list, I want to extend that string in this way:

  1. Input seed string $seed
  2. Extract the last k base of that seed
  3. Check the prefix_hash for the reads in read_list (reads where the first k-base) is the same with the seed.
  4. Merge that reads to end of $seed
  5. Redo Step 2 for the merged reads until no ends can be extended

I'm stuck with my code below:

use strict;
use Data::Dumper;
use Carp;

my $k = 2;

my %readlist = (

    "read1" => "ACTGA",
    "read2" => "ACAAA",
    "read3" => "CTCGC",
    "read4" => "GAGGG",
    "read5" => "TTTCC",
);


my %prefix_hash = (

    # This is a hash of array (prefix hash). 
    # The key is the last base
    # And array is the read that contain that base
    # at first k position
    # in this case k = 2

    "AC" => ["read1","read2"],
    "AG" => ["read3"],
    "GA" => ["read4"],
    "TT" => ["read5"]

);


my $seed = "AAAAC";


my @newreads = extend_seed($seed);


sub extend_seed {

    my  $str    = shift;

    my @new_str;
    my $first_lastk_str = substr($str,-($k));

    print "$first_lastk_str\n";
    # I'm stuck here how can I recurse and merge


    return @new_str;
}

Given the example above. I want to get the following output

Intial      AAAAC

First_merge AAAACTGA  # Seed merge with Read 1 
            AAAAACAA  # Seed merge with Read 2

Last_merge  AAAACTGAGGG # Fir开发者_StackOverflow中文版st_merge merge with Read 4 

What's the way to go about it?


First, you need a merge_strings routine:

sub merge_strings {
    my ($x, $y, $k) = @_;
    return sprintf '%s%s', $x, substr $y, $k;
}

The routine assumes that the last $k characters of $x and the first $k characters of $y match.

use strict; use warnings;
use Data::Dumper;

my $k = 2;

my %readlist = (
    "read1" => "ACTGA",
    "read2" => "ACAAA",
    "read3" => "CTCGC",
    "read4" => "GAGGG",
    "read5" => "TTTCC",
);

my %prefix_hash = (
    "AC" => ["read1","read2"],
    "AG" => ["read3"],
    "GA" => ["read4"],
    "TT" => ["read5"]
);

my $seed = "AAAAC";

my @newreads = extend_seed($seed, $k, \%prefix_hash, \%readlist);
print Dumper \@newreads;

sub merge_strings {
    my ($x, $y, $k) = @_;
    return sprintf '%s%s', $x, substr $y, $k;
}

sub extend_seed {
    my ($x, $k, $prefix, $reads) = @_;
    my $key = substr $x, -$k;

    return unless exists $prefix->{$key};

    my @ret = map merge_strings($x, $_, $k),
                  @{$reads}{@{ $prefix->{$key} }};

    push @ret, map extend_seed($_, $k, $prefix, $reads), @ret;
    return @ret;
}

Output:

$VAR1 = [
          'AAAACTGA',
          'AAAACAAA',
          'AAAACTGAGGG'
        ];
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜