How can I extract DNA sequence using a Perl script from UCSC if I have their coordinates?

2022-12-28 09:21 问答作者：

开发者_开发问答How can I extract DNA sequence using a Perl script from genome browser (UCSC), if I have their coordinates?

You can pipe a DAS sequence request into a Perl script that parses out the XML element containing the sequence.

For example, the following is a curl request of UCSC's DAS server, throwing away the standard error, piped to parseSeq.pl:

$ curl http://genome.ucsc.edu/cgi-bin/das/hg19/dna?segment=1:10000,10999 2>/dev/null | parseSeq.pl

The output of curl will be an XML document containing the 1000-base DNA sequence from the hg19 assembly of the human genome. The request asks for base 10000 to 10999 (remember that UCSC is 0-based) from the first chromosome. The XML will include some other stuff useful for logging and error checking.

After piping XML into a Perl script, you can use Perl's XML::Simple module to quickly parse out the stuff you want.

To help you get started, your parseSeq.pl file might start with:

#!/usr/bin/perl -w                                                                                                                                                                                                                          

use strict;                                                                                                                                                                                                                                 
use XML::Simple;                                                                                                                                                                                                                            
use Data::Dumper;                                                                                                                                                                                                                           

my $xml = new XML::Simple;                                                                                                                                                                                                                  
my $ref = $xml->XMLin('-');                                                                                                                                                                                                                       

print Dumper $ref;

The output of this should give you enough of a start to pull the DNA sequence from $ref.

继续阅读：bioinformatics perl

How can I extract DNA sequence using a Perl script from UCSC if I have their coordinates?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？