Convert GenBank Flatfiles to FASTA

2023-03-12 10:32 问答作者：

I need to parse a preliminary GenBank Flatfile. The sequence hasn't been published yet, so I can't look it up by accession and download a FASTA file. I'm new to Bioinformatics, so could someone 开发者_JS百科show me where I could find a BioPerl or BioPython script to do this myself? Thanks!

You need the Bio::SeqIO module to read or write out bioinformatics data. The SeqIO HOWTO should tell you everything you need to know, but here's a small read-a-GenBank-file script in Perl to get you started!

I have the Biopython solution for you here. I will firstly assume your genbank file relates to a genome sequence, then I will provide a different solution assuming it was instead a gene sequence. Indeed it would have been helpful to have known which of these you are dealing with.

Genome Sequence Parsing:

Parse in your custom genbank flatfile from file by:

from Bio import SeqIO
record = SeqIO.read("yourGenbankFileDirectory/yourGenbankFile.gb","genbank")

If you just want the raw sequence then:

rawSequence = record.seq.tostring()

Now perhaps you need a name for this sequence, to give the sequence a ">header" before making the .fasta. Let's see what names came with the genbank .gb file:

nameSequence = record.features[0].qualifiers

This should return a dictionary with various synonyms of that whole sequence as annotated by author of that genbank file

Gene Sequence Parsing:

Parse in your custom genbank flatfile from file by:

from Bio import SeqIO
record = SeqIO.read("yourGenbankFileDirectory/yourGenbankFile.gb","genbank")

To get a list of raw sequences for the gene/list of all genes then:

rawSequenceList = [gene.extract(record.seq.tostring()) for gene in record.features]

To get a list of names for each gene sequence (more precisely a dictionary of synonyms for each gene)

nameSequenceList = [gene.qualifiers for gene in record.features]

继续阅读：bioperl biopython fasta perl python

Convert GenBank Flatfiles to FASTA

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？