Assessing the significance of a BLASTn score?

2022-12-12 00:39 问答作者：

I am running standalone command line blast to align many query sequences against a large database sequence of nucleotides. I can modify the command line parameters of the blastn program to change various parameters such as the match/mismatch scores.

I am wondering - for the 'bit score' that blastn outputs, does it make sense to compare the bit scores for alignments with identical query and database sequences but different match/mismatch parameters? I am trying to assess how well blast is performing with various parameter values, but I want to make sure that everything is bein开发者_运维技巧g compared on even grounds. Thanks.

It's not clear to me why you think that comparing bit scores will give you an insight as to how well BLAST is performing. The usual method for doing

Unfortunately, much of the work on BLAST and other alignment programs is based on looking at local, ungapped alignments and empirically extending those that theory to gapped alignments. In particular, the bit scores are calculated like this:

S' = ( lambda * S - ln(K) ) / ln(2)

In the formula above, K and lambda are constants for your substitution matrix, S is the score (sum of substitution and gap scores), and S' is the bit score. This means that your bit scores will certainly change as a result of varying the gap open/gap extend parameters, which means that your comparison is invalid. This is an unfortunate result of the fact that there is little theory about gapped alignments, so the optimal gap scores for a given system have to be measured empirically.

Because bit scores aren't comparable, I suggest you do your assessment based on an alternate set of data that doesn't involve the alignment scores. For example, if I'm interested in the optimal gap opening/gap extension parameters for comparing protein sequences, I can look at proteins of known structure and assess each parameter set based on its ability make alignments that make structural sense. This avoids comparing the alignment scores entirely, which is good because comparing bit scores on their own isn't obviously useful.

I'm not sure you can do that. Do you really need to vary the match/mismatch parameters? What is your aim?

It's not necessarily true that bit scores aren't comparable. From the BLAST documentation on NCBI's web site:

"Bit scores are normalized, which means that the bit scores from different alignments can be compared, even if different scoring matrices have been used."

http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=handbook&part=ch16

继续阅读：bioinformatics

Assessing the significance of a BLASTn score?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？