Problem with GFF parser by BCBio
I am trying to parse a GFF file using the BCBio GFF parser and I get the following error. Can anybody help me in resolving this problem?
Traceback (most recent call last):
File "gff_parse.py", line 6, in <module>
for rec in GFF.parse(in_handle):
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 709, in parse
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 299, in parse_in_parts
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 320, in parse_simple
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 603, in _gff_process
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 634, in _lines_to_out_info
File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 183, in _gff_line_map
ValueError: invalid literal for int() with base 10: 'New Start'
Here is my code:
from BCBio import GFF
in_file = "infile.gff"
in_handle = open(in_file)
for rec in GFF.parse(in_handle):
print rec
in_handle.close()
Thanks 开发者_如何转开发 Tulika
How did you generate the GFF file? It appears to contain at least one invalid line. The fourth column should contain integers for the start coordinate of a feature; the error message indicates it contains the value 'New Start' instead.
The GFF3 specification page has some examples of valid GFF, and the online validator can help with debugging formatting issues like this.
精彩评论