开发者

reading and extracting data from file using python

I am new to python , and I want to extract the data from this format

<seq id> <alignment start> <alignment end> <envelope start> <envelope end> <hmm acc> <hmm name> <type> <hmm star开发者_运维技巧t> <hmm end> <hmm length> <bit score> <E-value> <significance> <clan>

**FBpp0143497**      **5    151**      5    157 PF00339.22  **Arrestin_N**        Domain     1   135   149     83.4   **1.1e-23**   1 CL0135   
**FBpp0143497**    **183    323**    183    324 PF02752.15  Arrestin_C        Domain     1   137   138     58.5     **6e-16**   1 CL0135   
FBpp0131987     60    280     51    280 PF00089.19  Trypsin           Domain    14   219   219    127.7   3.7e-37   1 CL0124  

to this format

>FBpp0143497
 5      151        Arrestin_N     1.1e-23

>FBpp0143497
 183    323        Arrestin_C     6e-16


You could parse the file with the 'csv' module, using space as a delimiter. See the documentation for csv.reader


As this is proteomic data, probably you could find dedicated parsers in the BioPython package


You can use split() to separate the items at spaces and then print out the values you want from the returned list.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜