开发者

How do I remove 2nd and rest digits after a period from column one of lines?

How do I remove 2nd and rest digit after the period from column one?

For example,

HP_000083.21423  开发者_JAVA技巧    N  -1  NO  99.8951%    0.000524499999999983
NP_075561.1_1908    N   -1  NO  99.9697%    0.000151499999999971

I would like to remove "_1908" from "NP_075561.1_1908"

and "1423 from "HP_000083.21423"

without removing other items from the subsequent columns.

Expected row would be:

HP_000083.2         N          -1       NO        99.8951%  0.000524499999999983
NP_075561.1             N           -1      NO        99.9697%  0.000151499999999971

Here's my code: Some of you had provided part of this solution in the past.

    for line in fname:
        line = re.sub('[\(\)\{\}\'\'\,<>]','', line)
        line = re.sub(r"(\.\d+)_\d+", r"\1", line) 
        fields = line.rstrip("\n").split()
       outfile.write('%s  %s  %s  %s  %s  %s\n' % (fields[0],fields[1],fields[2],fields[3],fields[4],(fields[5])))

Thanks in advance guys, Cheers,


I'd avoid using regular expressions in this case. You can easily make do with standard string methods:

for line in infile:
    first_col, rest = line.split(" ", 1)
    first_col = first_col[:first_col.index(".") + 2]
    output_line = str.join(" ", (first_col, rest))
    outfile.write(output_line)


Here is a solution with a pretty minimal change to the code you provided:

for line in fname:
    line = re.sub('[\(\)\{\}\'\'\,<>]','', line)
    line = re.sub(r"(\.\d)\d*_?\d*", r"\1", line, 1)
    fields = line.rstrip("\n").split()
    outfile.write('%s  %s  %s  %s  %s  %s\n' % (fields[0],fields[1],fields[2],fields[3],fields[4],(fields[5])))
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜