How do I remove 2nd and rest digits after a period from column one of lines?
How do I remove 2nd and rest digit after the period from column one?
For example,HP_000083.21423 开发者_JAVA技巧 N -1 NO 99.8951% 0.000524499999999983
NP_075561.1_1908 N -1 NO 99.9697% 0.000151499999999971
I would like to remove "_1908" from "NP_075561.1_1908"
and "1423 from "HP_000083.21423"
without removing other items from the subsequent columns.
Expected row would be:
HP_000083.2 N -1 NO 99.8951% 0.000524499999999983
NP_075561.1 N -1 NO 99.9697% 0.000151499999999971
Here's my code: Some of you had provided part of this solution in the past.
for line in fname:
line = re.sub('[\(\)\{\}\'\'\,<>]','', line)
line = re.sub(r"(\.\d+)_\d+", r"\1", line)
fields = line.rstrip("\n").split()
outfile.write('%s %s %s %s %s %s\n' % (fields[0],fields[1],fields[2],fields[3],fields[4],(fields[5])))
Thanks in advance guys, Cheers,
I'd avoid using regular expressions in this case. You can easily make do with standard string methods:
for line in infile:
first_col, rest = line.split(" ", 1)
first_col = first_col[:first_col.index(".") + 2]
output_line = str.join(" ", (first_col, rest))
outfile.write(output_line)
Here is a solution with a pretty minimal change to the code you provided:
for line in fname:
line = re.sub('[\(\)\{\}\'\'\,<>]','', line)
line = re.sub(r"(\.\d)\d*_?\d*", r"\1", line, 1)
fields = line.rstrip("\n").split()
outfile.write('%s %s %s %s %s %s\n' % (fields[0],fields[1],fields[2],fields[3],fields[4],(fields[5])))
精彩评论