Calculating difference within lists

2022-12-21 21:59 问答作者：

I have two files and the content is as follows:

alt text http://img144.imageshack.us/img144/4423/screencapture2b.png

alt text http://img229.imageshack.us/img229/9153/screencapture1c.png

Please only consider the bolded column and the red column. The remaining text is junk and unnecessary. As evident from the two files they are similar in many ways. I am trying to compare the bolded text in file_1 and file_2 (it is not bolded but hope you can make out it is the same column) and if they are different, I want to print out the red text from file_1. I achieved this by the following script:

import string
import itertools

chain_id=[]
for file in os.listdir("."):
    basename = os.path.basename(file)
    if basename.startswith("d.complex"):
        chain_id.append(basename)

for i in chain_id:
    print i
    g=codecs.open(i,  encoding='utf-8')

    f=codecs.open("ac_chain_dssp.dssp",  encoding='utf-8')
    for (x, y) in itertools.izip(g,  f): 
            if y[11]=="C":
                if y[35:38]!= "EN":
                    if y[35:38] != "OTE":
                        if x[11]=="C":
                            if x[12] != "C":
                                if y[35:38] !=x[35:38]:
                                    print x [7:10]


    g.close()
    f.close()

But the results I got were not what I expected. Now I want to modify the above code in such a way that when I compare the bolded column, if the difference between the values is more than 2, then it has to print out the results. For example, row-1 of bolded column in file_1 is 83 and in file_2 it is 84 since the difference between the two is less than two, I want it to be rejected.

Can someone help me in adding the remaining code? Cheers, Chavanak

PS: This is not homework 开发者_运维知识库:)

The direct answer to your question is to alter the last condition,
if y[35:38] !=x[35:38]: so that instead the "field" at [35:38] get converted to int (or float...) and a difference can be applied to them. Giving something like

   try:
     iy = int(y[35:38])
     ix = int(x[35:38])
   except ValueError:
     # here for whatever action is appropriate, including silent ignoring.
     print("Unexpected value for record # %s" % x[7:10])

   if abs(ix - iy) > 2:
     print(x[7:10])

More indirectly, the snippet in the question prompt the following remarks,which may in turn suggest different approaches to the problem.

first off, if the files are strictly "fixed format", if they are very big, and/or if nothing else is done with any of the other "fields" values found in the file, the current approach is valid and probably very efficient.
alternatively, the logic may be made more resilient to possible variations in the file structure etc, by parsing in the "fields" of the file, rather than addressing these as slices of a long string. Loot into the standard library's csv module for possible parser support.
some tests seem goofy / always true etc (like comparing a 3 characters slice to a 2 character string literal. Aside from being logically wrong, this too points to a more "parsed" solution where such logical error are more readily avoided or more obvious.

Nothing to do with your problem, but this:

        if y[11]=="C":
            if y[35:38]!= "EN":
# I don't see any "EN" or "OTE" anywhere in your sample input.
# In any case the above condition will always be true, because
# y[35:38] appears to be a 3-byte string but "EN" is a 2-byte string.
                if y[35:38] != "OTE":
                    if x[11]=="C":
                        if x[12] != "C":
                            if y[35:38] !=x[35:38]:
                                print x [7:10]

is ummmmm ...

You may wish to consider an alternative way of expression e.g.

if (x[11] == "C" == y[11]
and x[12] != "C"
and y[35:38] not in ("EN?", "OTE")
and y[35:38] != x[35:38]):
    print x[7:10]

I haven't understood your problem fully but

File 1

100 C 20.2
300 B 33.3

File 2

110 C 20.23
320 B 33.34

and you want to compare 3rd column of the two files.

lines1 = file1.readlines()
list1 = [float(line.split()[2]) for line in lines1] # list of 3rd column values

lines2 = file2.readlines()
list2 = [float(line.split()[2]) for line in lines2]

result = map(lambda x,y: x-y < 2,list1,list2)

 result = [list1[i]-list2[i] for i in range(len(list1)) if list1[i] - list2[i] > 2]

Is this what you want??

继续阅读：list python

Calculating difference within lists

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？