Python: Unexpected ordering in lists
I've encountering a weird behavior while working with lists in Python. I've implemented a method that returns a list of lists of Integers; in particular, those are cycles within a graph each including three nodes:
simple_cycles = compute_cycles(graph)
That returns me something like this:
[[40000,20000,30000],[700,500,600],[600,500,700],..]
Now, I need to (1) order each list of the list, and after that, I need to (2) remove duplicates from the entire list, and (3) I need to sort that entire list, again. The desired result then might look as follows:
[[500,600,700],[20000,30000,40000]]
Task (1) is achieved by sorting the internal lists prior to returning them via compute_cycles. Tasks (2) and (3) are obtained by executing the following line:
cycles = dict((x[0], x) for x in simple_cycles).values()
This works for the first graph processed. Each following graph fails, because the ordering within the internal lists is sometimes wrong. I tried the last source code line twice, and the second result was other than expected. For example, I got as x in the second run:
[29837921, 27629939, 27646591]
instead of
[27629939, 27646591, 29837921]
This result in choosing 29837921 as the key in the dictionary instead of 27629939. Thus, the initial ordering with sorted(x) seems already to be false. But why?
I tried to reproduce that behavior outside of my program, but I can't. In my application, I am parsing an XML document like this:
detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
..
def parse(self, infile, handler):
parser = etree.XMLParser(target=handler)
etree.parse(infile, parser)
When executing, for example,
detector = MyParser()
handler =开发者_JAVA技巧 MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)
then the ordering of the second run is unexpected.
I know, my source code example is not good to reproduce it by yourself, but maybe I am missing some elemental Python stuff while working with lists.
Update
Here is the creation of the lists:
from networkx import dfs_successors
def compute_cycles(graph):
cycles = []
for node in graph.nodes():
a = graph.successors(node);
for a_node in a:
b = graph.successors(a_node)
for next_node in b:
c = graph.successors(next_node);
if len(c) > 1:
if c[0] == node:
cycles.append(sorted([node, a_node, next_node]))
elif c[1] == node:
cycles.append(sorted([node, a_node, next_node]))
else:
if c == node:
cycles.append(sorted([node, a_node, next_node]))
#fi
#rof
#rof
#rof
return cycles
Update
If made a big mistake: I've overwritten the __repr__
function of my Node object used within the graph, so that it returns an integer. Maybe, the sorting fails because I am dealing with real objects instead of integers. I changed my call to the sort
function this way:
cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))
I'll have to see if that makes a difference. The node class is defined as follows:
class Node(object):
def __init__(self, revision, revision_hash):
self.rev = revision
self.revhash = revision_hash
def __repr__(self):
return repr((self.rev.revid))
I don't understand why you're using dict
.
print sorted(set(tuple(sorted(x)) for x in L))
Dictionaries do not necessarily keep the order. They are allowed to change it. Put this in the interpreter: {'a': 1, 'b': 2, 'c': 3}
. I got {'a': 1, 'c': 3, 'b': 2}
.
My problem is finally solved. Because I put objects in lists instead of simple Integers
, I had to use the sort
method as follows:
sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))
Here, I am accessing the member variable containing the Integer
, which was already returned by __str__
. However, the implicit conversion while sorting wasn't stable.
精彩评论