开发者

Python: Unexpected ordering in lists

I've encountering a weird behavior while working with lists in Python. I've implemented a method that returns a list of lists of Integers; in particular, those are cycles within a graph each including three nodes:

simple_cycles = compute_cycles(graph)

That returns me something like this:

[[40000,20000,30000],[700,500,600],[600,500,700],..]

Now, I need to (1) order each list of the list, and after that, I need to (2) remove duplicates from the entire list, and (3) I need to sort that entire list, again. The desired result then might look as follows:

[[500,600,700],[20000,30000,40000]]

Task (1) is achieved by sorting the internal lists prior to returning them via compute_cycles. Tasks (2) and (3) are obtained by executing the following line:

cycles = dict((x[0], x) for x in simple_cycles).values()

This works for the first graph processed. Each following graph fails, because the ordering within the internal lists is sometimes wrong. I tried the last source code line twice, and the second result was other than expected. For example, I got as x in the second run:

[29837921, 27629939, 27646591]

instead of

[27629939, 27646591, 29837921]

This result in choosing 29837921 as the key in the dictionary instead of 27629939. Thus, the initial ordering with sorted(x) seems already to be false. But why?

I tried to reproduce that behavior outside of my program, but I can't. In my application, I am parsing an XML document like this:

detector = MyParser()
handler = MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)

..

def parse(self, infile, handler):
  parser = etree.XMLParser(target=handler)
  etree.parse(infile, parser)

When executing, for example,

detector = MyParser()
handler =开发者_JAVA技巧 MyHandler()
handler.subscribe(detector.update)
detector.parse(filename, handler)
detector.parse(filename, handler)

then the ordering of the second run is unexpected.

I know, my source code example is not good to reproduce it by yourself, but maybe I am missing some elemental Python stuff while working with lists.

Update

Here is the creation of the lists:

from networkx import dfs_successors

def compute_cycles(graph):
  cycles = []
  for node in graph.nodes():
    a = graph.successors(node);
    for a_node in a:
      b = graph.successors(a_node)
      for next_node in b:
        c = graph.successors(next_node);
        if len(c) > 1:
          if c[0] == node:
            cycles.append(sorted([node, a_node, next_node]))
          elif c[1] == node:
            cycles.append(sorted([node, a_node, next_node]))
        else:
          if c == node:
            cycles.append(sorted([node, a_node, next_node]))
        #fi
      #rof
    #rof
  #rof
  return cycles

Update

If made a big mistake: I've overwritten the __repr__ function of my Node object used within the graph, so that it returns an integer. Maybe, the sorting fails because I am dealing with real objects instead of integers. I changed my call to the sort function this way:

cycles.append(sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

I'll have to see if that makes a difference. The node class is defined as follows:

class Node(object):
  def __init__(self, revision, revision_hash):
    self.rev = revision
    self.revhash = revision_hash

  def __repr__(self):
    return repr((self.rev.revid))


I don't understand why you're using dict.

print sorted(set(tuple(sorted(x)) for x in L))


Dictionaries do not necessarily keep the order. They are allowed to change it. Put this in the interpreter: {'a': 1, 'b': 2, 'c': 3}. I got {'a': 1, 'c': 3, 'b': 2}.


My problem is finally solved. Because I put objects in lists instead of simple Integers, I had to use the sort method as follows:

sorted([node, a_node, next_node], key=lambda revision: revision.rev.revid))

Here, I am accessing the member variable containing the Integer, which was already returned by __str__. However, the implicit conversion while sorting wasn't stable.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜