开发者

Why is int(50)<str(5) in python 2.x?

In python 3, int(50)<'2' causes a TypeError, and well it should. In python 2.x, however, int(50)<'2' returns True (this is also the case for other number formats, but int exists in both py2 and py3). My question, then, has several parts:

  1. Why does Python 2.x (< 3?) allow this behavior?
    • (And who thought it was a good idea to allow this to begin with???)
  2. What does it mean that an int is less than a str?
    • Is it referring to ord / chr?
    • Is there some binary format which is less obvious?
  3. Is there 开发者_开发知识库a difference between '5' and u'5' in this regard?


It works like this1.

>>> float() == long() == int() < dict() < list() < str() < tuple()
True

Numbers compare as less than containers. Numeric types are converted to a common type and compared based on their numeric value. Containers are compared by the alphabetic value of their names.2

From the docs:

CPython implementation detail: Objects of different types except numbers are ordered by >their type names; objects of the same types that don’t support proper comparison are >ordered by their address.

Objects of different builtin types compare alphabetically by the name of their type int starts with an 'i' and str starts with an s so any int is less than any str..

  1. I have no idea.
    • A drunken master.
  2. It means that a formal order has been introduced on the builtin types.
    • It's referring to an arbitrary order.
    • No.
  3. No. strings and unicode objects are considered the same for this purpose. Try it out.

In response to the comment about long < int

>>> int < long
True

You probably meant values of those types though, in which case the numeric comparison applies.

1 This is all on Python 2.6.5

2 Thank to kRON for clearing this up for me. I'd never thought to compare a number to a dict before and comparison of numbers is one of those things that's so obvious that it's easy to overlook.


The reason why these comparisons are allowed, is sorting. Python 2.x can sort lists containing mixed types, including strings and integers -- integers always appear first. Python 3.x does not allow this, for the exact reasons you pointed out.

Python 2.x:

>>> sorted([1, '1'])
[1, '1']
>>> sorted([1, '1', 2, '2'])
[1, 2, '1', '2']

Python 3.x:

>>> sorted([1, '1'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()


(And who thought it was a good idea to allow this to begin with???)

I can imagine that the reason might be to allow object from different types to be stored in tree-like structures, which use comparisons internally.


As Aaron said. Breaking it up into your points:

  1. Because it makes sort do something halfway usable where it otherwise would make no sense at all (mixed lists). It's not a good idea generally, but much in Python is designed for convenience over strictness.
  2. Ordered by type name. This means things of the same type group together, where they can be sorted. They should probably be grouped by type class, such as numbers together, but there's no proper type class framework. There may be a few more specific rules in there (probably is one for numeric types), I'd have to check the source.
  3. One is string and the other is unicode. They may have a direct comparison operation, however, but it's conceivable a non-comparable type would get grouped between them, causing a mess. I don't know if there's code to avoid this.

So, it doesn't make sense in the general case, but occasionally it's helpful.

from random import shuffle
letters=list('abcdefgh')
ints=range(8)
both=ints+letters
shuffle(ints)
shuffle(letters)
shuffle(both)
print sorted(ints+letters)
print sorted(both)

Both print the ints first, then the letters.

As a rule, you don't want to mix types randomly within a program, and apparently Python 3 prevents it where Python 2 tries to make vague sense where none exists. You could still sort by lambda a,b: cmp(repr(a),repr(b)) (or something better) if you really want to, but it appears the language developers agreed it's impractical default behaviour. I expect it varies which gives the least surprise, but it's a lot harder to detect a problem in the Python 2 sense.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜