In Python, are single character strings guaranteed to be identical?
I read somewhere (an SO post, I think, and probably somewhere else, too), that Python automatically references single character strings, so not only does 'a' == 'a'
, but 'a' is 'a'
.
Ho开发者_开发百科wever, I can't remember reading if this is guaranteed behavior in Python, or is it just implementation specific?
Bonus points for official sources.
It's implementation specific. It's difficult to tell, because (as the reference says):
... for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed.
The interpreter's pretty good about ensuring they're identical, but it doesn't always work:
x = u'a'
y = u'abc'[:1]
print x == y, x is y
Run on CPython 2.6, this gives True False
.
It is all implementation defined.
The documentation for intern says: "Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys."
That means that anything that could be a name and which is known at compile time is likely (but not guaranteed) to be the same as any other occurrences of the same name.
Other strings aren't stated to be interned. Constant strings appearing in the same compilation unit are folded together (but that is also just an implementation detail) so you get:
>>> a = '!'
>>> a is '!'
False
>>> a = 'a'
>>> a is 'a'
True
>>>
The string that contains an identifier is interned so even in different compilations you get the same string. The string that is not an identifier is only shared when in the same compilation unit:
>>> '!' is '!'
True
精彩评论