Python: Object identity assertions thrown by differences in import statement notations
When checking an object's identity, I am getting assertion errors because the object creation code imports the object-defining module under one notation (base.other_stuff.BarCode
) and the identity-checking code imports that same module under a different notation (other_stuff.BarCode
). (Please see below for gory details.)
It seems that the isinstance() call is sticky about the references to the object definition module, and wants it imported under the exact same notation. (I'm using version 2.5.)
I suppose could fix this by changing the import notation in the code checking the identity, but I'm worried that I'll just propagate the same problem to other code that depends on it. And I'm sure there is some more elegant solution that 开发者_运维技巧I probably should be using in the first place.
So how do I fix this?
DETAILS
PythonPath: '/', '/base/'
Files:
/__init__.py
base/__init__.py
base/other_stuff/__init__.py
base/other_stuff/BarCode.py
base/stuff/__init__.py
camp/__init__.py
Text of base/stuff/FooCode.py:
import other_stuff.BarCode as bc
class Foo:
def __init__(self, barThing):
assert isinstance(barThing, bc.Bar)
Text of camp/new_code.py:
import base.stuff.FooCode as fc
import base.other_stuff.BarCode as bc
thisBar = bc.Bar()
assert isinstance(thisBar, bc.Bar)
thisFoo = fc.Foo(barThing=thisBar)
This fails. It survives its assertion test, but blows up on the assertion in the initial code.
However, it works when I modify new_code to import BarCode.py with:
import other_stuff.BarCode as bc
. . . because both base/ and base/other_stuff are on the PythonPath.
It looks like you have <root>/
and <root>/base
in your sys.path
, which is always bad. When you do import other_stuff.BarCode as bc
from base/stuff/FooCode.py it imports other_stuff
as root package, but not subpackage of base
. So after doing import base.other_stuff.BarCode as bc
you get BarCode
module imported twice: with other_stuff.BarCode
and base.other_stuff.BarCode
.
The best solution would be:
- Remove
<root>/base
fromsys.path
(or $PYTHONPATH). - Use relative import in base/stuff/FooCode.py:
from ..other_stuff import BarCode as bc
.
Your code layout is seriously broken. You should not have package directories in sys.path
.
In you situation, Python will use two different search paths to find BarCode.py
, therefore loading it twice as separate modules, bar.other_stuff.BarCode
and other_stuff.BarCode
. This means that every object in this module exists twice, wasting memory, and natuarally the object identity will fail:
>>> from base.other_stuff import BarCode as bc1
>>> from other_stuff import BarCode as bc2
>>> bc1
<module 'base.other_stuff.BarCode' from '.../base/other_stuff/BarCode.pyc'>
>>> bc2
<module 'other_stuff.BarCode' from '.../other_stuff/BarCode.pyc'>
>>> bc1 == b2
False
>>> bc1 is bc2
False
Although they originate from the same source file, Python treats bc1
and bc2
as different modules.
Make sure that every module you are using can be identified uniquely by its full-qualified name, in your case: base.other_stuff.BarCode
. If a module is part of a package, never add the package directory to sys.path
.
"Notation" is the least of issues -- different notations that are defined to semantically refer to the same module are guaranteed to produce the same object. E.g.:
>>> import sys as foobar
>>> import sys as zapzip
>>> foobar is zapzip
True
The problem is, rather, that it's surely possible to import the same file more than once, in ways that don't let the import mechanism fully know what you're doing, and thus end up with distinct module objects. Overlapping paths like you're using could easily produce that, for example.
One approach (if you insist on writing code, and/or laying out your filesystem, in such a potentially confusing/misleading way;-) is to set __builtin__.__import__
to your own function that, after calling the previous/normal version, checks the __file__
attribute of the newly imported module against those already in sys.modules
(worth maintaining your dict of those, mapping file to canonical module object for that file) using os.path.normpath (or even stronger ways to detect synonyms for a single file, e.g. symlinks and hard links, via functionality in standard library module os
).
With this hook, you can make sure that all imports of any single given file will always result in a single canonical module object, almost no matter what gyrations occur in the paths and filesystem in question (would still be possible for a clever attacker to foil the checks by installing a tricky filesystem of their own devising, but I don't think you're actually trying to guard against deliberate cunning attacks;-).
You are having problems because you have both base and other_stuff in sys.path.
To the Python interpreter there are multiple BarCode modules: bar.other_stuff.BarCode
and other_stuff.BarCode
the first is located in the top level package: bar.other_stuff
and the other is separate top level package other_stuff
.
When the python interpreter searches the sys.path it is finding two complete different modules. When you try use classes from these two separate modules interchangeably you get the errors you are seeing.
You need to clean up your python path probably putting only the parent folder on base on the path.
精彩评论