开发者

Python: Object identity assertions thrown by differences in import statement notations

When checking an object's identity, I am getting assertion errors because the object creation code imports the object-defining module under one notation (base.other_stuff.BarCode) and the identity-checking code imports that same module under a different notation (other_stuff.BarCode). (Please see below for gory details.)

It seems that the isinstance() call is sticky about the references to the object definition module, and wants it imported under the exact same notation. (I'm using version 2.5.)

I suppose could fix this by changing the import notation in the code checking the identity, but I'm worried that I'll just propagate the same problem to other code that depends on it. And I'm sure there is some more elegant solution that 开发者_运维技巧I probably should be using in the first place.

So how do I fix this?

DETAILS

PythonPath: '/', '/base/'

Files:

/__init__.py
base/__init__.py
base/other_stuff/__init__.py
base/other_stuff/BarCode.py
base/stuff/__init__.py
camp/__init__.py

Text of base/stuff/FooCode.py:

import other_stuff.BarCode as bc

class Foo:
    def __init__(self, barThing):
        assert isinstance(barThing, bc.Bar)

Text of camp/new_code.py:

import base.stuff.FooCode as fc
import base.other_stuff.BarCode as bc

thisBar = bc.Bar()
assert isinstance(thisBar, bc.Bar)
thisFoo = fc.Foo(barThing=thisBar)

This fails. It survives its assertion test, but blows up on the assertion in the initial code.

However, it works when I modify new_code to import BarCode.py with:

import other_stuff.BarCode as bc

. . . because both base/ and base/other_stuff are on the PythonPath.


It looks like you have <root>/ and <root>/base in your sys.path, which is always bad. When you do import other_stuff.BarCode as bc from base/stuff/FooCode.py it imports other_stuff as root package, but not subpackage of base. So after doing import base.other_stuff.BarCode as bc you get BarCode module imported twice: with other_stuff.BarCode and base.other_stuff.BarCode.

The best solution would be:

  1. Remove <root>/base from sys.path (or $PYTHONPATH).
  2. Use relative import in base/stuff/FooCode.py: from ..other_stuff import BarCode as bc.


Your code layout is seriously broken. You should not have package directories in sys.path.

In you situation, Python will use two different search paths to find BarCode.py, therefore loading it twice as separate modules, bar.other_stuff.BarCode and other_stuff.BarCode. This means that every object in this module exists twice, wasting memory, and natuarally the object identity will fail:

>>> from base.other_stuff import BarCode as bc1
>>> from other_stuff import BarCode as bc2
>>> bc1
<module 'base.other_stuff.BarCode' from '.../base/other_stuff/BarCode.pyc'>
>>> bc2
<module 'other_stuff.BarCode' from '.../other_stuff/BarCode.pyc'>
>>> bc1 == b2
False
>>> bc1 is bc2
False

Although they originate from the same source file, Python treats bc1 and bc2 as different modules.

Make sure that every module you are using can be identified uniquely by its full-qualified name, in your case: base.other_stuff.BarCode. If a module is part of a package, never add the package directory to sys.path.


"Notation" is the least of issues -- different notations that are defined to semantically refer to the same module are guaranteed to produce the same object. E.g.:

>>> import sys as foobar
>>> import sys as zapzip
>>> foobar is zapzip
True

The problem is, rather, that it's surely possible to import the same file more than once, in ways that don't let the import mechanism fully know what you're doing, and thus end up with distinct module objects. Overlapping paths like you're using could easily produce that, for example.

One approach (if you insist on writing code, and/or laying out your filesystem, in such a potentially confusing/misleading way;-) is to set __builtin__.__import__ to your own function that, after calling the previous/normal version, checks the __file__ attribute of the newly imported module against those already in sys.modules (worth maintaining your dict of those, mapping file to canonical module object for that file) using os.path.normpath (or even stronger ways to detect synonyms for a single file, e.g. symlinks and hard links, via functionality in standard library module os).

With this hook, you can make sure that all imports of any single given file will always result in a single canonical module object, almost no matter what gyrations occur in the paths and filesystem in question (would still be possible for a clever attacker to foil the checks by installing a tricky filesystem of their own devising, but I don't think you're actually trying to guard against deliberate cunning attacks;-).


You are having problems because you have both base and other_stuff in sys.path.

To the Python interpreter there are multiple BarCode modules: bar.other_stuff.BarCode and other_stuff.BarCode the first is located in the top level package: bar.other_stuff and the other is separate top level package other_stuff.

When the python interpreter searches the sys.path it is finding two complete different modules. When you try use classes from these two separate modules interchangeably you get the errors you are seeing.

You need to clean up your python path probably putting only the parent folder on base on the path.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜