开发者

Reuse existing objects for immutable objects?

In Python, how is it possible to reuse existing equal immutable objects (like is done for str)? Can t开发者_如何学Pythonhis be done just by defining a __hash__ method, or does it require more complicated measures?


If you want to create via the class constructor and have it return a previously created object then you will need to provide a __new__ method (because by the time you get to __init__ the object has already been created).

Here is a simple example - if the value used to initialise has been seen before then a previously created object is returned rather than a new one created:

class Cached(object):
    """Simple example of immutable object reuse."""

    def __init__(self, i):
        self.i = i

    def __new__(cls, i, _cache={}):
        try:
            return _cache[i]
        except KeyError:
            # you must call __new__ on the base class
            x = super(Cached, cls).__new__(cls)
            x.__init__(i)
            _cache[i] = x
            return x

Note that for this example you can use anything to initialise as long as it's hashable. And just to show that objects really are being reused:

>>> a = Cached(100)
>>> b = Cached(200)
>>> c = Cached(100)
>>> a is b
False
>>> a is c
True


There are two 'software engineering' solutions to this that don't require any low-level knowledge of Python. They apply in the following scenarios:

First Scenario: Objects of your class are 'equal' if they are constructed with the same constructor parameters, and equality won't change over time after construction. Solution: Use a factory that hashses the constructor parameters:

class MyClass:
  def __init__(self, someint, someotherint):
    self.a = someint
    self.b = someotherint

cachedict = { }
def construct_myobject(someint, someotherint):
  if (someint, someotherint) not in cachedict:
    cachedict[(someint, someotherint)] = MyClass(someint, someotherint)
  return cachedict[(someint, someotherint)]

This approach essentially limits the instances of your class to one unique object per distinct input pair. There are obvious drawbacks as well: not all types are easily hashable and so on.

Second Scenario: Objects of your class are mutable and their 'equality' may change over time. Solution: define a class-level registry of equal instances:

class MyClass:
  registry = { }

  def __init__(self, someint, someotherint, third):
    MyClass.registry[id(self)] = (someint, someotherint)
    self.someint = someint
    self.someotherint = someotherint
    self.third = third

  def __eq__(self, other):
    return MyClass.registry[id(self)] == MyClass.registry[id(other)]

  def update(self, someint, someotherint):
    MyClass.registry[id(self)] = (someint, someotherint)

In this example, objects with the same someint, someotherint pair are equal, while the third parameter does not factor in. The trick is to keep the parameters in registry in sync. As an alternative to update, you could override getattr and setattr for your class instead; this would ensure that any assignment foo.someint = y would be kept synced with your class-level dictionary. See an example here.


I believe you would have to keep a dict {args: object} of instances already created, then override the class' __new__ method to check in that dictionary, and return the relevant object if it already existed. Note that I haven't implemented or tested this idea. Of course, strings are handled at the C level.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜