Architecture for providing different linear algebra back-ends
I am prototyping a new system in Python; the functionality is mostly numerical.
An important requirement is the ability to use different linear algebra back-ends: from individual user implementations to generic libraries, such as Numpy. The linear algebra implementation (that is, the back-end) must be independent from the interface.
My initial architectural attempt is as follows:
(1) Define the system interface
>>> v1 = Vector([1,2,3])
>>> v2 = Vector([4,5,6])
>>> print v1 * v2
>>> # prints "Vector([4, 10, 18])"
(2) Implement the code allowing to use that interface independently of the back-end
# this example uses numpy as the back-end, but I mean
# to do this for a general back-end
import numpy
def numpy_array(*args): # creates a numpy array from the arguments
return numpy.array(*args)
class VectorBase(type):
def __init__(cls, name, bases, attrs):
engine = attrs.pop("engine", None)
if not engine:
raise RuntimeError("you need to specify an engine")
# this implementation would change depending on `engine`
def new(cls, *args):
return numpy_array(*args)
setattr(cls, "new", classmethod(new))
class Vector(object):
__metaclass__ = VectorBase
# I could change this at run time
# and offer alternative back-ends
engine = "numpy"
@classmethod
def create(cls, v):
nv = cls()
nv._v = v
return nv
def __init__(self, *args):
self._v = None
if args:
self._v = self.new(*args)
def __repr__(self):
l = [item for item in self._v]
return "Vector(%s)" % repr(l)
def __mul__(self, other):
try:
return Vector.create(self._v * other._v)
except AttributeError:
return Vector.create(self._v * other)
def __rmul__(self, other):
return self.__mul__(other)
This simple example works as follows: the Vector
class keeps a reference to a vector instance made by the back-end (numpy.ndarray
in the example); all arithmetic calls are implemented by the interface, but their evaluation is deferred to the back-end.
In practice, the interface overloads all the appropriate operators and defers to the back-end (the example only shows __mul__
and __rmul__
, but you can follow that the same would be done for every operation).
I am willing to loose some performance in exchange of customizability. Even while my example works, it does not feel right -- I would be crippling the back-end with so many constructor calls! This begs for a different metaclass
implementation, allowing for a better call deferment.
So, how would you recommend that I implement this functionality? I need to stress the importance of keeping all of the system's Vec开发者_运维技巧tor
instances homogeneous and independent of the linear algebra back-end.
You should check out PEP-3141 and the standard lib module ABCMeta.
For a detailed explanation of how to use ABCMeta, the always helpful PyMOTW has a nice write-up.
Why not simply make a "virtual" class (AbstractVector
) which is like Vector in your example, and make different subclasses of it for each implementation?
An engine could be chosen by doing Vector = NumPyVector
or something like that.
Just FYI, you can easily configure and build NumPy to use Intel's Math Kernel Library or AMD's Core Math Library instead of the usual ATLAS + LAPACK. This is as simple as creating a site.cfg file with the blas_libs
, lapack_libs
, library_dirs
, and include_dirs
variables set appropriately. (Details for setting these options for MKL and ACML are readily Googleable.) Place it alongside the setup.py
script and build as usual.
To switch between these standard linear algebra libraries, you could build a different instance of NumPy for each and manage them using virtualenvs, for example.
I know that doesn't give you the flexibility you need to use your own custom math libraries, but just thought I'd throw that out there. And though I haven't looked into it, I imagine you might also be able to get NumPy to build against a custom library with less effort than it would take to build your own front-end, especially if you want to retain the extensive functionality of the NumPy/SciPy edifice.
精彩评论