Python Shared Libraries
As I understand there are two types of modules in python (CPython): - the .so (C extension) - the .py
The .so are only loaded once even when there are different processes/interpreters importing them.
The .py are loaded once for each process/interpreter (unless reloading explicitly).
Is there a way .py can be shared by multiple processes/interpreters?
One would still need开发者_运维问答 some layer where one could store modifications done to the module. I'm thinking one could embed the interpreter in a .so as the first step. Is there an already developed solution.
I acknowledge i may be very far off in terms of feasible ideas about this. Please excuse my ignorance.
The reason .so
(or .pyd
) files take up memory space only once (except for their variables segment) is that they are recognized by the OS kernel as object code. .py
files are only recognized as text files/data; it's the Python interpreter that grants them "code" status. Embedding the Python interpreter in a shared library won't resolve this.
Loading .py
files only once despite their use in multiple processes would require changes deep inside CPython.
Your best option, if you want to save memory space, is to compile Python modules to .so
files using Cython. That may require some changes to the modules.
No, there is no way. Python is so highly dynamic that each process that I'm not sure it would make any sense anyway, as you could monkey-patch the modules, for example. Perhaps there would be a way to share the code anyway, but the benefit would be very small for something that is likely to be a lot of work.
The best answer I can give you is "not impossible, but I don't know if it happens".
You have to think about what is actually happening. When you encounter a .py file, Python has to read the file, compile it, and then execute byte code. Compilation takes place inside of the process, and so can't be shared.
When you encounter a .so file, the operating system links in memory that has been reserved for that library. All processes share the same memory region, and so you save memory.
Python already has a third way of loading modules. If it can, upon loading a .py file, it creates a pre-compiled .pyc file that is faster to load (you avoid compilation). The next time it loads the .pyc file. They conceivably could the .pyc file by just mmapping it into memory. (Using MAP_PRIVATE in case other things mess with that byte code later.) If they did that, then shared modules would by default wind up in shared memory.
I have no idea whether it has actually been implemented in this way.
精彩评论