Compilers targeting .pyc files?
Out of curiosity, are there many compilers out there which target .pyc
files?
After a bit of Googling, the only two I can find are:
- unholy: why_'s Ruby-to-pyc compiler
- Python: The PSF's Python to pyc compiler
So… Are there any more?
(as a side note, I got thinking about this because I want to write a Scheme-to-pyc compiler)
(as a second side note, I'm not under any illusion that a Scheme-to-pyc compiler would be useful, but it would give me a开发者_C百科n incredible excuse to learn some internals of both Scheme and Python)
"I want to write a Scheme-to-pyc compiler".
My brain hurts! Why would you want to do that? Python byte code is an intermediate language specifically designed to meet the needs of the Python language and designed to run on Python virtual machines that, again, have been tailored to the needs of Python. Some of the most important areas of Python development these days are moving Python to other "virtual machines", such as Jython (JVM), IronPython (.NET), PyPy and the Unladen Swallow project (moving CPython to an LLVM-based representation). Trying to squeeze the syntax and semantics of another, very different language (Scheme) into the intermediate representation of another high-level language seems to be attacking the problem (whatever the problem is) at the wrong level. So, in general, it doesn't seem like there would be many .pyc compilers out there and there's a good reason for that.
I wrote a compiler several years ago which accepted a lisp-like language called "Noodle" and produced Python bytecode. While it never became particularly useful, it was a tremendously good learning experience both for understanding Common Lisp better (I copied several of its features) and for understanding Python better.
I can think of two particular cases when it might be useful to target Python bytecode directly, instead of producing Python and passing it on to a Python compiler:
- Full closures: in Python before 3.0 (before the
nonlocal
keyword), you can't modify the value of a closed-over variable without resorting to bytecode hackery. You can mutate values instead, so it's common practice to have a closure referencing a list, for example, and changing the first element in it from the inner scope. That can get real annoying. The restriction is part of the syntax, though, not the Python VM. My language had explicit variable declaration, so it successfully provided "normal" closures with modifiable closed-over values. - Getting at a traceback object without referencing any builtins. Real niche case, for sure, but I used it to break an early version of the "safelite" jail. See my posting about it.
So yeah, it's probably way more work than it's worth, but I enjoyed it, and you might too.
I suggest you focus on CPython.
http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html
Rather than a Scheme to .pyc translator, I suggest you write a Scheme to Python translator, and then let CPython handle the conversion to .pyc. (There is precedent for doing it this way; the first C++ compiler was Cfront which translated C++ into C, and then let the system C compiler do the rest.)
From what I know of Scheme, it wouldn't be that difficult to translate Scheme to Python.
One warning: the Python virtual machine is probably not as fast for Scheme as Scheme itself. For example, Python doesn't automatically turn tail recursion into iteration; and Python has a relatively shallow stack, so you would actually need to turn tail recursion to iteration for your translator.
As a bonus, once Unladen Swallow speeds up Python, your Scheme-to-Python translator would benefit, and at that point might even become practical!
If this seems like a fun project to you, I say go for it. Not every project has to be immediately practical.
P.S. If you want a project that is somewhat more practical, you might want to write an AWK to Python translator. That way, people with legacy AWK scripts could easily make the leap forward to Python!
Just for your interest, I have written a toy compiler from a simple LISP to Python. Practically, this is a LISP to pyc compiler.
Have a look: sinC - The tiniest LISP compiler
Probably a bit late at the party but if you're still interested the clojure-py project (https://github.com/halgari/clojure-py) is now able to compile a significant subset of clojure to python bytecode -- but some help is always welcome.
Targeting bytecode is not that hard in itself, except for one thing: it is not stable across platforms (e.g. MAKE_FUNCTION pops 2 elements from the stack in Python 3 but only 1 in Python 2), and these differences are not clearly documented in a single spot (afaict) -- so you probably have some abstraction layer needed.
精彩评论