RAII in Python - automatic destruction when leaving a scope
I've been trying to find RAII in Python. Resource Allocation Is Initialization is a pattern in C++ whereby an object is initialized as it is created. If it fails, then it throws an exception. In this way, the programmer knows that the object will never be left in a half-constructed state. Python can do this much.
But RAII also works with the scoping rules of C++ to ensure the prompt destruction of the object. As soon as the variable pops off the stack it is destroyed. This may happen in Python, but only if there are no external or circular references.
More importantly, a name for an object still exists until the function it is in exits (and sometimes longer). Variables at the module level will stick around for the life of the module.
I'd like to get an error if I do something like this:
for x in some_list:
...
... 100 lines later ...
for i in x:
# Oops! Forgot to define x first, but... where's my error?
...
I could manually delete the names after I've used it, but that would be quite ugly, and require effort on my part.
And I'd like it to Do-What-I-Mean in this case:
for x in some_list:
surface = x.getSurface()
new_points = []
for x,y,z in surface.points:
... # Do something with the points
new_points.append( (x,y,z) )
surface.points = new_points
x.setSurface(surface)
Python does some scoping, but not at the indentation level, just at the functional level. It seems silly to require that I make a new function just to scope the variables so I can reuse a name.
Python 2.5 has the "with" statement
but that requires that I explicitly put in __enter__
and __exit__
functions
and generally seems more oriented towards cleaning up resources like files
and mutex locks regardless of the exit vector. It doesn't help with scoping.
Or am I missing something?
I've searched for "Python RAII" and "Python scope" and I wasn't able to find anything that addressed the issue directly and authoritatively. I've looked over all the PEPs. The concept doesn't seem to be addressed within Python.
Am I a bad person because I want to have scoping variables in Python? Is that just too un-Pythonic?
Am I not grokking it?
Perhaps I'm trying to take away the bene开发者_开发百科fits of the dynamic aspects of the language. Is it selfish to sometimes want scope enforced?
Am I lazy for wanting the compiler/interpreter to catch my negligent variable reuse mistakes? Well, yes, of course I'm lazy, but am I lazy in a bad way?
tl;dr RAII is not possible, you mix it up with scoping in general and when you miss those extra scopes you're probably writing bad code.
Perhaps I don't get your question(s), or you don't get some very essential things about Python... First off, deterministic object destruction tied to scope is impossible in a garbage collected language. Variables in Python are merely references. You wouldn't want a malloc
'd chunk of memory to be free
'd as soon as a pointer pointing to it goes out of scope, would you? Practical exception in some circumstances if you happen to use ref counting - but no language is insane enough to set the exact implementation in stone.
And even if you have reference counting, as in CPython, it's an implementation detail. Generally, including in Python which has various implementations not using ref counting, you should code as if every object hangs around until memory runs out.
As for names existing for the rest of a function invocation: You can remove a name from the current or global scope via the del
statement. However, this has nothing to do with manual memory management. It just removes the reference. That may or may not happen to trigger the referenced object to be GC'd and is not the point of the exercise.
- If your code is long enough for this to cause name clashes, you should write smaller functions. And use more descriptive, less likely-to-clash names. Same for nested loops overwriting the out loop's iteration variable: I'm yet to run into this issue, so perhaps your names are not descriptive enough or you should factor these loops apart?
You are correct, with
has nothing to do with scoping, just with deterministic cleanup (so it overlaps with RAII in the ends, but not in the means).
Perhaps I'm trying to take away the benefits of the dynamic aspects of the language. Is it selfish to sometimes want scope enforced?
No. Decent lexical scoping is a merit independent of dynamic-/staticness. Admittedly, Python (2 - 3 pretty much fixed this) has weaknesses in this regard, although they're more in the realm of closures.
But to explain "why": Python must be conservative with where it starts a new scope because without declaration saying otherwise, assignment to a name makes it a local to the innermost/current scope. So e.g. if a for loop had it's own scope, you couldn't easily modify variables outside of the loop.
Am I lazy for wanting the compiler/interpreter to catch my negligent variable reuse mistakes? Well, yes, of course I'm lazy, but am I lazy in a bad way?
Again, I imagine that accidential resuse of a name (in a way that introduces errors or pitfalls) is rare and a small anyway.
Edit: To state this again as clearly as possible:
- There can't be stack-based cleanup in a language using GC. It's just not possibly, by definition: a variable is one of potentially many references to objects on the heap that neither know nor care about when variables go out of scope, and all memory management lies in the hands of the GC, which runs when it likes to, not when a stack frame is popped. Resource cleanup is solved differently, see below.
- Deterministic cleanup happens through the
with
statement. Yes, it doesn't introduce a new scope (see below), because that's not what it's for. It doesn't matter the name the managed object is bound to isn't removed - the cleanup happened nonetheless, what remains is a "don't touch me I'm unusable" object (e.g. a closed file stream). - Python has a scope per function, class, and module. Period. That's how the language works, whether you like it or not. If you want/"need" more fine-grained scoping, break the code into more fine-grained functions. You might wish for more fine-grained scoping, but there isn't - and for reasons pointed out earlier in this answer (three paragraphs above the "Edit:"), there are reasons for this. Like it or not, but this is how the language works.
You are right about
with
-- it is completely unrelated to variable scoping.Avoid global variables if you think they are a problem. This includes module level variables.
The main tool to hide state in Python are classes.
Generator expressions (and in Python 3 also list comprehensions) have their own scope.
If your functions are long enough for you to lose track of the local variables, you should probably refactor your code.
But RAII also works with the scoping rules of C++ to ensure the prompt destruction of the object.
This is considered unimportant in GC languages, which are based on the idea that memory is fungible. There is no pressing need to reclaim an object's memory as long as there's enough memory elsewhere to allocate new objects. Non-fungible resources like file handles, sockets, and mutexes are considered a special case to be dealt with specially (e.g., with
). This contrasts with C++'s model that treats all resources the same.
As soon as the variable pops off the stack it is destroyed.
Python doesn't have stack variables. In C++ terms, everything is a shared_ptr
.
Python does some scoping, but not at the indentation level, just at the functional level. It seems silly to require that I make a new function just to scope the variables so I can reuse a name.
It also does scoping at the generator comprehension level (and in 3.x, in all comprehensions).
If you don't want to clobber your for
loop variables, don't use so many for
loops. In particular, it's un-Pythonic to use append
in a loop. Instead of:
new_points = []
for x,y,z in surface.points:
... # Do something with the points
new_points.append( (x,y,z) )
write:
new_points = [do_something_with(x, y, z) for (x, y, z) in surface.points]
or
# Can be used in Python 2.4-2.7 to reduce scope of variables.
new_points = list(do_something_with(x, y, z) for (x, y, z) in surface.points)
Basically you are probably using the wrong language. If you want sane scoping rules and reliable destruction then stick with C++ or try Perl. The GC debate about when memory is released seems to miss the point. It's about releasing other resources like mutexes and file handles. I believe C# makes the distinction between a destructor that is called when the reference count goes to zero and when it decides to recycle the memory. People aren't that concerned about the memory recycling but do want to know as soon as it is no longer referenced. It's a pity as Python had real potential as a language. But it's unconventional scoping and unreliable destructors (or at least implementation dependent ones) means that one is denied the power you get with C++ and Perl.
Interesting the comment made about just using new memory if it's available rather than recycling old in GC. Isn't that just a fancy way of saying it leaks memory :-)
When switching to Python after years of C++, I have found it tempting to rely on __del__
to mimic RAII-type behavior, e.g. to close files or connections. However, there are situations (e.g. observer pattern as implemented by Rx) where the thing being observed maintains a reference to your object, keeping it alive! So, if you want to close the connection before it is terminated by the source, you won't get anywhere by trying to do that in __del__
.
The following situation arises in UI programming:
class MyComponent(UiComponent):
def add_view(self, model):
view = TheView(model) # observes model
self.children.append(view)
def remove_view(self, index):
del self.children[index] # model keeps the child alive
So, here is way to get RAII-type behavior: create a container with add and remove hooks:
import collections
class ScopedList(collections.abc.MutableSequence):
def __init__(self, iterable=list(), add_hook=lambda i: None, del_hook=lambda i: None):
self._items = list()
self._add_hook = add_hook
self._del_hook = del_hook
self += iterable
def __del__(self):
del self[:]
def __getitem__(self, index):
return self._items[index]
def __setitem__(self, index, item):
self._del_hook(self._items[index])
self._add_hook(item)
self._items[index] = item
def __delitem__(self, index):
if isinstance(index, slice):
for item in self._items[index]:
self._del_hook(item)
else:
self._del_hook(self._items[index])
del self._items[index]
def __len__(self):
return len(self._items)
def __repr__(self):
return "ScopedList({})".format(self._items)
def insert(self, index, item):
self._add_hook(item)
self._items.insert(index, item)
If UiComponent.children
is a ScopedList
, which calls acquire
and dispose
methods on the children, you get the same guarantee of deterministic resource acquisition and disposal as you are used to in C++.
精彩评论