Why does python use two underscores for certain things? [duplicate]

2023-01-11 07:02 问答作者：

This question already has answers here: Why do some functions have underscores "__" before and after the function name? (6 answers) Closed 3 months ago.

I'm fairly new to actual programming languages, and Python is my first one. I know my way around Linux a bit, enough to get a summer job with it (I'm still in开发者_运维百科 high school), and on the job, I have a lot of free time which I'm using to learn Python.

One thing's been getting me though. What exactly is different in Python when you have expressions such as

x.__add__(y) <==> x+y
x.__getattribute__('foo') <==> x.foo

I know what methods do and stuff, and I get what they do, but my question is: How are those double underscore methods above different from their simpler looking equivalents?

P.S., I don't mind being lectured on programming history, in fact, I find it very useful to know :) If these are mainly historical aspects of Python, feel free to start rambling.

Here is the creator of Python explaining it:

... rather than devising a new syntax for special kinds of class methods (such as initializers and destructors), I decided that these features could be handled by simply requiring the user to implement methods with special names such as __init__, __del__, and so forth. This naming convention was taken from C where identifiers starting with underscores are reserved by the compiler and often have special meaning (e.g., macros such as __FILE__ in the C preprocessor).

...

I also used this technique to allow user classes to redefine the behavior of Python's operators. As previously noted, Python is implemented in C and uses tables of function pointers to implement various capabilities of built-in objects (e.g., “get attribute”, “add” and “call”). To allow these capabilities to be defined in user-defined classes, I mapped the various function pointers to special method names such as __getattr__, __add__, and __call__. There is a direct correspondence between these names and the tables of function pointers one has to define when implementing new Python objects in C.

When you start a method with two underscores (and no trailing underscores), Python's name mangling rules are applied. This is a way to loosely simulate the private keyword from other OO languages such as C++ and Java. (Even so, the method is still technically not private in the way that Java and C++ methods are private, but it is "harder to get at" from outside the instance.)

Methods with two leading and two trailing underscores are considered to be "built-in" methods, that is, they're used by the interpreter and are generally the concrete implementations of overloaded operators or other built-in functionality.

Well, power for the programmer is good, so there should be a way to customize behaviour. Like operator overloading (__add__, __div__, __ge__, ...), attribute access (__getattribute__, __getattr__ (those two are differnt), __delattr__, ...) etc. In many cases, like operators, the usual syntax maps 1:1 to the respective method. In other cases, there is a special procedure which at some point involves calling the respective method - for example, __getattr__ is only called if the object doesn't have the requested attribute and __getattribute__ is not implemented or raised AttributeError. And some of them are really advanced topics that get you deeeeep into the object system's guts and are rarely needed. So no need to learn them all, just consult the reference when you need/want to know. Speaking of reference, here it is.

They are used to specify that the Python interpreter should use them in specific situations.

E.g., the __add__ function allows the + operator to work for custom classes. Otherwise you will get some sort of not defined error when attempting to add.

From an historical perspective, leading underscores have often been used as a method for indicating to the programmer that the names are to be considered internal to the package/module/library that defines them. In languages which do not provide good support for private namespaces, using underscores is a convention to emulate that. In Python, when you define a method named '__foo__' the maintenance programmer knows from the name that something special is going on which is not happening with a method named 'foo'. If Python had choosen to use 'add' as the internal method to overload '+', then you could never have a class with a method 'add' without causing much confusion. The underscores serve as a cue that some magic will happen.

A number of other questions are now marked as duplicates of this question, and at least two of them ask what either the __spam__ methods are called, or what the convention is called, and none of the existing answers cover that, so:

There actually is no official name for either.

Many developers unofficially call them "dunder methods", for "Double UNDERscore".

Some people use the term "magic methods", but that's somewhat ambiguous between meaning dunder methods, special methods (see below), or something somewhere between the two.

There is an official term "special attributes", which overlaps closely but not completely with dunder methods. The Data Model chapter in the reference never quite explains what a special attribute is, but the basic idea is that it's at least one of the following:

An attribute that's provided by the interpreter itself or its builtin code, like __name__ on a function.
An attribute that's part of a protocol implemented by the interpreter itself, like __add__ for the + operator, or __getitem__ for indexing and slicing.
An attribute that the interpreter is allowed to look up specially, by ignoring the instance and going right to the class, like __add__ again.

Most special attributes are methods, but not all (e.g., __name__ isn't). And most use the "dunder" convention, but not all (e.g., the next method on iterators in Python 2.x).

And meanwhile, most dunder methods are special attributes, but not all—in particular, it's not that uncommon for stdlib or external libraries to want to define their own protocols that work the same way, like the pickle protocol.

[Speculation] Python was influenced by Algol68, Guido possibly used Algol68 at the University of Amsterdam where Algol68 has a similar "stropping regime" called "Quote stropping". In Algol68 the operators, types and keywords can appear in a different typeface (usually **bold**, or __underlined__), in sourcecode files this typeface is achieved with quotes, e.g. 'abs' (quoting similar to quoting in 'wikitext')

Algol68 ⇒ Python (Operators mapped to member functions)

'and' ⇒ __and__
'or' ⇒ __or__
'not' ⇒ not
'entier' ⇒ __trunc__
'shl' ⇒ __lshift__
'shr' ⇒ __rshift__
'upb' ⇒ __sizeof__
'long' ⇒ __long__
'int' ⇒ __int__
'real' ⇒ __float__
'format' ⇒ __format__
'repr' ⇒ __repr__
'abs' ⇒ __abs__
'minus' ⇒ __neg__
'minus' ⇒ __sub__
'plus' ⇒ __add__
'times' ⇒ __mul__
'mod' ⇒ __mod__
'div' ⇒ __truediv__
'over' ⇒ __div__
'up' ⇒ __pow__
'im' ⇒ imag
're' ⇒ real
'conj' ⇒ conjugate

In Algol68 these were refered to as bold names, e.g. abs, but "under-under-abs" __abs__ in python.

My 2 cents: ¢ So sometimes — like a ghost — when you cut and paste python classes into a wiki you will magically reincarnate Algol68's bold keywords. ¢

继续阅读：python syntax

Why does python use two underscores for certain things? [duplicate]

Algol68 ⇒ Python (Operators mapped to member functions)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Algol68 ⇒ Python (Operators mapped to member functions)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？