开发者

Purpose of #!/usr/bin/python3 shebang

I have noticed this in a couple of scripting languages, but in this example, I am using python. In many tutorials, they would start with #!/usr/bin/python3 on the first line. I don't understand why we have this.

  • Shouldn't the operating system know it's a python script (obviously it's installed since you are making a reference to it)
  • What if the user is using a operating system that isn't unix based
  • The language is installed in a different folder for whatever reason
  • The user has a different versio开发者_JS百科n. Especially when it's not a full version number(Like Python3 vs Python32)

If anything, I could see this breaking the python script because of the listed reasons above.


#!/usr/bin/python3 is a shebang line.

A shebang line defines where the interpreter is located. In this case, the python3 interpreter is located in /usr/bin/python3. A shebang line could also be a bash, ruby, perl or any other scripting languages' interpreter, for example: #!/bin/bash.

Without the shebang line, the operating system does not know it's a python script, even if you set the execution flag (chmod +x script.py) on the script and run it like ./script.py. To make the script run by default in python3, either invoke it as python3 script.py or set the shebang line.

You can use #!/usr/bin/env python3 for portability across different systems in case they have the language interpreter installed in different locations.


That's called a hash-bang. If you run the script from the shell, it will inspect the first line to figure out what program should be started to interpret the script.

A non Unix based OS will use its own rules for figuring out how to run the script. Windows for example will use the filename extension and the # will cause the first line to be treated as a comment.

If the path to the Python executable is wrong, then naturally the script will fail. It is easy to create links to the actual executable from whatever location is specified by standard convention.


This line helps find the program executable that will run the script. This shebang notation is fairly standard across most scripting languages (at least as used on grown-up operating systems).

An important aspect of this line is specifying which interpreter will be used. On many development-centered Linux distributions, for example, it is normal to have several versions of python installed at the same time.

Python 2.x and Python 3 are not 100% compatible, so this difference can be very important. So #! /usr/bin/python and #! /usr/bin/python3 are not the same (and neither are quite the same as #! /usr/bin/env python3 as noted elsewhere on this page.


To clarify how the shebang line works for windows, from the 3.7 Python doc:

  • If the first line of a script file starts with #!, it is known as a “shebang” line. Linux and other Unix like operating systems have native support for such lines and they are commonly used on such systems to indicate how a script should be executed.
  • The Python Launcher for Windows allows the same facilities to be used with Python scripts on Windows
  • To allow shebang lines in Python scripts to be portable between Unix and Windows, the launcher supports a number of ‘virtual’ commands to specify which interpreter to use. The supported virtual commands are:
    • /usr/bin/env python
      • The /usr/bin/env form of shebang line has one further special property. Before looking for installed Python interpreters, this form will search the executable PATH for a Python executable. This corresponds to the behaviour of the Unix env program, which performs a PATH search.
    • /usr/bin/python
    • /usr/local/bin/python
    • python


  1. And this line is how.

  2. It is ignored.

  3. It will fail to run, and should be changed to point to the proper location. Or env should be used.

  4. It will fail to run, and probably fail to run under a different version regardless.


Actually the determination of what type of file a file is very complicated, so now the operating system can't just know. It can make lots of guesses based on -

  • extension
  • UTI
  • MIME

But the command line doesn't bother with all that, because it runs on a limited backwards compatible layer, from when that fancy nonsense didn't mean anything. If you double click it sure, a modern OS can figure that out- but if you run it from a terminal then no, because the terminal doesn't care about your fancy OS specific file typing APIs.

Regarding the other points. It's a convenience, it's similarly possible to run

python3 path/to/your/script

If your python isn't in the path specified, then it won't work, but we tend to install things to make stuff like this work, not the other way around. It doesn't actually matter if you're under *nix, it's up to your shell whether to consider this line because it's a shellcode. So for example you can run bash under Windows.

You can actually ommit this line entirely, it just mean the caller will have to specify an interpreter. Also don't put your interpreters in nonstandard locations and then try to call scripts without providing an interpreter.


The exec system call of the Linux kernel understands shebangs (#!) natively

When you do on bash:

./something

on Linux, this calls the exec system call with the path ./something.

This line of the kernel gets called on the file passed to exec: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25

if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))

It reads the very first bytes of the file, and compares them to #!.

If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec call with path /usr/bin/python3 and current file as the first argument:

/usr/bin/python3 /path/to/script.py

and this works for any scripting language that uses # as a comment character.

And analogously, if you decide to use env instead, which you likely should always do to work on systems that have the python3 in a different location, notably pyenv, see also this question, the shebang:

#!/usr/bin/env python3

ends up calling analogously:

/usr/bin/env python3 /path/to/script.py

which does what you expect from env python3: searches PATH for python3 and runs /usr/bin/python3 /path/to/script.py.

And yes, you can make an infinite loop with:

printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a

Bash recognizes the error:

-bash: /a: /a: bad interpreter: Too many levels of symbolic links

#! just happens to be human readable, but that is not required.

If the file started with different bytes, then the exec system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46 (which also happens to be human readable for .ELF). Let's confirm that by reading the 4 first bytes of /bin/ls, which is an ELF executable:

head -c 4 "$(which ls)" | hd 

output:

00000000  7f 45 4c 46                                       |.ELF|
00000004                                                                 

So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?

Finally, you can add your own shebang handlers with the binfmt_misc mechanism. For example, you can add a custom handler for .jar files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.

I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.

PATH search motivation

Likely, one big motivation for the existence of shebangs is the fact that in Linux, we often want to run commands from PATH just as:

basename-of-command

instead of:

/full/path/to/basename-of-command

But then, without the shebang mechanism, how would Linux know how to launch each type of file?

Hardcoding the extension in commands:

 basename-of-command.py

or implementing PATH search on every interpreter:

python3 basename-of-command

would be a possibility, but this has the major problem that everything breaks if we ever decide to refactor the command into another language.

Shebangs solve this problem beautifully.

See also: Why do people write #!/usr/bin/env python on the first line of a Python script?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜