开发者

What's the significance of the "No newline at end of file" log?

When doing a git diff it says "No 开发者_如何学JAVAnewline at end of file".

What's the significance of the message and what's it trying to tell us?


It indicates that you do not have a newline (usually \n, aka LF or CRLF) at the end of file.

That is, simply speaking, the last byte (or bytes if you're on Windows) in the file is not a newline.

The message is displayed because otherwise there is no way to tell the difference between a file where there is a newline at the end and one where is not. Diff has to output a newline anyway, or the result would be harder to read or process automatically.

Note that it is a good style to always put the newline as a last character, in text files, if it is allowed by the file format. Furthermore, for example, for C and C++ header files it is required by the language standard.


It's not just bad style, it can lead to unexpected behavior when using other tools on the file.

Here is test.txt:

first line
second line

There is no newline character on the last line. Let's see how many lines are in the file:

$ wc -l test.txt
1 test.txt

Maybe that's what you want, but in most cases you'd probably expect there to be 2 lines in the file.

Also, if you wanted to combine files it may not behave the way you'd expect:

$ cat test.txt test.txt
first line
second linefirst line
second line

Finally, it would make your diffs slightly more noisy if you were to add a new line. If you added a third line, it would show an edit to the second line as well as the new addition.


If you add a new line of text at the end of the existing file which does not already have a newline character at the end, the diff will show the old last line as having been modified, even though conceptually it wasn’t.

This is at least one good reason to add a newline character at the end.

Example

A file contains:

A() {
    // do something
}

Hexdump:

00000000: 4128 2920 7b0a 2020 2020 2f2f 2064 6f20  A() {.    // do 
00000010: 736f 6d65 7468 696e 670a 7d              something.}

You now edit it to

A() {
    // do something
}
// Useful comment

Hexdump:

00000000: 4128 2920 7b0a 2020 2020 2f2f 2064 6f20  A() {.    // do 
00000010: 736f 6d65 7468 696e 670a 7d0a 2f2f 2055  something.}.// U
00000020: 7365 6675 6c20 636f 6d6d 656e 742e 0a    seful comment..

The git diff will show:

-}
\ No newline at end of file
+}
+// Useful comment.

In other words, it shows a larger diff than conceptually occurred. It shows that you deleted the line } and added the line }\n. This is, in fact, what happened, but it’s not what conceptually happened, so it can be confusing.


The only reason is that Unix historically had a convention of all human-readable text files ending in a newline. At the time, this avoided extra processing when displaying or joining text files, and avoided treating text files differently to files containing other kinds of data (eg raw binary data which isn't human-readable).

Because of this convention, many tools from that era expect the ending newline, including text editors, diffing tools, and other text processing tools. Mac OS X was built on BSD Unix, and Linux was developed to be Unix-compatible, so both operating systems have inherited the same convention, behaviour and tools.

Windows wasn't developed to be Unix-compatible, so it doesn't have the same convention, and most Windows software will deal just fine with no trailing newline.

But, since Git was developed for Linux first, and a lot of open-source software is built on Unix-compatible systems like Linux, Mac OS X, FreeBSD, etc, most open-source communities and their tools (including programming languages) continue to follow these conventions.

There are technical reasons which made sense in 1971, but in this era it's mostly convention and maintaining compatibility with existing tools.


It just indicates that the end of the file doesn't have a newline. It's not a catastrophe it's just a message to make it clearer that there isn't one when looking at a diff in the command line.


The reason this convention came into practice is because on UNIX-like operating systems a newline character is treated as a line terminator and/or message boundary (this includes piping between processes, line buffering, etc.).

Consider, for example, that a file with just a newline character is treated as a single, empty line. Conversely, a file with a length of zero bytes is actually an empty file with zero lines. This can be confirmed according to the wc -l command.

Altogether, this behavior is reasonable because there would be no other way to distinguish between an empty text file versus a text file with a single empty line if the \n character was merely a line-separator rather than a line-terminator. Thus, valid text files should always end with a newline character. The only exception is if the text file is intended to be empty (no lines).


There is one thing that I don't see in previous responses. Warning about no end-of-line could be a warning when a portion of a file has been truncated. It could be a symptom of missing data.


The core problem is what you define line and whether end-on-line character sequence is part of the line or not. UNIX-based editors (such as VIM) or tools (such as Git) use EOL character sequence as line terminator, therefore it's a part of the line. It's similar to use of semicolon (;) in C and Pascal. In C semicolon terminates statements, in Pascal it separates them.


This actually does cause a problem because line endings are automatically modified dirtying files without making any changes to them. See this post for resolution.

git replacing LF with CRLF


Source files are often concatenated by tools (C, C++: header files, Javascript: bundlers). If you omit the newline character, you could introduce nasty bugs (where the last line of one source is concatenated with the first line of the next source file). Hopefully all the source code concat tools out there insert a newline between concatenated files anyway but that doesn't always seem to be the case.

The crux of the issue is - in most languages, newlines have semantic meaning and end-of-file is not a language defined alternative for the newline character. So you ought to terminate every statement/expression with a newline character -- including the last one.


Your original file probably had no newline character.

However, some editors like gedit in linux silently adds newline at end of file. You cannot get rid of this message while using this kind of editors.

What I tried to overcome this issue is to open file with visual studio code editor

This editor clearly shows the last line and you can delete the line as you wish.


ubuntu$> vi source.cpp

:set binary noeol


What

When doing a git diff it says "No newline at end of file".

In some ways, yes, but it's more nuanced and subtle.

When doing a git diff, git shows the difference between versions of files, displayed as chunks where those files differ. This may include a chunk at the end of the file.

If both versions of the file do not end in a newline, the end of the chunk will read

\ No newline at the end of file

whereas if only one version don't end in a newline, the chunk will end in either

-last line
\ No newline at the end of file
+new last line

or

-last line
+new last line
\ No newline at the end of file

Why?

What's the significance of the message

The reason is simple. One of the main purpose of git diff is display changes, unambiguously so that it can be used as a input to git apply. To do this, Git needs to know what it is supposed to do with newlines when applying a diff. Should it remove, keep or change them?

The \ No newline at the end of file is a way of doing that. It's also helpful to humans that want to be aware of such changes, because in some instances having or not having newlines is important to the file.

and what's it trying to tell us?

Well, just that there is no newline at the end of this file. Or that there wasn't but now there is. Or that there was but now isn't.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜