开发者

Git: how to renormalize line endings in all files in all revisions?

I have an existing repository where line endings are all messed up. I'd like to rewrite the entire repository and fix line endings once and for all. There are text files and binary files, let's 开发者_JAVA百科assume that git's heuristics for detecting binary files will work just fine.

What's the easiest way to repopulate the entire repository with files with normalized line endings?


Since Git 2.16 (Q1 2018) there is another way (other than deleting the index content), which is a new and safer way to record the fact that you are correcting the end-of-line convention:

git add --renormalize .

See commit 9472935 (16 Nov 2017) by Torsten Bögershausen (tboegi).
(Merged by Junio C Hamano -- gitster -- in commit af6e0fe, 27 Nov 2017)

add: introduce "--renormalize"

Make it safer to normalize the line endings in a repository.
Files that had been committed with CRLF will be committed with LF.

The old way to normalize a repo was like this:

# Make sure that there are not untracked files
$ echo "* text=auto" >.gitattributes
$ git read-tree --empty
$ git add .
$ git commit -m "Introduce end-of-line normalization"

The user must make sure that there are no untracked files, otherwise they would have been added and tracked from now on.

The new "add --renormalize" does not add untracked files:

$ echo "* text=auto" >.gitattributes
$ git add --renormalize .
$ git commit -m "Introduce end-of-line normalization"

Note that "git add --renormalize <pathspec>" is the short form for "git add -u --renormalize <pathspec>".


Note: Git 2.21 (Feb. 2019) fixes a bug related to this: "git add --ignore-errors" did not work as advertised and instead worked as an unintended synonym for "git add --renormalize", which has been fixed.

See commit 9e5da3d (17 Jan 2019) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 1c41824, 05 Feb 2019)

add: use separate ADD_CACHE_RENORMALIZE flag

Commit 9472935 (add: introduce "--renormalize", 2017-11-16, Git 2.16) taught git add to pass HASH_RENORMALIZE to add_to_index(), which then passes the flag along to index_path().
However, the flags taken by add_to_index() and the ones taken by index_path() are distinct namespaces.
We cannot take HASH_* flags in add_to_index(), because they overlap with the ADD_CACHE_* flags we already take (in this case, HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS).

We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using it to set HASH_RENORMALIZE within add_to_index().
In order to make it clear that these two flags come from distinct sets, let's also change the name "newflags" in the function to "hash_flags".

Also: See commit e2c2a37 (07 Feb 2019) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 9293bf6, 07 Feb 2019)

add_to_index(): convert forgotten HASH_RENORMALIZE check

Commit 9e5da3d (add: use separate ADD_CACHE_RENORMALIZE flag, 2019-01-17) switched out using HASH_RENORMALIZE in our flags field for a new ADD_CACHE_RENORMALIZE flag.
However, it forgot to convert one of the checks for HASH_RENORMALIZE into the new flag, which totally broke "git add --renormalize".


Git 2.37.3 (Q3 2022), "git add --renormalize"(man) clarifies a corner case

See commit efae7ce (10 Aug 2022) by Philip Oakley (PhilipOakley).
(Merged by Junio C Hamano -- gitster -- in commit 58ded4a, 18 Aug 2022)

doc add: renormalize is not idempotent for CRCRLF

Signed-off-by: Philip Oakley
Reviewed-by: Torsten Bögershausen

A bug report noted that a file containing /r/r/n needed renormalising twice.

This is by design.
Lone CR characters, not paired with an LF, are left unchanged.
Note this limitation of the "clean" filter in the documentation.

Renormalize was introduced at 9472935 (add: introduce , 2017-11-16, Git v2.16.0-rc0 -- merge listed in batch #6) (add: introduce "--renormalize", Torsten Bögershausen, 2017-11-16)

git add now includes in its man page:

This option implies -u.
Lone CR characters are untouched, thus while a CRLF cleans to LF, a CRCRLF sequence is only partially cleaned to CRLF.


If you just want to renormalize your current commit after having set core.autocrlf or text=auto, so you can have all the line ending normalization in one commit, run these commands:

git rm --cached -rf .
git add .

To also normalize the files in your working dir, run:

git checkout .


This can be used without git. Then, later on, git commit the code base.

for f in $(find ./ -type f ) ; do
    if grep -qP '\x00' $f ; then
       # file is binary
       continue    
    fi

    perl -pe 'BEGIN{ undef $/} s/\x0d\x0a/\x0a/g;s/\x0d/\x0a/g' -i $f
done

The grep is assuming anything containing a null character is a binary file.

perl is used to edit each file in-place. First, Windows style newlines are changed to Unix style newlines. Then Mac style newlines are changed to Unix style newlines.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜