Git: how to renormalize line endings in all files in all revisions?
I have an existing repository where line endings are all messed up. I'd like to rewrite the entire repository and fix line endings once and for all. There are text files and binary files, let's 开发者_JAVA百科assume that git's heuristics for detecting binary files will work just fine.
What's the easiest way to repopulate the entire repository with files with normalized line endings?
Since Git 2.16 (Q1 2018) there is another way (other than deleting the index content), which is a new and safer way to record the fact that you are correcting the end-of-line convention:
git add --renormalize .
See commit 9472935 (16 Nov 2017) by Torsten Bögershausen (tboegi
).
(Merged by Junio C Hamano -- gitster
-- in commit af6e0fe, 27 Nov 2017)
add
: introduce "--renormalize
"
Make it safer to normalize the line endings in a repository.
Files that had been committed with CRLF will be committed with LF.The old way to normalize a repo was like this:
# Make sure that there are not untracked files $ echo "* text=auto" >.gitattributes $ git read-tree --empty $ git add . $ git commit -m "Introduce end-of-line normalization"
The user must make sure that there are no untracked files, otherwise they would have been added and tracked from now on.
The new "add --renormalize" does not add untracked files:
$ echo "* text=auto" >.gitattributes $ git add --renormalize . $ git commit -m "Introduce end-of-line normalization"
Note that "
git add --renormalize <pathspec>
" is the short form for "git add -u --renormalize <pathspec>
".
Note: Git 2.21 (Feb. 2019) fixes a bug related to this: "git add --ignore-errors
" did not work as advertised and instead worked as an unintended synonym for "git add --renormalize
", which has been fixed.
See commit 9e5da3d (17 Jan 2019) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 1c41824, 05 Feb 2019)
add
: use separate ADD_CACHE_RENORMALIZE flag
Commit 9472935 (
add
: introduce "--renormalize
", 2017-11-16, Git 2.16) taughtgit add
to passHASH_RENORMALIZE
toadd_to_index()
, which then passes the flag along toindex_path()
.
However, the flags taken byadd_to_index()
and the ones taken byindex_path()
are distinct namespaces.
We cannot takeHASH_*
flags inadd_to_index()
, because they overlap with theADD_CACHE_*
flags we already take (in this case,HASH_RENORMALIZE
conflicts withADD_CACHE_IGNORE_ERRORS
).We can solve this by adding a new
ADD_CACHE_RENORMALIZE
flag, and using it to setHASH_RENORMALIZE
withinadd_to_index()
.
In order to make it clear that these two flags come from distinct sets, let's also change the name "newflags
" in the function to "hash_flags
".
Also: See commit e2c2a37 (07 Feb 2019) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 9293bf6, 07 Feb 2019)
add_to_index()
: convert forgottenHASH_RENORMALIZE
check
Commit 9e5da3d (
add
: use separateADD_CACHE_RENORMALIZE
flag, 2019-01-17) switched out usingHASH_RENORMALIZE
in our flags field for a newADD_CACHE_RENORMALIZE
flag.
However, it forgot to convert one of the checks forHASH_RENORMALIZE
into the new flag, which totally broke "git add --renormalize
".
Git 2.37.3 (Q3 2022), "git add --renormalize
"(man) clarifies a corner case
See commit efae7ce (10 Aug 2022) by Philip Oakley (PhilipOakley
).
(Merged by Junio C Hamano -- gitster
-- in commit 58ded4a, 18 Aug 2022)
doc add
: renormalize is not idempotent for CRCRLFSigned-off-by: Philip Oakley
Reviewed-by: Torsten Bögershausen
A bug report noted that a file containing
/r/r/n
needed renormalising twice.This is by design.
Lone CR characters, not paired with an LF, are left unchanged.
Note this limitation of the "clean
" filter in the documentation.Renormalize was introduced at 9472935 (
add
: introduce , 2017-11-16, Git v2.16.0-rc0 -- merge listed in batch #6) (add: introduce "--renormalize
", Torsten Bögershausen, 2017-11-16)
git add
now includes in its man page:
This option implies
-u
.
Lone CR characters are untouched, thus while a CRLF cleans to LF, a CRCRLF sequence is only partially cleaned to CRLF.
If you just want to renormalize your current commit after having set core.autocrlf
or text=auto
, so you can have all the line ending normalization in one commit, run these commands:
git rm --cached -rf .
git add .
To also normalize the files in your working dir, run:
git checkout .
This can be used without git. Then, later on, git commit
the code base.
for f in $(find ./ -type f ) ; do
if grep -qP '\x00' $f ; then
# file is binary
continue
fi
perl -pe 'BEGIN{ undef $/} s/\x0d\x0a/\x0a/g;s/\x0d/\x0a/g' -i $f
done
The grep is assuming anything containing a null character is a binary file.
perl is used to edit each file in-place. First, Windows style newlines are changed to Unix style newlines. Then Mac style newlines are changed to Unix style newlines.
精彩评论