开发者

To hook or not to hook - git

Our bespoke IDE outputs XML files with an encoding that makes them look like binary files. Diffs and merges of these files fail.

We can create ASCII versions of these files with the tr command. I would like to get to a state where these files are always automatically converted to ascii before they are com开发者_如何学Gomitted.

I picked up my copy of Version Control with Git and it wholeheartedly warns me away from using hooks unless I really need to.

Should I be using a hook for this purpose? Or can I do something else to ensure the files are always converted before commit?

Windows XP with msysgit 1.7.4

--= update =--

Thanks everyone for your help and patience. Looking to this question I tried the following, but it does not work:

echo "*.xrp    filter=xrp" > .git/info/attributes
git config --global filter.xrp.clean 'tr -cd '\''\11\12\15\40-\176'\'''
git config --global filter.xrp.smudge cat
git checkout --force

The files remain unchanged after this config change. Even when I delete and re-checkout.

The tr command configured as the clean task does work in isolation. Proof:

$ head -n 1 cashflow/repo/C_GMM_CashflowRepo.xrp
ÿþ< ! - -   X M L   R e p o s i t o r y   f i l e   1 . 0   - - >

$ tr -cd '\''\11\12\15\40-\176'\' < cashflow/repo/C_GMM_CashflowRepo.xrp | head -n 1
<!-- XML Repository file 1.0 -->

Can anyone see what is wrong with my config?


One issue with hooks is that they aren't distributed.

.gitattributes has some directive to manage the diff and content of a file, but another option would be an attribute filter (still in .gitattributes), and could automatically convert those files on commit.
(That is if the clean script is able to detect those files based on their content alone)


Per this chat discussion, the OP Synesso reports a success:

.gitattributes:
*.xrp filter=xrp

~/.gitconfig:
[filter "xrp"]
clean = \"C:/Program Files/Git/bin/tr.exe\" -cd "\\''\\11\\12\\15\\40-\\176'\\'"
smudge = cat

Then I had to modify the file, add, commit, delete, checkout ... and THEN it was fixed. :)

Note that, for any modification which doesn't concern just one user, but potentially any user cloning that repo, I prefer adding (and committing) an extra .gitattributes file in which the filter is declared, rather than modifying the .git/info/attribute file (which isn't cloned around).

From the gitattributes man page:

  • If you wish to affect only a single repository (i.e., to assign attributes to files that are particular to one user’s workflow for that repository), then attributes should be placed in the $GIT_DIR/info/attributes file.
  • Attributes which should be version-controlled and distributed to other repositories (i.e., attributes of interest to all users) should go into .gitattributes files.
  • Attributes that should affect all repositories for a single user should be placed in a file specified by the core.attributesfile configuration option.
  • Attributes for all users on a system should be placed in the $(prefix)/etc/gitattributes file.

http://git-scm.com/docs/gitattributes


phyatt adds in the comments:

I made an example similar to this for sqlite3.
You can add it into the correct files with two lines:

git config diff.sqlite3.textconv 'sqlite3 $1 .dump'
echo '*.db diff=sqlite3' >> $(git rev-parse --show-toplevel)/.gitattributes 

Similar lines can be used for writing other git config paths.


Does diff stand a chance of working on them as is (i.e. they just contain a handful of strange bytes but are otherwise text) or not? If it does, you can just force git to treat them as text with .gitattributes. If not, it still might be better to create custom diff and merge scripts (that will use the tr as needed to convert) and tell git to use it, again with .gitattributes.

In either case you will not be using hooks (those are for running in particular operations), but .gitattributes, which are file-specific.


If your preferred editing format were ASCII and only your builds required the binary files I would recommend using build rules to generate the binary version from the preferred source which you would commit to the repository.

Given that your IDE makes the files in the binary format already, I think the best thing is to store them in the repository in that format.

Rather than hooks, look at git help attributes, especially diff and textconv which allow you to configure files matching certain patterns to use alternate means of diffing. You should be able to produce working ASCII diffs without having to compromise how you store the files or edit them.

EDIT: Based on your comment elsewhere that "every other byte is 0" that suggest the file is UTF-16 or UCS-2. See this answer for a diff which can handle unicode: Can I make git recognize a UTF-16 file as text?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜