开发者

Merge git repository in subdirectory

I'd like to merge a remote git repository in my working git repository as a subdirectory of it. I'd like the resulting repository to contain the merged history of the two repositories and also that each file of the merged-in repository retain its history as it was in the remote repository. I tried using the subtree strategy as mentioned in How to use the subtree merge strateg开发者_如何学编程y, but after following that procedure, although the resulting repository contains indeed the merged history of the two repositories, individual files coming from the remote one haven't retained their history (`git log' on any of them just shows a message "Merged branch...").

Also I don't want to use submodules because I do not want the two combined git repositories to be separate anymore.

Is it possible to merge a remote git repository in another one as a subdirectory with individual files coming from the remote repository retaining their history?

Thanks very much for any help.

EDIT: I'm currently trying out a solution that uses git filter-branch to rewrite the merged-in repository history. It does seem to work, but I need to test it some more. I'll return to report on my findings.

EDIT 2: In hope I make myself more clear I give the exact commands I used with git's subtree strategy, which result in apparent loss of history of the files of the remote repository. Let A be the git repo I'm currently working in and B the git repo I'd like to incorporate into A as a subdirectory of it. It did the following:

git remote add -f B <url-of-B>
git merge -s ours --no-commit B/master
git read-tree --prefix=subdir/Iwant/to/put/B/in/ -u B/master
git commit -m "Merge B as subdirectory in subdir/Iwant/to/put/B/in."

After these commands and going into directory subdir/Iwant/to/put/B/in, I see all files of B, but git log on any one of them shows just the commit message "Merge B as subdirectory in subdir/Iwant/to/put/B/in." Their file history as it is in B is lost.

What seems to work (since I'm a beginner on git I may be wrong) is the following:

git remote add -f B <url-of-B>
git checkout -b B_branch B/master  # make a local branch following B's master
git filter-branch --index-filter \ 
   'git ls-files -s | sed "s-\t\"*-&subdir/Iwant/to/put/B/in/-" |
        GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
                git update-index --index-info &&
        mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD 
git checkout master
git merge B_branch

The command above for filter-branch is taken from git help filter-branch, in which I only changed the subdir path.


git-subtree is a script designed for exactly this use case of merging multiple repositories into one while preserving history (and/or splitting history of subtrees, though that is seems to be irrelevant to this question). It is distributed as part of the git tree since release 1.7.11.

To merge a repository <repo> at revision <rev> as subdirectory <prefix>, use git subtree add as follows:

git subtree add -P <prefix> <repo> <rev>

git-subtree implements the subtree merge strategy in a more user friendly manner.

The downside is that in the merged history the files are unprefixed (not in a subdirectory). Say you merge repository a into b. As a result git log a/f1 will show you all the changes (if any) except those in the merged history. You can do:

git log --follow -- f1

but that won't show the changes other then in the merged history.

In other words, if you don't change a's files in repository b, then you need to specify --follow and an unprefixed path. If you change them in both repositories, then you have 2 commands, none of which shows all the changes.

More on it here.


After getting the fuller explanation of what is going on, I think I understand it and in any case at the bottom I have a workaround. Specifically, I believe what is happening is rename detection is being fooled by the subtree merge with --prefix. Here is my test case:

mkdir -p z/a z/b
cd z/a
git init
echo A>A
git add A
git commit -m A
echo AA>>A
git commit -a -m AA
cd ../b
git init
echo B>B
git add B
git commit -m B
echo BB>>B
git commit -a -m BB
cd ../a
git remote add -f B ../b
git merge -s ours --no-commit B/master
git read-tree --prefix=bdir -u B/master
git commit -m "subtree merge B into bdir"
cd bdir
echo BBB>>B
git commit -a -m BBB

We make git directories a and b with several commits each. We do a subtree merge, and then we do a final commit in the new subtree.

Running gitk (in z/a) shows that the history does appear, we can see it. Running git log shows that the history does appear. However, looking at a specific file has a problem: git log bdir/B

Well, there is a trick we can play. We can look at the pre-rename history of a specific file using --follow. git log --follow -- B. This is good but isn't great since it fails to link the history of the pre-merge with the post-merge.

I tried playing with -M and -C, but I wasn't able to get it to follow one specific file.

So, the solution, I feel, is to tell git about the rename that will be taking place as part of the subtree merge. Unfortunately git-read-tree is pretty fussy about subtree merges so we have to work through a temporary directory, but that can go away before we commit. Afterwards, we can see the full history.

First, create an "A" repository and make some commits:

mkdir -p z/a z/b
cd z/a
git init
echo A>A
git add A
git commit -m A
echo AA>>A
git commit -a -m AA

Second, create a "B" repository and make some commits:

cd ../b
git init
echo B>B
git add B
git commit -m B
echo BB>>B
git commit -a -m BB

And the trick to making this work: force Git to recognize the rename by creating a subdirectory and moving the contents into it.

mkdir bdir
git mv B bdir
git commit -a -m bdir-rename

Return to repository "A" and fetch and merge the contents of "B":

cd ../a
git remote add -f B ../b
git merge -s ours --no-commit B/master
# According to Alex Brown and pjvandehaar, newer versions of git need --allow-unrelated-histories
# git merge -s ours --allow-unrelated-histories --no-commit B/master
git read-tree --prefix= -u B/master
git commit -m "subtree merge B into bdir"

To show that they're now merged:

cd bdir
echo BBB>>B
git commit -a -m BBB

To prove the full history is preserved in a connected chain:

git log --follow B

We get the history after doing this, but the problem is that if you are actually keeping the old "b" repo around and occasionally merging from it (say it is actually a third party separately maintained repo) you are in trouble since that third party will not have done the rename. You must try to merge new changes into your version of b with the rename and I fear that will not go smoothly. But if b is going away, you win.


I wanted to

  1. keep a linear history without explicit merge, and
  2. make it look like the files of the merged repository had always existed in the subdirectory, and as a side effect make git log -- file work without --follow.

Step 1: Rewrite history in the source repository to make it look like all files always existed below the subdirectory.

Create a temporary branch for the rewritten history.

git checkout -b tmp_subdir

Then use git filter-branch as described in How can I rewrite history so that all files, except the ones I already moved, are in a subdirectory?:

git filter-branch --prune-empty --tree-filter '
if [ ! -e foo/bar ]; then
    mkdir -p foo/bar
    git ls-tree --name-only $GIT_COMMIT | xargs -I files mv files foo/bar
fi'

Step 2: Switch to the target repository. Add the source repository as remote in the target repository and fetch its contents.

git remote add sourcerepo .../path/to/sourcerepo
git fetch sourcerepo

Step 3: Use merge --onto to add the commits of the rewritten source repository on top of the target repository.

git rebase --preserve-merges --onto master --root sourcerepo/tmp_subdir

You can check the log to see that this really got you what you wanted.

git log --stat

Step 4: After the rebase you’re in “detached HEAD” state. You can fast-forward master to the new head.

git checkout -b tmp_merged
git checkout master
git merge tmp_merged
git branch -d tmp_merged

Step 5: Finally some cleanup: Remove the temporary remote.

git remote rm sourcerepo


If you are really wanting to stitch things together, look up grafting. You should also be using git rebase --preserve-merges --onto. There is also an option to keep the author date for the committer information.


I found the following solution workable for me. First I go into project B, create a new branch in which already all files will be moved to the new sub directory. I then push this new branch to origin. Next I go to project A, add and fetch the remote of B, then I checkout the moved branch, I go back into master and merge:

# in local copy of project B
git checkout -b prepare_move
mkdir subdir
git mv <files_to_move> subdir/
git commit -m 'move files to subdir'
git push origin prepare_move

# in local copy of project A
git remote add -f B_origin <remote-url>
git checkout -b from_B B_origin/prepare_move
git checkout master
git merge from_B

If I go to sub directory subdir, I can use git log --follow and still have the history.

I'm not a git expert, so I cannot comment whether this is a particularly good solution or if it has caveats, but so far it seems all fine.


Have you tried adding the extra repository as a git submodule? It won't merge the history with the containing repository, in fact, it will be an independent repository.

I mention it, because you haven't.


Say you want to merge repository a into b (I'm assuming they're located alongside one another):

cd a
git filter-repo --to-subdirectory-filter a
cd ..
cd b
git remote add a ../a
git fetch a
git merge --allow-unrelated-histories a/master
git remote remove a

For this you need git-filter-repo installed (filter-branch is discouraged).

An example of merging 2 big repositories, putting one of them into a subdirectory: https://gist.github.com/x-yuri/9890ab1079cf4357d6f269d073fd9731

More on it here.


Similar to hfs' answer I wanted to

  • keep a linear history without explicit merge and
  • make it look like the files of the merged repository had always existed in the subdirectory, and as a side effect make git log -- file work without --follow.

However, I chose the more modern filter-repo (assuming the new repo exists and is checked out):

git clone git@host/repo/old.git
cd old
git checkout -b tmp_subdir
git filter-repo --to-subdirectory-filter old

cd ../new
git remote add old ../old
git fetch old
git rebase --rebase-merges --onto main --root old/tmp_subdir --committer-date-is-author-date

you might need to fix conflicts (manually) or change the rebase command to include --merge -s recursive -X theirs if you want to try solving it with theirs version:

git rebase --rebase-merges --onto main --root old/tmp_subdir --committer-
date-is-author-date --merge -s recursive -X theirs

you end up on a detached HEAD, so create a new branch and merge it to main note that modern repositories should not use a "master" branch but a "main"

branch for a more inclusive language.
git checkout -b old_merge
git checkout main
git merge old_merge

cleanup

git branch -d old_merge
git remote rm old
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜