开发者

Update Git submodule to latest commit on origin

I have a project with a Git submodule. It is from an ssh://... URL, and is on commit A. Commit B has been pushed to that URL, and I want the submodule to retrieve the commit, and change to it.

Now, my understanding is that git submodule update should do this, but it doesn't. It doesn't do anything (no output, success exit code). Here's an example:

$ mkdir foo
$ cd foo
$ git init .
Initialized empty Git repository in /.../foo/.git/
$ git submodule add ssh://user@host/git/mod mod
Cloning into mod...
user@host's password: hunter2
remote: Counting objects: 131, done.
remote: Compressing objects: 100% (115/115), done.
remote: Total 131 (delta 54), reused 0 (delta 0)
Receiving objects: 100% (131/131), 16.16 KiB, done.
Resolving deltas: 100% (54/54), done.
$ git commit -m "Hello world."
[master (root-commit) 565b235] Hello world.
 2 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 mod
# At this point, ssh://user@host/git/mod changes; submodule needs to change too.
$ git submodule init
Submodule 'mod' (ssh://user@host/git/mod) registered for path 'mod'
$ git submodule update
$ git submodule sync
Synchronizing submodule url for 'mod'
$ git submodule开发者_如何学JAVA update
$ man git-submodule 
$ git submodule update --rebase
$ git submodule update
$ echo $?
0
$ git status
# On branch master
nothing to commit (working directory clean)
$ git submodule update mod
$ ...

I've also tried git fetch mod, which appears to do a fetch (but can't possibly, because it's not prompting for a password!), but git log and git show deny the existence of new commits. Thus far I've just been rm-ing the module and re-adding it, but this is both wrong in principle and tedious in practice.


The git submodule update command actually tells Git that you want your submodules to each check out the commit already specified in the index of the superproject. If you want to update your submodules to the latest commit available from their remote, you will need to do this directly in the submodules.

So in summary:

# Get the submodule initially
git submodule add ssh://bla submodule_dir
git submodule init

# Time passes, submodule upstream is updated
# and you now want to update

# Change to the submodule directory
cd submodule_dir

# Checkout desired branch
git checkout master

# Update
git pull

# Get back to your project root
cd ..

# Now the submodules are in the state you want, so
git commit -am "Pulled down update to submodule_dir"

Or, if you're a busy person:

git submodule foreach git pull origin master


Git 1.8.2 features a new option, --remote, that will enable exactly this behavior. Running

git submodule update --remote --merge

will fetch the latest changes from upstream in each submodule, merge them in, and check out the latest revision of the submodule. As the documentation puts it:

--remote

This option is only valid for the update command. Instead of using the superproject’s recorded SHA-1 to update the submodule, use the status of the submodule’s remote-tracking branch.

This is equivalent to running git pull <remote> <default_branch> (usually git pull origin master or git pull origin main) in each submodule, which is generally exactly what you want.


In your project parent directory, run:

git submodule update --init

Or if you have recursive submodules run:

git submodule update --init --recursive

Sometimes this still doesn't work, because somehow you have local changes in the local submodule directory while the submodule is being updated.

Most of the time the local change might not be the one you want to commit. It can happen due to a file deletion in your submodule, etc. If so, do a reset in your local submodule directory and in your project parent directory, run again:

git submodule update --init --recursive


Your main project points to a particular commit that the submodule should be at. git submodule update tries to check out that commit in each submodule that has been initialized. The submodule is really an independent repository - just creating a new commit in the submodule and pushing that isn't enough. You also need to explicitly add the new version of the submodule in the main project.

So, in your case, you should find the right commit in the submodule - let's assume that's the tip of master:

cd mod
git checkout master
git pull origin master

Now go back to the main project, stage the submodule and commit that:

cd ..
git add mod
git commit -m "Updating the submodule 'mod' to the latest version"

Now push your new version of the main project:

git push origin master

From this point on, if anyone else updates their main project, then git submodule update for them will update the submodule, assuming it's been initialized.


It seems like two different scenarios are being mixed together in this discussion:

Scenario 1

Using my parent repository's pointers to submodules, I want to check out the commit in each submodule that the parent repository is pointing to, possibly after first iterating through all submodules and updating/pulling these from remote.

This is, as pointed out, done with

git submodule foreach git pull origin BRANCH
git submodule update

Scenario 2, which I think is what OP is aiming at

New stuff has happened in one or more submodules, and I want to 1) pull these changes and 2) update the parent repository to point to the HEAD (latest) commit of this/these submodules.

This would be done by

git submodule foreach git pull origin BRANCH
git add module_1_name
git add module_2_name
......
git add module_n_name
git push origin BRANCH

Not very practical, since you would have to hardcode n paths to all n submodules in e.g. a script to update the parent repository's commit pointers.

It would be cool to have an automated iteration through each submodule, updating the parent repository pointer (using git add) to point to the head of the submodule(s).

For this, I made this small Bash script:

git-update-submodules.sh

#!/bin/bash

APP_PATH=$1
shift

if [ -z $APP_PATH ]; then
  echo "Missing 1st argument: should be path to folder of a git repo";
  exit 1;
fi

BRANCH=$1
shift

if [ -z $BRANCH ]; then
  echo "Missing 2nd argument (branch name)";
  exit 1;
fi

echo "Working in: $APP_PATH"
cd $APP_PATH

git checkout $BRANCH && git pull --ff origin $BRANCH

git submodule sync
git submodule init
git submodule update
git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

git commit -m "Updated $BRANCH branch of deployment repo to point to latest head of submodules"
git push origin $BRANCH

To run it, execute

git-update-submodules.sh /path/to/base/repo BRANCH_NAME

Elaboration

First of all, I assume that the branch with name $BRANCH (second argument) exists in all repositories. Feel free to make this even more complex.

The first couple of sections is some checking that the arguments are there. Then I pull the parent repository's latest stuff (I prefer to use --ff (fast-forwarding) whenever I'm just doing pulls. I have rebase off, BTW).

git checkout $BRANCH && git pull --ff origin $BRANCH

Then some submodule initializing, might be necessary, if new submodules have been added or are not initialized yet:

git submodule sync
git submodule init
git submodule update

Then I update/pull all submodules:

git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

Notice a few things: First of all, I'm chaining some Git commands using && - meaning previous command must execute without error.

After a possible successful pull (if new stuff was found on the remote), I do a push to ensure that a possible merge-commit is not left behind on the client. Again, it only happens if a pull actually brought in new stuff.

Finally, the final || true is ensuring that script continues on errors. To make this work, everything in the iteration must be wrapped in the double-quotes and the Git commands are wrapped in parentheses (operator precedence).

My favourite part:

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

Iterate all submodules - with --quiet, which removes the 'Entering MODULE_PATH' output. Using 'echo $path' (must be in single-quotes), the path to the submodule gets written to output.

This list of relative submodule paths is captured in an array ($(...)) - finally iterate this and do git add $i to update the parent repository.

Finally, a commit with some message explaining that the parent repository was updated. This commit will be ignored by default, if nothing was done. Push this to origin, and you're done.

I have a script running this in a Jenkins job that chains to a scheduled automated deployment afterwards, and it works like a charm.

I hope this will be of help to someone.


Note, while the modern form of updating submodule commits would be:

git submodule update --recursive --remote --force

See Gabriel Staples's answer for an alternative take, not using --merge --force.

The --force option allows for the checkout to take place even if the commit specified in the index of the containing repository already matches the commit checked out in the submodule.

The --merge option seems not necessary in this case: "the commit recorded in the superproject will be merged into the current branch in the submodule."


The older form was:

git submodule foreach --quiet git pull --quiet origin

Except... this second form is not really "quiet".

See commit a282f5a (12 Apr 2019) by Nguyễn Thái Ngọc Duy (pclouds).
(Merged by Junio C Hamano -- gitster -- in commit f1c9f6c, 25 Apr 2019)

submodule foreach: fix "<command> --quiet" not being respected

Robin reported that

git submodule foreach --quiet git pull --quiet origin

is not really quiet anymore.
It should be quiet before fc1b924 (submodule: port submodule subcommand 'foreach' from shell to C, 2018-05-10, Git v2.19.0-rc0) because parseopt can't accidentally eat options then.

"git pull" behaves as if --quiet is not given.

This happens because parseopt in submodule--helper will try to parse both --quiet options as if they are foreach's options, not git-pull's.
The parsed options are removed from the command line. So when we do pull later, we execute just this

git pull origin

When calling submodule helper, adding "--" in front of "git pull" will stop parseopt for parsing options that do not really belong to submodule--helper foreach.

PARSE_OPT_KEEP_UNKNOWN is removed as a safety measure. parseopt should never see unknown options or something has gone wrong. There are also a couple usage string update while I'm looking at them.

While at it, I also add "--" to other subcommands that pass "$@" to submodule--helper. "$@" in these cases are paths and less likely to be --something-like-this.
But the point still stands, git-submodule has parsed and classified what are options, what are paths.
submodule--helper should never consider paths passed by git-submodule to be options even if they look like one.


And Git 2.23 (Q3 2019) fixes another issue: "git submodule foreach" did not protect command line options passed to the command to be run in each submodule correctly, when the "--recursive" option was in use.

See commit 30db18b (24 Jun 2019) by Morian Sonnet (momoson).
(Merged by Junio C Hamano -- gitster -- in commit 968eecb, 09 Jul 2019)

submodule foreach: fix recursion of options

Calling:

git submodule foreach --recursive <subcommand> --<option>

leads to an error stating that the option --<option> is unknown to submodule--helper.
That is of course only, when <option> is not a valid option for git submodule foreach.

The reason for this is, that above call is internally translated into a call to submodule--helper:

git submodule--helper foreach --recursive \
   -- <subcommand> --<option>

This call starts by executing the subcommand with its option inside the first level submodule and continues by calling the next iteration of the submodule foreach call

git --super-prefix <submodulepath> submodule--helper \
  foreach --recursive <subcommand> --<option>

inside the first level submodule. Note that the double dash in front of the subcommand is missing.

This problem starts to arise only recently, as the PARSE_OPT_KEEP_UNKNOWN flag for the argument parsing of git submodule foreach was removed in commit a282f5a.
Hence, the unknown option is complained about now, as the argument parsing is not properly ended by the double dash.

This commit fixes the problem by adding the double dash in front of the subcommand during the recursion.


Note that, before Git 2.29 (Q4 2020), "git submodule update --quiet"(man) did not squelch underlying "rebase" and "pull" commands.

See commit 3ad0401 (30 Sep 2020) by Theodore Dubois (tbodt).
(Merged by Junio C Hamano -- gitster -- in commit 300cd14, 05 Oct 2020)

submodule update: silence underlying merge/rebase with "--quiet"

Signed-off-by: Theodore Dubois

Commands such as

$ git pull --rebase --recurse-submodules --quiet  

produce non-quiet output from the merge or rebase.
Pass the --quiet option down when invoking "rebase" and "merge".

Also fix the parsing of git submodule update(man) -v.

When e84c3cf3 ("git-submodule.sh: accept verbose flag in cmd_update to be non-quiet", 2018-08-14, Git v2.19.0-rc0 -- merge) taught "git submodule update"(man) to take "--quiet", it apparently did not know how ${GIT_QUIET:+--quiet} works, and reviewers seem to have missed that setting the variable to "0", rather than unsetting it, still results in "--quiet" being passed to underlying commands.


With Git 2.38 (Q3 2022), git-submodule.sh is prepared to be turned into a builtin, meaning the submodule--helper which has issues described above is being faded out.

See commit 5b893f7, commit 2eec463, commit 8f12108, commit 36d4516, commit 6e556c4, commit 0d68ee7, commit d9c7f69, commit da3aae9, commit 757d092, commit 960fad9, commit 8577525 (28 Jun 2022) by Ævar Arnfjörð Bjarmason (avar).
See commit b788fc6 (28 Jun 2022) by Glen Choo (chooglen).
(Merged by Junio C Hamano -- gitster -- in commit 361cbe6, 14 Jul 2022)

git-submodule.sh: use "$quiet", not "$GIT_QUIET"

Signed-off-by: Ævar Arnfjörð Bjarmason

Remove the use of the "$GIT_QUIET" variable in favor of our own "$quiet", ever since b3c5f5c ("submodule: move core cmd_update() logic to C", 2022-03-15, Git v2.36.0-rc0 -- merge) we have not used the "say" function in git-sh-setup.sh, which is the only thing that's affected by using "GIT_QUIET".

We still want to support --quiet for our own use though, but let's use our own variable for that.
Now it's obvious that we only care about passing "--quiet" to git submodule--helper, and not to change the output of any "say" invocation.


Plain and simple, to fetch the submodules:

git submodule update --init --recursive

And now proceed updating them to the latest master branch (for example):

git submodule foreach git pull origin master


git pull --recurse-submodules

This will pull all the latest commits.


This works for me to update to the latest commits

git submodule update --recursive --remote --init


In my case, I wanted git to update to the latest and at the same time re-populate any missing files.

The following restored the missing files (thanks to --force which doesn't seem to have been mentioned here), but it didn't pull any new commits:

git submodule update --init --recursive --force

This did:

git submodule update --recursive --remote --merge --force


If you don't know the host branch, make this:

git submodule foreach git pull origin $(git rev-parse --abbrev-ref HEAD)

It will get a branch of the main Git repository and then for each submodule will make a pull of the same branch.


@Jason is correct in a way but not entirely.

update

Update the registered submodules, i.e. clone missing submodules and checkout the commit specified in the index of the containing repository. This will make the submodules HEAD be detached unless --rebase or --merge is specified or the key submodule.$name.update is set to rebase or merge.

So, git submodule update does checkout, but it is to the commit in the index of the containing repository. It does not yet know of the new commit upstream at all. So go to your submodule, get the commit you want and commit the updated submodule state in the main repository and then do the git submodule update.


If you are looking to checkout master branch for each submodule -- you can use the following command for that purpose:

git submodule foreach git checkout master


For me all git submodule did not work. But this worked:

cd <path/to/submodule>
git pull

It downloads and thus updates the third party repo. Then

cd <path/to/repo>
git commit -m "update latest version" <relative_path/to/submodule>
git push

which updates your remote repo (with the link to the last commit repo@xxxxxx).


How to update all git submodules in a repo (two ways to do two very different things!)

Quick summary

# Option 1: as a **user** of the outer repo, pull the latest changes of the
# sub-repos as previously specified (pointed to as commit hashes) by developers
# of this outer repo.
# - This recursively updates all git submodules to their commit hash pointers as
#   currently committed in the outer repo.
git submodule update --init --recursive

# Option 2. As a **developer** of the outer repo, update all subrepos to force
# them each to pull the latest changes from their respective upstreams (ex: via
# `git pull origin main` or `git pull origin master`, or similar, for each
# sub-repo). 
git submodule update --init --recursive --remote

# For both options above: now add and commit these subrepo changes
git add -A
git commit -m "Update all subrepos to their latest upstream changes"

Details

  1. Option 1: as a user of the outer repo, trying to get all submodules into the state intended by the developers of the outer repo:
    git submodule update --init --recursive
    
  2. Option 2: as a developer of the outer repo, trying to update all submodules to the latest commit pushed to the default branch of each of their remote repos (ie: update all subrepos to the latest state intended by the developers of each subrepo):
    git submodule update --init --recursive --remote
    
    ...in place of using git submodule foreach --recursive git pull origin master or git submodule foreach --recursive git pull origin main.

It seems to me that the best answer for both options above is to not use the --merge and --force options I see in some other answers.

Explanation of the options used above:

  • the --init part above initializes the submodule in case you just cloned the repo and haven't done that yet
  • --recursive does this for submodules within submodules, recursively down forever
  • and --remote says to update the submodule to the latest commit on the default branch on the default remote for the submodule. It is like doing git pull origin master or git pull origin main in most cases, for example, for each submodule. If you want to update to the commit specified by the outer-most repo (super repo) instead, leave --remote off.

git submodule foreach --recursive git pull (don't use this--it frequently fails) vs git submodule update --recursive --remote (use this!--it always works)

I left the following comments under this answer. I think they are important so I am putting them in my answer too.

Basically, for some situations, git submodule foreach --recursive git pull might work. For others, git submodule foreach --recursive git pull origin master might be what you need instead. For others, git submodule foreach --recursive git pull origin main might be what you need. And for others still, none of those might work! You might need git submodule foreach --recursive git pull upstream develop, for instance. OR, even worse, there might not be any git submodule foreach command which works for your outer repo, as each submodule might require a different command to update itself from its default remote and default branch. In all cases I can find, however, this does work, including for all cases you might use one of the several git submodule foreach commands I just presented above. So, use this instead:

git submodule update --recursive --remote

Anyway, here are my several comments about that under this answer:

(1/4) @DavidZ, a lot of people think that git submodule foreach git pull and git submodule update --remote are the same thing, with the latter simply being the newer command. They aren't the same thing, however. git submodule foreach git pull will fail under multiple circumstances for which git submodule update --remote works just fine! If your submodule points to a commit hash that doesn't have a branch pointing to it, which is frequently the case in real-life development where you want a particular version of the submodule for your outer repo, then that submodule...

(2/4)...is in a detached HEAD state. In this case, git submodule foreach git pull fails to run git pull on that submodule since a detached HEAD cannot have an upstream branch. git submodule update --remote, however, works just fine! It appears to call git pull origin main on that submodule if origin is the default remote and main is the default branch on that default remote, or git pull origin master, for instance, if origin is the default remote but master is the default branch.

(3/4) Furthemore, git submodule foreach git pull origin master will even fail in many cases where git submodule update --remote works just fine, since many submodules use master as the default branch, and many other submodules use main as the default branch since GitHub changed from master to main recently in order to get away from terms related to slavery in the United States ("master" and "slave").

(4/4) So, I added the explicit remote and branch to make it more clear that they are frequently needed, and to remind people that git pull is frequently not enough, and git pull origin master may not work, and git pull origin main may work when the former doesn't, but also may not even work, and that none of them by themselves are the same as git submodule update --remote, since that latter command is smart enough to just do git pull <default_remote> <default_branch> for you for each submodule, apparently adjusting the remote and branch as necessary for each submodule.

Related, & other research

  1. How to find the primary branch of a repo: https://stackoverflow.com/a/49384283/4561887
  2. How to update each subrepo by running a custom command in it via git submodule foreach <cmd>: https://stackoverflow.com/a/45744725/4561887
  3. man git submodule - then search for foreach, --remote, etc.


Here's an awesome one-liner to update everything to the latest on master:

git submodule foreach 'git fetch origin --tags; git checkout master; git pull' && git pull && git submodule update --init --recursive

Thanks to Mark Jaquith


the simplest way to handle git projects containing submodules is to always add

--recurse-submodules 

at the end of each git command example:

git fetch --recurse-submodules

another

git pull --update --recurse-submodules

etc...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜