开发者

How to complete a git clone for a big project on an unstable connection?

I am trying to git clone the LibreOffice codebase, but at the moment I have an internet connection of about 300kbps and it's just anything but stable. I can ge开发者_运维技巧t the connection back any moment, but then the git clone process already stopped working, and no way to get it running again. Is there some way to have a more failure-resistant git clone download?

One option I considered myself is to download someone else's .git directory, but that is overly dependent of others and doesn't seem like the best possible solution to me.


Two solutions (or rather workarounds) that come to mind are:

  • Use shallow clone i.e. git clone --depth=1, then deepen this clone using git fetch --depth=N, with increasing N. You can use git fetch --unshallow (since 1.8.0.3) to download all remaining revisions.

  • Ask somebody to bundle up to some tagged release (see git-bundle(1) manpage). The bundle itself is an ordinary file, which you can download any way, via HTTP/FTP with resume support, via BitTorrent, via rsync, etc. The you can create clone from bundle, fix configuration, and do further fetches from official LibreOffice repository.


I don't think this is ready yet. There's an old GSoC page that which planned to implement your desired feature. My best bet is, like you suggested download it as a directory. I'm assuming you are able to resume downloads over other protocols.

Restartable Clone

When cloning a large repository (such as KDE, Open Office, Linux kernel) there is currently no way to restart an interrupted clone. It may take considerable time for a user on the end of a small pipe to download the data, and if the clone is interrupted in the middle the user currently needs to start over from the beginning and try again. For some users this may make it impossible to clone a large repository.

Goal: Allow git-clone to automatically resume a previously failed download over the native git:// protocol. Language: C Mentor: Shawn Pearce Suggested by: Shawn Pearce on gmane


Update

Along with the shallow cloning (git clone --depth=1) suggestion in one of the other answers it may be helpful if someone can make a bare repository for you if you can communicate with the provider. You can easily convert the bare repository to a full repository. Also read the comments in that answer as a shallow clone may not always help.


This method uses 3rd party server.

First, do git clone --bare, then rsync -v -P -e ssh user@host:repo.git . You can use msys under Windows.


"Never underestimate the bandwidth of a carrier pigeon and a bundle of SD cards" would be the modern form of this answer. Tar it up, plain old cp -a it, whatever, and mail the damn thing. Find someone willing to take two minutes of their time to drop a thumb drive into an SASE. Find a contact, there, they might even do it for you.


You can "download someone else's .git directory", but with that someone else being the official repository itself. The LibreOffice repositories are available via http, for instance their build.git is at http://anongit.freedesktop.org/git/libreoffice/build.git/ (see http://cgit.freedesktop.org/libreoffice/ for the complete list, the http URL is at the bottom of each repository's page).

What you see at these http URLs is nothing more than a .git directory (actually a "bare" repository, which has only what you would find in the .git directory). It is the same directory the server for the git:// protocol (git daemon) would read. If you make a copy of these directories with a web downloader (for instance wget -m -np), you can clone from your copy and it will work as well as if you had cloned directly from the http repository.

So, what you can do is: for each repository, get a copy of it with your favorite web downloader (which will deal with all the issues with resuming broken downloads), and clone from that copy. When you want to update, use again your favorite web downloader to update your copy, and pull from that copy. Now your clones and updates are as resistant to bad connections as your favorite web downloader is.


I would like to put my 5 cents here. This is actually what helped me to solve this issue

  • Turn off compression
  • Increase http.postBuffer
  • Do a partial clone
  • Navigate to the cloned directory and fetch the rest of the clone
  • Pull the rest
git config --global core.compression 0
git config --global https.postBuffer 524288000
git clone  <your_git_http_url_here> --depth 1
git fetch --unshallow 
git pull --all

This helped me to clone ~3GB repo over the 8Mbps adsl connection, of course I had to perform fetch and pulls few times, but still ...


Increasing buffer size will help you in this problem. Just follow the steps.

  1. Open terminal or Git Bash and with cd go to the location where you wanted to clone repo.

  2. Set compression to 0

    git config --global core.compression 0
    
  3. Set postBuffer size

    git config --global http.postBuffer 1048576000
    
  4. Set maxRequestBuffer size

    git config --global http.maxRequestBuffer 100M
    
  5. Now start clone

    git clone <repo url>
    
  6. Wait till clone completes.


Let's break git clone down into its component parts, and use git checkout to prevent re-downloading files.

When git clone runs, the first few things it does are equivalent to

git init
git remote add origin <repo_url>
git fetch origin <branch>

If you run the above steps manually, and assuming that they completed correctly, you can now run the following as many times as necessary:

git checkout --force <branch>

Note that it will checkout all files each time it's run, but you will not have to re-download files, which may save you a ton of time.


git clone --depth <Number> <repository> --branch <branch name> --single-branch

This command help me (Thanks to Nicola Paolucci)

for example

git clone --depth 1 https://github.com/gokhanmoral/siyahkernel3 --branch ics  --single-branch


If you have access to a 3rd-party server, you could clone there and then copy.


Use a git proxy, such as ngitcached or git-proxy.


This problem bit me too. In my case there is a work-around. It may or may not apply in your case.

I'm using a mobile phone sometimes to initiate git operations on a remote system. If my wi-fi breaks of course the session ends and git drops the whole clone operation without recovering. But since the internet connection from my remote system to the git master is solid there's no need for the clone to stop. All I need is the commonsense to detach the clone from the terminal session. This can be done by using screen/tmux or nohup/daemon. So it's a liveware malfunction in my case.


Use CNTRL Z to stop the cloning. Don't close the terminal put the system/laptop in hibernation and then continue later by fg command. I was facing this same problem today while trying to clone a repo frm github. This came as a time saver for me.


Same problem here - I have a really flaky internet connection with often not more than 10-15 kb/sec :-P

For me the wget way worked very well.

Go to the repository site where the green button "clone or download" is, click it and copy the link of the ZIP download option.

Then insert the link to the wget command:

wget -c -m -np https://github.com/your/repository/archive/master.zip

Works like a charm...


if we assume server's have good band-wide (and you have a server) another answer is to:

  1. create your own server using Server-Side Git Wrapper's
  2. clone it in your server
  3. Zip it using Server-Side Zip Archiver's
  4. download it from and with Server-Side Resume support

but this only works with very basic Web-development experience ;) and also you need git.exe in your sever


The best workaround that worked for me:

I faced the same issue with a bad internet connection. So I came up with the following solution:

I created a small php file on my server to download the package as a zip file:

<?php
$url = "https://codeload.github.com/CocoaPods/Specs/zip/master";
file_put_contents("coco.zip", fopen($url, 'r'));
?>  

<a href="coco.zip">coco.zip</a>

Then download the zip file using any download manager that supports resume.


You can try to use mercurial with the hg-git extension.

If that doesn't work you can can use git fetch <commit-id> to fetch only parts of a remote git repository (you can fetch into an empty git repository, there is no need to create it with clone). But you might to correct the branch configuration (=create local and remote tracking branches) when you use this approach.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜