开发者

Which is the smarter git protocol, ssh or git(over ssh) or https protocol?

Which is efficient? SSH:// or Git:// (File compression)

I understand in Git, git protocol is smart because there is a protocol agent on both ends of communication to compress the file transfer resulting in faster clone by efficiently using the network bandwidth.

From an O'Reilly book I found the following statements.

For secure, authenticated connections, the Git native 
protocol can be tunneled over an SSH connection using
the following URL templates:

ssh: //[user@]example.com[:port]/path/to/repo.git
ssh: //[user@]example.com/path/to/repo.git
ssh: //[user@]example.com/~user2/path/to/repo.git
ssh: //[user@]example.com/~/path/to/repo.git*

I'm not sure if the author means what he says. He talks of git protocol getting tunneled over SSH.

From my perspective, unless you connect to the git port (agent port), the protocol is not in effect. And SSH is merely an uncompressed file transfer. But as per the author, if we use SSH he says the git protocol is tunneled over it. So is SSH smarter in GIT?

Von C, Thanks for your answer. "Network protocols (HTTP and Git) are generally read-only" Git can be made rw when you run the daemon with --enable=receive-pack.

Following are my concerns.

When they say git protocol is smart, they mean when you execute git clone, Git server agent compresses the data that is sent back to the client, so the clone should be faster. In my use case I'll be setting the Git server in Hongkong and using it on San Jose and other countries as well, So I want to be efficient over network due to latency concerns.

So my question is whe开发者_Python百科n I use git clone ssh://user@server/reposloc do I get the benefits of git protocol also? As per O'Reilly author book he means git is tunneled over ssh, then how does git protocol work when I don't have git daemon running on the server.

So using SSh://xyz... does it give both the benefit of ssh and git protocols?

Appreciate your answers in advance.


Update 2010-2014:

Both ssh and https are equivalent, since Git 1.6.6+ (2010) and the implementation of smart http protocol:

Which is the smarter git protocol, ssh or git(over ssh) or https protocol?

You now can use ssh or https for read/write access to your repos.
You can also detect if your remote server supports smart http.
Add the right environment variable if you have to use a proxy.

Q3 2015, as Yousha Aleayoub mentions in the comments:

GitHub "Which remote URL should I use?"

The https:// clone URLs are available on all repositories, public and private.
They are smart, so they will provide you with either read-only or read/write access, depending on your permissions to the repository.

The git-http-backend is the:

simple CGI program to serve the contents of a Git repository to Git clients accessing the repository over http:// and https:// protocols.
The program supports clients fetching using both the smart HTTP protocol and the backwards-compatible dumb HTTP protocol, as well as clients pushing using the smart HTTP protocol.


Original answer (July 2010):

From the Pro Git Book:

Probably the most common transport protocol for Git is SSH.
This is because SSH access to servers is already set up in most places — and if it isn’t, it’s easy to do.

SSH is also the only network-based protocol that you can easily read from and write to. The other two network protocols (HTTP and Git) are generally read-only, so even if you have them available for the unwashed masses, you still need SSH for your own write commands.

SSH is also an authenticated network protocol; and because it’s ubiquitous, it’s generally easy to set up and use.

So it is not "smarter" than Git protocol, just a complementary protocol for certain features not addressed by the Git protocol.

The downside of the Git protocol is the lack of authentication. It’s generally undesirable for the Git protocol to be the only access to your project.
Generally, you’ll pair it with SSH access for the few developers who have push (write) access and have everyone else use git:// for read-only access

It also requires firewall access to port 9418, which isn’t a standard port that corporate firewalls always allow. Behind big corporate firewalls, this obscure port is commonly blocked.

(that is why in my shop, I need to use ssh+git and not just git, even for read access: 9418 is blocked...)


Take a look at the second part of this page

The only "dumb" protocol is straight HTTP, which requires no special effort on the server. In both the git:// and ssh:// protocols, a git upload-pack process (which is not a daemon) is forked on the server that communicates with the client who's running git fetch-pack. In both ssh:// and git://, you get "smart" communication.


(I wanted to add a comment to @erjiang answer, but I'm not allowed because I don't have enough StackOverflow reputation.)

It seems that since Git 1.6.6, HTTP is not "dumb" anymore. From Git website's blog:

As of the release of version 1.6.6 at the end of last year (2009), however, Git can now use the HTTP protocol just about as efficiently as the git or ssh versions


When you access git over ssh it just tunnels the git protocol over ssh, way easier to set up and way more secure, this the preferred way to access remote repositories.

This is actually "smarter" than the bare git protocol, because it can enforce user authentication via ssh mechanisms. git does all the compressing and what not on the client regardless of the transport layer, and it decompresses it on the server.

The "git" server doesn't do this, all this happens when using ssh as well. the git server should be avoided if you want to be able to write to the remote repository. if you want read only access git or HTTP transports are "OK", but if you have developers that need to write to the respository you should just use ssh. Setting up tunnels for the git server is just adding to complexity and configuration that will be brittle and gain you nothing.


From Wikipedia:

To set up an SSH tunnel, one configures an SSH client to forward a specified local port to a port on the remote machine. Once the SSH tunnel has been established, the user can connect to the specified local port to access the network service. The local port need not have the same port number as the remote port.

If you need some kind of ASCII art representation:

Git Data ---> [SSH encrypts data] ----- Internet -----> [SSH decrypts data] ----> Git Data


The various protocols are at different levels (e.g. the ISO 7 layer model), so you can have both, just as you could be connected by Wires or Wirelessly, or fibre.


A quick search of the pack files during a git clone will list a single pack file that is sent from the server to the client. The pack file is listed under .git/objects/pack/pack-XXXX.pack. The files to be sent from the server to the client are first packed, compressed. Then there is a single copy of the packed contents. This can be seen when comparing the packed files using lsof -p on the server side and lsof -p on the client side. In the sample case a 200MB files is uploaded from the server to the client....

1) Server side 
   lsof -p 8079 | grep pack
   git-uploa 8079  REG  253,2 277896169 5140075 /home/server/work/work0617/.git/objects/pack/pack-492945ae602a975d46df133f6ded9642146fb6a7.pack
   git-uploa 8079  REG  253,2   1703472 5140076 /home/server/work/work0617/.git/objects/pack/pack-492945ae602a975d46df133f6ded9642146fb6a7.idx
   git-uploa 8079  REG  253,2 277896169 5140075 /home/server/work/work0617/.git/objects/pack/pack-492945ae602a975d46df133f6ded9642146fb6a7.pack

2) Client side
   lsof -p 3140 | grep pack
   git     3140  3u   REG    8,1 101031935 3681610 /home/client/work/work0617/work0617/.git/objects/pack/tmp_pack_pRfYPa

 3) The server side pack file 277MB. The file on the client side is 101MB and growing. So a single compressed file is copied over.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜