开发者

Is there a way to easily convert a series of tarballs of a source tree into a git repository?

I'm new to git and I have a moderately large number o开发者_JAVA百科f weekly tarballs from a long running project. Each tarball has on average a few hundred files in it. I'm looking for a git strategy that will allow me to add the expanded contents of each tarball to a new git repository, starting from version 1.001 and going through version 1.650. As of this stage of the project 99.5% of tarball(n) is just a copy of version(n-1) - in other words, a perfect candidate for git. The desired end result is to have only the master branch remaining at the end of the process.

I think I know git well enough to do this "by hand". As I understand it there is no possibility of a merge conflict since there will be no opportunity to change the master before the next version is added and committed. A shell script is my first guess, but I'm not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1. For the purposes of this project the host environment is Ubuntu 10.4, resources available are 8 Gig RAM, 500 Gig Disk space free and 4 CPU processor at 3.ghz .

I don't need someone else to solve the problem but I could use a nudge in the right direction as to how a git expert would approach it. Any advice from someone who's "been there done that" would be appreciated.

Hotei

PS: I have looked at site's suggested "related questions" and found nothing relevant.


Take a look at $GIT_SRC_DIR/contrib/fast-import/import-tars.perl


Regarding this comment:

I'm not sure how well bash will like it when git checkout branch_n gets processed while bash is executing in branch_n-1

Are you concerned about two operations running concurrently and getting in each others' way? This shouldn't be a problem unless you intentionally run operations in parallel.

Assuming the tarballs follow a linear evolution, branching shouldn't come into this at all.

The process should be fairly straightforward:

  1. git init
  2. untar ball _n_
  3. git add --all .; git commit (with appropriate flags)
  4. git tag -a v1.001 -m "Version 1.001."
  5. rm -rf * (to handle deletions in the history; you want to leave .git intact, of course)
  6. goto 2


What I would do in this situation, as you have tarballs that are in the end 'tagged versions':

  1. create empty git repository
  2. extract a tarball to that directory overwriting any files
  3. add all files git add .
  4. git commit -a -m 'version foo'
  5. git tag current version
  6. remove all files
  7. repeat from step 2 for each tarball

In your case it's not necessary to create branches as all your tarballs are distinct, successive versions; each iteration overwrites previous one.


Without having been exactly there, yu should simply:

  • untar an archive anywhere you want
  • rsync it with the git working directory in order to:
    • change the relevant file
    • add the new files from that archive to the working directory
    • remove the files from the working directory that are no linger part of the current archive
  • git add -A
  • git commit -m "archive n"
  • repeat

The idea is not to checkout branch_n+1, but to stay within the same branch, committing each tar content one after the other within the same branch of the same git repo.
Should you truly have somehow two concurrent processes, you could then:

  • git clone the first git repo
  • git branch -b a_new_branch to make sure you isolate that parallel process in its own branch that you will be able to push back to the first repo when done.


Take a look at git-weave. You feed it a directory containing all the expanded tarballs together with a log file containing the sequence and connection between them (it handles branches) with commit messages, it creates a git repository from this.

In your case of some 600 tarballs this looks like a daunting task, you'll probably need to write a script to cobble up the log file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜