Converting a Subversion repository to Git

(7 steps to migrate a complete mirror of svn in git)

When I first realized that I needed a version control system, the best system at the time was CVS. (No, really.) Subversion was nearing 1.0, so I waited for its release and then used it everywhere. Well, that was 2003. Time for a change.

This past year, it became obvious that there were many Git users within the Drupal community, so Drupal has decided to move to Git. Since then I've started learning and researching the best ways to convert all my development to a Git-based workflow. So far… it rocks.

svn boxes go into the factory; git ponies come out.

When getting my toes wet in Git, I started using an extremely useful git command called git-svn, which primarily can be used to checkout a Subversion repository to a local Git repo and then push your changes back to the original Subversion repository. That worked great as a stop-gap measure, but now I’m ready to chuck all my svn repos and convert them to Git.

Supposedly, git-svn can also be used to convert a Subversion repo to Git. Unfortunately, after reading the git-svn docs carefully and several useful resources (like the slightly-obscure Git FAQ, the Git Community Book, Paul Dowman’s blog and Alexis Midon’s blog), it became apparent that all the resources are piecemeal and nothing gives you the BIG HONKIN’ PICTURE. So here it is, ponies and all…

A complete guide to git-svn conversions

Our goal is to do a complete conversion of our Subversion repository and end up with a bare Git repository acceptable for sharing with others (privately or publicly). Bare repositories are ones without a local working checkout of the files available for modifications. They are the recommended format for shared repositories.

1. Retrieve a list of all Subversion committers

Subversion simply lists the username for each commit. Git’s commits have much richer data, but at its simplest, the commit author needs to have a name and email listed. By default the git-svn tool will just list the SVN username in both the author and email fields. But with a little bit of work, you can create a list of all SVN users and what their corresponding Git name and emails are. This list can be used by git-svn to transform plain svn usernames into proper Git committers.

From the root of your local Subversion checkout, run this command:

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

That will grab all the log messages, pluck out the usernames, eliminate any duplicate usernames, sort the usernames and place them into a “authors-transform.txt” file. Now edit each line in the file. For example, convert:

jwilkins = jwilkins <jwilkins>

into this:

jwilkins = John Albin Wilkins <johnalbin@example.com>

2. Clone the Subversion repository using git-svn

git svn clone [SVN repo URL] --no-metadata -A authors-transform.txt --stdlayout ~/temp

This will do the standard git-svn transformation (using the authors-transform.txt file you created in step 1) and place the git repository in the “~/temp” folder inside your home directory.

3. Convert svn:ignore properties to .gitignore

If your svn repo was using svn:ignore properties, you can easily convert this to a .gitignore file using:

cd ~/temp
git svn show-ignore > .gitignore
git add .gitignore
git commit -m 'Convert svn:ignore properties to .gitignore.'

4. Push repository to a bare git repository

First, create a bare repository and make its default branch match svn’s “trunk” branch name.

git init --bare ~/new-bare.git
cd ~/new-bare.git
git symbolic-ref HEAD refs/heads/trunk

Then push the temp repository to the new bare repository.

cd ~/temp
git remote add bare ~/new-bare.git
git config remote.bare.push 'refs/remotes/*:refs/heads/*'
git push bare

You can now safely delete the ~/temp repository.

5. Rename “trunk” branch to “master”

Your main development branch will be named “trunk” which matches the name it was in Subversion. You’ll want to rename it to Git’s standard “master” branch using:

cd ~/new-bare.git
git branch -m trunk master

6. Clean up branches and tags

git-svn makes all of Subversions tags into very-short branches in Git of the form “tags/name”. You’ll want to convert all those branches into actual Git tags using:

cd ~/new-bare.git
git for-each-ref --format='%(refname)' refs/heads/tags |
cut -d / -f 4 |
while read ref
do
  git tag "$ref" "refs/heads/tags/$ref";
  git branch -D "tags/$ref";
done

This step will take a bit of typing. :-) But, don’t worry; your unix shell will provide a > secondary prompt for the extra-long command that starts with git for-each-ref.

7. Drink

If you’ve got just the one Subversion repo to convert…Congratulations! You’re done. Go party. Just take your “new-bare.git” folder and share it.

If, on the other hand, you’ve got a bunch of Subversion repositories to convert, you’ve got a long, long night in front of you if you want to convert them all by hand. You’re going to need a drink (or several).

Since I had 141 svn repositories that needed to be converted, I wrote a set of wrapper scripts to ease the work… which I’ll discuss in my next blog post.

Comments

Pro Tip for Windows users: Having been through this recently myself, don't bother with git-svn on Windows, instead get yourself a Linux VM and VMware Player and do your conversion on that. The scraping from Subversion ran about 10 times faster for me than running it "natively" on Windows and I had none of the quirks that I was finding with git-svn on Windows.

Hi John, I also didn't find the existing guides totally satisfactory, so I wrote my own, here: http://ao2.it/wiki/How_to_migrate_an_SVN_repository_to_Git

As you can see my use case involved gitosis for the repositories administration, and I had an unusual layout to convert too, but the base is the same after all.

The svn:ignore bits are interesting, I think I'll add that to my wiki, if you don't anticipate me :)

Regards,
Antonio

Switch and Drop Legacy? Or, you could do as I do and drop your past SVN history.

Not the best solution, of course, but you can keep SVN running somewhere if you need to go back in time. However, I picked a good point where development was at a slowdown, scrapped SVN, and set everything up in a fresh Git repository :)

Oops. I just noticed the awk command in step 1 is slightly off. If you have a space character in your SVN username (for example "(no author)", it will only include the part of the username before the space. This is the proper awk command:


svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

I’ve corrected the article above.

Thanks for your post, it helped a lot in understanding what is going on and what to do.

Can anyone explain what --no-metadata does exactly? I've read that it excludes some info during transfers, but did not find what exactly.

Sometimes it helps to read the documentation. It is not recommended to use --no-metadata, even for one-way imports:

This gets rid of the git-svn-id: lines at the end of every commit.

This option can only be used for one-shot imports as git svn will not be able to fetch again without metadata. Additionally, if you lose your .git/svn/*/.rev_map. files, git svn will not be able to rebuild them.

The git svn log command will not work on repositories using this, either. Using this conflicts with the useSvmProps option for (hopefully) obvious reasons.

This option is NOT recommended as it makes it difficult to track down old references to SVN revision numbers in existing documentation, bug reports and archives. If you plan to eventually migrate from SVN to git and are certain about dropping SVN history, consider git-filter-branch(1) instead. filter-branch also allows reformating of metadata for ease-of-reading and rewriting authorship info for non-"svn.authorsFile" users."

@Balu

You need to re-read my blog post and the git-svn docs carefully, because you've somehow misread them.

The docs about --no-metadata you quoted directly say “This option can only be used for one-shot imports”. [emphasis mine] One shot imports are precisely the point of this blog post. I fully expect you to toss the svn repo in the bin after doing this conversion to git.

This option is NOT recommended as it makes it difficult to track down old references to SVN revision numbers in existing documentation, bug reports and archives.

That is actually a good point. But svn commit numbers are not something that I personally needed to preserve. For those of you who do need to retain svn commit numbers, I recommend following Balu’s advice.

hi, thanks for your guide. You may want to add, that it works from git 1.7, not from git 1.6, where, e.g., "git init" has no directory argument yet.

@Valery: Good point about Git 1.6. But I consider any version of Git below 1.7 barbaric. ;-)

Wouldn't it be better to actually tag the second to last commit in the tag: "refs/heads/tags/$ref"^ ? Otherwise, we're tagging the commit that says "Tagging for X version". The difference being that in Git the tags themselves are not commits and we'd then be able to see the tags in tools like GitX when looking at the branch history from where the tag was made.

Otherwise, we're tagging the commit that says "Tagging for X version".

Unfortunately, because of the way Subversion works, you can make changes while making a tag. So an svn tag may contain a changeset in addition to the tag name. :-p

That's why the underlying git-svn command makes a git tag where it does.

Hey John.

I converted 3 svn repos to git last night, to join the rest of the repos. I noticed that in step 3, I created the .gitignore file, however when I push to the -bare.git in step 4, this commit isn't pushed as well. You have any ideas what might be up?

(What I ended up doing is, in leu of drinking, was pulling from the bare, commiting the .gitignore, pushing back to -bare.git, then push --mirror to the central server.)

@Terin

Step 3 works fine. Did you miss the command to add and commit the .gitignore file to the ~/temp repository?

I got this error when tryng to create the .gitignore file.

$ git svn show-ignore > .gitignore
config --get svn-remote.svn.fetch :refs/remotes/git-svn$: command returned error: 1

when i added -i trunk , it worked fine. maybe this can help other if they have the same problem.

L ars

Thanks Lars D! I had the same error and your addition fixed it. The command that worked for me was:
git svn show-ignore -i trunk > .gitignore

Awesome, thanks John!

Thanks so much for this very useful article. I used it to successfully convert two large repos from Subversion to Git.

I ended up leaving off the --ignore-metadata as we have references in our BTS and other systems to the Subversion revision.

Also, I added the --shared option to init the bare repo, as this sets up the correct permissions for a repo shared within a group.

Again, thanks!

Please, note that the "while-do" cycle in step 6, as it is currently written, will only work if you're using a bash shell; you'll get a syntax error otherwise.

svn2git probably works great for most repos, but it didn't for me. When it finished I noticed I was missing several recent commits (and the changes from them!). I didn't investigate to see how deeply the problem went, I just used John's git-svn-migrate, which worked like a charm.

I am trying to do the same thing regarding Git.Better later then never ;)

Very useful post ,thanks for sharing it

Blogs are good for every one where we get lots of information for any topics nice job keep it up !!!
Essay Help

hi, thanks for your guide. You may want to add, that it works from git 1.7, not from git 1.6, where, e.g., "git init" has no directory argument yet.Pizzateig

@Valery: Good point about Git 1.6. But I consider any version of Git below 1.7 barbaric. ;-)Pizzateig

Thanks for your post, it helped a lot in understanding what is going on and what to do.

Can anyone explain what --no-metadata does exactly? I've read that it excludes some info during transfers, but did not find what exactly.

how to get the man you want

I think I could disagree with the main ideas. I won't share it with my friends.. You should think of other ways to express your ideas. Free Product Advertising

Excellent read I just passed this onto a colleague who was doing a little research on that. And he actually bought me lunch because I found it for him smile So let me rephrase that: Thanks for lunch! Corner Beads

What an awesome content is this and definitely it makes identify each and everyone who research this. Please keep providing so awesome and attractive suggestions.I really appreciate it.Great components you have always informed us. Due Date Calculator

I cannot get the sock down to forty stitches. I have taken to decreasing my stitches instead. I have ripped out the sock at least three times. vessel sinks

A Valentine birthday by its very valentine messages nature automatically lends itself to be quite the celebration.birthday quotes Seize the day and make your special Birthday Valentine feel extra special.

This might have been a portal for studies so they can handle it like that. It is a fine way to take heed to what people are announcing. imoveis jardim marajoara

This is a great share which will surely help me to know such important thing. Eagerly waiting to know some more about some more topics.I recognize with your outcomes and will consistently look generate to your future messages. hcg canada

Add new comment