When I first realized that I needed a version control system, the best system at the time was CVS. (No, really.) Subversion was nearing 1.0, so I waited for its release and then used it everywhere. Well, that was 2003. Time for a change.
This past year, it became obvious that there were many Git users within the Drupal community, so Drupal has decided to move to Git. Since then I've started learning and researching the best ways to convert all my development to a Git-based workflow. So far… it rocks.
When getting my toes wet in Git, I started using an extremely useful git command called git-svn, which primarily can be used to checkout a Subversion repository to a local Git repo and then push your changes back to the original Subversion repository. That worked great as a stop-gap measure, but now I’m ready to chuck all my svn repos and convert them to Git.
Supposedly, git-svn
can also be used to convert a Subversion repo to Git. Unfortunately, after reading the git-svn docs carefully and several useful resources (like the slightly-obscure Git FAQ, the Git Community Book, Paul Dowman’s blog and Alexis Midon’s blog), it became apparent that all the resources are piecemeal and nothing gives you the BIG HONKIN’ PICTURE. So here it is, ponies and all…
A complete guide to git-svn conversions
Our goal is to do a complete conversion of our Subversion repository and end up with a bare Git repository acceptable for sharing with others (privately or publicly). Bare repositories are ones without a local working checkout of the files available for modifications. They are the recommended format for shared repositories.
1. Retrieve a list of all Subversion committers
Subversion simply lists the username for each commit. Git’s commits have much richer data, but at its simplest, the commit author needs to have a name and email listed. By default the git-svn
tool will just list the SVN username in both the author and email fields. But with a little bit of work, you can create a list of all SVN users and what their corresponding Git name and emails are. This list can be used by git-svn to transform plain svn usernames into proper Git committers.
From the root of your local Subversion checkout, run this command:
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" "$2">"}' | sort -u > authors-transform.txt
That will grab all the log messages, pluck out the usernames, eliminate any duplicate usernames, sort the usernames and place them into a “authors-transform.txt” file. Now edit each line in the file. For example, convert:
jwilkins = jwilkins
into this:
jwilkins = John Albin Wilkins
2. Clone the Subversion repository using git-svn
git svn clone [SVN repo URL] --no-metadata -A authors-transform.txt --stdlayout ~/temp
This will do the standard git-svn
transformation (using the authors-transform.txt file you created in step 1) and place the git repository in the “~/temp” folder inside your home directory.
3. Convert svn:ignore properties to .gitignore
If your svn repo was using svn:ignore
properties, you can easily convert this to a .gitignore
file using:
cd ~/temp
git svn show-ignore > .gitignore
git add .gitignore
git commit -m 'Convert svn:ignore properties to .gitignore.'
4. Push repository to a bare git repository
First, create a bare repository and make its default branch match svn’s “trunk” branch name.
git init --bare ~/new-bare.git
cd ~/new-bare.git
git symbolic-ref HEAD refs/heads/trunk
Then push the temp repository to the new bare repository.
cd ~/temp
git remote add bare ~/new-bare.git
git config remote.bare.push 'refs/remotes/*:refs/heads/*'
git push bare
You can now safely delete the ~/temp repository.
5. Rename “trunk” branch to “master”
Your main development branch will be named “trunk” which matches the name it was in Subversion. You’ll want to rename it to Git’s standard “master” branch using:
cd ~/new-bare.git
git branch -m trunk master
6. Clean up branches and tags
git-svn
makes all of Subversions tags into very-short branches in Git of the form “tags/name”. You’ll want to convert all those branches into actual Git tags using:
cd ~/new-bare.git
git for-each-ref --format='%(refname)' refs/heads/tags |
cut -d / -f 4 |
while read ref
do
git tag "$ref" "refs/heads/tags/$ref";
git branch -D "tags/$ref";
done
This step will take a bit of typing. :-) But, don’t worry; your unix shell will provide a >
secondary prompt for the extra-long command that starts with git for-each-ref
.
7. Drink
If you’ve got just the one Subversion repo to convert…Congratulations! You’re done. Go party. Just take your “new-bare.git” folder and share it.
If, on the other hand, you’ve got a bunch of Subversion repositories to convert, you’ve got a long, long night in front of you if you want to convert them all by hand. You’re going to need a drink (or several).
Since I had 141 svn repositories that needed to be converted, I wrote a set of wrapper scripts to ease the work… which I’ll discuss in my next blog post.
Comments63
Pro Tip for Windows users:
Pro Tip for Windows users: Having been through this recently myself, don't bother with git-svn on Windows, instead get yourself a Linux VM and VMware Player and do your conversion on that. The scraping from Subversion ran about 10 times faster for me than running it "natively" on Windows and I had none of the quirks that I was finding with git-svn on Windows.
Hi John, I also didn't find
Hi John, I also didn't find the existing guides totally satisfactory, so I wrote my own, here: http://ao2.it/wiki/How_to_migrate_an_SVN_repository_to_Git
As you can see my use case involved gitosis for the repositories administration, and I had an unusual layout to convert too, but the base is the same after all.
The svn:ignore bits are interesting, I think I'll add that to my wiki, if you don't anticipate me :)
Regards,
Antonio
Switch and Drop Legacy? Or,
Switch and Drop Legacy? Or, you could do as I do and drop your past SVN history.
Not the best solution, of course, but you can keep SVN running somewhere if you need to go back in time. However, I picked a good point where development was at a slowdown, scrapped SVN, and set everything up in a fresh Git repository :)
awk command is slightly wrong
Oops. I just noticed the awk command in step 1 is slightly off. If you have a space character in your SVN username (for example "(no author)", it will only include the part of the username before the space. This is the proper awk command:
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt
I’ve corrected the article above.
Did you look at this existing
Did you look at this existing tool? http://github.com/nirvdrum/svn2git
Thanks
Thanks for your post, it helped a lot in understanding what is going on and what to do.
Can anyone explain what --no-metadata does exactly? I've read that it excludes some info during transfers, but did not find what exactly.
No-Metadata
Sometimes it helps to read the documentation. It is not recommended to use --no-metadata, even for one-way imports:
This option is NOT recommended as it makes it difficult to track down old references to SVN revision numbers in existing documentation, bug reports and archives. If you plan to eventually migrate from SVN to git and are certain about dropping SVN history, consider git-filter-branch(1) instead. filter-branch also allows reformating of metadata for ease-of-reading and rewriting authorship info for non-"svn.authorsFile" users."
@Balu
@Balu
You need to re-read my blog post and the git-svn docs carefully, because you've somehow misread them.
The docs about
--no-metadata
you quoted directly say “This option can only be used for one-shot imports”. [emphasis mine] One shot imports are precisely the point of this blog post. I fully expect you to toss the svn repo in the bin after doing this conversion to git.That is actually a good point. But svn commit numbers are not something that I personally needed to preserve. For those of you who do need to retain svn commit numbers, I recommend following Balu’s advice.
your recipe works from git 1.7
hi, thanks for your guide. You may want to add, that it works from git 1.7, not from git 1.6, where, e.g., "git init" has no directory argument yet.
@Valery: Good point about Git
@Valery: Good point about Git 1.6. But I consider any version of Git below 1.7 barbaric. ;-)
Tagging the last real commit
Wouldn't it be better to actually tag the second to last commit in the tag: "refs/heads/tags/$ref"^ ? Otherwise, we're tagging the commit that says "Tagging for X version". The difference being that in Git the tags themselves are not commits and we'd then be able to see the tags in tools like GitX when looking at the branch history from where the tag was made.
Re: tagging the SVN tag
Unfortunately, because of the way Subversion works, you can make changes while making a tag. So an svn tag may contain a changeset in addition to the tag name. :-p
That's why the underlying git-svn command makes a git tag where it does.
gitignore
Hey John.
I converted 3 svn repos to git last night, to join the rest of the repos. I noticed that in step 3, I created the .gitignore file, however when I push to the -bare.git in step 4, this commit isn't pushed as well. You have any ideas what might be up?
(What I ended up doing is, in leu of drinking, was pulling from the bare, commiting the .gitignore, pushing back to -bare.git, then push --mirror to the central server.)
Adding .gitignore
@Terin
Step 3 works fine. Did you miss the command to add and commit the .gitignore file to the ~/temp repository?
error when getting ignore
I got this error when tryng to create the .gitignore file.
$ git svn show-ignore > .gitignore
config --get svn-remote.svn.fetch :refs/remotes/git-svn$: command returned error: 1
when i added -i trunk , it worked fine. maybe this can help other if they have the same problem.
L ars
Thanks Lars D! I had the same
Thanks Lars D! I had the same error and your addition fixed it. The command that worked for me was:
git svn show-ignore -i trunk > .gitignore
Awesome, thanks John!
Awesome, thanks John!
Very useful!
Thanks so much for this very useful article. I used it to successfully convert two large repos from Subversion to Git.
I ended up leaving off the --ignore-metadata as we have references in our BTS and other systems to the Subversion revision.
Also, I added the --shared option to init the bare repo, as this sets up the correct permissions for a repo shared within a group.
Again, thanks!
use the bash!
Please, note that the "while-do" cycle in step 6, as it is currently written, will only work if you're using a bash shell; you'll get a syntax error otherwise.
svn2git probably works great
svn2git probably works great for most repos, but it didn't for me. When it finished I noticed I was missing several recent commits (and the changes from them!). I didn't investigate to see how deeply the problem went, I just used John's git-svn-migrate, which worked like a charm.
I see in your git-svn-migrate
I see in your git-svn-migrate.sh script you have added another line that pushes the .gitignore commit. I had the same problem as Terin until I found that I had to do this after the git push bare command
git push bare master:trunk
Hey,
Hey,
Could you perhaps explain why you need the bare repo step ? I found another article which did the same, and they used yours as a reference … what difference does it make, if I just add a remote to the "temporary" repo after conversion and cleanup ? Why would I need an intermediary one ?
Thanks !
Hi Greg!
Hi Greg!
Basically, the issue is that git-svn creates a lot of overhead in order to maintain the "svn-ness" of the repository. By pushing just the “refs/remotes/*:refs/heads/*” references to a bare repository, you end up purging all of the svn remnants and having a cleaner repository.
Hey thanks John.
Hey thanks John.
Right, I hadn't realized you were doing a "selective" push there. I am using git-svn-abandon for cleaning up, I assume it does something similar, if not the exact same :)
Oh and one more thing; the
Oh and one more thing; the svn:ignore > .gitignore conversion - am I missing something (again) or shouldn't you be doing this on all (or some) of your branches instead just master ?
Thanks for these great
Thanks for these great instructions, they worked for me when migrating from a beanstalk svn repo to a bitbucket git repo.
I followed your instructions
I followed your instructions and got the following error with "git push bare":
" No refs in common and none specified; doing nothing.
Perhaps you should specify a branch such as 'master'.
fatal: The remote end hung up unexpectedly
error: failed to push some refs to '/Users/bodirsky/new-bare.git' "
Any ideas?
Thank you for this article,
Thank you for this article, it saved me a couple hours of my life.
Thanks John for this great
Thanks John for this great tutorial. It was a great help, and i only hat to do some minor changes on this workflow, i.e. a new latest-svn tag.
But also I have problems with the .gitignore file which does not go into the repo (or into the correct branch).
Here you see what I did and that everything completed without error, but in the final ls there is just no .gitignore: http://pastebin.com/VhthD4VN
This is really AWESOME! Works
This is really AWESOME! Works like a charm!
Thanks a lot for the great tutorial.
FYI, only Windows this
FYI, only Windows this command does not work:
git config remote.bare.push 'refs/remotes/*:refs/heads/*'
It works if you remove the single quotes:
git config remote.bare.push refs/remotes/*:refs/heads/*
Would be fantastic if there were Windows methods of getting the username mappings and cleaning up branches and tags. Otherwise, thanks for the writeup!
Hi John,
Hi John,
Great article, it was a massive help when converting my old svn repos into git. It ran without any modifications on Cygwin. I just thought I'd mention that I got the following error when converting one of my svn repo to git:
'fatal: refs/remotes/trunk: not a valid SHA1'.
It happened in srep 2. The problem turned out the be the fact that my SVN repo wasn't standard as the root directory wasn't trunk but was a custom name. This confused git and it didn't know where the master branch needed to be. I fixed this problem by passing in the parameter --trunk= in step 2 where is the directory that is your primary branch.
Hope that helps someone!
I did the tag conversion like
I did the tag conversion like so:
git branch -ar | grep '^ tags/' | sed -r 's|^\s*tags/(\S*)\s*$|git checkout tags/\1 \&\& git tag \1|' | sh
git-svn also found some branches, which I took care of here:
sed -r 's|^\s*tags/(\S*)\s*$|git checkout tags/\1 \&\& git tag \1|'
You could also add something
You could also add something like this to the end of your migrate script to push repos to github, changing REPO to the placeholder for the current name:
#curl -u 'USER:PASS' https://api.github.com/user/repos -d '{"name":"REPO"}'
#git remote add origin git@github.com:USER/REPO.git
#git push origin master
#git push --all
#git push --tags
Nice tutorial. But please
Nice tutorial. But please note that if your SVN strucutre is not standard and you do not use "tags" for your tags but i.e. "tag" then step #2 should look like this (in my case NAME was "tag" (w/o quotes)):
git svn clone [SVN repo URL] --tags=NAME --no-metadata -A authors-transform.txt --stdlayout ~/temp
I removed --stdlayout and it
I removed --stdlayout and it did able to import properly then rest was ok. Hope that helps...
Great tool! I am using the
Great tool! I am using the bash scrips from John's git-svn-migrate. A lot easier than typing...and allows me to setup a few test runs and repeat them...
My problem now is that one of my repositories is being difficult. I am getting the follwing
r3678 = e0fecdab6326c07f41c85cd9199645b52d3ea83c (refs/remotes/branches)
error: there are still refs under 'refs/remotes/tags'
fatal: Cannot lock the ref 'refs/remotes/tags'.
update-ref -m r3679 refs/remotes/tags 77d2e062df2ada907635f086f4942c3a5d01f48f: command returned error: 128
- Converting svn:ignore properties into a .gitignore file...
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions
rev-list --first-parent --pretty=medium HEAD: command returned error: 128
Since I am an extremely new GIT user, any advice on how I should tackle this repository?
Thanks!
Excellent guide - had to
Excellent guide - had to convert a repo preserving authors, and svn was archived alongside rotary dial in my brain. Thanks!
It's likely been a while
It's likely been a while since you've updated this, but I'd like to let you know that this guide is still relevant and still works. I adapted it to take an existing svn repo on the web, check it out, use your guide to convert it to a local git repo, and then push it to github. The only thing that I'd like to add is that after you convert the tag-branches to real tags, you need to run
git push --tags
for the tags to appear on github. I'm not sure how this affects a local repo. I guess after the tags are created, they need to be committed?Thanks, this guide is
Thanks, this guide is excellent. I ran it on Ubuntu 12.04, with git 1.7.9.5. Here are the things I had to do differently, mostly based on the discussion here:
1. The standard git installation does not include git-svn, so:
apt-get install git-svn
2. In Step 1, reading the log on my local machine failed to to retrieve all author names. So instead, I ran:
svn log -q REPO-URL
3. I saw the same error as others on this thread, and in Step 3 had to use:
git svn show-ignore -i trunk > .gitignore
4. The .gitignore commit did not make its way into the bare repository. I had to add at the end of Step 4:
git push bare master:trunk
Hey John,
Hey John,
very nice tutorial. I just ended up in a git repo with a complete history for each branch that was in svn. So i have lets say a branch "master" and "v1.1". Both have the same first 1000 commits and at a certain point they differ. (For sure the point where the v1.1 branch was created from trunk). Then the v1.1 branch as a few more commits and then ends. The master-branch has many more commits and ends i the last commit svn had. So basically it seems that git did not extract from where the svn-branches came and were they did go. Is this behavior "correct" because of the nature of svn's repositories? Is there an (easy) way to fix this?
Thanks a lot
@John: Thx a lot for these
@John: Thx a lot for these instructions! Could you please correct the error in "chapter" 4 and add "git push bare master:trunk" after "git push bare"?
For all who are using a single svn repository containing multiple projects, you have to specify the layout like this:
git svn clone file://${SVNROOT} --no-metadata -A ${AUTHORS} --trunk="${SVN_PROJECT}/trunk" --branches="${SVN_PROJECT}/branches" --tags="${SVN_PROJECT}/tags" ${OUTPUTDIR}
Does not seem to capture
Does not seem to capture merges.
Thanks, this guide helped me
Thanks, this guide helped me successfully convert a svn with 20K revisions from SVN to GIT, using windows and powershell.
For cleaning up branches to tags, step 6, the following powershell scripts does the job
& git for-each-ref --format='%(refname)' refs/heads/tags | % {
#Extract the 4th field from every line
$_.Split("/")[3]
} | % {
#Foreach value extracted in the previous loop
& git tag $_ "refs/heads/tags/$_"
& git branch -D "tags/$_"
}
Powershell command from Frode. F. - http://stackoverflow.com/questions/14778617/convert-all-subversion-bran…
This was really helpful,
This was really helpful, however one small thing: In step 4 I found I had to do
git config remote.bare.push 'refs/remotes/origin/*:refs/heads/*'
To get everything to work.
I followed the steps and in
I followed the steps and in the end I do not see my Branch and Tags imported.
git branch -a
* master
remotes/svn/trunk
I just see trunk/ being imported.
SVN structure:
trunk/
branches/
tags/
Any ideas?
Thank you for writing this
Thank you for writing this article, it helped me convert a Google code project to Git Hub. I had to do it manually and remove a file > 100mb because of git hubs limits it wouldn't auto import. Very detailed, thank you.
git branch -m trunk master
git branch -m trunk master
replies with error "error: refname refs/heads/trunk not found"I fixed this by doing
cp -Rf refs/heads/origin/* refs/heads
beforegit branch -m trunk master
.git 2.3.2, svn 1.8.13 on OSX
I was getting the follow
I was getting the follow error after the branch rename command
error: refname refs/heads/trunk not found
fatal: Branch rename failed
so I changed the setting for the push command to
git config remote.bare.push 'refs/remotes/origin/*:refs/heads/*'
Not sure if that is due to my particular setup or a chang in git svn.
Everything works fine until
Everything works fine until step5. When I run
git branch -m trunk master
All that is output is:
error: refname refs/heads/trunk not found
fatal: Branch rename failed
Is there something I am missing?
you need to add in step 2 the
you need to add in step 2 the command
sudo apt-get install git-svn
in order to use git svn
This is awesome! I didn't
This is awesome! I didn't even consider preserving history, but its great to know that it possible, and it did what I needed it to.
Six years later, and on git 2
Six years later, and on git 2.9.3, some problems I experience with this guide:
My remotes folder in the repository that git svn clone creates actually looks like:
.git/refs/removes/origin/branch1
.git/refs/removes/origin/branch2
etc.
With the suggested setting, these branches will be named "origin/branch1" also in the "bare" repository. Is that intentional really? Looks wrong to me. If I specify 'refs/remotes/origin/*:refs/heads/*' instead, the branches will avoid the "origin" prefix.
In any case, the above will only push the origin branches. The local commit of the conversion from svn:ignore to .getignore will be on the master branch, which is not referenced from any of the origin branches identified by the refspec in remote.base.push, so this commit will not be pushed by git push bare, and thus not available in the bare repository. So we need to push master explicitly. Then we will get both a master and a trunk branch in "bare" which are the same except for the additional .gitignore commit on master. So instead of renaming trunk to master, we can just remove trunk.
We can specify the refspec on the command for push, and we don't seem to need to bother with setting symbolic-ref in bare before pushing, so in summary, these are the commands I ended up using instead of step 2 to 5:
git svn clone https://.../svn/Repo -A authors-transform.txt --stdlayout --preserve-empty-dirs RepoGit
git init --bare RepoGitBare
cd RepoGit
git svn show-ignore > .gitignore
git add .gitignore
git commit -m 'Convert svn:ignore properties to .gitignore.'
git remote add bare ../RepoGitBare
git push bare refs/remotes/origin/*:refs/heads/* master
cd ../RepoGitBare
git branch -d trunk
For some reason, the trunk
For some reason, the trunk branch ended up with a different name (see below), so I had to use
git branch -m origin/trunk master
. I think it worked, but I'm not sure...> git branch -r
bare/origin/polz
bare/origin/tags/pre-polz
bare/origin/trunk
origin/polz
origin/tags/pre-polz
origin/trunk
Thanks for suggesting this. I
Thanks for suggesting this. I have struggled quite some time with this on my Windows 10 machine getting strange connection problems etc, but when I do exactly the same on a Linux box (or even from a bash prompt in the new "Windows Subsystem for Linux"), it worked immediately. Bit strange that this kind of problem is still around, almost 7 years after you comment on it...
Hi, at the step 5. to rename
Hi, at the step 5. to rename trunk to master, the command line is actually:
git branch -m origin/trunk master
When I ran the git svn clone
When I ran the git svn clone command I got the message Using higher level of URL: http://server/svn/folder/folder/trunk => http://server/svn/folder so to get around this I also passed in the parameter --no-minimize-url hope it helps
Hi John,
Hi John,
Great article. But i found out that getting the authors list is a very difficult process for the huge repositories like 100 GB in size. As i am using Visual SVN, i dont have the Actual SVN repositories located in my Hard drive. So, Is there only this way that we have to checkout the Repository and run the svn log command to get the authors list? Or is there any other way that we can use an URL with svn log command.
Please help me as i wasted 2 weeks of time to migrate a huge SVN repo but no luck..
TO FIX THE No refs in common
TO FIX THE No refs in common and none specified; doing nothing. error:
Don't put the URL in the brackets! IE: https://my.svn.com/svn/stuff NOT [https://my.svn.com/svn/stuff]
GitHub will import an SVN
GitHub will import an SVN repo quite easily when you create a new repo (you just specify import code after you create it). You get a progress bar (which isn't clear when you use git-svn clone). It has a fancy author-matching interface, and I'm guessing it's more robust with tags than this manual way. It's worth checking it out, if you have a private GitHub account (for a private Git) or don't mind a public copy of your code.
Hi Manuel, did you manage to
Hi Manuel, did you manage to fix this problem? I'm having the same issue.
This guide gives dangerously
This guide gives dangerously bad advice.
git-svn is subtly broken; it has a tendency to misplace branch joins, especially in the presence of cvs2svn artifacts and certain common sorts of Subversion operator errors. No conversion pipeline based on git-svn is safe or reliable.
Use this instead: http://www.catb.org/esr/reposurgeon/
Guide to conversion pragmatics is here: http://www.catb.org/~esr/reposurgeon/dvcs-migration-guide.html
Mr. Albin, please amend your post to add a suitable warning about this.
The command in Step 6 -…
The command in Step 6 - Clean up branches and tags produces a syntax error with the following bash versions:
- bash 4.3.48(1)-release
- bash 4.4.23(1)-release
I don't know if it works for other bash versions.
The syntax error is rather vague "bash: syntax error near unexpected token `done'".
Can this be fixed?