Dividing Git Repositories

I’ve been working in a single git repository with many projects (subfolders) and needed to divide it by moving some folders (including their history) to a new repository, thereby removing these files (with history) from the original repository. There are a number of posts on this, however for my case none worked exactly as indicated. I rather had to combine instructions from various sources, that’s why I decided to write a summarizing blog post here.

Short Instructions

Basically, you have to perform these steps:

  1. Make a backup
  2. Clone the source repository
  3. Execute git subtree commands on all folders to be moved
  4. Create a new repository
  5. Pull the isolated folders into the new repository and move them to subdirectories
  6. Restore the original repository
  7. Delete all moved folders (including history) from the original repository

Detailed Instructions

First of all: Make a backup. Seriously, these operations will rewrite your git history and can be destructive, so make sure everything is stored elsewhere.

The first step is to create a new working copy of the existing repository.

1
git clone git://my-server.tld/my-repo.git

Imagine the following git repository with the following folders on top level:

A
B
C
D

Let’s say we want to move A and B to another, new git repository and leave C and D in the original repository. Therefore we invoke git subtree split – a magic command that will create a branch containing only the history of a given subdirectory. The syntax is:

1
git subtree split -P <directory> -b <branch>

In our example, we isolate folder A on branch onlyA and folder B on branch onlyB:

1
2
git subtree split -P A -b onlyA
git subtree split -P B -b onlyB

The source repository is now prepared. The next step is to create the new target repository:

1
2
3
4
cd ..
mkdir newrepo
cd newrepo
git init

Now we pull a branch from the original repository:

1
git pull /path/to/original/repository onlyA

The path to the original repository must be an absolute path, at least relative paths did not work for me. When listing the directory content you will notice that everything was imported to the root of your repository. If you want the merged content to be in a subfolder again, you need to create it and move the files using git mv commands manually. Dont’t forget to commit your moved files. Don’t forget to move hidden files also! These can be discovered using ls -la.

1
2
3
4
5
mkdir A
git mv file1 A
git mv file2 A
...
git commit -m "Merged A and moved to subfolder"

Note: Apparently it is not possible to move the files in the original repository and then use the split command. When doing this, my history got lost. For me it only worked the other way round (first split, then move the files in the target repository). Feel free to comment if you can explain this behaviour.

Repeat this process for all other folders to be moved:

1
2
3
4
5
6
git pull /path/to/original/repository onlyB
mkdir B
git mv file1 B
git mv file2 B
...
git commit -m "Merged B and moved to subfolder"

After that we restore the original repository completely (either using git clone or from a backup). Then we remove all moved folders (including history) from that repository. This is achieved using git filter-branch commands:

1
2
git filter-branch -f --tree-filter 'rm -rf A' HEAD
git filter-branch -f --tree-filter 'rm -rf B' HEAD

That’s it! We have now moved a part of the original repository (including history) to a new repository and removed that part (including history) from the original repository. Effectively this divided the original repository into two independent repositories.

Resources

Here are some links that might be helpful for details and performance optimized operations:

4 thoughts on “Dividing Git Repositories

  1. Thank you Dave!
    This is a great breakdown! I also found that other posts talking about this seemed to have conflicting instructions and with the different versions of git and subtree command, I was glad to find and follow your instructions! I couldn’t get the old folders removed by using the : “git filter-branch -f –tree-filter ‘rm -rf A’ HEAD” , but that was the least of my worries.

    Thanks again!
    -fabio

  2. Thanks for that really useful article. Everybody developing with git may have this problem at some day. But I’m sure nobody really knows how to do this correctly and without pain.

  3. I’m able to run `git pull /path/to/original/repository onlyA`, but when I try `git pull /path/to/original/repository onlyB`, I get the error “fatal: refusing to merge unrelated histories”. This is with git 2.12.1. Any idea why?

Leave a Reply

Your email address will not be published. Required fields are marked *