Let’s say you have two projects that were living in their own repository for ages. And all of a sudden it make sense to move a subset of code, and its whole commit history, from the first to the second.
That’s a topic I already addressed. See for instance my previous articles on an internal corporate project I open-sourced , a migration from SVN to Git , and Git sub-tree cleaning .
Should we really keep revisiting the subject again and again? Yes, cause things have changed! Since v1.7.2 , Git supports orphan branches. And we’ll now use them to keep unrelated branches sharing the same root until their merging point.
We start with the source repository. The code we’re about to move is available
in the
develop
branch:
$ git clone https://github.com/kdeldycke/source-project.git
$ cd ./source-project
$ git checkout develop
To prevent any bad move, we detach the local copy of the repository from its remote branches:
$ git remote rm origin
Now we can start cleaning the
develop
branch to only keep the subset of code
we’d like to move.
First we remove other branches and all tags:
$ git branch -D master
$ git tag -d `git tag | grep -E '.'`
In my case, and after studying the whole commit history, the code lived under the following past and current locations:
-
./folder1/lib/*
-
./folder2/subfolder/data-lib/*
-
./folder3/scripts/*
-
./script_tools.py
I then managed to produce a one-liner
find
command satisfying all these path
constraints. Combined with the
filter-branch
action, I was allowed me to
remove all content but these path within the whole commit history:
$ git filter-branch --force --prune-empty --tree-filter 'find . -type f -not -ipath "*lib*" -and -not -ipath "*script*" -and -not -ipath "./.git*" -and -not -path "." -print -exec rm -rf "{}" \;' -- --all
I revisited the new commit log which was way cleaner. But the command above was too much coarse-grained, so I had to repeat the operation again to get the exact sub-tree I was looking for:
$ git filter-branch --force --prune-empty --tree-filter 'find . -type f -iname ".gitignore" -print -exec rm -rf "{}" \;' -- --all
$ git filter-branch --force --prune-empty --tree-filter 'if [ -d ./calibration ]; then rm -rf ./calibration; fi' -- --all
$ git filter-branch --force --prune-empty --tree-filter 'if [ -f ./script_tools.py ]; then mkdir -p ./folder3/scripts; mv ./script_tools.py ./folder3/scripts/; fi' -- --all
Now that I have the perfect history for the minimal subset of code I’m targeting, we can flatten the commit log:
$ git rebase --root
Rebasing is not an exact science and you might end-up with empty commits:
(...)
Could not apply 2a4f66a6fa114846bb80c3d488e41a186bce4894...
The previous cherry-pick is now empty, possibly due to conflict resolution.
If you wish to commit it anyway, use:
git commit --allow-empty
Otherwise, please use 'git reset'
(...)
In which case I simply ignore the problem and order the rebasing action to continue as many times necessary to let the process complete:
$ git rebase --continue
$ git rebase --continue
$ git rebase --continue
Finally, we clean-up:
$ git reflog expire --all
$ git gc --aggressive --prune
Let’s now switch to the repository that will become the new home for our code:
$ cd ..
$ git clone https://github.com/kdeldycke/destination-project.git
$ cd ./destination-project
We create a new detached, orphan branch:
$ git symbolic-ref HEAD refs/heads/orphan
$ rm .git/index
$ git clean -fdx
$ git commit -m 'Temporary initial commit for orphan branch.' --allow-empty
Then publish that new
orphan
branch upstream:
$ git push --set-upstream origin orphan
At this point I encourage you to check in your GUI that the said orphan branch is really detached from your usual branches:
Time to import the branch we cleaned from the source repository:
$ git checkout orphan
$ git remote add code_import ../source-project
$ git fetch code_import
We then replace our
orphan
branch by the code we just imported:
$ git rebase code_import/develop
Once everything’s at your taste, we can remove the relationship with the source
project and push the newly populated
orphan
branch upstream:
$ git branch -r -D code_import/develop
$ git push --force --set-upstream origin orphan