How to remove all files from the git history that are not currently present?
I’ve seen several articles and questions about how to remove a single file from all git history. Example: How to remove/delete a large file from commit history in Git repository?
What I’d like to do is remove all files that are not currently present at the head of the master branch.
My use case is that I’m splitting off a smaller repository (call it
small ) from a monolithic repository (call it
monolith ). I want to preserve the git history when creating
small , but only the relevant git history.
First, I created a new repository
small on GitHub. Then, on my laptop, I added it as a remote named
origin-small to my local
monolith repository, and pushed the current state of the master branch of
I then removed the remote
monolith , changed directories, and cloned
small from GitHub. Voilà, I had a copy of my original repository,
monolith , with its full history.
But, there are loads of files in the history of
small that are no longer relevant, and they are bloating the repo.
What I’d like to do is:
- Delete all of the unnecessary files from
- Run a command to clear the whole git history of the files that I just deleted.
Is there a way to do this with a single command? Or do I need to run
git filter-branch once for every file/directory that I want to remove?
I ended up using
git-filter-repo . WARNING: This approach is NOT able to update tags on the remote, if there are any.
brew install git-filter-repo
- Clone your desired repo, in mirror form.
git clone --mirror <my-repo-url>
- Enter the repo directory.
- Analyze the repo to identify all files that are in the history, but no longer exist.
git filter-repo --analyze
analysisoutput directory, there will be a file named
path-deleted-sizes.txtthat contains a list all files that were committed at some point, and were later deleted, but still exist in the git history.
Create a new file that lacks the headers and other columns.
tail +3 ./filter-repo/analysis/path-deleted-sizes.txt | tr -s ' ' | cut -d ' ' -f 5- > ./filter-repo/analysis/path-deleted.txt
- Clean the git history of all files that no longer exist. This will also clean dirty commits, remove empty commits, and recompress everything for you.
git filter-repo --invert-paths --paths-from-file ./filter-repo/analysis/path-deleted.txt
- Clean up the
./filter-repodirectory, or you won’t be able to push your changes.
rm -rf ./filter-repo
- Force-push all refs to the origin. It will force-push, even though the command doesn’t indicate it. Also, it will update all branches on the remote, which is convenient. If you have branch protection enabled on some branches in GitHub/Bitbucket/etc., then you will need to allow force-pushes. You can always re-run this command if you find that some refs could not be force-pushed.
git push --force-with-lease