Git remove file from history

From UVOO Tech Wiki
Jump to navigation Jump to search

Remove binary file from git history in order to reduce size of git clone

To remove a binary file from the entire Git history to reduce the size of the repository, you can use the git filter-repo tool. This tool is more efficient and easier to use than the older git filter-branch.

Steps

  1. Install git filter-repo:
    • If you don't have git filter-repo installed, you can install it using pip:
    pip install git-filter-repo
    
  2. Clone the Repository:
    • Clone your repository to a local directory:
    git clone --mirror https://github.com/your-username/your-repo.git
    cd your-repo.git
    
  3. Remove the Binary File:
    • Use git filter-repo to remove the binary file from the entire history. Replace path/to/binary-file with the actual path to the binary file you want to remove:
    git filter-repo --path path/to/binary-file --invert-paths
    
  4. Force Push the Changes:
    • Force push the rewritten history to the remote repository:
    git push --force --mirror
    

Example

Assuming you want to remove a binary file located at binaries/large-file.bin, here are the commands you would run:

# Install git filter-repo if not already installed
pip install git-filter-repo

# Clone the repository as a mirror
git clone --mirror https://github.com/your-username/your-repo.git
cd your-repo.git

# Remove the binary file from the entire history
git filter-repo --path binaries/large-file.bin --invert-paths

# Force push the changes to the remote repository
git push --force --mirror

Important Considerations

  • Backup: Before performing these operations, make sure to backup your repository. These changes are destructive and cannot be undone.
  • Collaboration: Inform your collaborators about the history rewrite. They will need to re-clone the repository to avoid issues.
  • GitHub Actions and CI/CD: If you use GitHub Actions or other CI/CD tools, ensure that the changes do not disrupt your workflows.

Alternative: BFG Repo-Cleaner

Another tool you can use is the BFG Repo-Cleaner, which is designed to remove large files and sensitive data from Git repositories.

  1. Download BFG Repo-Cleaner:
  2. Run BFG Repo-Cleaner:
    • Use BFG to remove the binary file:
    java -jar bfg.jar --delete-files binaries/large-file.bin
    
  3. Clean Up and Push:
    • Follow up with a git gc and force push:
    git reflog expire --expire=now --all && git gc --prune=now --aggressive
    git push --force
    

By following these steps, you can remove a binary file from the entire Git history, reducing the size of your repository and making git clone operations faster.