Git remove file from history

From UVOO Tech Wiki
Revision as of 16:54, 9 October 2024 by Busk (talk | contribs) (Created page with "# Remove binary file from git history in order to reduce size of git clone To remove a binary file from the entire Git history to reduce the size of the repository, you can u...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Remove binary file from git history in order to reduce size of git clone

To remove a binary file from the entire Git history to reduce the size of the repository, you can use the git filter-repo tool. This tool is more efficient and easier to use than the older git filter-branch.

Step-by-Step Guide

  1. Install git filter-repo:
    • If you don't have git filter-repo installed, you can install it using pip:
    pip install git-filter-repo
    
  2. Clone the Repository:
    • Clone your repository to a local directory:
    git clone --mirror https://github.com/your-username/your-repo.git
    cd your-repo.git
    
  3. Remove the Binary File:
    • Use git filter-repo to remove the binary file from the entire history. Replace path/to/binary-file with the actual path to the binary file you want to remove:
    git filter-repo --path path/to/binary-file --invert-paths
    
  4. Force Push the Changes:
    • Force push the rewritten history to the remote repository:
    git push --force --mirror
    

Example

Assuming you want to remove a binary file located at binaries/large-file.bin, here are the commands you would run:

# Install git filter-repo if not already installed
pip install git-filter-repo

# Clone the repository as a mirror
git clone --mirror https://github.com/your-username/your-repo.git
cd your-repo.git

# Remove the binary file from the entire history
git filter-repo --path binaries/large-file.bin --invert-paths

# Force push the changes to the remote repository
git push --force --mirror

Important Considerations

  • Backup: Before performing these operations, make sure to backup your repository. These changes are destructive and cannot be undone.
  • Collaboration: Inform your collaborators about the history rewrite. They will need to re-clone the repository to avoid issues.
  • GitHub Actions and CI/CD: If you use GitHub Actions or other CI/CD tools, ensure that the changes do not disrupt your workflows.

Alternative: BFG Repo-Cleaner

Another tool you can use is the BFG Repo-Cleaner, which is designed to remove large files and sensitive data from Git repositories.

  1. Download BFG Repo-Cleaner:
  2. Run BFG Repo-Cleaner:
    • Use BFG to remove the binary file:
    java -jar bfg.jar --delete-files binaries/large-file.bin
    
  3. Clean Up and Push:
    • Follow up with a git gc and force push:
    git reflog expire --expire=now --all && git gc --prune=now --aggressive
    git push --force
    

By following these steps, you can remove a binary file from the entire Git history, reducing the size of your repository and making git clone operations faster.