GitHub - bloomberg/repofactor: Tools for refactoring history of git repositories
source link: https://github.com/bloomberg/repofactor
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Finding the causes of repository bloat
This project contains a bunch of tools to help analyse the largest blobs (by "on disk" storage) in a repository.
Here is a sample sequence of commands showing typical usage:
-
Typically start with a clean clone of the repository that you want to analyse. It can be bare. For reasonable performance it should be cloned onto "local" disk on a reasonably fast Linux machine.
-
Add these tools to your
PATH
or use a full path to each script or executable. -
Run these tools from the repository undergoing analysis and cleaning.
-
Work out a suitable threshold size by running
generate-larger-than
with experimental parameters. 50000 might be a good starting point. The size is "average bytes after compression by Git". -
Generate a sorted list of objects with file information
generate-larger-than 50000 | sort -k3n | add-file-info >../largeobjs.txt
-
Make a report showing the summary of each commit together with the paths which introduce the large objects, their uncompressed size and file information
report-on-large-objects ../largeobjs.txt
Filtering out large blobs
-
Create a temporary work directory and export
RFWORK_DIR
to point to this directory (defaults to the current directory). -
Again, run all commands from the repository being analysed.
-
From the above report, edit down a list of blob ids that can be eliminated. Call this
large-objects.txt
. -
Generate a remove script
make-remove-blobs large-objects.txt >"$RFWORK_DIR"/remove-blobs.pl chmod +x "$RFWORK_DIR"/remove-blobs.pl
-
Optionally edit the remove script to filter out any paths that are not required at the same time
-
Run the filter branch
run-filter-branch
-
Create a new "easy rebase" script for moving work-in-progess branches from the old history to the new history
make-mtnh >"$RFWORK_DIR"/move-to-new-history
-
Push the rewritten refs and the
rewrite-commit-map
branch to all central repositories -
Deploy
move-to-new-history
for users to use
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK