7

Github Topologically sort db-dump.tar.gz by dtolnay · Pull Request #3409 · rust-...

 3 years ago
source link: https://github.com/rust-lang/crates.io/pull/3409
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Copy link

Member

dtolnay commented 29 days ago

Before:

$ tar tf db-dump.tar.gz
2021-03-06-140039/
2021-03-06-140039/README.md
2021-03-06-140039/schema.sql
2021-03-06-140039/import.sql
2021-03-06-140039/export.sql
2021-03-06-140039/metadata.json
2021-03-06-140039/data
2021-03-06-140039/data/teams.csv
2021-03-06-140039/data/reserved_crate_names.csv
2021-03-06-140039/data/version_downloads.csv
2021-03-06-140039/data/keywords.csv
2021-03-06-140039/data/crates.csv
2021-03-06-140039/data/dependencies.csv
2021-03-06-140039/data/version_authors.csv
2021-03-06-140039/data/users.csv
2021-03-06-140039/data/crates_keywords.csv
2021-03-06-140039/data/versions.csv
2021-03-06-140039/data/categories.csv
2021-03-06-140039/data/metadata.csv
2021-03-06-140039/data/badges.csv
2021-03-06-140039/data/crate_owners.csv
2021-03-06-140039/data/crates_categories.csv

After:

$ tar tf db-dump.tar.gz
2021-03-06-140039
2021-03-06-140039/export.sql
2021-03-06-140039/import.sql
2021-03-06-140039/metadata.json
2021-03-06-140039/schema.sql
2021-03-06-140039/README.md
2021-03-06-140039/data
2021-03-06-140039/data/metadata.csv
2021-03-06-140039/data/reserved_crate_names.csv
2021-03-06-140039/data/categories.csv
2021-03-06-140039/data/teams.csv
2021-03-06-140039/data/keywords.csv
2021-03-06-140039/data/users.csv
2021-03-06-140039/data/crates.csv
2021-03-06-140039/data/crates_categories.csv
2021-03-06-140039/data/badges.csv
2021-03-06-140039/data/crates_keywords.csv
2021-03-06-140039/data/crate_owners.csv
2021-03-06-140039/data/versions.csv
2021-03-06-140039/data/version_authors.csv
2021-03-06-140039/data/dependencies.csv
2021-03-06-140039/data/version_downloads.csv

Notice that e.g. the dependencies table, which contains foreign keys into the crates and versions tables, used to appear in the tar BEFORE the versions table, which made it impossible to do a streaming import. Now the files are correctly sorted topologically with all dependencies appearing in front of the tables which depend on them.

Tested via cargo run --release --bin enqueue-job -- compress /path/to/2021-03-06-140039 using 287580b.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK