4

Working with ZSTD Files

 4 years ago
source link: https://fuzzyblog.io/blog/linux/2019/09/09/working-with-zst-files.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Working with ZSTD Files

Sep 9, 2019

I love open source developers but there are times when I question their damn naming practices. I'm currently working with a giant data repository started as a "Z Standard" or "zstd" compressed file. And while I know that means "Z Standard", I can't help but look at it as "Z std". Oy.

Anyway. Zstd is a Facebook standard for data compression and it is strikingly effective. I've got over 100 gigs of JSON encoded data stored in a 13.7 gig file. Now I am aware that text compresses actually quite well but still 100 gigs in 13.7 gigs of space feels like wow.

Tools

If you're on a Mac then brew, as always, is your very best friend:

brew install zstd

Useful Command Lines

Assume that pol.zst is the name of the archive and it is located in your current directory.

Examining a handful of records:

zstd -cd pol.zst | head -n100

this dumps a stream of records out that are then fed into head which limits the quantity to 100.

The zstd -c and -d options mean:

-c     : force write to standard output, even if it is the console
-d     : decompression

Integrating the often useful jq (which just gets a single json element out):

zstd -cd pol.zst | jq '.timestamp'

And like all good *nix pipelines, this is composable (this example would extract the first 1000 records and then reduce them to only the comment element from the json):

zstd -cd pol.zst | head -n1000 | jq '.comment'

To count the total records in the zst file:

zstd -cd pol.zst | wc -l

Happily help is also available with:

zstd --help

Thank You's

Kudos to Facebook for another great bit of Open Source contributed back to the world. Also thanks to Grant Vousden-Dishington, the contributor of these command lines. He's been doing Zstd for a while; I'm the noob here.


Posted In: #linux #zstd


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK