Multithreading for performance in shell scripts
source link: https://www.vidarholen.net/contents/blog/?tag=multithreading
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Multithreading for performance in shell scripts
Now that everyone and their grandmother have at least two cores, you can double the efficiency by distributing the workload. However, multithreading support in pure shell scripts is terrible, even though you often do things that can take a while, like encoding a bunch of chip tunes to ogg vorbis:
mkdir ogg for file in *.mod do xmp -d wav -o - "$file" | oggenc -q 3 -o "ogg/$file.ogg" done
This is exactly the kind of operation that is conceptually trivial to parallelize, but not obvious to implement in a shell script. Sure, you could run them all in the background and wait
for them, but that will give you a load average equal to the number of files. Not fun when there are hundreds of files.
You can run two (or however many) in the background, wait
and then start two more, but that’ll give terrible performance when the jobs aren’t of roughly equal length, since at the end, the longest running job will be blocking the other eager cores.
Instead of listing ways that won’t work, I’ll get to the point: GNU (and FreeBSD) xargs
has a -P
for specifying the number of jobs to run in parallel!
Let’s rewrite that conversion loop to parallelize
mod2ogg() { for arg; do xmp -d wav -o - "$arg" | oggenc -q 3 -o "ogg/$arg.ogg" -; done } export -f mod2ogg find . -name '*.mod' -print0 | xargs -0 -n 1 -P 2 bash -c 'mod2ogg "$@"' --
And if we already had a mod2ogg script, similar to the function just defined, it would have been simpler:
find . -name '*.mod' -print0 | xargs -0 -n 1 -P 2 mod2ogg
Voila. Twice as fast, and you can just increase the -P
with fancier hardware.
I also added -n 1
to xargs here, to ensure an even distribution of work. If the work units are so small that executing the command starts becoming a sizable portion of it, you can increase it to make xargs run mod2ogg with more files at a time (which is why it’s a loop in the example).
Recommend
-
16
Writing shell scripts leaves a lot of room to make mistakes, in ways that will cause your scripts to break on certain input, or (if some input is untrusted) open up security vulnerabilities. Here are some tips on how to ma...
-
15
Have you ever thought how good it would be to have a help message for your shell script that you wrote a month ago and already forgot what it is supposed to do? Yeah, there is always a way to show a message using c...
-
6
Troubleshooting shell scripts that grew slower with age Sometimes, I start writing posts, and then shelve them because they aren't quite ready, or I don't feel like putting them out, or maybe it's just not clicking into place, or...
-
2
Corner cases and shell scripts Now and then I like to review my old chat logs to look for inspiration. Sometimes I can find old troubleshooting sessions and use that as a reminder of something which was messed up and should have...
-
6
Jyoti - A simple IRC bot for use with shell scripts Zero dependencies. Simple usage. Hackable. Usage The idea is that Jyoti can be repurposed easily with shell scripts, removing the need to write a new b...
-
3
Using shell scripts for massively parallel processing 6 ...
-
2
Shell scripts made simple Inspired by Google's zx, but made much simpler and more accessible usin...
-
2
Showing GUIs from Shell Scripts Posted on October 12, 2021 by Olivier Goffart and Simon Hausmann Ever written a quick (shell) script to automate a small task at some point? Then that script grew...
-
6
Create fast, easy, and repeatable containers with Podman and shell scripts Get started with containers in a fast, repeatable...
-
5
Files Permalink Latest commit message Commit time
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK