1

Oil 0.10.0 - Can Unix Shell Error Handling Be Fixed Once and For All?

 1 year ago
source link: https://www.oilshell.org/blog/2022/05/release-0.10.0.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Can Unix Shell Error Handling Be Fixed Once and For All?

blog | oilshell.org

Oil 0.10.0 - Can Unix Shell Error Handling Be Fixed Once and For All?

2022-05-05

This is the latest version of Oil, a Unix shell that's our upgrade path from bash:

Oil version 0.10.0 - Source tarballs and documentation.

To build and run it, follow the instructions in INSTALL.txt. The wiki has tips on How To Test OSH. If you're new to the project, see Why Create a New Shell? and posts tagged #FAQ.


Here are some comics for context. The first one describes set -e -u -o pipefail, which is sometimes called "bash strict mode".

I use it in all my shell scripts, but it's not enough. Strict mode has holes and pitfalls.

This release announcment describes what Oil does about it!

Error Handling Overhaul: try and _status

So, can shell's error handling be fixed once and for all? I believe Oil 0.10.0 has done this. It's the first shell with reliable error handling in 50 years :-)

Basic Idea

Recall that Oil is designed to be familiar to Python and JavaScript users. So a program should stop by default on any failure, like:

cp: cannot create regular file '/nonexistent': Permission denied
  cp myfile /nonexistent
  ^~
hello.sh:1: errexit PID 29556: Command failed with status 1
# shell exits with status 1

Additionally, Oil fixes the holes in shell and bash, and steers you away from the pitfalls.

You can also handle those failures in a custom way with the new try builtin:

try {
  cp myfile /nonexistent  # exit status may be non-zero
  var item = a[i]         # index may be out of range
}                         # try sets _status, not $?
if (_status !== 0) {
  echo 'error'
}

This work is now documented!

Oil vs. Shell Idioms > Error Handling. This comprehensive list of examples is the first stop for users.

Oil Fixes Shell's Error Handling (errexit). I spent over a week writing and revising this design doc and reference.

  • It explains 4 fundamental issues with shell and Unix, and enumerates 9 error handling pitfalls.
  • It describes Oil's new error handling constructs.
  • It has FAQs on language design and global options like command_sub_errexit.
  • The 4 fundamental issues:
    1. When is $? Set?
    2. What Does $? Mean?
    3. The Meaning of if
    4. Design Mistake: The Disabled errexit Quirk

I've also updated A Tour of the Oil Language, which explains the language from scratch.

Language Design Notes

Let's compare Oil with 2 popular languages:

  1. Unlike Go, errors in Oil are fatal by default.
  2. Oil code is as short or shorter than Python, while still being explicit.

For example, ignoring a failure is often one line:

try ls /bad

rather than four lines:

try:
  myls('/bad')
except Exception:
  pass

Error handling code with 3 branches (true/false/error) is also shorter.

So even though the design is constrained by compatibility with both shell and bash, I'm very happy with how it turned out. This has been a consistent theme: there have been surprisingly few compromises in the Oil language!

  • For more notes on the language design, see the design doc and the appendix to this post.
  • Another appendix explains why Oil's error handling constructs changed in this release.

Backward Compatibility: The Eternal Puzzle

What do I mean by "Oil fixes shell's error handling"? I mean something pretty strong:

  1. OSH runs existing shell scripts as is, whether you use errexit or not.
    • It doesn't add any new error handling, because you may want to run your script under other shells.
  2. Oil has the correct defaults. It's a clean slate language that should be familiar to Python and JavaScript users.
    • Every failure is fatal by default.
    • You can handle errors using the new try builtin.
  3. There's an upgrade path from OSH to Oil.

Four Ways to Use OSH / Oil

What is the upgrade path? I recently figured out a concise way to explain how global options like shopt --set oil:basic work. There are four use cases for OSH and Oil:

  1. Run old scripts as is (no options)
  2. Improve old scripts while keeping compatibility with sh or bash (strict:all)
  3. Upgrade old scripts, dropping compatibility (oil:basic)
  4. Write new scripts (oil:all, or use bin/oil)

This explanation is on the OSH versus Oil wiki page, and I should write a longer post with examples.

Good News: 50K Euros From NLNet

I applied for an NLNet grant in February, and we got it in April!

So now we have 50K euros to help pay a compiler engineer to translate Oil to C++. See the blog post last month:

Oil Is Being Implemented "Middle Out". I show evidence that Oil can be fast, and list five ways you can help.

Please Sponsor Oil

I'm glad we have this grant to kick things off. But it's likely not enough to pay someone to completely "own" the task of translating Oil to C++.

If you like this error handling work, or Oil in general, you should sponsor it:

The NLNet grant comes with the constraint that we hire someone in the European Union. I would like to also pay people in other parts of the world. If you understand compilers, C++, and Python, that person could be you!


I hear the feedback loud and clear that the docs are incomplete. However, they take a long time to write and revise, and I need help.

For example, I spent 4-5 days implementing the language changes this release, but documenting everything took even longer! But I think we've finally unraveled and documented the decades-old mystery of shell error handling.

So this project can be finished, but we need help. The easiest way to help is to donate.

Groundhog Day

This work goes back to the original motivation for the project: Shell made me productive programmer, but I can't recommend it to my friends with a straight face! It has too many holes and pitfalls.

I noted in 2019 that David Korn, Tom Duff, and Richard Stallman complained about these same problems with shell in 1991 and 1994. Those complaints were closer to the creation of Unix in 1970 than we are to 1994!

So I view shell as a "groundhog day" problem -- we keep having the same conversations over and over again without making progress.

Have we lost our memory, and our collective will to build and fix things? Are we able to pass on knowledge to new generations of programmers?

Here are a couple recent examples.

Never-Ending Arguments About Shell

A recent "troll" blog post generated dozens of comments across multiple sites:

My comment on

Please stop writing shell scripts (pythonspeed.com)
79 points, 112 comments on 2022-03-22 (lobste.rs)
38 points, 59 comments - 43 days ago (Hacker News)

This conversation is largely a waste of time, because people said the same things 10 and 20 years ago, and the situation hasn't changed. Quite the contrary: shell is more popular than ever.

(I should write a blog post about why shell is more common these days, and why it will be more common in the future. Short answer: scale, the increasing heterogeneity of interconnected systems, and the ratio between apps and operating systems.)

On the other hand, the comments re-affirm that many people use shell to solve problems you care about! Even if you don't use shell directly, you do want a better shell.

Comments:

I used to give the same advice, but I completely changed my opinion over the past 10 years or so. I eventually put in the time and learned shell scripting.

Y'all writing bash scripts without set -u and error checking?

Here's an example from 2008, which was largely before the rise of a cloud full of virtual machines and containers:

Unfortunately I don't think there's a really good Unix programming language to replace the Bourne shell, which is one of the reasons that writing programs in the Bourne shell remains so tempting

Prediction

In the comment threads for this release announcement, some people will react negatively because Oil is a shell. They won't understand that Oil fixes exactly the problems that make shell frustrating! We're on the same side.

They may also say that they "switch" to Python after 100 lines of shell. Given that shell is in such poor shape, this is reasonable! But I still want to write about the The Shell XOR Python Fallacy (my related comment in the thread above).

Production Incidents

This recent blog post noted that a missing set -o pipefail caused a production incident at Cloudflare.

My comment on

PIPEFAIL: How a missing shell option slowed Cloudflare down (blog.cloudflare.com)
64 points, 22 comments on 2022-04-05 (lobste.rs)
17 points, 8 comments - 29 days ago (Hacker news)

Note that pipefail is Oil's option groups oil:basic and oil:all, so using Oil would help here.


It reminded me of this similar post from 2017:

When Bash scripts bite (janestreet.com via Hacker News)
233 points, 139 comments - on May 12, 2017

This is the problem that Oil's command_sub_errexit fixes. Moreover, Oil is the only shell with this option.


There are many other posts in this genre -- feel free reference them in the comments. Again, Oil patches all the holes in shell. If you disagree, please file a bug.

Acknowledgements

  • Thanks to ca2013 and Aidenn0 for extensive proofreading of the error handling doc.
  • Thanks to Albin Otterhäll for feedback on the Oil language. Aside from the error handling changes, this release removes bare assignment from Oil (shopt --set parse_equals), leaving it for the "configuration dialect".
  • Thanks to Nathan Sketch and glyh for contributions to the prior release, which I didn't announce.
3d489ab8 Nathan Sketch re2c patterns should never match NUL bytes (#1095)
adaddb2c glyh Implement jobs -p (#1098)
  • Thanks to Julia Evans for the great comics (which are not affiliated with Oil). Here is more material to check out:
    • Comics -- search for bash.
    • Bite Size Bash. I don't have this one, but I bought print copies of others, and they are great!

What's Next?

I look forward to feedback on Oil's error handling features. I expect that the next few months will be filled with with recruiting and fundraising.

  • To meet our goals, we probably need a Why Sponsor Oil? link on every page of the site. I don't want this to annoy to people who have already donated!
  • I want to write Brief Descriptions of mycpp and send it to Python experts. As mentioned in the cross reference, this tool is a hybrid between the recent mypyc compiler and the old Shed Skin compiler.

This may help find the people with the right skill sets and interest. Again, check out Oil Is Being Implemented "Middle Out" as well as Compiler Engineer Job on the wiki!

Appendices

Why Change try?

Here's some background on the language changes in this release.

In October 2020, I said that one of the Four Features That Justify a Unix Shell is reliable error handling.

At that point, I had implemented most of what these new docs describe, like the command_sub_errexit and strict_errexit options. However:

  • The try builtin was awkward.
    • There was an idiom if try myproc, along with flags --assign and --allow-status-01. This worked, and was shell-like, but I believe it was unfamiliar to many programmers. Remember that Oil is for Python and JavaScript users that avoid shell.
    • try only accepted a simple command, not a block. It would force you to use small functions, possibly breaking the flow of the code.
  • We fixed error handling for OSH, but not Oil. It wasn't clear how to handle errors in Oil expressions like 42 / 0 (divide by zero) and a[i] (index out of bounds). The new try builtin handles both command and expression failures consistently.
  • The work was only lightly documented. I spent awhile writing that announcement, and then I was eager to work on the garbage collector.

As a result, few users tried it. So I've learned my lesson with respect to documentation-driven development! A feature isn't done until it's documented and users give feedback.

So it's important that you download Oil 0.10.0, read the docs, and give feedback on it. Again, I claim that this is the first shell in 50 years with reliable error handling. Prove me wrong!


Also, ca2013 reported a bad error message with strict_errexit, which led me to overhaul it. It's now stricter, which makes idiomatic Oil code more straightforward and consistent. I avoided the "meta-pitfall".

More Design Notes

Here are a couple notes for language designers, and readers following the #software-architecture concepts on this blog.

Should a Shell Have Exceptions?

There is no notion of Python-like exceptions in Oil. This avoids what I call a Perlis-Thompson problem. When you add new features to a language, they have to compose with the old ones.

I also like this design because Oil remains a thin layer over the kernel. The exit status of processes is a kernel concept, and it's naturally extended to shell builtins and functions.

The Meta Language Influences the Language

In hindsight, the design of try is obvious and has been missing from shell for decades!

For example, in Oil's own test harnesses, I frequently need to handle errors from shell functions while errexit is on. This is difficult to do correctly in shell, but try makes it easy.

So why did no other shell come up with this solution? I believe a major reason is that they're tree interpreters written in C. Error recovery is hard in C, so shells have avoided language features that require it!

In contrast, Python exceptions are easy to use, and we use them to implement Oil's try. (But again there's no notion of exceptions in Oil itself.)

So this is highly related to the point about metalanguages I've been making, particularly in the last post. In the appendix, I mention the draft of Shell's Implementation Language Has Always Been a Problem, where I further justify this point. This is justified by comments on longjmp() by Stephen Bourne, and a comment on Lisp exceptions in the bash source code.


On the other hand, why didn't I come up with this solution the first time? I think the original design was actually too minimal, focusing on slight adjustments to if. Also, language design is just hard :-)

Closed Issues

This release closed 6 issues. You can also view the full changelog.

#1113 strict_errexit can occur in a child process, which doesn't abort the whole script
#1111 error handling changes: rename try -> bool, new try that takes a block
#1107 strict_errexit error message points to the wrong line
#1106 Unexpected block args in builtins should be errors
#942 test and document var x = $(false)
#937 shopt parse_equals should only be on in config mode

Metrics for the 0.10.0 Release

These metrics help me keep track of the project. Let's compare this release with version 0.9.7, which I discussed in January Release Notes and Themes.

Spec Tests

The spec test suites for both OSH and Oil continue to expand and turn green.


Source Code Size

Despite the new features, the code is still compact. Significant lines:

Physical lines:

Benchmarks

The oil-native Parser performance hasn't changed:

Runtime performance for the Python / "OVM" build:

These measurements are noisy, but this looks like a regression. There's a change on both machines, even though the size is different. It could be related to some of the runtime checks for error handling.

I will keep an eye on it, but what matters is the speed of oil-native, not the speed of this reference implementation. We already know this is too slow!

Native Code Metrics

I didn't work on the translation during this release, but we're not regressing:

The following deltas are proportional. Generated source lines;

Binary size:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK