

People should aim to make “badly written” code “just work” (2009)
source link: https://lwn.net/Articles/326505/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

On Wed, 25 Mar 2009, Kyle Moffett wrote: <font>> </font><font>> Well, I think the goal is not to *replace* the POSIX API or even </font><font>> provide "transactional" guarantees. The performance penalty for </font><font>> atomic transactions is pretty high, and most programs (like GIT) don't </font><font>> really give a damn, as they provide that on a higher level. </font> Speaking with my 'git' hat on, I can tell that - git was designed to have almost minimal requirements from the filesystem, and to not do anything even half-way clever. - despite that, we've hit an absolute metric sh*tload of filesystem bugs and misfeatures. Some very much in Linux. And some I bet git was the first to ever notice, exactly because git tries to be really anal, in ways that I can pretty much guarantee no normal program _ever_ is. For example, the latest one came from git actually checking the error code from 'close()'. Tell me the last time you saw anybody do that in a real program. Hint: it's just not done. EVER. Git does it (and even then, git does it only for the core git object files that we care about so much), and we found a real data-loss CIFS bug thanks to that. Afaik, the bug has been there for a year and half. Don't tell me nobody uses cifs. Before that, we had cross-directory rename bugs. Or the inexplicable "pread() doesn't work correctly on HP-UX". Or the "readdir() returns the same entry multiple times" bug. And all of this without ever doing anything even _remotely_ odd. No file locking, no rewriting of old files, no lseek()ing in directories, no nothing. Anybody who wants more complex and subtle filesystem interfaces is just crazy. Not only will they never get used, they'll definitely not be stable. <font>> To be honest I think we could provide much better data consistency </font><font>> guarantees and remove a lot of fsync() calls with just a basic </font><font>> per-filesystem barrier() call. </font> The problem is not that we have a lot of fsync() calls. Quite the reverse. fsync() is really really rare. So is being careful in general. The number of applications that do even the _minimal_ safety-net of "create new file, rename it atomically over an old one" is basically zero. Almost everybody ends up rewriting files with something like open(name, O_CREAT | O_TRUNC, 0666) write(); close(); where there isn't an fsync in sight, nor any "create temp file", nor likely even any real error checking on the write(), much less the close(). And if we have a Linux-specific magic system call or sync action, it's going to be even more rarely used than fsync(). Do you think anybody really uses the OS X FSYNC_FULL ioctl? Nope. Outside of a few databases, it is almost certainly not going to be used, and fsync() will not be reliable in general. So rather than come up with new barriers that nobody will use, filesystem people should aim to make "badly written" code "just work" unless people are really really unlucky. Because like it or not, that's what 99% of all code is. The undeniable FACT that people don't tend to check errors from close() should, for example, mean that delayed allocation must still track disk full conditions, for example. If your filesystem returns ENOSPC at close() rather than at write(), you just lost error coverage for disk full cases from 90% of all apps. It's that simple. Crying that it's an application bug is like crying over the speed of light: you should deal with *reality*, not what you wish reality was. Same goes for any complaints that "people should write a temp-file, fsync it, and rename it over the original". You may wish that was what they did, but reality is that "open(filename, O_TRUNC | O_CREAT, 0666)" thing. Harsh, I know. And in the end, even the _good_ applications will decide that it's not worth the performance penalty of doing an fsync(). In git, for example, where we generally try to be very very very careful, 'fsync()' on the object files is turned off by default. Why? Because turning it on results in unacceptable behavior on ext3. Now, admittedly, the git design means that a lost new DB file isn't deadly, just potentially very very annoying and confusing - you may have to roll back and re-do your operation by hand, and you have to know enough to be able to do it in the first place. The point here? Sometimes those filesystem people who say "you must use fsync() to get well-defined semantics" are the same people who SCREWED IT UP SO DAMN BADLY THAT FSYNC ISN'T ACTUALLY REALISTICALLY USEABLE! Theory and practice sometimes clash. And when that happens, theory loses. Every single time. Linus(Log in to post comments)
Recommend
-
10
Chrome 25 broke audio for me rather badly I listen to my scanner a lot. I usually pipe it through Airplay and mix it with my music and have it on while working on other things....
-
9
Don't react badly to genuine questions Last week, Etsy started something called the Hacker Grants program. It will provide ten grants of $5,000 each to women who want to join Hacker School this summer but need support. That in i...
-
12
One parks badly, another leaves a note, and I take a pic Writing Software, technology, sysadmin war stories, and more. Friday, September 23, 2011 One pa...
-
13
You Fail to Reach Your Goals Because You Designed Them Badly 原文地址: https://www.scotthyoung.com/blog/2019/05/15/design-motivation/ 为...
-
11
I ran command line macOS tools, such as Bash and Geekbench, on a jailbroken iPhone by replacing iOS’s dyld shared cache (all of iOS’s code) with macOS’s. However, graphical apps will never work: macOS’s WindowServer won’t start, since iOS’s d...
-
11
Availability Groups, Busy Databases, and Badly-Timed Reboots A server’s availability group...
-
12
Datamining Facebook's Novi wallet Nov 23, 2021 I tested Facebook’s new Novi digital wallet and found evidence for upcoming features, such as a debit card to access...
-
9
It's more than clear that And Just Like That... (the Sex and the City reboot) is going to keep bringing up Samantha in an attempt to have Kim Cattrall return to the show in the fu...
-
3
Wow - Google Custom Search Is Badly Broken Oct 11, 2019 So I just happened to be over on ruby-doc.org and I did a search on for strfitme. This was the astonishingly bad results: ...
-
4
Wordle’s Wordlebot will analyze how badly you playedAmrita Khalid·Contributing WriterFri, April 8, 2022, 8:01 AM·2 min read
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK