2

TL;DR: Memory-Model Recommendations for Rusting the Linux Kernel

 2 years ago
source link: https://paulmck.livejournal.com/65341.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
These recommendations assume that the initial Linux-kernel targets for Rust developers are device drivers that do not have unusual performance and scalability requirements, meaning that wrappering of small C-language functions is tolerable. (Please note that most device drivers fit into this category.) It also assumes that the main goal is to reduce memory-safety bugs, although other bugs might be addressed as well. Or, Murphy being Murphy, created as well. But that is a risk in all software development, not just Rust in the Linux kernel.

Those interested in getting Rust into Linux-kernel device drivers sooner rather than later should look at the short-term recommendations, while those interested in extending Rust's (and, for that matter, C's) concurrency capabilities might be more interested in the long-term recommendations.

Short-Term Recommendations

The goal here is to allow the Rust language considerable memory-model flexibility while also providing Rust considerable freedom in what it might be used for within the Linux kernel. The recommendations are as follows:
  1. Provide wrappers around the existing Linux kernel's locking, MMIO, I/O-barrier, and I/O-access primitives. If I was doing this work, I would add wrappers incrementally as they were actually needed, but I freely admit that there are benefits to providing a full set of wrappers from the get-go.
  2. Either wrapper READ_ONCE() or be careful to take Alpha's, Itanium's, and ARMv8's requirements into account as needed. The situation with WRITE_ONCE() appears more straightforward. The same considerations apply to the atomic_read() and atomic_set() family of primitives.
  3. Atomic read-modify-write primitives should be made available to Rust programs via wrappers around C code. But who knows? Maybe it is once again time to try out intrinsics. Again, if I was doing the work, I would add wrappers/implementations incrementally.
  4. Although there has been significant discussion surrounding how sequence locking and RCU might be handled by Rust code, more work appears to be needed. For one thing, it is not clear that people proposing solutions are aware of the wide range of Linux-kernel use cases for these primitives. In the meantime, I recommend keeping direct use of sequence locking and RCU in C code, and providing Rust wrappers for the resulting higher-level APIs. In this case, it might be wise to make good use of compiler directives in order to limit Rust's ability to apply code-motion optimizations.
  5. Control dependencies should be confined to C code. If needed, higher-level APIs whose C-language implementations require control dependencies can be wrappered for Rust use. But my best guess is that it will be some time before Rust code needs control dependencies.
Taking this approach should avoid memory-model-mismatch issues between Rust and C code in the Linux kernel.

Of course, situations that do not fit this set of recommendations can be addressed on a case-by-case basis. I would of course be happy to help.

Long-Term Recomendations

This section takes a more utopian view. Please note that middle-of-the-road approaches are likely to be useful, for example, those that simply provide wrappers around the C-language implementations.

What would a perfect Rust sequence-locking implementation/wrapper look like? Here are some off-the-cuff desiderata, none of which are met by the current C-code Linux-kernel implementation:
  1. Use of quantities computed from sequence-lock-protected variables in a failed reader should result in a warning. But please note that reliably associating variables with sequence locks may not be easy.
  2. Improper access to variables not protected by the sequence lock should result in a warning. But please note that it is quite difficult to define "improper" in this context, let alone detect it in real code.
  3. Data races in failed sequence-lock readers should not cause failures. But please note that this is extremely difficult in general if the data races involve non-volatile C-language accesses in the reader. For example, the compiler would be within its rights to refetch the value after the old value had been checked. (This is why the sequence-locking post suggests marked accesses to sequence-lock-protected variables.)
  4. Data races involving sequence-locking updaters are detected, for example, via KCSAN.
As noted elsewhere, use of a wrapper around the existing C-language implementation allows C and Rust to use the same sequence lock. This might or might not prove to be important.

Similarly, what would a perfect Rust RCU wrapper look like? Again, here are some off-the-cuff desiderata, similarly unmet by existing C-code Linux-kernel implementations:
  1. Make call_rcu() and friends cause the specified object to make an end-of-grace-period transition in other threads' ownership from readers to unowned. Again, not an immediate transition at the time of call_rcu() invocation, but rather a deferred transition that takes place at the end of some future grace period.
  2. Some Rust notion of type safety would be useful in slab caches flagged as SLAB_TYPESAFE_BY_RCU.
  3. Some Rust notion of existence guarantee would be useful for RCU readers.
  4. Detect pointers to RCU-protected objects being improperly leaked from RCU read-side critical sections, where proper leakage is possible via locks and reference counters.
  5. Detect mishandling of dependency-carrying pointers returned by rcu_dereference() and friends. Or perhaps introduce some marking such pointers to that the compiler will avoid breaking the ordering. See the address/data dependency post for a list of C++ working papers moving towards this goal.
There has recently been some work attempting to make the C compiler understand control dependencies. Perhaps Rust could make something useful happen in this area, perhaps by providing markings that allow the compiler to associate the reads with the corresponding control-dependent writes.

Perhaps Rust can better understand more of the ownership schemes that are used within the Linux kernel.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK