1

Gaynor: Buffers on the edge: Python and Rust

 1 year ago
source link: https://lwn.net/Articles/912181/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Gaynor: Buffers on the edge: Python and Rust

[Posted October 24, 2022 by corbet]
Alex Gaynor examines the awkwardness that comes when trying to interface Python and Rust code.
The challenge is that if you want to pass some bytes to a Rust library to parse them (or do any other processing for that matter), the library almost certainly expects a &[u8], and there’s no way to turn a &[ReadOnlyCell<u8>] into a &[u8] safely, without allocating and copying. And of course, the whole point of the Python buffer protocol is to avoid these sorts of inefficiencies.

Therefore, the regrettable solution is that, right now, there is no way to have all three of: efficiency, interoperability, and soundness.


(Log in to post comments)

Gaynor: Buffers on the edge: Python and Rust

Posted Oct 24, 2022 15:15 UTC (Mon) by floppus (subscriber, #137245) [Link]

It's a valid point. But are there *any* languages that are interoperable with Rust in the way the author describes?

Naively, it seems to me that if this hypothetical parsing library intends to be used by other general-purpose languages, then it ought to learn how to parse a &[ReadOnlyCell<u8>].

Gaynor: Buffers on the edge: Python and Rust

Posted Oct 24, 2022 16:15 UTC (Mon) by matthias (subscriber, #94967) [Link]

The problem seems to be on the python side of the interface. ReadOnlyCell<T> is defined in the pyo3 crate that provides the rust bindings for python. And the definition says:
> &ReadOnlyCell<T> is basically a safe version of *const T: The data cannot be modified through the reference, but other references may be modifying the data.

Thus to avoid UB, it is necessary to guard all accesses with a lock. But this lock has to be provided by python, as the rust code cannot know which other accesses are happening to the buffer.

The hypothetical parsing library can be called by other general-purpose languages just fine, even if it expects a &[u8]. The caller just needs to uphold the rust safety properties, i.e., there may not be any mutables reference to the buffer while the rust function is running. For C code this is a non-issue. You just use *const T as type on the C-side. If the programmer chooses to mutate the buffer, then this is UB. But in C, the programmer is responsible for avoiding any UB. Therefore, it is perfectly valid to just assume that the programmer will never do this.

And languages that care about memory safety should already provide the means to ensure that there cannot be any data races wrt. the buffer. It is a bit strange that the python interface does not have such means.

Pick two, any two

Posted Oct 24, 2022 15:17 UTC (Mon) by Wol (subscriber, #4433) [Link]

This appears to be a fundamental law of life ...

:-)
Cheers,
Wol


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK