Buffers on the edge: Python and Rust
One of my least favorite kinds of bug is when two different systems are interacting and the result has bad behavior but it’s difficult to say which (if either!) system is at fault. This is one of those stories, about Python’s buffer protocol and Rust’s memory model.
Python buffer protocol
Python’s Buffer Protocol is a
set of APIs which allow Python objects to expose their backing memory, so that
0-copy interoperability is possible between different data structures. For
example, they can be used to seamlessly share memory between an image parsing
library and numpy. They also support various metadata to enable more advanced
interoperability, such as multi-dimensional arrays and arrays of different
types. But, for the rest of this post we’re going to pretend they’re just a
uint8_t *
and a length for simplicity.
If you have a Python object and want to obtain its buffer, you can do so with
memoryview
in Python or PyObject_GetBuffer
in C. If you’re defining a
class and want to expose a buffer, you can do so in Python by… actually you
can’t, only classes implemented in C can implement the buffer protocol. To
implement the buffer protocol in C, you provide the bf_getbuffer
and
bf_releasebuffer
functions which are called to obtain a buffer from an
object and when that buffer is being released, respectively.
Shifting gears slightly, let’s talk about data races. Data races are a type of race condition that happens when a write and a read or write to the same address occur from different threads without synchronization. Synchronization could mean a lock, or an explicit atomic operation. Data races are undefined behavior in C1. Undefined behavior is Latin for “the code will often do what you want, but the compiler is free to do whatever it wants including cause security vulnerabilities”. Undefined behavior should be avoided.
Are data races possible with Python buffer objects? Sadly, yes. Imagine we have an object which implements the buffer protocol, and we request two buffers from it. This will give us two pointers to the same memory location. Now we spin up two threads, one reading from the buffer and the other writing to it. We’ve got ourselves a data race.
Perhaps you are thinking, “doesn’t the GIL prevent this?” If we are imagining pure Python code, then yes, the GIL would prevent this – it’s a lock, which means accesses are synchronized. But one of the goals of the buffer protocol is to allow C extensions to release the GIL while processing buffers. Therefore the reads and writes to our buffer could be coming from a C extension which has released the GIL – now we have no synchronization.
If we imagine that our reading and writing code comes from the same C extension we might say that’s a bug in the extension. But what if they come from totally separate packages (the point of the buffer protocol!)? Neither side is buggy, it’s totally correct to either read or write from a buffer. That’d mean the Python code which invoked them in parallel was buggy. But Python code (even buggy Python code!) is not supposed to be able to trigger C-level undefined behavior, that’s part of the point of using a high-level language like Python. It seems the design of the buffer protocol and C’s notion of data race undefined behavior may not play nicely with this.
Rust
Let’s talk about Rust. In Rust a sequence of objects in memory, and their
length, are represented with a slice. A slice of bytes is written &[u8]
, or
&mut [u8]
for a mutable slice. Rust implements a simple (but powerful!)
rule: references may be mutable XOR shared. This means that if a mutable
reference to memory exists then that reference must be the only one that
exists, and vice versa if there are multiple references to a piece of memory
then they must be immutable references. This has basic implications, like that
if you have a &[u8]
it must be the case that no one is mutating it behind
your back, because there can’t be any mutable references. This is different
from C/C++’s notion of const
, which means “this reference may not be used
for mutation, but other mutable references may exist”.
Rust also introduces a notion of “soundness”, which is a stronger notion than
C’s undefined behavior. Most C undefined behavior is defined by what happens
at runtime. Soundness is about how a function could be used, regardless of
how it actually is used. A function is sound if it’s impossible to trigger undefined
behavior, with any combination of arguments it takes. Inversely, a function is
unsound if it’s a safe function (i.e. not declared unsafe fn
) and it’s
possible to trigger undefined behavior with it. The Rust community considers
all instances of
unsoundness to be security issues,
even if they’re improbable in practice. Using unsafe
to violate the mutable
XOR shared rule is undefined behavior, and thus unsound.
Putting it all together
If we have a Python buffer, and we want to represent the data in Rust, how
should we do so? The natural answer would be a &[u8]
, but as you may have
picked up, in the face of the possibility of concurrent writes, this would be
unsound. Similarly, an &mut [u8]
is unacceptable because the Python buffer
protocol provides no assurances that only one mutable buffer is handed out at
a time. Importantly, because Rust’s notion of unsoundness is a source-code
level concern even if Python code never actually creates multiple buffers in
this fashion the code would still be unsound.
pyo3 is a popular Rust library for binding to the CPython
C-API. Its solution to this is interior mutability, which is a pattern in
Rust code where structures safely encapsulate mutation with shared references.
In pyo3 a Python buffer’s contents is represented as
&[ReadOnlyCell<u8>]
.
This is safe and sound, but unfortunately struggles with interoperability.
The challenge is that if you want to pass some bytes to a Rust library to
parse them (or do any other processing for that matter), the library almost
certainly expects a &[u8]
, and there’s no way to turn a
&[ReadOnlyCell<u8>]
into a &[u8]
safely, without allocating and copying.
And of course, the whole point of the Python buffer protocol is to avoid these
sorts of inefficiencies.
Therefore, the regrettable solution is that, right now, there is no way to have all three of: efficiency, interoperability, and soundness.
A better future?
That’s the current state of the world, what could we do to improve things?
The simplest answer I can come up with is for Python’s buffer protocol to
implement Rust’s mutable XOR shared semantics. Providing such semantics would
also address the possibility of undefined behavior from C code. It could
further be done in a backwards compatible way by providing a flag that allows
implementors of the buffer protocol to signal that they provide these
semantics – and thus can safely be represented as &[u8]
. In fact, an
implementor of the buffer protocol could provide these semantics today, the
only problem is that code requesting a buffer would have no way of knowing
that they were adhered to.
Perhaps there are other solutions that also address this problem too! I’m very excited to hear other people’s thoughts on how we can address this. As the presence of Python extension modules written in Rust becomes more prominent finding an efficient, interoperable, and sound way of handling Python buffers in Rust will become important.
-
The exact language of the spec is: “The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.” C11 (ISO/IEC 9899:2011) section 5.1.2.4 ↩︎