In our previous post, we announced that Android now supports the Rust programming language for developing the OS itself. Related to this, we are also participating in the effort to evaluate the use of Rust as a supported language for developing the Linux kernel. In this post, we discuss some technical aspects of this work using a few simple examples.
C has been the language of choice for writing kernels for almost half a century because it offers the level of control and predictable performance required by such a critical component. Density of memory safety bugs in the Linux kernel is generally quite low due to high code quality, high standards of code review, and carefully implemented safeguards. However, memory safety bugs do still regularly occur. On Android, vulnerabilities in the kernel are generally considered high-severity because they can result in a security model bypass due to the privileged mode that the kernel runs in.
We feel that Rust is now ready to join C as a practical language for implementing the kernel. It can help us reduce the number of potential bugs and security vulnerabilities in privileged code while playing nicely with the core kernel and preserving its performance characteristics.
We developed an initial prototype of the Binder driver to allow us to make meaningful comparisons between the safety and performance characteristics of the existing C version and its Rust counterpart. The Linux kernel has over 30 million lines of code, so naturally our goal is not to convert it all to Rust but rather to allow new code to be written in Rust. We believe this incremental approach allows us to benefit from the kernel’s existing high-performance implementation while providing kernel developers with new tools to improve memory safety and maintain performance going forward.
We joined the Rust for Linux organization, where the community had already done and continues to do great work toward adding Rust support to the Linux kernel build system. We also need designs that allow code in the two languages to interact with each other: we're particularly interested in safe, zero-cost abstractions that allow Rust code to use kernel functionality written in C, and how to implement functionality in idiomatic Rust that can be called seamlessly from the C portions of the kernel.
Since Rust is a new language for the kernel, we also have the opportunity to enforce best practices in terms of documentation and uniformity. For example, we have specific machine-checked requirements around the usage of unsafe code: for every unsafe function, the developer must document the requirements that need to be satisfied by callers to ensure that its usage is safe; additionally, for every call to unsafe functions (or usage of unsafe constructs like dereferencing a raw pointer), the developer must document the justification for why it is safe to do so.
Just as important as safety, Rust support needs to be convenient and helpful for developers to use. Let’s get into a few examples of how Rust can assist kernel developers in writing drivers that are safe and correct.
We'll use an implementation of a semaphore character device. Each device has a current value; writes of n bytes result in the device value being incremented by n; reads decrement the value by 1 unless the value is 0, in which case they will block until they can decrement the count without going below 0.
Suppose semaphore is a file representing our device. We can interact with it from the shell as follows:
> cat semaphore
When semaphore is a newly initialized device, the command above will block because the device's current value is 0. It will be unblocked if we run the following command from another shell because it increments the value by 1, which allows the original read to complete:
> echo -n a > semaphore
We could also increment the count by more than 1 if we write more data, for example:
> echo -n abc > semaphore
increments the count by 3, so the next 3 reads won't block.
To allow us to show a few more aspects of Rust, we'll add the following features to our driver: remember what the maximum value was throughout the lifetime of a device, and remember how many reads each file issued on the device.
We'll now show how such a driver would be implemented in Rust, contrasting it with a C implementation. We note, however, we are still early on so this is all subject to change in the future. How Rust can assist the developer is the aspect that we'd like to emphasize. For example, at compile time it allows us to eliminate or greatly reduce the chances of introducing classes of bugs, while at the same time remaining flexible and having minimal overhead.
A developer needs to do the following to implement a driver for a new character device in Rust:
The following outlines how the first two steps of our example compare in Rust and C:
impl FileOpener<Arc<Semaphore>> for FileState { fn open( shared: &Arc<Semaphore> ) -> KernelResult<Box<Self>> { [...] } } impl FileOperations for FileState { type Wrapper = Box<Self>; fn read( &self, _: &File, data: &mut UserSlicePtrWriter, offset: u64 ) -> KernelResult<usize> { [...] } fn write( &self, data: &mut UserSlicePtrReader, _offset: u64 ) -> KernelResult<usize> { [...] } fn ioctl( &self, file: &File, cmd: &mut IoctlCommand ) -> KernelResult<i32> { [...] } fn release(_obj: Box<Self>, _file: &File) { [...] } declare_file_operations!(read, write, ioctl); }
static int semaphore_open(struct inode *nodp, struct file *filp) { struct semaphore_state *shared = container_of(filp->private_data, struct semaphore_state, miscdev); [...] } static ssize_t semaphore_write(struct file *filp, const char __user *buffer, size_t count, loff_t *ppos) { struct file_state *state = filp->private_data; [...] } static ssize_t semaphore_read(struct file *filp, char __user *buffer, size_t count, loff_t *ppos) { struct file_state *state = filp->private_data; [...] } static long semaphore_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct file_state *state = filp->private_data; [...] } static int semaphore_release(struct inode *nodp, struct file *filp) { struct file_state *state = filp->private_data; [...] } static const struct file_operations semaphore_fops = { .owner = THIS_MODULE, .open = semaphore_open, .read = semaphore_read, .write = semaphore_write, .compat_ioctl = semaphore_ioctl, .release = semaphore_release, };
Character devices in Rust benefit from a number of safety features:
For a driver to provide a custom ioctl handler, it needs to implement the ioctl function that is part of the FileOperations trait, as exemplified in the table below.
fn ioctl( &self, file: &File, cmd: &mut IoctlCommand ) -> KernelResult<i32> { cmd.dispatch(self, file) } impl IoctlHandler for FileState { fn read( &self, _file: &File, cmd: u32, writer: &mut UserSlicePtrWriter ) -> KernelResult<i32> { match cmd { IOCTL_GET_READ_COUNT => { writer.write( &self .read_count .load(Ordering::Relaxed))?; Ok(0) } _ => Err(Error::EINVAL), } } fn write( &self, _file: &File, cmd: u32, reader: &mut UserSlicePtrReader ) -> KernelResult<i32> { match cmd { IOCTL_SET_READ_COUNT => { self .read_count .store(reader.read()?, Ordering::Relaxed); Ok(0) } _ => Err(Error::EINVAL), } } }
#define IOCTL_GET_READ_COUNT _IOR('c', 1, u64) #define IOCTL_SET_READ_COUNT _IOW('c', 1, u64) static long semaphore_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct file_state *state = filp->private_data; void __user *buffer = (void __user *)arg; u64 value; switch (cmd) { case IOCTL_GET_READ_COUNT: value = atomic64_read(&state->read_count); if (copy_to_user(buffer, &value, sizeof(value))) return -EFAULT; return 0; case IOCTL_SET_READ_COUNT: if (copy_from_user(&value, buffer, sizeof(value))) return -EFAULT; atomic64_set(&state->read_count, value); return 0; default: return -EINVAL; } }
Ioctl commands are standardized such that, given a command, we know whether a user buffer is provided, its intended use (read, write, both, none), and its size. In Rust, we provide a dispatcher (accessible by calling cmd.dispatch) that uses this information to automatically create user memory access helpers and pass them to the caller.
A driver is not required to use this though. If, for example, it doesn't use the standard ioctl encoding, Rust offers the flexibility of simply calling cmd.raw to extract the raw arguments and using them to handle the ioctl (potentially with unsafe code, which will need to be justified).
However, if a driver implementation does use the standard dispatcher, it will benefit from not having to implement any unsafe code, and:
All of the above could potentially also be done in C, but it's very easy for developers to (likely unintentionally) break contracts that lead to unsafety; Rust requires unsafe blocks for this, which should only be used in rare cases and brings additional scrutiny. Additionally, Rust offers the following:
We allow developers to use mutexes and spinlocks to provide interior mutability. In our example, we use a mutex to protect mutable data; in the tables below we show the data structures we use in C and Rust, and how we implement a wait until the count is nonzero so that we can satisfy a read:
struct SemaphoreInner { count: usize, max_seen: usize, } struct Semaphore { changed: CondVar, inner: Mutex<SemaphoreInner>, } struct FileState { read_count: AtomicU64, shared: Arc<Semaphore>, }
struct semaphore_state { struct kref ref; struct miscdevice miscdev; wait_queue_head_t changed; struct mutex mutex; size_t count; size_t max_seen; }; struct file_state { atomic64_t read_count; struct semaphore_state *shared; };
fn consume(&self) -> KernelResult { let mut inner = self.shared.inner.lock(); while inner.count == 0 { if self.shared.changed.wait(&mut inner) { return Err(Error::EINTR); } } inner.count -= 1; Ok(()) }
static int semaphore_consume( struct semaphore_state *state) { DEFINE_WAIT(wait); mutex_lock(&state->mutex); while (state->count == 0) { prepare_to_wait(&state->changed, &wait, TASK_INTERRUPTIBLE); mutex_unlock(&state->mutex); schedule(); finish_wait(&state->changed, &wait); if (signal_pending(current)) return -EINTR; mutex_lock(&state->mutex); } state->count--; mutex_unlock(&state->mutex); return 0; }
We note that such waits are not uncommon in the existing C code, for example, a pipe waiting for a "partner" to write, a unix-domain socket waiting for data, an inode search waiting for completion of a delete, or a user-mode helper waiting for state change.
The following are benefits from the Rust implementation:
In the tables below, we show how open, read, and write are implemented in our example driver:
fn read( &self, _: &File, data: &mut UserSlicePtrWriter, offset: u64 ) -> KernelResult<usize> { if data.is_empty() || offset > 0 { return Ok(0); } self.consume()?; data.write_slice(&[0u8; 1])?; self.read_count.fetch_add(1, Ordering::Relaxed); Ok(1) }
static ssize_t semaphore_read(struct file *filp, char __user *buffer, size_t count, loff_t *ppos) { struct file_state *state = filp->private_data; char c = 0; int ret; if (count == 0 || *ppos > 0) return 0; ret = semaphore_consume(state->shared); if (ret) return ret; if (copy_to_user(buffer, &c, sizeof(c))) return -EFAULT; atomic64_add(1, &state->read_count); *ppos += 1; return 1; }
fn write( &self, data: &mut UserSlicePtrReader, _offset: u64 ) -> KernelResult<usize> { { let mut inner = self.shared.inner.lock(); inner.count = inner.count.saturating_add(data.len()); if inner.count > inner.max_seen { inner.max_seen = inner.count; } } self.shared.changed.notify_all(); Ok(data.len()) }
static ssize_t semaphore_write(struct file *filp, const char __user *buffer, size_t count, loff_t *ppos) { struct file_state *state = filp->private_data; struct semaphore_state *shared = state->shared; mutex_lock(&shared->mutex); shared->count += count; if (shared->count < count) shared->count = SIZE_MAX; if (shared->count > shared->max_seen) shared->max_seen = shared->count; mutex_unlock(&shared->mutex); wake_up_all(&shared->changed); return count; }
fn open( shared: &Arc<Semaphore> ) -> KernelResult<Box<Self>> { Ok(Box::try_new(Self { read_count: AtomicU64::new(0), shared: shared.clone(), })?) }
static int semaphore_open(struct inode *nodp, struct file *filp) { struct semaphore_state *shared = container_of(filp->private_data, struct semaphore_state, miscdev); struct file_state *state; state = kzalloc(sizeof(*state), GFP_KERNEL); if (!state) return -ENOMEM; kref_get(&shared->ref); state->shared = shared; atomic64_set(&state->read_count, 0); filp->private_data = state; return 0; }
They illustrate other benefits brought by Rust:
The examples above are only a small part of the whole project. We hope it gives readers a glimpse of the kinds of benefits that Rust brings. At the moment we have nearly all generic kernel functionality needed by Binder neatly wrapped in safe Rust abstractions, so we are in the process of gathering feedback from the broader Linux kernel community with the intent of upstreaming the existing Rust support.
We also continue to make progress on our Binder prototype, implement additional abstractions, and smooth out some rough edges. This is an exciting time and a rare opportunity to potentially influence how the Linux kernel is developed, as well as inform the evolution of the Rust language. We invite those interested to join us in Rust for Linux and attend our planned talk at Linux Plumbers Conference 2021!
Thanks Nick Desaulniers, Kees Cook, and Adrian Taylor for contributions to this post. Special thanks to Jeff Vander Stoep for contributions and editing, and to Greg Kroah-Hartman for reviewing and contributing to the code examples.
Publicar un comentario
No hay comentarios :
Publicar un comentario