Posted by Jeff Vander Stoep and Stephen Hines, Android Team

Correctness of code in the Android platform is a top priority for the security, stability, and quality of each Android release. Memory safety bugs in C and C++ continue to be the most-difficult-to-address source of incorrectness. We invest a great deal of effort and resources into detecting, fixing, and mitigating this class of bugs, and these efforts are effective in preventing a large number of bugs from making it into Android releases. Yet in spite of these efforts, memory safety bugs continue to be a top contributor of stability issues, and consistently represent ~70% of Android’s high severity security vulnerabilities.

In addition to ongoing and upcoming efforts to improve detection of memory bugs, we are ramping up efforts to prevent them in the first place. Memory-safe languages are the most cost-effective means for preventing memory bugs. In addition to memory-safe languages like Kotlin and Java, we’re excited to announce that the Android Open Source Project (AOSP) now supports the Rust programming language for developing the OS itself.

Systems programming

Managed languages like Java and Kotlin are the best option for Android app development. These languages are designed for ease of use, portability, and safety. The Android Runtime (ART) manages memory on behalf of the developer. The Android OS uses Java extensively, effectively protecting large portions of the Android platform from memory bugs. Unfortunately, for the lower layers of the OS, Java and Kotlin are not an option.

Lower levels of the OS require systems programming languages like C, C++, and Rust. These languages are designed with control and predictability as goals. They provide access to low level system resources and hardware. They are light on resources and have more predictable performance characteristics.

For C and C++, the developer is responsible for managing memory lifetime. Unfortunately, it's easy to make mistakes when doing this, especially in complex and multithreaded codebases.

Rust provides memory safety guarantees by using a combination of compile-time checks to enforce object lifetime/ownership and runtime checks to ensure that memory accesses are valid. This safety is achieved while providing equivalent performance to C and C++.

The limits of sandboxing

C and C++ languages don’t provide these same safety guarantees and require robust isolation. All Android processes are sandboxed and we follow the Rule of 2 to decide if functionality necessitates additional isolation and deprivileging. The Rule of 2 is simple: given three options, developers may only select two of the following three options.

For Android, this means that if code is written in C/C++ and parses untrustworthy input, it should be contained within a tightly constrained and unprivileged sandbox. While adherence to the Rule of 2 has been effective in reducing the severity and reachability of security vulnerabilities, it does come with limitations. Sandboxing is expensive: the new processes it requires consume additional overhead and introduce latency due to IPC and additional memory usage. Sandboxing doesn’t eliminate vulnerabilities from the code and its efficacy is reduced by high bug density, allowing attackers to chain multiple vulnerabilities together.

Memory-safe languages like Rust help us overcome these limitations in two ways:

Lowers the density of bugs within our code, which increases the effectiveness of our current sandboxing.
Reduces our sandboxing needs, allowing introduction of new features that are both safer and lighter on resources.

But what about all that existing C++?

Of course, introducing a new programming language does nothing to address bugs in our existing C/C++ code. Even if we redirected the efforts of every software engineer on the Android team, rewriting tens of millions of lines of code is simply not feasible.

The above analysis of the age of memory safety bugs in Android (measured from when they were first introduced) demonstrates why our memory-safe language efforts are best focused on new development and not on rewriting mature C/C++ code. Most of our memory bugs occur in new or recently modified code, with about 50% being less than a year old.

The comparative rarity of older memory bugs may come as a surprise to some, but we’ve found that old code is not where we most urgently need improvement. Software bugs are found and fixed over time, so we would expect the number of bugs in code that is being maintained but not actively developed to go down over time. Just as reducing the number and density of bugs improves the effectiveness of sandboxing, it also improves the effectiveness of bug detection.

Limitations of detection

Bug detection via robust testing, sanitization, and fuzzing is crucial for improving the quality and correctness of all software, including software written in Rust. A key limitation for the most effective memory safety detection techniques is that the erroneous state must actually be triggered in instrumented code in order to be detected. Even in code bases with excellent test/fuzz coverage, this results in a lot of bugs going undetected.

Another limitation is that bug detection is scaling faster than bug fixing. In some projects, bugs that are being detected are not always getting fixed. Bug fixing is a long and costly process.

Each of these steps is costly, and missing any one of them can result in the bug going unpatched for some or all users. For complex C/C++ code bases, often there are only a handful of people capable of developing and reviewing the fix, and even with a high amount of effort spent on fixing bugs, sometimes the fixes are incorrect.

Bug detection is most effective when bugs are relatively rare and dangerous bugs can be given the urgency and priority that they merit. Our ability to reap the benefits of improvements in bug detection require that we prioritize preventing the introduction of new bugs.

Prioritizing prevention

Rust modernizes a range of other language aspects, which results in improved correctness of code:

Memory safety - enforces memory safety through a combination of compiler and run-time checks.
Data concurrency - prevents data races. The ease with which this allows users to write efficient, thread-safe code has given rise to Rust’s Fearless Concurrency slogan.
More expressive type system - helps prevent logical programming bugs (e.g. newtype wrappers, enum variants with contents).
References and variables are immutable by default - assist the developer in following the security principle of least privilege, marking a reference or variable mutable only when they actually intend it to be so. While C++ has const, it tends to be used infrequently and inconsistently. In comparison, the Rust compiler assists in avoiding stray mutability annotations by offering warnings for mutable values which are never mutated.
Better error handling in standard libraries - wrap potentially failing calls in Result, which causes the compiler to require that users check for failures even for functions which do not return a needed value. This protects against bugs like the Rage Against the Cage vulnerability which resulted from an unhandled error. By making it easy to propagate errors via the ? operator and optimizing Result for low overhead, Rust encourages users to write their fallible functions in the same style and receive the same protection.
Initialization - requires that all variables be initialized before use. Uninitialized memory vulnerabilities have historically been the root cause of 3-5% of security vulnerabilities on Android. In Android 11, we started auto initializing memory in C/C++ to reduce this problem. However, initializing to zero is not always safe, particularly for things like return values, where this could become a new source of faulty error handling. Rust requires every variable be initialized to a legal member of its type before use, avoiding the issue of unintentionally initializing to an unsafe value. Similar to Clang for C/C++, the Rust compiler is aware of the initialization requirement, and avoids any potential performance overhead of double initialization.
Safer integer handling - Overflow sanitization is on for Rust debug builds by default, encouraging programmers to specify a wrapping_add if they truly intend a calculation to overflow or saturating_add if they don’t. We intend to enable overflow sanitization for all builds in Android. Further, all integer type conversions are explicit casts: developers can not accidentally cast during a function call when assigning to a variable or when attempting to do arithmetic with other types.

Where we go from here

Adding a new language to the Android platform is a large undertaking. There are toolchains and dependencies that need to be maintained, test infrastructure and tooling that must be updated, and developers that need to be trained. For the past 18 months we have been adding Rust support to the Android Open Source Project, and we have a few early adopter projects that we will be sharing in the coming months. Scaling this to more of the OS is a multi-year project. Stay tuned, we will be posting more updates on this blog.

Java is a registered trademark of Oracle and/or its affiliates.

Thanks Matthew Maurer, Bram Bonne, and Lars Bergstrom for contributions to this post. Special thanks to our colleagues, Adrian Taylor for his insight into the age of memory vulnerabilities, and to Chris Palmer for his work on “The Rule of 2” and “The limits of Sandboxing”.

April 6, 2021

Rust in the Android platform