October 7, 2020

Privacy-Preserving Smart Input with Gboard

Google Keyboard (a.k.a Gboard) has a critical mission to provide frictionless input on Android to empower users to communicate accurately and express themselves effortlessly. In order to accomplish this mission, Gboard must also protect users' private and sensitive data. Nothing users type is sent to Google servers. We recently launched privacy-preserving input by further advancing the latest federated technologies. In Android 11, Gboard also launched the contextual input suggestion experience by integrating on-device smarts into the user's daily communication in a privacy-preserving way.

Before Android 11, input suggestions were surfaced to users in several different places. In Android 11, Gboard launched a consistent and coordinated approach to access contextual input suggestions. For the first time, we've brought Smart Replies to the keyboard suggestions - powered by system intelligence running entirely on device. The smart input suggestions are rendered with a transparent layer on top of Gboard’s suggestion strip. This structure maintains the trust boundaries between the Android platform and Gboard, meaning sensitive personal content cannot be not accessed by Gboard. The suggestions are only sent to the app after the user taps to accept them.

For instance, when a user receives the message “Have a virtual coffee at 5pm?” in Whatsapp, on-device system intelligence predicts smart text and emoji replies “Sounds great!” and “👍”. Android system intelligence can see the incoming message but Gboard cannot. In Android 11, these Smart Replies are rendered by the Android platform on Gboard’s suggestion strip as a transparent layer. The suggested reply is generated by the system intelligence. When the user taps the suggestion, Android platform sends it to the input field directly. If the user doesn't tap the suggestion, gBoard and the app cannot see it. In this way, Android and Gboard surface the best of Google smarts whilst keeping users' data private: none of their data goes to any app, including the keyboard, unless they've tapped a suggestion.

Additionally, federated learning has enabled Gboard to train intelligent input models across many devices while keeping everything individual users type on their device. Today, the emoji is as common as punctuation - and have become the way for our users to express themselves in messaging. Our users want a way to have fresh and diversified emojis to better express their thoughts in messaging apps. Recently, we launched new on-device transformer models that are fine-tuned with federated learning in Gboard, to produce more contextual emoji predictions for English, Spanish and Portuguese.

Furthermore, following the success of privacy-preserving machine learning techniques, Gboard continues to leverage federated analytics to understand how Gboard is used from decentralized data. What we've learned from privacy-preserving analysis has let us make better decisions in our product.

When a user shares an emoji in a conversation, their phone keeps an ongoing count of which emojis are used. Later, when the phone is idle, plugged in, and connected to WiFi, Google’s federated analytics server invites the device to join a “round” of federated analytics data computation with hundreds of other participating phones. Every device involved in one round will compute the emoji share frequency, encrypt the result and send it a federated analytics server. Although the server can’t decrypt the data individually, the final tally of total emoji counts can be decrypted when combining encrypted data across devices. The aggregated data shows that the most popular emoji is 😂 in Whatsapp, 😭 in Roblox(gaming), and ✔ in Google Docs. Emoji 😷 moved up from 119th to 42nd in terms of frequency during COVID-19.

Gboard always has a strong commitment to Google’s Privacy Principles. Gboard strives to build privacy-preserving effortless input products for users to freely express their thoughts in 900+ languages while safeguarding user data. We will keep pushing the state of the art in smart input technologies on Android while safeguarding user data. Stay tuned!

No comments:

Post a Comment

You are welcome to contribute comments, but they should be relevant to the conversation. We reserve the right to remove off-topic remarks in the interest of keeping the conversation focused and engaging. Shameless self-promotion is well, shameless, and will get canned.

Note: Only a member of this blog may post a comment.