In July 2023, four Googlers from the Enterprise Security and Access Security organizations developed a tool that aimed at revolutionizing the way Googlers interact with Access Control Lists - SpeakACL. This tool, awarded the Gold Prize during Google’s internal Security & AI Hackathon, allows developers to create or modify security policies using simple English instructions rather than having to learn system-specific syntax or complex security principles. This can save security and product teams hours of time and effort, while helping to protect the information of their users by encouraging the reduction of permitted access by adhering to the principle of least privilege.
Access Control Policies in BeyondCorp
Google requires developers and owners of enterprise applications to define their own access control policies, as described in BeyondCorp: The Access Proxy. We have invested in reducing the difficulty of self-service ACL and ACL test creation to encourage these service owners to define least privilege access control policies. However, it is still challenging to concisely transform their intent into the language acceptable to the access control engine. Additional complexity is added by the variety of engines, and corresponding policy definition languages that target different access control domains (i.e. websites, networks, RPC servers).
To adequately implement an access control policy, service developers are expected to learn various policy definition languages and their associated syntax, in addition to sufficiently understanding security concepts. As this takes time away from core developer work, it is not the most efficient use of developer time. A solution was required to remove these challenges so developers can focus on building innovative tools and products.
Making it Work
We built a prototype interface for interactively defining and modifying access control policies for the BeyondCorp access control engine using the PaLM 2 Large Language Model (LLM). using the PaLM 2 Large Language Model (LLM). We used Google Colab to provide the model with a diverse, highly variable, dataset using in-context learning and fine-tuning. In-context learning allows the model to learn from a dataset of examples that are relevant to the task at hand, which we provided via few-shot learning. Fine-tuning allows the model to be adapted to a specific task by adjusting its parameters. Tuning the model with a diverse labeled dataset that we curated for this task allowed us to improve its ability to generate ACLs that are both syntactically accurate and adhered to the principle of least privilege.
With SpeakACL, and other tools leveraging AI in security, it is always recommended to take a conservative approach with the autonomy you give an AI agent. To ensure our model outputs are correct & safe to use, we combined our tool with existing safeguards that exist at Google for all access policy modifications:
Request LGTM from a teammate to ensure that the intent of the proposed change is correct.
Automated Risk Assessment occurs on proposed security policy at Google.
Manual Review by Security Engineers is performed on changes not assessed as low risk to ensure compliance with security policies and guidelines.
Linting, unit tests, and integration tests ensure that the access control language syntax is correct, and that the change does not break any expected access or permit unexpected access.
Looking to the future
While progress in AI is impressive, it is crucial we as an industry continue to prioritize safety while navigating the landscape. Other than adding checks to syntactically and semantically verify access policies produced by our model, we also designed safeguards for sensitive information disclosure, data leaking, prompt injections, and supply chain vulnerabilities to make sure our model is performing at the highest level of security.
SpeakACL is an ACL Generation tool that has the potential to revolutionize the way access policies are created and managed. The efficiency, security, and ease of use achieved by this AI-powered ACL Generation Engine reflects Google’s ongoing commitment to leveraging AI across domains to develop cutting-edge products and infrastructure.
No comments:
Post a Comment
You are welcome to contribute comments, but they should be relevant to the conversation. We reserve the right to remove off-topic remarks in the interest of keeping the conversation focused and engaging. Shameless self-promotion is well, shameless, and will get canned.
Note: Only a member of this blog may post a comment.