New blueprint helps secure confidential data in AI Platform Notebooks

Take the 2021 State of DevOps survey: Shape the future of DevOps
May 3, 2021
XDR Data Retention | Making Sure Your XDR Platform Outlasts Your Adversaries
May 3, 2021

Core to Google Cloud’s efforts to be the industry’s most Trusted Cloud is our belief in shared fate – taking an active stake to help customers achieve better security outcomes on our platforms. To make it easier to build security into deployments, we provide opinionated guidance for customers in the form of security blueprints. We recently released our updated Google Cloud security foundations guide and deployable blueprint to help our customers build security into their starting point on Google Cloud. Today, we’re adding to our portfolio of blueprints with the publication of our Protecting confidential data in AI Platform Notebooks blueprint guide and deployable blueprint, which can help you apply data governance and security policies that protect your AI Platform Notebooks containing confidential data.

Security and privacy are particularly important when it comes to AI, because confidential data is often at the heart of AI and ML projects. This blog post focuses on securing the following high level notebook flow at all relevant security layers.

high level notebook flow.jpg

AI Platform Notebooks offer an integrated and secure JupyterLab environment for enterprises. Data science practitioners in enterprises use AI Platform Notebooks to experiment, develop code, and deploy models. With a few clicks, you can easily get started with a Notebook running alongside popular deep learning frameworks (TensorFlow Enterprise, PyTorch, RAPIDS and many others). Today AI Platform Notebooks can be run on Deep Learning Virtual Machines or Deep Learning Containers.

Enterprise customers, particularly those in highly regulated industries like financial services and healthcare and life sciences, may want to run their JupyterLab Notebooks in a secure perimeter, and control access to the notebooks and data. AI Platform Notebooks were built from the ground up with such customers in mind, with security and access control as the pillars of the service. Recently, we announced the general availability of several security features, including VPC Service Controls (VPC-SC), customer managed encryption keys (CMEK), and more for AI Platform Notebooks. However, security is more than just features; practices and processes are just as important. Let’s walk through the blueprint, which serves as a step-by-step guide to help secure your data and the Notebooks environment.

AI Platform Notebooks support popular Google Cloud Platform enterprise security architectures through VPC-SC, shared VPC, and private IP controls. You can run a Shielded VM as your compute instance for AI Platform Notebooks, and encrypt your data on disk with CMEK. You can choose between two predefined user access modes to AI Platform Notebooks: single-user or via a service account. You can also customize access based on your Cloud Identity and Access Management (IAM) configuration. Let’s take a closer look at these security features in the context of AI Platform Notebooks.

Compute Engine security

AI Platform Notebooks with Shielded VM supports a set of security controls that help defend against rootkits and bootkits. Available in Notebooks API and DLVM Debian 10 images, this functionality helps you protect enterprise workloads from threats like remote attacks, privilege escalation, and malicious insiders. This feature leverages advanced platform security capabilities such as secure and measured boot, a virtual trusted platform module (vTPM), UEFI firmware, and integrity monitoring. On a Shielded VM Notebook instance, Compute Engine enables the virtual Trusted Platform Module (vTPM) and integrity monitoring options by default. In addition to this functionality, Notebooks API provides an upgrade endpoint which allows you to perform operating system updates to the latest DLVM image, either manually or automatically via auto-upgrade.

Data encryption

When you enable CMEK for an AI Platform Notebooks instance, the key that you designate, rather than a key managed by Google, is used to encrypt data on the boot and data disks of the VM. In general, CMEK is most useful if you need full control over the keys used to encrypt your data. With CMEK, you can manage your keys within Cloud KMS. For example, you can rotate or disable a key, or you can set up a rotation schedule using the Cloud KMS API.

Data exfiltration mitigation

VPC Service Controls (VPC-SC) improves your ability to mitigate the risk of data exfiltration from Google Cloud services such as Cloud Storage and BigQuery. 

AI Platform Notebooks supports VPC-SC, which prevents reading data from or copying data to a resource outside the perimeter using service operations, such as copying to a public Cloud Storage bucket using the “gsutil cp” command or to a permanent external BigQuery table using the “bq mk” command.

Access control and audit logging

AI Platform Notebooks has a specific set of Identity and Access Management (IAM) roles. Each predefined role contains a set of permissions. When you add a new member to a project, you can use an IAM policy to give that member one or more IAM roles. Each IAM role contains permissions that grant the member access to specific resources. AI Platform Notebooks IAM permissions are used to manage Notebook instances; you can create, delete, and modify AI Platform Notebooks instances via Notebooks API. (To configure JupyterLab access, please refer to this troubleshooting resource.)

AI Platform Notebooks writes Admin Activity audit logs, which record operations that modify the configuration or metadata of a resource.

With these security features in mind, let’s take a look at a few use cases where AI Platform Notebooks can be particularly useful:

  1. Customers want the same security measures and controls they apply to their IT infrastructure applied to their data and notebook instances.

  2. Customers want uniform security policies that can be easily applied when their data science teams access data.

  3. Customers want to tune sensitive data access for specific individuals or teams, and prevent broader access to that data.

AI Platform Notebook Security Best Practices

Google Cloud provides features and products that address security concerns at multiple layers including network, endpoint, application, data, and user access. Although every organization is unique, many of our customers have common requirements when it comes to securing their Cloud environments, including notebooks deployments. 

The new Protecting confidential data in AI Platform Notebooks blueprint guide can help you set up security controls and mitigate data exfiltration when using AI Platform Notebooks by: 

  1. Helping you implement a set of best practices based on common customer inputs.

  2. Minimizing time to deployment by using a declarative configuration with Terraform.

  3. Allowing for reproducibility by leveraging the Google Cloud security foundations blueprint.

The blueprint deploys the following architecture:

blueprint deploys.jpg

The above diagram illustrates an architecture for implementing security with the following approach:

  1. Gather resources around common contexts as early as possible.

  2. Apply least-privilege principles when setting up authorization policies.

  3. Create network boundaries that only allow for necessary communications.

  4. Protect sensitive information at the data and software level.

1. Gather resources around common contexts as early as possible

With Google Cloud, you can gather resources that share a common theme using a resource hierarchy that you can customize. The Google Cloud security foundations blueprint sets a default organization’s hierarchy. The blueprint adds a folder and projects related to handling sensitive production data while using AI Platform Notebooks.

A “trusted” folder under the “production” folder contains three projects organized according to its logical application:

  • “trusted-kms” gathers resources such as keys and secrets that protect data.

  • “trusted-data” gathers sensitive data.

  • “trusted-analytics” gathers resources such as notebooks that access data.

Grouping resources around a common context allows for high level resource management and provides the following advantages compared to setting rules at the resource level:

  1. Helps reduce the risk of security breach. You can apply security rules to a desired entity and propagate them to lower levels via policy inheritance across your data hierarchy.

  2. Ensure that administrators have to actively create bridges between resources. By default, projects are sandboxed environments of resources.

  3. Facilitate future organizational changes. Setting rules at a high level helps move groups of resources closer together.

The blueprint does the following to facilitate the least-privileged approach to security:

2. Create network boundaries that only allow for necessary communications.

Google Cloud provides VPCs for defining networks of resources. The previous sections cover the separation of functions through projects. VPCs belong to projects, so by default, resources from a VPC can not communicate with resources in another VPC.

An administrator must now allow or block network communications:

  1. With the internet: Instances in Google can have internal and external IP addresses. The blueprint sets a default policy for forbidding the use of external IP addresses at the trusted folder level.

  2. With Google APIs: Without external IP addresses, instances cannot access the public endpoints of Cloud Storage and BigQuery. The blueprint sets private connectivity to Google APIs at the VPC level to allow notebooks communication with those services.

  3. Within boundaries: Limits environments such as BigQuery or Cloud Storage that notebooks have access to. The blueprint sets VPC Service Controls to create trusted perimeters, within which only resources in certain projects can access certain services based on access policies for user/device clients.

  4. Between resources: The blueprint creates notebooks using an existing shared VPC. The shared VPC should have restrictive firewall rules to limit the protocols that instances can use to communicate with each other.

The blueprint uses Google Cloud’s network features to set the minimum required network paths as follows:

  • Enables users to access Google Cloud endpoints through allowlisted devices.

  • Allows for the creation of SSH tunnels for users to access notebook instances.

  • Connects instances to Google services through private connections within an authorized perimeter.

3. Apply least-privilege principles when setting up authorization policies.

Google Cloud provides a default Cloud IAM setup to make the platform onboarding easier. For production environments, we recommend ignoring most of those default resources. Use Cloud IAM to create your custom identities and authorization rules based on your requirements. 

Google Cloud provides features to implement the least-privileged principle while setting up a separation of duties:

  1. Custom roles provide a way to group a minimum set of permissions for restricting access. This ensures that a role allows identities to only perform the tasks expected of them.

  2. Service accounts can represent an instance identity and act on behalf of trusted users. This allows for consistent behavior and limits user actions outside of those computing resources.

  3. Logical identity groups based on user persona simplifies management by limiting the number of lone and possibly forgotten identities.

  4. Cloud IAM policies link roles and identities. This provides users with the means to do their job while mitigating the risk of unauthorized actions.

For example, the blueprint:

  • Creates a service account with enough roles to run jobs and act as an identity for notebook instances in the trusted-analytics project.

  • Assigns roles to a pre-created group of trusted scientists to allow them to use notebooks to interact with data.

  • Creates a custom role in the trusted-data project with view-only access to sensitive information in BigQuery, without being allowed to modify or export the data.

  • Binds the custom role to relevant user groups and services accounts so they can interact with data in the trusted-data project.

Through Terraform, the blueprint creates the following flow:

  1. Add users from the trusted_scientists variable to the pre-created trusted-data-scientists Google Groups.

  2. Sets a policy for identities in the trusted-data-scientists group to use the service account sa_p_notebook_compute.

  3. Creates an individual notebook instance per trusted user and leverages the sa_p_notebook_compute service account as an identity for the instances.

With this setup, users can access confidential data in the trusted-data project through the service account, which acts as an identity for instances in the trusted-analytics project. 

Note: All trusted users can access all confidential data. Setting narrower permissions is out of scope for this blueprint. Narrower permissions can be set by creating multiple service accounts and limiting their data access at the required level (a specific column, for example), then assigning each service account to the relevant group of identities.

4. Protect sensitive information at the data and software level.

Google Cloud provides default features to protect data at rest, and additional security features for creating a notebook.

The blueprint encrypts data at rest using keys, and shows how to:

  • Create highly available customer-manager keys in your own project.

  • Limit key access to select identities.

  • Use keys to protect data from BigQuery, Cloud Storage and AI Platform Notebooks in other projects within the relevant perimeter.

For more details, see the key management section of the blueprint guide.

AI Platform Notebooks leverage Jupyter notebooks set up on Compute Engine instances. When creating a notebook, the blueprint uses AI Platform Notebooks customization features to:

  • Set additional security parameters, such as preventing “sudo”.

  • Limit access to external sources when calling deployment scripts.

  • Modify the Jupyter setup to mitigate the risk of file downloads from the Jupyterlab UI.

For more details, see the AI Platform Notebooks security controls section of the blueprint guide.

To learn more about protecting your confidential data while better enabling your data scientists, read the guide: Protecting confidential data in AI Platform Notebooks. We hope that this blueprint, as well as our ever-expanding portfolio of blueprints available on our Google Cloud security best practices center, helps you build security into your Google Cloud deployments from the start, and helps make you safer with Google.

Leave a Reply

Your email address will not be published. Required fields are marked *