What is GKE Workload identity?

Author avatar
by Linus Karlsson
2020-02-17
7 min
What is GKE Workload identity?

Workload identity is a modern way to provision keys for pods running on Google Kubernetes Engine. It allows individual pods to use a service account with a suitable set of permissions, without manually managing Kubernetes secrets.

In this article, we will describe Workload identity, compare it to other approaches, and finally show a real world example of how to configure a Kubernetes cluster with Workload identity enabled. Here at Debricked, we love new security enhancing measures that help companies and individuals to protect their data and reduce their cybersecurity risk. In this article, we describe Workload identity, and how you can use it to enhance the security of your cloud deployment.

Previous approaches to Workload identity

Workload identity is a fairly recent addition to Google Cloud Platform, introduced in the middle of 2019. It tries to remedy the lack of a flexible but secure way to access other Google Cloud Services from within Kubernetes pods. It does so by associating Kubernetes service accounts with Google service accounts. Details can be found in the introductory blog post from GCP, as well as in the official documentation for Workload identity. Let’s briefly discuss some previously used strategies, and their main drawbacks, providing a motivation for the introduction of Workload identity.

Exported service account keys

In this option a Google service account is manually created and granted the required roles for the desired cloud services. To actually use the service account inside the pod, the service account keys are exported as a JSON file. This JSON file must then be provided to the pod in some way, for example through the use of Kubernetes secrets. In this case, the secret is mounted inside the container by Kubernetes, and can be used by applications in the pod. Since the keys must be explicitly handled by the user, they are called user-managed keys.

The main drawback of this solution is that the the exported keys have a very long lifetime, by default a lifetime of 10 years. Best practices include periodic rotation of service account keys, which is inconvenient to enforce when the keys are user-managed. Manual key management increases the likelihood of user mistakes, such as accidentally leaving keys on systems where they don’t belong. This is especially troublesome when the keys have a long lifetime.

Compute engine service accounts

Another option is to modify the service account the individual Compute Engine instances run as. Each Kubernetes node will run on an instance, and each pod can by default use the service account of the node it runs on. By modifying the roles of the instance’s service account, it can be granted permission to access the desired cloud services. Since these accounts are managed by GCP, it means that, e.g., key rotation is handled automatically.

The main drawback is that Kubernetes nodes in a cluster can run several different pods, where each pod may require a different set of roles of the service account. Since the service account is connected to the instance, this requires the service account to be granted permission for the sum of all roles required by the different pods. This means that each individual pod may be granted higher privileges than required, which violates the principle of least privilege.

Workload identity

The idea of Workload identity is to provide construction to solve the drawbacks described above, by:

  • Make the credentials handled by GCP, which provides automatic key rotation without having the users handle the keys manually, as well as preventing accidental exposure of the key by removing the key export step.
  • Make it possible to assign a service account to individual pods, giving each pod only the set of permissions it requires.

Each pod retrieves their key from the metadata server, which means that the instance’s service account does not need to be used. Thus, not only the pods run with a minimum required set of roles, the instance itself does as well.

In the figure below, we see an overview of what is achieved by Workload identity, where each individual pod can use a dedicated service account with a suitable set of roles.

Setting up a Kubernetes cluster

In this section, we show a brief example of how to setup a small Kubernetes cluster with Google Kubernetes Engine. The service will run a simple Node.js application, which will read and write data to Google Storage, using Workload identity to securely provision the pods with keys. The underlying Compute engine instances will run with a minimal set of privileges.

Get started with Google Kubernetes Engine

We assume you already have an account on Google Cloud Platform, the Google Cloud SDK installed on your local machine, and credentials for your own personal account available so that you can use the gcloud and gsutil commands. Furthermore, you need a storage bucket that can be used for the example below. If you have not used any of the tools before, we recommend that you familiarize yourself with the Google Kubernetes Engine Quickstart before you continue with this example.

All source code for this example can be found in a GitHub repository.

Example application

Our example application will be a small Node.js application which can list files in the bucket, as well as read and write to the static filename. The source code can be found below, or in the GitHub repo for this example.

const express = require('express')
const { Storage } = require('@google-cloud/storage')
const app = express()
 
const storage = new Storage()
const bucketName = process.env.BUCKET_NAME
 
app.get('/', (req, res) => {
 res.send('This is an example application')
})
 
app.get('/list', async (req, res) => {
 const [files] = await storage.bucket(bucketName).getFiles()
 res.send({
   files: files.map(file => file.name)
 })
})
 
app.get('/test', (req, res) => {
 storage.bucket(bucketName).file('test.txt').createReadStream()
   .on('error', err => {
     console.error('Got error while reading file.')
     console.error(err)
     res.status(500).send(`Could not read file, got error: ${JSON.stringify(err)}`)
   })
   .pipe(res)
})
 
app.post('/test', (req, res) => {
 const file = storage.bucket(bucketName).file('test.txt')
 req
   .pipe(file.createWriteStream())
   .on('error', err => {
     console.error('Got error while writing file.')
     console.error(err)
     res.status(500).send(`Could not write file, got exception: ${JSON.stringify(err)}`)
   })
   .on('finish', () => {
     res.sendStatus(204)

Example Kubernetes cluster

We now want to run this small application within a Kubernetes cluster. We define two objects in two separate files: one for the deployment, and one for the load balancer. Their contents can be seen below or in the Github repository. An important part to notice is the namespace, which will be used later on when enabling workload identity.

First, the deployment file. Note that you will have to modify at least the bucket name to match your own bucket name. If you want to modify the application from the previous section, you also need to modify the image name and push your own image to a container registry.

# This deployment describes the pods.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: example-gke-workload-identity
 namespace: storage-consumer-ns
spec:
 replicas: 2
 selector:
   matchLabels:
     app: hello
 template:
   metadata:
     labels:
       app: hello
   spec:
     serviceAccountName: storage-consumer-ksa
     containers:
     - name: hello-app
       # TODO: Replace with your own image, or use the debricked example.
       image: debricked/example-gke-workload-identity-app:latest
       ports:
       - containerPort: 8080
       env:
         - name: PORT
           value: "8080"
       # TODO: Replace with your own bucket name.
         - name: BUCKET_NAME
           value: gke-workload-identity-playground-bucket

Then the load balancer service.

# The service provides a load-balancer to access the pods.
apiVersion: v1
kind: Service
metadata:
 name: hello
 namespace: storage-consumer-ns
spec:
 type: LoadBalancer
 selector:
   app: hello
 ports:
 - port: 80
   targetPort: 8080

Commands to launch cluster with Workload identity enabled

We now have all prerequisites required for our small example cluster, and now we actually need to enable workload identity. In the description below, I will describe the different commands, but you can also find them as a shell script in the GitHub repo. Don’t forget to modify the variables at the top of the file to match your environment!

First, if you just wish to run the example code as-is, clone the GitHub repo, modify the variables at the top of the file, and:

  • Run ./cluster_workload_identity.sh create to create the cluster, and
  • Run ./cluster_workload_identity.sh destroy to destroy it when finished.

However, keep reading to get an explanation of the most important steps of cluster_workload_identity.sh

Creating the restricted instance runner account

First, we create the service account the instances will run under. This account will have a very limited set of roles, namely: logging.logWriter, monitoring.metricWriter, and monitoring.viewer. The name of the service account is defined in the RUNNER_GSA variable, while RUNNER_RSA_FULL contains the fully-qualified identity string.

# Create restricted service account to run cluster nodes under.
 gcloud iam service-accounts create "${RUNNER_GSA}" --display-name="${RUNNER_GSA}"
 
 gcloud projects add-iam-policy-binding ${PROJECT} \
   --member "serviceAccount:${RUNNER_GSA_FULL}" \
   --role roles/logging.logWriter
 
 gcloud projects add-iam-policy-binding ${PROJECT} \
   --member "serviceAccount:${RUNNER_GSA_FULL}" \
   --role roles/monitoring.metricWriter
 
 gcloud projects add-iam-policy-binding ${PROJECT} \
   --member "serviceAccount:${RUNNER_GSA_FULL}" \
   --role roles/monitoring.viewer

Creating the cluster with Workload identity support

To enable Workload identity on the GKE cluster, it needs to be assigned to an identity namespace. This namespace contains the mapping between Google service accounts and Kubernetes service accounts. This assignment is done using the –identity-namespace flag during cluster creation. We also ensure that our runner account above is used by the underlying instances of the cluster.

 gcloud beta container clusters create "${CLUSTER}" \
   --enable-ip-alias \
   --enable-autoupgrade \
   --zone="$ZONE" \
   --network="${NETWORK}" \
   --metadata disable-legacy-endpoints=true \
   --identity-namespace="$PROJECT".svc.id.goog \
   --service-account="${RUNNER_GSA_FULL}"

Creating the service account for cloud storage access

Next, we create the service account that the pods will actually use to access Google Cloud Platform, and give it full access to the specified bucket. It will not have access to other services.

# create service account that the pod should use
 gcloud iam service-accounts create "$GSA" --display-name="${GSA}"
 
 # give it admin permissions to this storage bucket only
 gsutil iam ch "serviceAccount:${GSA_FULL}:roles/storage.objectAdmin" "gs://${BUCKET}"

Create namespace and Kubernetes service accounts

We can now move to the Kubernetes part and create a Kubernetes service account and a Kubernetes namespace. We will later connect these with the Google service account from the previous section.

 # get credentials to cluster
 gcloud container clusters get-credentials "${CLUSTER}" --zone="$ZONE"
 
 # create k8s namespace
 kubectl create namespace "$K8S_NAMESPACE"
 
 # create k8s service account in namespace
 kubectl create serviceaccount --namespace "$K8S_NAMESPACE" "$KSA"

Connecting the different accounts

Finally, we can bind the two different types of service accounts together. This allows the Kubernetes service account to act as the Google service account, thus allowing the pod to access cloud services.

 # Allow the Kubernetes service account to use the Google service account by creating an Cloud IAM policy
 # binding between the two. This binding allows the Kubernetes Service account to act as the Google service account.
 gcloud iam service-accounts add-iam-policy-binding \
   --role roles/iam.workloadIdentityUser \
   --member "serviceAccount:${PROJECT}.svc.id.goog[${K8S_NAMESPACE}/${KSA}]" \
   "${GSA_FULL}"
 
 kubectl annotate serviceaccount \
   --namespace "${K8S_NAMESPACE}" \
   "${KSA}" \
   "iam.gke.io/gcp-service-account=${GSA_FULL}"

Deployment

Finally, we can actually deploy our application using kubectl as below.

 kubectl apply -f deployment.yaml
 kubectl apply -f loadbalancer.yaml

After a while (up to a minute) we can fetch the external IP of our load balancer using the following command

 kubectl get services --namespace "${K8S_NAMESPACE}"

Trying it out

You can now connect to the application either by using the browser or by using curl.

To add some content to test.txt, use, e.g., curl to POST data to store in the file.

curl -d 'Some test data' http://<IP-ADDRESS>/test 

Retrieve the data again

curl http://&lt;IP-ADDRESS&gt;/test<br>

List all files in the bucket

curl http://&lt;IP-ADDRESS&gt;/list

Destroying the cluster

When finished destroy the cluster and remove the created service accounts to avoid paying for an unused cluster.

 # to delete cluster when done
 gcloud container clusters delete storage-consumer --zone="$ZONE"
 
 # delete service account, and its assigned roles.
 gcloud iam service-accounts remove-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:${PROJECT}.svc.id.goog[${K8S_NAMESPACE}/${KSA}]" "${GSA_FULL}"
 gsutil iam ch -d "serviceAccount:${GSA_FULL}" "gs://${BUCKET}"
 gcloud iam service-accounts delete "${GSA_FULL}"
 
 # deleting runner, and its assigned roles.
 gcloud projects remove-iam-policy-binding ${PROJECT} --member "serviceAccount:${RUNNER_GSA_FULL}" --role roles/logging.logWriter
 gcloud projects remove-iam-policy-binding ${PROJECT} --member "serviceAccount:${RUNNER_GSA_FULL}" --role roles/monitoring.metricWriter
 gcloud projects remove-iam-policy-binding ${PROJECT} --member "serviceAccount:${RUNNER_GSA_FULL}" --role roles/monitoring.viewer
 gcloud iam service-accounts delete "${RUNNER_GSA_FULL}"
  1. User avatar
    Nitin
    about 3 years ago

    Thanks for the great article. I have one query, so suppose am connecting to Firestore and pubsub in my code and I use the json key of the corresponding service accounts to create the builders like below.

    SubscriberServiceApiClientBuilder or FirestoreDbBuilder objects are created using the respective Service account keys.
    For example as below:

    _firestoreInstance = new FirestoreDbBuilder
    {
    ProjectId = projectId, JsonCredentials = _fireStoreSAJsonKey
    }.Build();

    I was reading the json _fireStoreSAJsonKey from a secret. Now the question is with the Workload identity coming into picture, how will I be getting the content of service account key?

    1. User avatar
      Emil Olsson
      about 3 years ago

      Hi! I haven’t used Firestore together with Workload identity myself, but this question on Stack Overflow seems to provide an answer https://stackoverflow.com/a/62825558

      1. User avatar
        Nitin
        about 3 years ago

        Thanks for the answer. It helped me.

  2. User avatar
    yasuhisa katsumi
    about 4 years ago

    Hi, Thanks for the good explanation, this page helped me a lot for using workload identity.

    regards
    katsumi