Blog

Set up multi-cluster Anthos Service Mesh using managed control plane

anthos-service-mesh

This is a simple step-by-step guide to set up a multi-cluster Anthos Service Mesh (ASM) using the managed control plane. 

The accompanying code can be found here: https://github.com/caddac/anthos-examples/tree/main/asm 

What is a service mesh and why would you need one?

A service mesh is an abstraction built on top of various compute platforms – in this case clusters – to simplify inter-service communication, observability and security. It moves these responsibilities to the platform, facilitating a simpler application. In this blog post, we’ll deploy Anthos Service Mesh, which is a managed Istio installation. Istio is a service mesh for kubernetes, a powerful tool to gain consistent insight into service-to-service observability and security. Using Istio, this is achieved by proxying all interservice communication through proxies run as sidecars to your pods. 

Services can use Istio’s consistent and transparent patterns to communicate with other services and don’t need to concern themselves with how to reach another service, security from incoming requests, or publishing metrics about inter-service traffic. All of this is handled in Istio. These features are even more important when working with a multi-cluster configuration and operational overhead is even higher. 

Using a service mesh does come with some drawbacks, including setup complexity, resource overhead and cost. Istio has many features and several ways to configure it, including istioctl, IstioOperator and helm. However, Anthos Service Mesh with managed control plane does not support the IstioOperator API or helm. 

OK, here we go!

In this tutorial, we’ll build two GKE clusters and deploy the istio hello-world application to them. Then we’ll install a single ingress (on just one cluster) and show that traffic is load-balanced across the 4 hello-world pods running on two GKE clusters. 

service-mesh

Deploy the terraform

In the asm/infra directory, update the values in the var_inputs.auto.tfvars file.
Plan and apply it:

$ terraform init
$ terraform apply

 

Note: If the apply fails, try applying again. Sometimes terraform gets ahead of itself when applying so many dependent resources all at once. I had to apply it three times to get all the resources created.

There is a lot going on here, so let's cover it briefly.

This deploys a new GCP project and enables several required APIs [1]. Quick note here, the official docs seem to be missing the sts and anthos APIs, so make sure you enable those. 

It also deploys a VPC, two subnetworks with secondary ranges and a firewall rule to let everything communicate. Finally, it enables the hub mesh feature for the project.

Each file deploys a cluster. These differ only in the names of the clusters, regions and subnet references. Here are two things to note on the cluster config. First, we’ve set the mesh_id label so GCP knows which mesh this cluster belongs to. Second, we’ve enabled Workload Identity, which allows Kubernetes service accounts to act as GCP service accounts [2] without needing to download or even create JSON keys. 

We register the cluster to the fleet, using the cluster name as the membership id. 

Finally, we install the managed control plane. We’re using the automatic control-plane (which means “managed”), but which revision? Check the revision label on the namespace to see which revision of the control plane is being used [5]. Yes, it’s that easy!

At this point, we have two clusters running in the same fleet, both with ASM installed using the managed control plane. So, although we can start installing Istio components, the clusters don’t have cross-cluster service discovery working yet.

Configure cross cluster service discovery

We need to install secrets for each cluster onto every other cluster in our fleet. Luckily, there is a cli tool for this – asmcli. Unfortunately, it doesn’t work on macOS yet, so it's easiest to run it in the cloud console. 

Open the cloud console in your project by browsing to the GCP console and clicking the cloud console button

istio-service-mesh

After the console pops up at the bottom of the screen, run the following commands:

$ curl https://storage.googleapis.com/csm-artifacts/asm/asmcli_1.13 > asmcli
$ chmod +x asmcli
$ ./asmcli create-mesh ${GOOGLE_CLOUD_PROJECT} \
   ${GOOGLE_CLOUD_PROJECT}/us-west1/${GOOGLE_CLOUD_PROJECT}-cluster1 \
   ${GOOGLE_CLOUD_PROJECT}/us-east1/${GOOGLE_CLOUD_PROJECT}-cluster2

istio-kubernetes

Quick note here: If you are using private clusters, there are additional steps to enable cross-cluster service discovery [6][7].

Deploy the apps and Istio components

From the asm/manifests directory, we can install apps to test our mesh. One cluster 1, we’re going to install the helloworld app and the Istio components for ingress. 

Start by getting the contexts for the clusters:

export PROJECT_ID=<gcp_project_id>
gcloud container clusters get-credentials ${PROJECT_ID}-cluster1 --region us-west1 --project ${PROJECT_ID}
gcloud container clusters get-credentials ${PROJECT_ID}-cluster2 --region us-east1 --project ${PROJECT_ID}

 

Install cluster 1 by running:

kubectl apply -k manifests/ --context=gke_${PROJECT_ID}_us-west1_${PROJECT_ID}-cluster1

 

And cluster 2 by running:

kubectl apply -k manifests/app/ --context=gke_${PROJECT_ID}_us-east1_${PROJECT_ID}-cluster2

service-mesh-istio

Review your work

After a few minutes, everything should be up, and you can inspect your work.

Verify all the helloworld pods are running both clusters:

$ export PROJECT_ID=<gcp_project_id> 
$ kubectl get pods -n helloworld
--context=gke_${PROJECT_ID}_us-west1_${PROJECT_ID}-cluster1

$ kubectl get pods -n helloworld
--context=gke_${PROJECT_ID}_us-east1_${PROJECT_ID}-cluster2

kubernetes-service-mesh

Verify the ingress-gateway deployment is running on cluster 1

$ kubectl get deploy -n asm-gateways --context=gke_${PROJECT_ID}_us-west1_${PROJECT_ID}-cluster1

istio-mesh

Verify the ingress-gateway service has an external IP on cluster 1

$ kubectl get svc -n asm-gateways --context=gke_${PROJECT_ID}_us-west1_${PROJECT_ID}-cluster1

 

service-mesh kubernetes

Finally, verify cross-cluster load balancing works by querying the public endpoint several times. You’ll notice the version and pod id changes as different pods are hit. You should get 4 different pod ids and 2 different versions.

application-mesh

$ curl <your external ip here>/hello   

 

service-mesh-architecture

Customizing, viewing logs

Enable envoy debug logs

We did this back when we deployed the asm configuration. Check out the file asm/manifests/gateways/asm-config.yaml. The ConfigMap we deployed configured the envoy proxies to log their access logs to stdout. There are other options you can enable in similar ways [3].

Control plane logs

You can still view control plane logs when using the managed control plane. They are in Logs Explorer under the resource “Istio Control Plane”

References

  1. https://cloud.google.com/service-mesh/docs/managed/auto-control-plane-with-fleet#before_you_begin 
  2. https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
  3. https://cloud.google.com/service-mesh/docs/managed/enable-managed-anthos-service-mesh-optional-features
  4. https://cloud.google.com/service-mesh/docs/managed/auto-control-plane-with-fleet
  5. https://cloud.google.com/service-mesh/docs/managed/select-a-release-channel#how_to_select_a_release_channel 
  6. https://cloud.google.com/service-mesh/docs/unified-install/gke-install-multi-cluster#private-clusters-endpoint
  7. https://cloud.google.com/service-mesh/docs/unified-install/gke-install-multi-cluster#private-clusters-authorized-network

Subscribe to updates, news and more.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related blogs