Skip to main content
Version: 1.5.x

Revisioned Istio Control Plane

Alpha Feature

Revisioned Istio control plane is an Alpha feature and is not recommended for production usage.

Istio deployment at control plane clusters can be marked with an arbitrary revision. If installed as revisioned, all components of the Istio control plane are identified with the configured revision. This allows running more than one Istio control plane side-by-side on the same cluster. Running more than one Istio control plane is helpful in more controlled and safer upgrades.

Installation

Upgrade

For upgrading from non-revisioned to revisioned control plane follow the steps as mentioned in Upgrades

For a fresh installation, follow the standard steps for onboarding a control plane cluster along with following changes:

Deploy Operators

Customize the tctl command that to specify the revision while generating cluster operators manifest. You have 2 ways to accomplish that.

  1. From the CLI itself, using the --set flag:

    In the following example, you will use canary as revision value. You can change revision value to any value that you want.

    tctl install manifest cluster-operators \
    --registry <registry-location> \
    --set "operator.deployment.env[0].name=CONTROL_PLANE_REVISION" \
    --set "operator.deployment.env[0].value=canary" > clusteroperators.yaml
  2. By passing a values file that you created. As an example:

    operator:
    deployment:
    env:
    - name: CONTROL_PLANE_REVISION
    value: canary

    And then running:

    tctl install manifest cluster-operators \
    --registry <registry-location> --values /path/to/values.yaml > clusteroperators.yaml

Apply this using

kubectl apply -f clusteroperators.yaml
Data plane operator

There will be no data plane operator deployment in the clusteroperators.yaml because with revisioned control plane, data plane operator is no longer needed. Gateway (Ingress/Egress/Tier1) deployments will be handled by TSB control plane operator itself.

Gateway Upgrades Approaches

There are two ways you can upgrade Gateways with revisioned control plane and you can control this by setting ENABLE_INPLACE_GATEWAY_UPGRADE variable for XCP component in control plane CR.

  1. ENABLE_INPLACE_GATEWAY_UPGRADE=true is the default behavior. When using in-place gateway upgrade, existing gateway deployment will be patched with new proxy image and will continue using the same gateway service. This means you don't have to make any changes to configure the gateway external IP.
  2. ENABLE_INPLACE_GATEWAY_UPGRADE=false means that a new gateway service and deployment for the canary version will be created, so now there are two services:
    1. <gateway-name> which is handling the non-revisioned control plane workload traffic
    2. <gateway-name>-canary which is handling the revisioned control plane workload traffic, a new external IP will be allocated to this newly created <gateway-name>-canary service. You can control traffic between two versions by using external load balancers or by updating DNS entry.

Control Plane Installation

A new field, revision must be set in the ControlPlane custom resource (CR) with the same value as used while generating clusteroperators.yaml. By default, TSB will do in-place gateway upgrade. Set ENABLE_INPLACE_GATEWAY_UPGRADE to false if you want to deploy canary deployment and service for gateway.

apiVersion: install.tetrate.io/v1alpha1
kind: ControlPlane
metadata:
name: controlplane
namespace: istio-system
spec:
hub: <registry-location>
telemetryStore:
elastic:
host: <tsb-address>
port: <tsb-port>
version: <elastic-version>
selfSigned: <is-elastic-use-self-signed-certificate>
managementPlane:
host: <tsb-address>
port: <tsb-port>
clusterName: <cluster-name-in-tsb>
selfSigned: <is-mp-use-self-signed-certificate>
components:
xcp:
revision: 'canary' # Revision value. Must be same with operator revision value
centralAuthMode: 'JWT'

This can then be applied to your Kubernetes cluster:

kubectl apply -f controlplane.yaml

After the installation steps are done, look at deployments, configmaps and webhooks in the istio-system namespace. All resources which are part of revisioned Istio control plane will be having revision as suffix in the name.

kubectl get deployment -n istio-system | grep canary
# Output
istio-operator-canary 1/1 1 1 96s
istiod-canary 1/1 1 1 32s
kubectl get configmap -n istio-system | grep canary
# Output
istio-canary 2 105s
istio-sidecar-injector-canary 2 105s
kubectl get validatingwebhookconfiguration -n istio-system | grep canary
# Output
istio-validator-canary-istio-system 1 2m43s

Sidecar upgrades

Workload namespaces must be labelled with the matching revision so sidecar proxies will point to the revisioned control plane. If you are upgrading from non-revisioned control plane, remove istio-injection label. Don't forget to change bookinfo namespace in the following example to your application namespace.

kubectl label namespace bookinfo istio-injection- istio.io/rev=canary --overwrite

Then restart the workload pods to re-inject sidecar proxies with revisioned control plane. You can use rollout restart.

kubectl rollout restart deployment -n bookinfo

To verify if the sidecar is really connected to the intended istiod, istioctl command can be used:

istioctl pc bootstrap deploy/details-v1 -n bookinfo -o json | grep -i discovery
# Output
"discoveryAddress": "istiod-canary.istio-system.svc:15012",

Gateway Deployment

For each gateway (Ingress/Egress/Tier1) resource, you must to set the matching revision.

For example in your Ingress gateway deployment manifest.

apiVersion: install.tetrate.io/v1alpha1
kind: IngressGateway
metadata:
name: tsb-gateway-bookinfo
namespace: bookinfo
spec:
revision: canary # Revision value. Must be same with operator revision value

Apply this using

kubectl apply -f ingress-gateway.yaml

Once applied, this will result in revisioned gateway deployment.

In-place gateway upgrade

If you're using the In-place gateway upgrade as mentioned above in Gateway Upgrades Approaches, a new deployment and then a new pod with same name will be created using the same service.

kubectl get deployments -n bookinfo
# Output
tsb-gateway-bookinfo 1/1 1 1 1m12s
kubectl get svc -n bookinfo
# Output
tsb-gateway-bookinfo LoadBalancer 10.255.10.85 172.29.255.157 15443:31159/TCP,8080:31789/TCP,...

you can inspect the deployment, service or the newly created pod that now they're labelled with a label istio.io/rev=canary

kubectl get deployments -n bookinfo --show-labels | grep canary

Will show all labels and highlight the canary label

kubectl get svc -n bookinfo --show-labels | grep canary

Will show all labels and highlight the canary label

Canary gateway upgrade

If you're not using the canary gateway upgrade (i.e. ENABLE_INPLACE_GATEWAY_UPGRADE=false) as mentioned above in Gateway Upgrades Approaches, a new deployment and then a new pod with a new name suffixed the revision value will be created and it will create a new service with new external IP.

kubectl get deployments -n bookinfo
# Output
tsb-gateway-bookinfo 1/1 1 1 8m12s
tsb-gateway-bookinfo-canary 1/1 1 1 4m19s
kubectl get svc -n bookinfo
# Output
tsb-gateway-bookinfo LoadBalancer 10.255.10.81 172.29.255.151 15443:31159/TCP,8080:31789/TCP,...
tsb-gateway-bookinfo-canary LoadBalancer 10.255.10.85 172.29.255.152 15443:31159/TCP,8080:31789/TCP,...

Troubleshooting

  1. Look for ingressdeployment, egressdeployment, tier1deployment resources in the istio-system namespace corresponding to TSB IngressGateway, EgressGateway, Tier1Gateway resources respectively.

    kubectl get ingressdeployment -n istio-system
    # Output
    NAME AGE
    tsb-gateway-bookinfo 79s

    If missing, tsb control plane operator did not reconcile TSB gateway resource to corresponding xcp resource. First re-verify the revision match between tsb control plane operator and Gateway resource. Next, operator logs should give some hint.

  2. Look for corresponding IstioOperator resource in the istio-system namespace. example:

    kubectl get iop -n istio-system | grep canary
    # Output
    xcpgw-tsb-gateway-bookinfo-canary canary 15m

    If missing, xcp-operator-edge logs should give some hint.

  3. If above two points are OK and still gateway deployment/services not getting deployed OR not as per IstioOperator resource, istio operator deployment logs should give some hint.

Upgrades

Non-revisioned to revisioned control plane

  • You need to scale down TSB data plane operator before starting the upgrade. This is to avoid race condition between TSB data plane operator and TSB control plane operator to reconcile the same TSB Ingress/Egress/Tier1Gateway resources.
kubectl scale --replicas=0 deployment tsb-operator-data-plane -n istio-gateway
  • Install revisioned control plane following installation instructions.
  • To upgrade sidecars, remove istio-injection=enabled workload namespace label and apply istio.io/rev label on the workload namespace to the Istio revision. Then restart the application workloads.
  • To upgrade the workloads running on the virtual machine (VM), restart the envoy sidecar running at the virtual machine.
  • To upgrade the gateways, add the spec.revision in the Ingress/Egress/Tier1Gateway resource as described in the Gateway Deployment section.
  • Support for the upgrade of gateways running on the VM is work-in-progress.

non-revisioned data plane cleanup

To cleanup non-revisioned istio data plane after upgrade completed, that is all sidecars have moved to the revisioned proxy and all application gateways have revisioned gateways running in addition to non-revisioned gateways:

  1. Delete IstioOperator resource named tsb-gateways from the namespace tsb-gateway using kubectl.

    kubectl delete iop tsb-gateways -n istio-gateway

    istio-operator deployment running in tsb-gateways will cleanup all non-revisioned application gateways to reconcile the IstioOperator resource deletion.

  2. Delete istio-gateway namespace because that is no longer needed.

  3. Delete TSB data plane operator webhooks:

    kubectl delete validatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1
    kubectl delete mutatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1

non-revisioned control plane cleanup

  1. Delete IstioOperator resource named tsb-istiocontrolplane from the namespace istio-system using kubectl.
    kubectl delete iop tsb-istiocontrolplane -n istio-system
  2. Delete Istio operator deployment and kubernetes RBAC(clusterrole and clusterrolebinding)
    kubectl delete clusterrole,clusterrolebinding istio-operator
    kubectl delete deployment,sa istio-operator -n istio-system

Rollback from revisioned to non-revisioned

  • If cleanup of non-revisioned has already been performed, first bring back the non-revisioned control plane. To get the older non-revision control plane, re-install TSB cluster operators without revision: First scale down the istio-operator-<revision> if you're using In-place Gateway Upgrade
    kubectl scale --replicas=0 deployment istio-operator-canary -n istio-system
    tctl install manifest cluster-operators --registry $HUB > clusteroperators.yaml
    kubectl apply -f clusteroperators.yaml
    Then edit the existing ControlPlane CR to remove the spec.components.xcp.revision. Non-revisioned TSB control plane operator will then reconcile non-revisioned ControlPlane resource to redeploy non-revisioned Istio control plane.
  • Sidecars can be rollbacked by changing the value of istio.io/rev workload namespace label to default, followed by rolling restart of application deployments.
  • Older non-revisioned gateways will be back automatically because of TSB data plane operator, which does not care about revision being present or not in gateway Install CRs.
  • To cleanup revisioned gateways:
    • Remove spec.revision from the Ingress/Egress/Tier1Gateway TSB gateway install resources.
    • Delete corresponding IstioOperator resources from the istio-system namespace.
  • Delete revisioned control plane IstioOperator resource(xcp-iop-<revision>) from the namespace istio-system using kubectl.
    kubectl delete iop xcp-iop-<revision> -n istio-system
    Delete revisioned Istio operator deployment and kubernetes RBAC(clusterrole and clusterrolebinding)
    kubectl delete sa,deployment istio-operator-<revision> -n istio-system
    kubectl delete clusterrole,clusterrolebinding istio-operator-<revision>