Skip to main content
Version: 1.5.x

TSB Upgrade

This page will walk you through how to upgrade TSB using the tctl CLI, rendering the Kubernetes manifests for the different operators and applying them to the clusters to upgrade using kubectl.

Before you start, make sure that you have:

✓ Checked the new version's requirements

The upgrade procedure between operator-based releases is fairly simple. Once the operator pods are updated with the new release images, the newly spun up operator pods will upgrade all the necessary components to the new version for you.

Create Backups

In order to make sure you can restore everything when something goes wrong, please create backups for the Management Plane and each of your clusters' local Control Planes.

Backup the Management Plane

Backup the tctl binary

Since each new tctl binary potentially comes with new operators and configurations to deploy and configure TSB, you should backup the current tctl binary you are using. Please do this before syncing the new images.

Copy the tctl binary with version suffix (e.g. -1.4.0) to quickly restore the older one if needed.

cp ~/.tctl/bin/tctl ~/.tctl/bin/tctl-{version}

If you have misplaced your binary, you may be able to find the right version from this URL. However, it is strongly recommended that you backup your current copy to be sure.

Backup the Management Plane Custom Resource

Create a backup of the Management Plane CR by executing the following command:

kubectl get managementplane -n tsb -o yaml > mp-backup.yaml

Backup the PostgreSQL database

Create a backup of your PostgreSQL database.

The exact procedure for connecting to the database may differ depending on your environment, please refer to the documentation for your environment.

Backup the Control Plane Custom Resource

Create a backup of all ControlPlane CRs by executing the following command on each of your onboarded clusters:

kubectl get controlplane -n istio-system -o yaml > cp-backup.yaml

Before Upgrade

Upgrading to TSB 1.5.x from 1.4.x

If you are upgrading to TSB 1.5.x from 1.4.x, you may need to remove old Elasticsearch indices and templates that are used by SkyWalking.

If your Elasticsearch deployment contains an index named skywalking_ui_template please remove it before upgrading following the instructions below. Otherwise, you can skip this section.

Run the following shell script against your Elasticsearch installation. It is assumed that you have all of the necessary information to access Elasticsearch, such as es-host, es-port, etc.

es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port/skywalking_ui_template -XDELETE ;

The above command assumes that your Elasticsearch instance accepts plain HTTP connections with basic authentication. Modify the commands as necessary. For example, if you forcing HTTPS connections, you will need to change the URL scheme to https.

Once the above script runs successfully and the skywalking_ui_template index is deleted, you can proceed with upgrading the Management Plane.

Remember to make sure that the oap deployment is properly running again once you have upgraded the Management Plane. The Management Plane should have created the skywalking_ui_template index.

Adding cert provider configuration to ManagementPlane and ControlPlane CR

TSB 1.5 requires a certificate provider to support certificate provisioning for various webhook certificates required by our operators. TSB also provides out of the box support for installing and managing Cert-Manager as a certificate provider in your cluster. You can achieve this by setting the following fields in your ManagementPlane and ControlPlane CRs after you have installed the TSB operator:

  components:
internalCertProvider:
certManager:
managed: INTERNAL

TSB by default runs in the above mentioned configuration. In case you already using Cert-Manager, you can skip the installation and utilize the existing installation with TSB by setting the managed field as EXTERNAL.

For more details about CertProvider configuration, please refer to the following section Internal Cert Provider

Existing Cert-Manager installation

To avoid overriding the existing Cert-Manager installation, TSB operator looks for existing cert-manager installation in cluster and if found, it will fail installation of cert-manager. If you are using Cert-Manager in your cluster, please set the managed field to EXTERNAL in your ManagementPlane and ControlPlane CRs as suggested above before upgrade.

Migrate XCP to use JWT

Starting TSB 1.5, default authentication method for XCP central and edge traffic is JWT. If you want to keep using mTLS you have to set spec.components.xcp.centralAuthModes to mTLS in the ManagementPlane CR, and spec.components.xcp.centralAuthModes to MUTUAL_TLS in the ControlPlane CR. To migrate from mTLS to JWT follow steps described here

Upgrade Procedure

Download tctl and Sync Images

Now that you have taken backups, download the new version's tctl binary, then obtain the new TSB container images.

Details on how to do this is described in the Requirements and Download page

Create the Management Plane Operator

Create the base manifest which will allow you to update the management plane operator from your private Docker registry:

tctl install manifest management-plane-operator \
--registry <your-docker-registry> \
> managementplaneoperator.yaml
Management namespace name

Starting with TSB 0.9.0 the default Management Plane namespace name is tsb as opposed to tcc used in older versions. If you installed TSB using an earlier version than 0.9.0, your Management Plane probably lives in the tcc namespace. You will need to add a --management-namespace tcc flag to reflect this.

Customization

The managementplaneoperator.yaml file created by the install command can now be used as a base template for your Management Plane upgrade. If your existing TSB configuration contains specific adjustments on top of the standard configuration, you should copy them over to the new template.

Now, add the manifest to source control or apply it directly to the management plane cluster by using the kubectl client:

kubectl apply -f managementplaneoperator.yaml

After applying the manifest, you will see the new operator running in the tsb namespace:

kubectl get pod -n tsb
NAME READY STATUS RESTARTS AGE
tsb-operator-management-plane-d4c86f5c8-b2zb5 1/1 Running 0 8s

For more information on the manifest and how to configure it, please review the ManagementPlane resource reference

Delete Elasticsearch legacy templates

Only if you are using an Elasticsearch version >= 7.8, you will need to delete legacy templates.

To check which version of Elasticsearch you are using, you can run the following curl command. You will find the version under the number field:

es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port;

After you have upgraded the TSB Management plane to 1.5, a new SkyWalking version will be deployed that uses its own bespoke Elasticsearch client library due to Elasticsearch licensing changes.

Since Elasticsearch version 7.8, a new template system called Composable index templates is used which deprecates the previous template system from now on called legacy.

To delete any existing Skywalking legacy templates, please execute the following curl command:

es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port/_template/skywalking_*;

Create the Certificates

In TSB 1.5 there are new certificates added. Check Configuring Secrets for more information. Basically, the new certificates to create are xcp-central-ca-bundle which contains the CA used to sign xcp-central-cert (only if you're going to use JWT mode for XCP) and mp-certs which contains the CA used to sign tsb-certs and both must have the same SANs.

Now you can generate the new certificates and tokens by running the following command:

tctl install manifest control-plane-secrets \
--elastic-password ${ELASTIC_PASSWORD} \
--elastic-username ${ELASTIC_USERNAME} \
--elastic-ca-certificate "${ELASTIC_CA}" \
--management-plane-ca-certificate="$(cat mp-certs.crt)" \
--xcp-central-ca-bundle="$(cat xcp-central-ca-bundle.crt)" \
--cluster <cluster_name> \
--cluster-service-account="$(tctl install cluster-service-account --cluster <cluster_name>)" | kubectl apply -f -

Additionally, if the certificate tsb-certs is a self signed certificate, or signed by an internal CA, you will need to set spec.managementPlane.selfSigned to true in the ControlPlane CR.

Create the Control and Data Plane operators

To deploy the new Control and Data Plane operators in your application clusters, you must run tctl install manifest cluster-operators to retrieve the Control Plane and Data Plane operator manifests for the new version.

tctl install manifest cluster-operators \
--registry <your-docker-registry> \
> clusteroperators.yaml
Customization

The clusteroperators.yaml file can now be used for your cluster upgrade. If your existing control and Data Planes have specific adjustments on top of the standard configuration, you should copy them over to the template.

Applying the Manifest

Now, add the manifest to source control or apply it directly to the appropriate clusters by using the kubectl client:

kubectl apply -f clusteroperators.yaml

For more information on each of these manifests and how to configure them, please check out the following guides:

Rollback

In case something goes wrong and you want to rollback TSB to the previous version, you will need to rollback both the Management Plane and the Control Planes.

Rollback the Control Plane

Scale down istio-operator and tsb-operator

kubectl scale deployment \
-l "platform.tsb.tetrate.io/component in (tsb-operator,istio)" \
-n istio-system \
--replicas=0

Delete the IstioOperator Resource

Delete the operator will require to remove the finalizer protecting the istio object with the following command:

kubectl patch iop tsb-istiocontrolplane -n istio-system --type='json' -p='[{"op": "remove", "path": "/metadata/finalizers", "value":""}]'

kubectl delete istiooperator -n istio-system --all

Scale down istio-operator and tsb-operator for the Data Plane operator

kubectl scale deployment \
-l "platform.tsb.tetrate.io/component in (tsb-operator,istio)" \
-n istio-gateway \
--replicas=0

Delete the IstioOperator Resource for the Data Plane

Delete the operator will require to remove the finalizer protecting the istio object with the following command:

kubectl patch iop tsb-gateways -n istio-gateway --type='json' -p='[{"op": "remove", "path": "/metadata/finalizers", "value":""}]'

kubectl delete istiooperator -n istio-gateway --all

Restore the tctl binary

Restore tctl from the backup copy that you made, or download the binary for the specific version you would like to use.

mv ~/.tctl/bin/tctl-{version} ~/.tctl/bin/tctl

Create the Cluster Operators, and rollback the Control Plane CR

Using the tctl binary from the previous version, follow the instructions to create the cluster operators.

Then apply the the backup of the Control Plane CR:

kubectl apply -f cp-backup.yaml

Rollback the Management Plane

Scale Down Pods in Management Plane

Scale down all of the pods in the Management Plane so that the it is inactive.

kubectl scale deployment tsb iam -n tsb --replicas=0

Restore PostgreSQL

Restore your PostgreSQL database from your backup. The exact procedure for connecting to the database may differ depending on your environment, please refer to the documentation for your environment.

Restore the Management Plane operator

Follow the instructions for upgrading to create the Management Plane operator. Then apply the backup of the Management Plane CR:

kubectl apply -f mp-backup.yaml

Scale back the deployments

Finally, scale back the deployments.

kubectl scale deployment tsb iam -n tsb --replicas 1

Zipkin and OAP

After rolling back, if you encounter that zipkin and oap pods are not starting, please follow the steps in Elasticsearch wipe procedure.

Clean up any leftovers from the upgrade process

You may want to perform a clean up to avoid collisions or any error in future deployments and upgrades. If you have enabled gitops feature, new CRDs and webhooks are created in the cluster.

GitOps

Delete both validating webhhooks.

kubectl delete validatingwebhookconfiguration tsb-gitops-direct-webhook 

kubectl delete validatingwebhookconfiguration tsb-gitops-webhook

Remove the new CRDs created by the tsb-operator

The full TSB Kubernetes CRDs can be downloaded here. Delete the crds with the following command:

kubectl delete -f tsb-crds.gen-124508f63c2dce6b69cb337dee18178a.yaml

Remove issuers and certificates created by the tsb-operator in 1.5

Issuers:

kubectl delete issuers -n istio-system csr-signer-control-plane

kubectl delete issuer -n tsb csr-signer-management-plane

Certificates:

kubectl delete certificates -n istio-system ca-control-plane

kubectl delete certificates -n tsb ca-management-plane

kubectl delete certificates -n tsb <cluster-name>-xcp-edge-cert

ClusterIssuers:

kubectl delete clusterissuers selfsigned-issuer-control-plane

kubectl delete clusterissuers selfsigned-issuer-management-plane