TSB Upgrade
This page will walk you through how to upgrade TSB using the tctl
CLI,
rendering the Kubernetes manifests for the different operators and applying them
to the clusters to upgrade using kubectl
.
Before you start, make sure that you have:
✓ Checked the new version's requirements
The upgrade procedure between operator-based releases is fairly simple. Once the operator pods are updated with the new release images, the newly spun up operator pods will upgrade all the necessary components to the new version for you.
Create Backups
In order to make sure you can restore everything when something goes wrong, please create backups for the Management Plane and each of your clusters' local Control Planes.
Backup the Management Plane
Backup the tctl
binary
Since each new tctl
binary potentially comes with new operators and configurations to deploy and configure TSB, you should backup the current tctl
binary you are using. Please do this before syncing the new images.
Copy the tctl
binary with version suffix (e.g. -1.4.0
) to quickly restore the older one if needed.
cp ~/.tctl/bin/tctl ~/.tctl/bin/tctl-{version}
If you have misplaced your binary, you may be able to find the right version from this URL. However, it is strongly recommended that you backup your current copy to be sure.
Backup the Management Plane Custom Resource
Create a backup of the Management Plane CR by executing the following command:
kubectl get managementplane -n tsb -o yaml > mp-backup.yaml
Backup the PostgreSQL database
Create a backup of your PostgreSQL database.
The exact procedure for connecting to the database may differ depending on your environment, please refer to the documentation for your environment.
Backup the Control Plane Custom Resource
Create a backup of all ControlPlane CRs by executing the following command on each of your onboarded clusters:
kubectl get controlplane -n istio-system -o yaml > cp-backup.yaml
Before Upgrade
Upgrading to TSB 1.5.x from 1.4.x
If you are upgrading to TSB 1.5.x from 1.4.x, you may need to remove old Elasticsearch indices and templates that are used by SkyWalking.
If your Elasticsearch deployment contains an index named skywalking_ui_template
please remove it before
upgrading following the instructions below. Otherwise, you can skip this section.
Run the following shell script against your Elasticsearch installation. It is assumed that
you have all of the necessary information to access Elasticsearch, such as es-host
,
es-port
, etc.
es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port/skywalking_ui_template -XDELETE ;
The above command assumes that your Elasticsearch instance accepts plain HTTP connections
with basic authentication. Modify the commands as necessary. For example, if you forcing
HTTPS connections, you will need to change the URL scheme to https
.
Once the above script runs successfully and the skywalking_ui_template index is deleted, you can proceed with upgrading the Management Plane.
Remember to make sure that the oap
deployment is properly running again
once you have upgraded the Management Plane. The Management Plane should
have created the skywalking_ui_template index.
Adding cert provider configuration to ManagementPlane and ControlPlane CR
TSB 1.5 requires a certificate provider to support certificate provisioning for various webhook certificates required by our operators. TSB also provides out of the box support for installing and managing Cert-Manager as a certificate provider in your cluster. You can achieve this by setting the following fields in your ManagementPlane and ControlPlane CRs after you have installed the TSB operator:
components:
internalCertProvider:
certManager:
managed: INTERNAL
TSB by default runs in the above mentioned configuration. In case you already using Cert-Manager,
you can skip the installation and utilize the existing installation with TSB by setting the managed field as EXTERNAL
.
For more details about CertProvider configuration, please refer to the following section Internal Cert Provider
Existing Cert-Manager installation
To avoid overriding the existing Cert-Manager installation, TSB operator looks for existing cert-manager installation
in cluster and if found, it will fail installation of cert-manager. If you are using Cert-Manager in your cluster, please
set the managed field to EXTERNAL
in your ManagementPlane and ControlPlane CRs as suggested above before upgrade.
Migrate XCP to use JWT
Starting TSB 1.5, default authentication method for XCP central and edge traffic is JWT. If you want to keep using mTLS you
have to set spec.components.xcp.centralAuthModes
to mTLS
in the ManagementPlane CR, and spec.components.xcp.centralAuthModes
to MUTUAL_TLS
in the ControlPlane CR. To migrate from mTLS to JWT follow steps described here
Upgrade Procedure
Download tctl
and Sync Images
Now that you have taken backups, download the new version's tctl
binary,
then obtain the new TSB container images.
Details on how to do this is described in the Requirements and Download page
Create the Management Plane Operator
Create the base manifest which will allow you to update the management plane operator from your private Docker registry:
tctl install manifest management-plane-operator \
--registry <your-docker-registry> \
> managementplaneoperator.yaml
Management namespace name
Starting with TSB 0.9.0 the default Management Plane namespace name is tsb
as
opposed to tcc
used in older versions. If you installed TSB using an earlier
version than 0.9.0, your Management Plane probably lives in the tcc
namespace.
You will need to add a --management-namespace tcc
flag to reflect this.
Customization
The managementplaneoperator.yaml file created by the install command can now be used as a base template for your Management Plane upgrade. If your existing TSB configuration contains specific adjustments on top of the standard configuration, you should copy them over to the new template.
Now, add the manifest to source control or apply it directly to the management plane cluster by using the kubectl client:
kubectl apply -f managementplaneoperator.yaml
After applying the manifest, you will see the new operator running in the tsb
namespace:
kubectl get pod -n tsb
NAME READY STATUS RESTARTS AGE
tsb-operator-management-plane-d4c86f5c8-b2zb5 1/1 Running 0 8s
For more information on the manifest and how to configure it, please review the ManagementPlane resource reference
Delete Elasticsearch legacy templates
Only if you are using an Elasticsearch version >= 7.8, you will need to delete legacy templates.
To check which version of Elasticsearch you are using, you can run the following curl
command.
You will find the version under the number
field:
es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port;
After you have upgraded the TSB Management plane to 1.5, a new SkyWalking version will be deployed that uses its own bespoke Elasticsearch client library due to Elasticsearch licensing changes.
Since Elasticsearch version 7.8, a new template system called Composable index templates is used which deprecates the previous template system from now on called legacy.
To delete any existing Skywalking legacy templates, please execute the following curl
command:
es_host=<replace_this>
es_port=<replace_this>
es_user=<replace_this>
es_pass=<replace_this>
curl -u "$es_user:$es_pass" http://$es_host:$es_port/_template/skywalking_*;
Create the Certificates
In TSB 1.5 there are new certificates added. Check Configuring Secrets for more information.
Basically, the new certificates to create are xcp-central-ca-bundle
which contains the CA used to sign xcp-central-cert
(only if
you're going to use JWT mode for XCP) and mp-certs
which contains the CA used to sign tsb-certs
and both must have the same SANs.
Now you can generate the new certificates and tokens by running the following command:
tctl install manifest control-plane-secrets \
--elastic-password ${ELASTIC_PASSWORD} \
--elastic-username ${ELASTIC_USERNAME} \
--elastic-ca-certificate "${ELASTIC_CA}" \
--management-plane-ca-certificate="$(cat mp-certs.crt)" \
--xcp-central-ca-bundle="$(cat xcp-central-ca-bundle.crt)" \
--cluster <cluster_name> \
--cluster-service-account="$(tctl install cluster-service-account --cluster <cluster_name>)" | kubectl apply -f -
Additionally, if the certificate tsb-certs
is a self signed certificate, or signed by an internal CA, you will
need to set spec.managementPlane.selfSigned
to true in the ControlPlane CR.
Create the Control and Data Plane operators
To deploy the new Control and Data Plane operators in your application clusters,
you must run tctl install manifest cluster-operators
to retrieve the Control Plane and Data Plane operator manifests for the new
version.
tctl install manifest cluster-operators \
--registry <your-docker-registry> \
> clusteroperators.yaml
Customization
The clusteroperators.yaml file can now be used for your cluster upgrade. If your existing control and Data Planes have specific adjustments on top of the standard configuration, you should copy them over to the template.
Applying the Manifest
Now, add the manifest to source control or apply it directly to the appropriate clusters by using the kubectl client:
kubectl apply -f clusteroperators.yaml
For more information on each of these manifests and how to configure them, please check out the following guides:
Rollback
In case something goes wrong and you want to rollback TSB to the previous version, you will need to rollback both the Management Plane and the Control Planes.
Rollback the Control Plane
Scale down istio-operator
and tsb-operator
kubectl scale deployment \
-l "platform.tsb.tetrate.io/component in (tsb-operator,istio)" \
-n istio-system \
--replicas=0
Delete the IstioOperator
Resource
Delete the operator will require to remove the finalizer protecting the istio object with the following command:
kubectl patch iop tsb-istiocontrolplane -n istio-system --type='json' -p='[{"op": "remove", "path": "/metadata/finalizers", "value":""}]'
kubectl delete istiooperator -n istio-system --all
Scale down istio-operator
and tsb-operator
for the Data Plane operator
kubectl scale deployment \
-l "platform.tsb.tetrate.io/component in (tsb-operator,istio)" \
-n istio-gateway \
--replicas=0
Delete the IstioOperator
Resource for the Data Plane
Delete the operator will require to remove the finalizer protecting the istio object with the following command:
kubectl patch iop tsb-gateways -n istio-gateway --type='json' -p='[{"op": "remove", "path": "/metadata/finalizers", "value":""}]'
kubectl delete istiooperator -n istio-gateway --all
Restore the tctl
binary
Restore tctl
from the backup copy that you made, or download the binary for the specific version you would like to use.
mv ~/.tctl/bin/tctl-{version} ~/.tctl/bin/tctl
Create the Cluster Operators, and rollback the Control Plane CR
Using the tctl
binary from the previous version, follow the instructions to create the cluster operators.
Then apply the the backup of the Control Plane CR:
kubectl apply -f cp-backup.yaml
Rollback the Management Plane
Scale Down Pods in Management Plane
Scale down all of the pods in the Management Plane so that the it is inactive.
kubectl scale deployment tsb iam -n tsb --replicas=0
Restore PostgreSQL
Restore your PostgreSQL database from your backup. The exact procedure for connecting to the database may differ depending on your environment, please refer to the documentation for your environment.
Restore the Management Plane operator
Follow the instructions for upgrading to create the Management Plane operator. Then apply the backup of the Management Plane CR:
kubectl apply -f mp-backup.yaml
Scale back the deployments
Finally, scale back the deployments.
kubectl scale deployment tsb iam -n tsb --replicas 1
Zipkin and OAP
After rolling back, if you encounter that zipkin and oap pods are not starting, please follow the steps in Elasticsearch wipe procedure.
Clean up any leftovers from the upgrade process
You may want to perform a clean up to avoid collisions or any error in future deployments and upgrades. If you have enabled gitops feature, new CRDs and webhooks are created in the cluster.
GitOps
Delete both validating webhhooks.
kubectl delete validatingwebhookconfiguration tsb-gitops-direct-webhook
kubectl delete validatingwebhookconfiguration tsb-gitops-webhook
Remove the new CRDs created by the tsb-operator
The full TSB Kubernetes CRDs can be downloaded here. Delete the crds with the following command:
kubectl delete -f tsb-crds.gen-124508f63c2dce6b69cb337dee18178a.yaml
Remove issuers and certificates created by the tsb-operator in 1.5
Issuers:
kubectl delete issuers -n istio-system csr-signer-control-plane
kubectl delete issuer -n tsb csr-signer-management-plane
Certificates:
kubectl delete certificates -n istio-system ca-control-plane
kubectl delete certificates -n tsb ca-management-plane
kubectl delete certificates -n tsb <cluster-name>-xcp-edge-cert
ClusterIssuers:
kubectl delete clusterissuers selfsigned-issuer-control-plane
kubectl delete clusterissuers selfsigned-issuer-management-plane