Version: 1.5.x

Resource Consumption and Capacity Planning

This document describes a conservative guideline for capacity planning of Tetrate Service Bridge (TSB) in Management and Control planes.

These parameters apply to production installations: TSB will run with minimal resources if you are using a demo-like environment.

disclaimer

The resource provisioning guidelines described in this document are very conservative.

Also please be aware that the resource provisioning described in this document are applicable to vertical resource scaling. Multiple replicas of the same TSB components do not share the load with each other, and therefore you cannot expect the combined resources from multiple components to have the same effect. Replicas of TSB components should only be used for high availability purposes only.

Recommended baseline production installation resource requirements

For a baseline installation of TSB with 1 registered cluster and 1 deployed service within that cluster, the following resources are recommended.

To reiterate, the amount of memory described below are very conservative. Also, the actual performance given by the number of vCPUs tend to fluctuate depending on your underlying infrastructure. You are advised to verify the results in your environment.

Component	vCPU #	Memory MiB
TSB server (Management Plane) ¹	2	512
XCP Central Components ²	2	128
XCP Edge	1	128
Front Envoy	1	50
IAM	1	128
TSB UI	1	256
OAP	4	5192
OTEL-collector	2	1024
Zipkin	2	2048

¹ Including the Kubernetes operator and persistent data reconciliation processes.
² Including the Kubernetes operator.

Recommended scaling resource parameters

The TSB stack is mostly CPU-bound. Additional clusters registered with TSB via XCP increase the CPU utilization by ~4%.

The effect of additional registered clusters or additional deployed workload services on memory utilisation is almost negligible. Likewise, the effect of additional clusters or workloads on resource consumption of the majority of TSB components is mostly negligible, with the notable exceptions of TSB, XCP Central component, TSB UI and IAM.

note

Components that are part of the visibility stack (e.g. OTel/Zipkin, etc.) have their resource utilisation driven by requests, thus the resource scaling should follow the user request rate statistics. As a general rule of thumb, more than 1 vCPU is preferred. It is also important to notice that the visibility stack performance is largely bound by Elasticsearch performance.

Thus, we recommend vertically scaling the components by 1 vCPU for a number of deployed workflows:

Management Plane

Besides OAP, All components don't require any resource adjustment. Those components are architectured and tested to support very large clusters.

OAP in Management plane requires extra CPU and Memory ~ 100 millicores of CPU and 1024 MiB of RAM per every 1000 services. E.g. 4000 services aggregated in TSB Management Plane from all TSB clusters would require approximately 400 millicores of CPU and 4096 MiB of RAM in total.

Control Plane Resource Requirements

Following table shows typical peak resource utilization for TSB control plane with the following assumptions:

50 services with sidecars
Traffic on entire cluster is 500 repository
Zipkin sampling rate is 1% of the traffic
Metric is captured for every request at every workload.

Note that average CPU utilization would be a fraction of the typical peak value.

Component	Typical Peak CPU (m)	Typical Peak Memory (Mi)
Istiod	300m	250Mi
OAP	2500m	2500Mi
Zipkin	200m	1000Mi
XCP Edge	100m	100Mi
Istio Operator - Control Plane	50m	100Mi
Istio Operator - Data Plane	150m	100Mi
TSB Control Plane Operator	100m	100Mi
TSB Data Plane Operator	150m	100Mi
OTEL Collector	50m	100Mi

TSB/Istio Operator resource usage per Ingress Gateway

The following table shows the resources used by TSB Operator and Istio Operator per Ingress Gateways

note

Keep in mind that these are estimated numbers depending on your application deployed, this can vary, but you can have a general idea of the consumption with these values

Ingress Gateways	TSB Operator CPU(m)	TSB Operator Mem(Mi)	Istio Operator CPU(m)	Istio Operator Mem(Mi)
0	100m	50Mi	10m	45Mi
50	2600m	125Mi	1100m	120Mi
100	3500m	200Mi	1300m	175Mi
150	3800m	250Mi	1400m	200Mi
200	4000m	325Mi	1400m	250Mi
250	4700m	325Mi	1750m	300Mi
300	5000m	475Mi	1750m	400Mi

Component resource utilization

The following tables will show how the different components of TSB scale with 4000 services and peaking with 60 rpm, this is divided by information from the Management Plane, and the Control Plane.

Management Plane

Services	Gateways	Traffic(rpm)	Central CPU(m)	Central Mem(Mi)	MPC CPU(m)	MPC Mem(Mi)	OAP CPU(m)	OAP Mem(Mi)	Otel CPU(m)	Otel Mem(Mi)	TSB CPU(m)	TSB Mem(Mi)	Zipkin CPU(m)	Zipkin Mem(Mi)
0	0	0 rpm	3m	39Mi	5m	30Mi	37m	408Mi	22m	108Mi	14m	57Mi	2m	708Mi
420	7	600 rpm	4m	42Mi	15m	31Mi	116m	736Mi	24m	123Mi	50m	63Mi	14m	835Mi
820	9	600 rpm	4m	54Mi	24m	34Mi	43m	909Mi	26m	127Mi	85m	75Mi	25m	948Mi
1220	11	600 rpm	4m	59Mi	32m	41Mi	28m	1141Mi	27m	210Mi	213m	78Mi	25m	954Mi
1620	13	600 rpm	5m	63Mi	44m	48Mi	209m	1475Mi	29m	249Mi	113m	86Mi	25m	957Mi
2020	15	600 rpm	5m	73Mi	41m	51Mi	51m	1655Mi	24m	319Mi	211m	91Mi	27m	957Mi
2420	17	300 rpm	4m	84Mi	72m	62Mi	57m	1910Mi	29m	381Mi	227m	97Mi	27m	755Mi
2820	19	60 rpm	5m	90Mi	73m	65Mi	43m	2136Mi	16m	466Mi	275m	104Mi	27m	770Mi
3220	21	60 rpm	5m	106Mi	85m	78Mi	89m	2600Mi	43m	574Mi	382m	108Mi	27m	802Mi
3620	23	60 rpm	5m	123Mi	94m	71Mi	245m	2772Mi	37m	578Mi	625m	115Mi	27m	825Mi
4020	25	60 rpm	5m	147Mi	90m	81Mi	521m	3224Mi	15m	704Mi	508m	122Mi	27m	856Mi

note

IAM will peak at 509m/52Mi, LDAP at 2m/17Mi and XCP Operator at 9m/37Mi

Control Plane

Services	Gateways	Traffic(rpm)	Edge CPU(m)	Edge Mem(Mi)	Istiod CPU(m)	Istiod Mem(Mi)	OAP CPU(m)	OAP Mem(Mi)	Otel CPU(m)	Otel Mem(Mi)	Zipkin CPU(m)	Zipkin Mem(Mi)
0	0	0 rpm	6m	49Mi	9m	53Mi	48m	610Mi	26m	80Mi	25m	723Mi
400	2	600 rpm	350m	120Mi	600m	600Mi	900m	1510Mi	27m	86Mi	75m	931Mi
800	4	600 rpm	700m	230Mi	2170m	1140Mi	1720m	2310Mi	32m	91Mi	123m	1030Mi
1200	6	600 rpm	1010m	366Mi	2680m	1890Mi	2630m	3280Mi	35m	101Mi	139M	1080Mi
1600	8	600 rpm	1600m	438Mi	2690m	2490Mi	3610m	4030Mi	41m	180Mi	180m	1070Mi
2000	10	600 rpm	1900m	514Mi	3240m	3820Mi	4470m	5890Mi	43m	106Mi	209m	1080Mi
2400	12	300 rpm	682m	628Mi	2010m	4660Mi	3910m	5750Mi	37m	110Mi	281m	1070Mi
4000	20	600 rpm	1470m	1040Mi	3730m	9790Mi	13300m	35000Mi	37m	135Mi	465m	1100Mi

note

Metric Server will peak at 11m/32Mi, Onboarding Operator at 6m/38Mi, and XCP-Operator at 11m/46Mi

disclaimer

Recommended baseline production installation resource requirements​

Recommended scaling resource parameters​

note

Management Plane​

Control Plane Resource Requirements​

TSB/Istio Operator resource usage per Ingress Gateway​

note

Component resource utilization​

Management Plane​

note

Control Plane​

note

Recommended baseline production installation resource requirements

Recommended scaling resource parameters

Management Plane

Control Plane Resource Requirements

TSB/Istio Operator resource usage per Ingress Gateway

Component resource utilization

Management Plane

Control Plane