Performance results for Kubernetes Visibility Agent<!-- /*NS Branding Styles*/ --> .ns-kb-css-body-editor-container { p { font-size: 12pt; font-family: Lato; color: #000000; } span { font-size: 12pt; font-family: Lato; color: #000000; } h2 { font-size: 24pt; font-family: Lato; color: black; } h3 { font-size: 18pt; font-family: Lato; color: black; } h4 { font-size: 14pt; font-family: Lato; color: black; } a { font-size: 12pt; font-family: Lato; color: #00718F; } a:hover { font-size: 12pt; color: #024F69; } a:target { font-size: 12pt; color: #032D42; } a:visited { font-size: 12pt; color: #00718f; } ul { font-size: 12pt; font-family: Lato; } li { font-size: 12pt; font-family: Lato; } img { display: block; max-width: ; width: auto; height: auto; } } Introduction This article lists out the performance metrics and results on Kubernetes Visibility Agent (CNO for Visibility). These results are collected by performing tests both on Informer side and Instance side. Purpose These performance results will help in estimating the load, memory, CPU etc on the Instance as well as on the k8s cluster on which the Informer is deployed. Performance Results Following are the performance results from multiple scenarios that are tested: Informer side test results This test is measuring the performance of K8S Informer which in deployed on a cluster with 50K pods. The below results are based on the tests performed using: Instance Details: Nodes - 2, Version: Zurich, DB - MySQL K8S Cluster Details: OS: CentOS, CPU - 8 cores, Memory - 16GB, Nodes - 5 1. The following table gives a view on informer pod memory/cpu usage for discovering cluster with 50K pods ParametersValuesDiscovery Time35 minNo. of Pods on Cluster 50KNo of Informer Daemonset Pods5Informer Pod CPU13%Informer Pod Memory (Max)1GBNo. of ECC Queue records created50ECC Queue to CMDB population timeMin : 1 secondsMax : 20 secondsAvg : 13 seconds 2. The following table gives a view on required Informer memory per number of Pods No of PodsInformer Memory(Max)1K50 MB5K139 MB25K549 MB 3. The following table gives a view on cpu usage of node where informer + daemonset is deployed along with network traffic pods. Each traffic pod is making an HTTP call every 10 seconds, resulting in a total of 180 calls per minute from the 30 traffic pods deployed on the node. Node Host Config4 CPUsNo of network traffic pods30CPU usage without daemonsetMax : 100%Avg : 8.76%CPU usage with daemonsetMax : 100%Avg : 9.27%No of informer daemonset5 Instance side test results This test is measuring the instance performance with 600 and 1000 clusters full discovery with 10K pods/cluster for a duration of 3 days. The below results are based on the tests performed using - Instance Details on which tests are performed: Nodes = 12Semaphores = 16 * 12 = 192Semaphores API_INT Available semaphores: 4Queue depth: 0Max queue depth: 47Queue depth limit: 50Maximum transaction concurrency: 4Maximum concurrency achieved: 4 DB Size : XL (RaptorDB + RR)DB Connection Pool Size: 32sn_acc_visibility.k8s_informer_max_worker_threads=14sn_acc_visibility.max_concurrent_full_discovery=3DMTableCleaner configured with max consumers=12 , max producers=3continuous discovery=off Below are the results that are collected by using a simulator which acts as an Informer. Scenario 1: 600 simulators are used and each one sends a payload every 30 seconds to the instance for 3 days. The below table shows the results for this scenario. ParameterResultPayload Size1MBTest Duration3 daysFull Discovery Frequency12 HrsNo of Clusters600No of pods/cluster10KSend interval (seconds)30ECC Queue to CMDB Population Time (Max)1073 secondsNo. of 429 Rejections0API_INT Queue Depth Max7Max Replication lag838 secondsDeletion job run time41 minutesInstance App Node CPU Time (avg)13K (milli seconds)Instance App Node Memory Usage (avg)1.2 GBDBI CPU Time (avg)78K (milli seconds) Scenario 2: 1000 simulators are used and each one sends a payload every 30 seconds to the instance for 3 days (disabled - CMDB Workspace CIs Processed Via IRE On Source Batch Job). The below table shows the results for this scenario. ParameterResultPayload Size1MBTest Duration3 daysFull Discovery Frequency12 HrsNo of Clusters1000No of pods/cluster10KSend interval (seconds)30ECC Queue to CMDB Population Time (Max)736 secondsNo. of 429 Rejections0API_INT Queue Depth Max4Max Replication lag3 secondsDeletion job run time25 minutesInstance App Node CPU Time (avg)14K (milli seconds)Instance App Node Memory Usage (avg)0.75 GBDBI CPU Time (avg)51K (milli seconds) Note: These performance tests are conducted on Instance and K8S cluster with no other load.