Azure discovery throwing "429 Too Many Requests" errors DescriptionAzure Cloud discovery uses Azure Resource Graph Explorer APIs to discover a number of different resources, but currently there is a hard limit of how many requests per 5 seconds enforced by Azure. If this limit is reached the API response will result in the following error "Exception occurred while executing operation Cloud REST - add response to context. Custom operation Failed to run script due to the following error: JAVASCRIPT_CODE_FAILURE: com.snc.sw.exception.CommandFailureException: Cloud request failed. URL: https://management.azure.com/providers/Microsoft.ResourceGraph/resources?api-version=2019-04-01 Status: 429 Response: HttpResponseProxy{HTTP/1.1 429 Too Many Requests". This in term terminates the pattern discovery resulting in an error status in the discovery log. NOTE: The main workaround that implements the RETRY mechanism does not avoid 429 throttling errors. Because this is only a retry mechanism, there's no guarantee that the retries will succeed either. So it's possible that you'll complete the schedule without the full inventory being successfully scanned. If data accuracy is paramount then the ADDITIONAL and SUPPLEMENTAL workarounds will avoid 429s entirely and ensure all resources are discovered at the cost of performance Steps to Reproduce Install "Discovery and Service Mapping patterns" plugin Migrate from CAPI to patterns to activate the Azure LP patterns (https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB0827153)Make new Cloud discovery schedule for an Azure accountRun the discoveryCheck the discovery log in the discovery status to verify if there are error status for some patterns. * Note - checking the specific pattern log will show that the reason is the 429 error, where mostly the following patterns fail:- VM,Security group,Host,Application gateway,Azure functionsWorkaround### NOTE: The following is the permanent fix implemented, but this is not 100% consistent because it doesn't account for thread concurrency. Please see the ADDITIONAL and SUPPLEMENTAL sections for a 100% consistent workaround at the expense of further manual effort and reduced performance pending a new permanent fix (internal feature enhancement) The following approached has been taken in the "Discovery and Service Mapping patterns" plugin with version 1.0.77, which is the June 2021 release, to reduce the number of 429 errors:1. Add a logic in the custom operation executing the Azure API calls to check if the received error is of type 429. if this the case then wait for 10 seconds and retry to send the same API call again. This is executed a total number of 3 times before throwing the 429 error, to avoid endless retries. 2. Redesign some patterns to reduce amount of API calls send in small time windows, where the main design change was done for the Azure Functions pattern. This is not considered a full fix, as the limitation is coming from Azure Resource Graph and current value of 15 API calls per 5 seconds is just too low to cover full environment discovery. This is especially valid for large environmentswhere multiple API calls have to be send to get all the pages of just 1 resource response, but all tests showed that it will greatly reduce the number of 429 errors if not fully clearing them. ### ADDITIONAL WORKAROUND: This is sufficient for environments with only a single Azure service account - In conjunction with above - Specify a MID server only for Azure Cloud Discovery (Capabilities: Cloud Management, Azure) - Reduce MID threads to 15 - Add in a 5s sleep in ecc_agent_script_include AzureApiCommand as seen below: Modify AzureApiCommand to add in a blanket 5s sleep:performCloudRequest: function(args, CTX, clientType){Packages.com.snc.sw.log.DiscoLog.getLogger("AzureApiCommand script include.").debug(" We are in AzureApiCommand.");var abstractApiCall = new AbstractApiCall();var response = abstractApiCall.execute(args, CTX, clientType, this);Packages.java.lang.Thread.sleep(5000);return response;}, ### SUPPLEMENTAL WORKAROUND: This can be leveraged in combination with ADDITIONAL to provide scalability for larger environments Using multiple mid servers to increase the performance as this workaround will definitely SLOW things Down:Add a new field to the Azure Service Principals table.***Note: there is a MID Servers field that is unable to be added to the current Azure Service Principals form and unable to be edited (based on security constraints) on the list view, when added. To avoid this issue, we are creating a custom reference field for MID server selection. This new field will allow us to control which MID server is allocated to a service principal account.Navigate to Microsoft Azure Discovery > Credentials (Service Principals)Right click on the list header and choose Configure > TableChange your scope to Azure, to edit the tableClick "New" to add a new columnType = ReferenceColumn label = MID ServerReference SpecificationReference = MID Server ***which is ecc_agentReference qual conditions (optional) = will limit the MID Server's that are available***Change the scope back to Global from this point onwardAdd code in the CloudMidSelectionApi which will force a MID server to be associated with a Service Principal and in turn be used for service accounts associated to the Service Principal account.Navigate to System Definition > Script IncludesLook up record CloudMidSelectionApiScroll through the script to roughly to line 60Before // get the provider (cloudType) and add it as a capabilityAfter // This should never be null dcType = ldcRec.getValue('sys_class_name'); }Insert the following code which will look up the MID server associated to the Service Principal account and pass in to the discovery phase// Checking for Azure throttling workaroundif (!gs.nil(cloudServiceAccountId)){if (!gs.nil(dcType) && dcType == 'cmdb_ci_azure_datacenter') { //get the current service account var grSA = new GlideRecord('cmdb_ci_cloud_service_account'); grSA.addEncodedQuery("account_id="+cloudServiceAccountId); grSA.query(); while(grSA.next()){ //grab the discovery credentials from the service account dcSysID = grSA.getValue('discovery_credentials'); //looking for the service principal var grDC = new GlideRecord('azure_service_principal'); grDC.get(dcSysID); grDC.query(); while(grDC.next()){ //get the MID server associated to the Service Principal customMidSysId = grDC.getValue('u_mid_server'); if (customMidSysId !== '') { //Pass the MID server selected to the discovery process return customMidSysId; } } } }}Review your current MID server(s) capabilities and configurationsNavigate to MID Server > ServersFor each MID server associated to Azure infrastructure discovery set the current informationConfiguration Parameters > add threads.max = 15 (default is 25)Capabilities > add Cloud Management and AzureRestart your MID Server(s)UI Action on form "Restart MID"***Note: this will also push the updated MID Server > Script Include to the MID if it did not occur upon update.Associate a MID server to the Service PrincipalNavigate to Microsoft Azure Discovery > Credentials (Service Principals)Add the u_mid_server (aka MID Server) field to the list viewDouble click the MID Server column for a Service Principal and select the appropriate MID serverAssociate Azure subscription and Azure service account back to a specific Service PrincipalNavigate to Azure Subscriptions, in the navigator type cmdb_ci_azure_subscription.listSelect a record and on the Service Principal field choose one of the Service Principal(s) with a dedicated MID ServerDo the same thing for the Service Account listed in the related table on the formRepeat this process for each of the Azure Subscription***Note: The idea here is to evenly distribute an Azure Subscription > Service Account > to Service Principal > to a MID Server.Example of distribution of load:200 total subscriptions which each are associated to their own service account5 MID Servers dedicated to Azure discovery5 Service Principals with ARG access across all subscriptionsSet up 5 Azure Cloud Resource discovery schedules for subscriptions associated to each of the 5 Service Principals.The average discovery for the 5 schedules is ~10-12 minutesSet up another 5 schedules associated to 5 subscriptions against the 5 different Service Principals at a 20 minute interval after the first five (you can do this as well using the "After Discovery" for Run instead of "Daily")The time to finish then for all 200 subscriptions is ~13 Hours. This leaves ~11 hours for large footprints in Azure. To increase the number of subscriptions that can be discovered without a 429 error, you simply add a new Service Principal account, associate said Service Principal to a new MID server (with configurations listed above) and associate said Service Principal to designated Subscription(s) and Service Account(s) create discovery schedules for the subscriptions.Related Problem: PRB1459683