There are a few major concepts that you need to understand before reading this article is ServiceNow "semaphores" and "worker threads". These are explained in many places, but essentially ServiceNow limits the number of threads available to handle inbound UI requests, web service traffic and background jobs/operations. When designing an outbound web service (from ServiceNow to some remote endpoint) you should consider which of these thread pools are being used and for how long, and balance the pros and cons.
If you open Diagnostics > Stats (/stats.do), you can see the current status of the thread pools for your node in the following sections. All ServiceNow instances have multiple nodes and every user session gets tied to a specific node.
1. The "Default" semaphore pool handles UI requests. There are 16* Default threads per node.
2. The "API_INT" semaphore pool handles Web service requests. There are 4* API_INT threads per node.
3. The "Scheduler Workers" handle background jobs/operations. There are 8* Scheduler Worker threads per node.
* These numbers are the default settings for all instances as of the Orlando release of ServiceNow, 2020.
You might want to refer to the following article for additional information about ServiceNow's /stats.do page:
1. If at all possible, avoid solutions that include making the UI wait on an external integration endpoint. This includes Workflows that do not have a timer before an Activity that makes an outbound web service call (see KB0647534 - How does a Workflow Timer help?)
2. If there is no way to avoid making the UI wait on an external integration endpoint, then you should set a very low timeout threshold and use the direct execute() method. Ideally, you should also attempt to let the end user know why they are having to wait for a response - at a minimum, a little message that indicates the system is waiting on an external source and how long the system might wait for a response.
3. The illustrations here will all use RESTMessageV2 as the example, but the same behavior applies to SOAPMessageV2 as well.
4. As of ServiceNow's New York release the lag in the creation/execution of scheduled jobs has been hugely reduced as the scheduler thread (that runs on each node) now claims new jobs every 1 second. This is good. It means the asynchronous methods below that leverage scheduled jobs incur 5-9 seconds less automatic lag time. For large customers (those with 5 or more nodes per side, this was never a big lag time since each node claims jobs individually)
5. RESTResponseV2/SOAPResponseV2 getter methods, such as getBody() and getStatusCode(), will cause SOAPMessageV2/RESTMessageV2 to wait for a response - just as if you had called waitForResponse(ms)! So, you should be aware that these methods will cause the initiating thread to hang while waiting for the HTTP Response body or HTTP Response status code to be populated.
The connection timeout is the maximum time a client (i.e. the ServiceNow instance or MID server) will wait for a TCP connection to be established with the web service endpoint. This can be set manually for each request using the setHttpTimeout(milliseconds) method. This timeout value applies to both synchronous and asynchronous requests. Note that calling this method will also override the socket timeout for the request.
The socket timeout is the maximum time a client will wait to receive an individual TCP packet from the endpoint. This can be set manually for each request using the setHttpTimeout(milliseconds) method. This timeout value applies to both synchronous and asynchronous requests. Note that calling this method will also override the connection timeout for the request.
The ECC timeout value is the amount of time that an asynchronous request will wait for the response to show up in the ECC Queue. This timeout applies to all asynchronous requests, both with a MID Server and without. However, this does not apply to synchronous requests because they do not use the ECC Queue. This can be controlled globally using the properties glide.rest.outbound.ecc_response.timeout or glide.soap.outbound.ecc_response.timeout. This is also controlled on a per request basis through passing the number of seconds to the waitForResponse(seconds) method.
The following are timeout-related sys_properties:
This is the default behavior. In order to use a MID Server, you must explicitly specify the MID Server to be used. In this case an HTTP request will be sent from your ServiceNow instance to the web server end point that you specify.
Using the execute() method means that a web service call will execute directly from the ServiceNow instance on the currently executing thread and we will wait for the response to be returned. If the execute() method is used then the thread will pause until a response is received. This method can be problematic depending on how long it takes for the request to be processed and a response returned to ServiceNow. While ServiceNow is waiting for a response, whatever thread initiated the call will be frozen. For example, if the call was initiated by end user activity, then the end user will experience a frozen screen until the response is returned.
Benefits: Using execute() is simple. You call the method and it returns the Response object in one line of code. Then you do your response handling against response object API (SOAPResponseV2 or RESTResponseV2).
Drawbacks: The executing thread is frozen until the response comes back. While a direct call is by far the simplest option, if your web service call is initiating from the UI, this can be a problem.
Usually executeAsync() is used with a MID Server. However, you can also use it without a MID Server. Using the executeAsync() method without a MID Server will send the SOAP/REST request through a scheduled job (sys_trigger table) and the details of the response to the REST request will be stored in the ECC Queue (ecc_queue table). This means that the currently executing thread will move to the next line directly after the scheduled job has been created and will not wait on that line of code for the response to come back. This helps improve performance time for requests triggered by the UI. However, since the thread moves on without waiting for a response there is no way to handle the response.
Besides not being able to handle the response, executeAsync() also introduces a couple layers of complexity due to using a scheduled job and the ECC queue. Since the request now executes on a scheduled job, this means whatever latency was hitting the users from the UI will now be shifted to the scheduled job queue. This could be problematic if the scheduled job queue fills up with slow web service request calls or other jobs. Also, this method incurs the slight overhead of creation and execution of a scheduled job - a process that usually takes around 0 to 1 seconds (as of New York). That can have a noticeable performance impact if you need your web service to execute within couple milliseconds. The user who initiated the original thread won't have to wait, which is good, but the overall response time for the integration will be impacted. This makes executeAsync() a questionable choice for any integration that needs to execute in near real-time. However, if your integration can tolerate a few seconds of added latency to each request, maybe this will work for you.
So, the above method improves the situation in some ways because now the UI is not getting hung while we wait for the web service response. However, since we are not waiting for the response we have no way to handle the response! Our web request has presumably been sent out across the ether to do its work, but how do we know if it worked or not - what response came back? Ground control to Major Tom, can you hear me Major Tom?!
To overcome the issue of not being able to handle the response, sometimes people decide to use the waitForResponse(seconds) method. However, I am going to recommend that unless you are using a MID Server, this is almost never a good idea! Why, you say? I will tell you why. Because the method waitForResponse() causes the initiating thread to freeze again, so we are back to poor response times on the initiating thread - the problem we were trying to avoid by becoming asynchronous. The fact is, when used in combination with waitForResponse(), executeAsync() is no longer asynchronous.
Don't worry. I have a solution. We can make your web service request truly asynchronous and we can handle the response too! ANNOUNCER VOICE: Now introducing (queue generic fanfare music) the setEccTopic() method! The setEccTopic() method takes a little more development work, but it allows for truly asynchronous request handling, while at the same time benefitting from the ability to define a custom "sensor" that will handle the response. There are still some drawbacks to this method. We now have two scheduled jobs that must be created, scheduled, picked up and processed and therefore have at least 1-2 seconds of added latency before we can start processing the response (remember the diagram says 6-10 because it was designed before New York improved the initial latency of scheduled jobs). However, by and large, this is a good solution as long as your web service can tolerate the added latency and the response can be handled on a different thread than the initiating request. See below diagram. Also, there is an in-depth discussion of this method's usage here: KB0563615 - RESTMessageV2 API EccTopic Support.
If waitForResponse() is not used, then executeAsync() method will allow the executing thread to continue immediately without waiting for a response. This can speed things up considerably - an important concern when your web service is triggered by the UI or has other time sensitive dependencies for some reason. On the other hand, if response handling is required, then you will need to weigh the options. The following is a list of options:
1. Use the setEccTopic() method in combination with executeAsync() to spin up a new background thread to process the request (more on that later). setEccTopic() allows asynchronous response handling, but like all executeAsync() method implementations, it does suffer from the inherent lag of scheduling a job and the potential of latency from other jobs causing queue overload. Also, this will not work for solutions that need to be displayed immediately, e.g. in the HTTP Response for the HTTP Request that triggered the RESTMessageV2 code in the first place.
2. Another option is to have some type of polling design where the UI makes quick requests back to the Server to see if the results it is looking for are available.
Drawbacks: Most web service implementations expect some type of response handling for the requests that are sent out. Using the waitForResponse() method allows you to process a request, but then loses any performance advantage from being processed asynchronously.
Using a MID Server is a way to have a web service request initiated from a point within your, ServiceNow's customer's, network. A MID Server is a Java Virtual Machine that sits inside your network firewall and talks to the ServiceNow instance [mostly] through a SOAP integration.
When a MID Server is used, the execute() command implicitly becomes asynchronous and both methods act the same way. Whatever method you use to execute the request, the waitForResponse() method must be used to get the response back from the MID Server.
You might use a MID Server + executeAsync() if you want to have your REST/SOAP request be initiated from inside your network. If you don't need to wait for a response or you have some other method of getting the response (like a bi-directional integration with a correlation ID, for example) then this might be a good option. It will allow the initiating thread to be immediately released, optimizing ServiceNow semaphore resources (Default/API_INT).
If you use a MID Server + executeAsync() + waitForResponse() then we are back to freezing the initiating thread. This is potentially the worst case scenario in terms of the performance. It has the largest amount of lag time, highest number of points of failure, and most frozen threads waiting for responses. You should really try to avoid this!
Using a custom probe to handle the response using MID Server + executeAsync() + setEccTopic() might be a more efficient way to achieve your security goals, if you want your web service call to be initiated from within your network. However, you should consider that there are multiple points of failure and each step will incur some lag time. Make sure to use proper timeout/retry settings and write exception handling to ensure that failed operations will be noticed.