[Predictive Intelligence] Similarity solution throws NoHttpResponseException from the Prediction Server on port 443 with failed to respond from prediction call and it will throw a Read timed out exception during a push model call on a very large solutions - Support and Troubleshooting

[Predictive Intelligence] Similarity solution throws NoHttpResponseException from the Prediction Server on port 443 with failed to respond from prediction call and it will throw a Read timed out exception during a push model call on a very large solutions Issue Similarity solutions that have frequent updates where the model changes often, it needs to update the model on the Prediction Server, so that the updates are included in the new predictions. On large updates on large Similarity models, or when the Prediction Server doesn't have the solution model in memory, this push model call and the subsequent prediction call can exceed the default timeout of 10 seconds and you will see the following error for the push model call in the System Logs -

2022-10-05 23:28:56 (246) Default-thread-39 0740853A1BDA95582A120D4EE54BCB44 txid=0df681f61b9e DxC_ML: dxc_id=2172519121256537 Starting push model call for ml_sn_global_similar_incidents 2022-10-05 23:29:11 (544) http-18 New transaction 0740853A1BDA95582A120D4EE54BCB44 #15762752 /api/now/v1/batch 2022-10-05 23:29:29 (086) Default-thread-39 0740853A1BDA95582A120D4EE54BCB44 txid=0df681f61b9e SEVERE *** ERROR *** DxC_ML: dxc_id=2172519121256537 Received org.apache.http.NoHttpResponseException: mlpredictor-customer.ams100.service-now.com:443 failed to respond from prediction call for ml_sn_global_similar_incidents solution 2022-10-05 23:29:29 (088) Default-thread-39 0740853A1BDA95582A120D4EE54BCB44 txid=0df681f61b9e DxC_ML: dxc_id=2172519121256537 End prediction for ml_sn_global_similar_incidents took 37511 ms 2022-10-05 23:29:29 (089) Default-thread-39 0740853A1BDA95582A120D4EE54BCB44 txid=0df681f61b9e SEVERE *** ERROR *** DxC_ML: dxc_id=2172519117548700 Received com.snc.ml.prediction.common.ServiceException: org.apache.http.NoHttpResponseException: mlpredictor-customer.ams100.service-now.com:443 failed to respond from prediction call for ml_sn_global_similar_incidents solution 2022-10-05 23:29:29 (093) Default-thread-39 0740853A1BDA95582A120D4EE54BCB44 txid=0df681f61b9e SEVERE *** ERROR *** MLPredictor: Exception caught: java.lang.NullPointerException For the the prediction call that is timed out , you will see the Read timed out exception in the System Logs -

DxC_ML: dxc_id=734293073336436 Received Read timed out from prediction call for ml_sn_global_similar_incidents solution: java.lang.Exception: Read timed out: com.glide.platform_ml.api.predictionserver.SimilarityPredictionClient.predict(SimilarityPredictionClient.java:285) com.glide.platform_ml.api.Solution.predictInternal(Solution.java:435) Therefore, when the push model call exceeds the default timeout, the prediction call will also timeout after 10 seconds and the prediction fails and returns to the client with a "Read timed out" error. However, all subsequent prediction calls on the solution will provide predictions, as the Similarity solution/model will have been updated in the Prediction Server memory once the push model call has successfully completed.

Cause Predictions on Similarity solutions are generally user invoked and if the solution is very large and not used often, the Prediction Server may no longer have the solution in memory, so it triggers a push model call for the Prediction Server to retrieve the trained solution from the instance, which may exceed the default timeout of 10 seconds. Prediction Servers only keep the models in memory for 48 hours after the last prediction call on it, before it is removed from the Prediction Server memory.

If the trained Similarity solution is very large, it can take longer than 10 seconds to update the Similarity solution in the Prediction Server memory, when the push model call is invoked to update it, exceeding the default timeout.

If the trained Similarity solution has frequent large updates, it also requires to invoke the push model call to update the Similarity solution in the Prediction Server memory, which may also exceed the default timeout of 10 seconds

Resolution The prediction and push model timeout can be set at the Predictive Intelligence solution level, so that you can override the default prediction and push model timeout, both set to the default timeout of 10 seconds. Please be aware that if the Similarity solution is user invoked, they may have to wait longer for 10 seconds for a Prediction to return during the push model call on the Similarity solution. You create two system properties for each Similarity solution using the "Solution Name" [ml_solution] in the system property name. For example, if your custom Similarity solution name is [ml_sn_global_similar_incidents], create the two system properties as follows -

In the Global scope, create two new "integer" system properties [glide.mlpredictor. ml_sn_global_similar_incidents .predict.request_timeout] and [glide.mlpredictor. ml_sn_global_similar_incidents .push.request_timeout] Set the value to 200000 [in ms] for both system properties.. This sets the prediction request and the push model request timeout to 200 seconds for the solution [ml_sn_global_similar_incidents] to handle very large models. The push request is when it needs to load the model from the instance into the Prediction Server memory Important note: You can create a system property to change the default prediction timeout for any predictive intelligence solution, as per the above steps using the solution name in the system property name. However, we do not recommend setting it lower than the default value of 10 seconds and increasing the timeout should only be done in certain circumstances such as the one described in this article. As always, test in sub-production first.