Mid Server Connectivity/Validation Issues in Paris


Description

In Paris, It's been observed that the main issue with mid servers revolves around the OCSP check. This is where it verifies if Service-Now.com is who they say they are through entrust.net. If you do not have a connection on port 443 to entrust.net, it affects whether the mid server is actually validated (even though it says it is) and it ends up having issues connecting to the instance. This could also very well create an AMB issue.

After we have validated a mid server, it is possible that the OCSP Connection that was previously good during the original Validation, may be put into an unstable state. This could also happen if the properties that are used to prevent the check from happening are reversed to the following:

com.glide.communications.httpclient.verify_hostname = true

mid.security.validation.endpoints = set to blank '*.service-now.com'

This would indicate that the check is being performed again and will create the unstable state of the validation. Additionally we will find the following error in the agent log:


2020-10-28 14:11:08 (866) StartupSequencer WARNING *** WARNING *** An active MID Server with a duplicate name detected.
java.lang.Exception: An active MID Server with a duplicate name detected.
at com.service_now.mid.Instance.ensureUniquelyNamedAgentRecord(Instance.java:206)
at com.service_now.mid.Instance.ensureAgentRecord(Instance.java:165)
at com.service_now.mid.services.StartupSequencer.ensureUniqueAgentRecordFromInstance(StartupSequencer.java:243)
at com.service_now.mid.services.StartupSequencer.testsSucceeded(StartupSequencer.java:137)
at com.service_now.mid.services.StartupSequencer.startupSequencerRunnable(StartupSequencer.java:624)
at java.lang.Thread.run(Thread.java:748)

2020-10-28 14:11:08 (866) StartupSequencer WARNING *** WARNING ***
2020-10-28 14:11:08 (866) StartupSequencer WARNING *** WARNING *** Encountered error in ensuring agent record on the instance. Retry...
2020-10-28 14:11:08 (866) StartupSequencer Waiting to retry in 5 minutes. Attempt 1 of 3.


It's a bit of a misleading as this seems to be occurring when there are multiple mid servers that fail the OCSP check



Release or Environment

Paris(maybe Orlando)

Cause

Some sort of issue with validation gets confused when the OCSP check is re-enabled or the connection to entrust.net is severed causing an unstable state with the validation.

Resolution

The first thing that we can do is to verify the following properties on the instance:
1. on sys_properties.LIST:
com.glide.communications.httpclient.verify_hostname = false
2. on ecc_agent_property.LIST:
mid.security.validation.endpoints = set to blank ' ' (would be enabled if the value is *.service-now.comĀ 
3. additionally set this same property (mid.security.validation.endpoints) to blank on the agent/work.remote.properties file on the mid server

***Please note that this will disable the OCSP Check - it may be a security issue, but for the time being it is possible that it is preventing you from getting the mid server to be a properly validated state(even though it says that it is validated)

Now that it is disabled, keep in mind that there is a possibility that several mid servers may be stuck in this state. If this is the only mid server with the issue, do the following:
1. On the instance, invalidate the mid server for ALL mid servers that have the issue
3. Stop the mid server service on all servers
and then one at a time:
4. Delete the mid server service : (As Administrator from the command prompt) > sc delete <ServiceName> *you can get the service name by right mouse clicking on it and go to properties -the value is already highlighted for you to copy/paste in the command on the Command Prompt
5. Run Start.bat
6. Validate the mid server

The mid servers should validate. If they validate and we still have an issue, you should try a manual rebuild of the Mid Server just in case there is a corrupted binary causing the issue:

1. STOP the mid server service
2. Download the install package from the instance for the Mid Server (Mid Server -> Downloads)
3. Rename the current agent folder to agent_old(if it does not et you, it may require a reboot of the mid server - or you may have a Window Open in that folder)
4. Extract the agent folder to the same location as agent_old
5. Copy agent_old/conf, agent/old/keystore, agent_old/config.xml to the same locations on the "new" agent folder (overwrite)
6. You may be using certificates, if so, copy agent_old/jre/lib/security/cacerts to the same location on agent (ovewrite)
7 Delete the current mid server service:
(As Administrator from the command prompt) > sc delete <ServiceName> *you can get the service name by right mouse clicking on it and go to properties -the value is already highlighted for you to copy/paste in the command on the Command Prompt
8. Run Start.bat

*You should not need to revalidate the mid server