LDAP Integration Using a Mid Server Intermittently Fails With Error: Did not get a response from the MID server after waiting for 55 seconds


Description

Observing the LDAP logs you periodically see the following error:

Error LDAP Server: LDAPServer-LDAP URL: ldap://ldapserver.test.corp failed scheduled connection test. ErrorCode: 40100. ErrorMessage: Did not get a response from the MID server after waiting for 55 seconds.

Repeatedly testing the LDAP connectivity from a browser shows the following failure for some of the tests:

https://<instancename>.service-now.com/security_status.do?name=LDAPAuthStatus&action=testconnection

{ 
"LDAPServer-LDAP" : [ {
"url" : "ldap://ldapserver.test.corp",
"operational_status" : true,
"test_error_code" : 40100,
"test_error_message" : "Did not get a response from the MID server after waiting for 55 seconds",
"test_success" : false
} ]
}

Other times the same test passes fine:

{
"LDAPServer-LDAP" : [ {
"url" : "ldap://ldapserver.test.corp",
"operational_status" : true,
"test_error_code" : 0,
"test_error_message" : "Connected successfully",
"test_success" : true
} ]
}

This displays the intermittent nature of the failure.

This may also have led to an auto-generated alert Case in HI with the subject "AHA Audit Failed: LDAP Connectivity" because our monitoring is using the same test.

Release or Environment

You have an LDAP integration that runs the LDAP connection through a mid server. Applies to any release. 

 

Cause

  1. Go to Mid Server > Servers > select the mid server which hosts the LDAP server
  2. From under "Related Links" select "Grab MID logs"
    • There should be two Queue = output lines, one with Name = agent0.log.0 and one with wrapper.log
    • After a few seconds there should be two Queue = input lines, one with Name = agent0.log.0 and one with wrapper.log
    • But in this case there are more than just the two Queue = input lines returned, for example there are four lines returned: two with Name = agent0.log.0 and two with wrapper.log
    • If you see duplicated agent0.log.0 and wrapper.log records being returned then the solution provided in this KB needs to be implemented.

This may be caused by the Windows machine having two or more mid server JVMs (or, putting it another way, two or more Windows services) running out of the same mid server installation directory causing various issues with mid server responses.  In this case, causing delays to the responses to the LDAPConnectionTesterProbe (as seen in the ECC queue) causing the LDAP connection failure errors and multiple responses to the "Grab MID logs" requests.

To verify this go into the Windows Services on the mid server machine and right click on each of the mid server services you have in Running there then select Properties -> check the "Path to executable" for each Service - if two or more point to the same directory path then you have this issue.

Resolution

Shutdown all but one of the mid server Services on the Windows machine where there are duplicated "Path to executable" Services, keeping just one of them running. Be sure to set the shutdown mid server Services to not restart automatically by setting Startup Type to "Disabled".

Repeating the "Grab MID logs" should no longer result in multiple agent0.log.0 and wrapper.log files being returned and the LDAP connection test failures should no longer be seen.

 

Additional Information

This particular cause is just one potential causes of duplicate inputs, and is quite rare. See these for more details of the duplicate service issues. From New York we actively prevent a second service from starting up.
KB0743043 How to debug and resolve the "A MID Server with a duplicate name or sys_id was prevented from connecting." Issue
PRB1330396 / KB0743123 MID Server start.bat fails to check if a Windows Service already exists for the installation folder before creating another service

More likely is that the MID Server happens to be busy at the time. See this for a workaround to increase the priority of these test jobs so that they run immediately. These jobs run at higher priority from London Patch 9, Madrid Patch 5, New York.
PRB1331240 / KB0743756 LDAP "Test Connection" and "Browse" features can timeout, and LDAP Monitor may show Connection Status as Not Connected, due to running at Standard(2) MID Server priority - Did not get a response from the MID server after waiting for 55 seconds