MID Server: troubleshooting WMI/Powershell issues - Credentials


Description

After running a Discovery Schedule, we sometimes find errors in Discovery Logs indicating a credentials failure. This results in Discovery status not completing successfully.

There are cases in which Discovery Logs indicate a credentials failure and the actual cause of the failed discovery is external to the MID Server. We are focusing this article on Discovery of Windows machines.

Credentials

In order to discover target Windows machines, we need to add Windows credentials to the Credentials table. These credential records specify a username, a password, a kind of credential (Windows, SSH, ...), and MID Servers that are "allowed" to use this credential.

When the MID Server starts or when a credential is modified, the MID Server downloads and caches all available credentials.  

MID Server configuration parameters

The three main configuration parameters we consider in this article are:

Starting with the Fuji release, these parameters are true by default.

Running Windows probes

Many Windows probes try to retrieve some WMI and Windows registry values from the target machine. 
There are two methods, depending on the value of the mid.use_powershell configuration parameter.

Running the WMIRunner probe

This probe cannot use the credentials defined in Discovery > Credentials. In this case, the MID Server connects to the WMI providers with the account the MID Server service is running with. We call it the "local" MID account, though this account could be a domain user account. This account is required to have access to the remote machines so that WMIRunner works.

Running the Powershell probe

This probe can use the credentials defined in Discovery > Credentials. In this case, the MID server uses Powershell API to retrieve WMI values. If the MID Server is configured to run Powershell probes using credentials (see "MID Server Configuration Parameters" above), it attempts to connect to the target machine using the first defined credential. If this credential fails, then the second one is attempted. 
If all the credentials fail and the MID Server is configured to use the local MID account as last resort, the MID Server attempts with the user the MID Server is running with.

Simulating a simple probe

Go to System Maintenance > Scripts - Background and run the following script.

var mid_server  = 'MID_SERVER_1';
var target_host = '192.168.200.14';
var debug = true;

sendProbeWmiRunner(mid_server, target_host);

function sendProbeWmiRunner(mid, host) {
	var ecc = new GlideRecord('ecc_queue');
	ecc.initialize();
	ecc.agent = 'mid.server.' + mid;
	ecc.topic = 'WMIRunner';
	ecc.name = 'TestWMI';
	ecc.source = host;
	ecc.queue = 'output';
	ecc.payload = '<parameters><parameter name="WMI_FetchData" value="Win32_BIOS.SerialNumber"/><parameter name="port" value="135"/><parameter name="debug" value="' + debug + '"/><parameter name="skip_sensor" value="true"/></parameters>';
	ecc.insert();
}

Substitute the variables mid_server and target_host with the name of your Mid Server and the IP address of the remote Windows machine you are trying to discover. The script sends a WMIRunner probe to the MID Server that tries to retrieve the serial number from a target machine. If mid.use_powershell is true, the MID Server switches the WMIRunner probe internally to Powershell.

Troubleshooting Windows probes

We have run the Discovery Schedule and observed some errors in Discovery Logs resulting from running WMI/Powershell probes. As described above, if the MID Server Configuration Parameter mid.use_powershell is false, troubleshoot the WMIRunner probe. Otherwise, troubleshoot the Powershell probe. 

Troubleshooting WMIRunner probes - mid.use_powershell:false

  1. Send the probe (see "Simulating a simple probe" above).
  2. Go to the ECC Queue (Discovery > ECC Queue), order the list by Created from newest to oldest and observe the output message that we have just sent (topic "WMIRunner", name "TestWMI").
  3. Wait a few seconds so that the MID Server processes the probe.
  4. Refresh the list until you get the response ECC Message.
  5. Open the message and observe the contents of Payload (click the small XML button to see the payload formatted).
  6. Observe the message within the result tag.

When the probe succeeds, you receive something similar to this:

<Win32_BIOS>
	<SerialNumber>
		VMware-56 4d 41 9f 34 6e 0c 3d-1b be f1 3c f7 2c ad 7b
	</SerialNumber>
</Win32_BIOS>

This is what we get in case of bad credential:

<error>
	Connection failed to WMI service. Error: Permission denied
</error>
<error>
	Thu May 28 16:24:21 2015 DEBUG: Testing WMI connection to 192.168.200.14
	Thu May 28 16:25:03 2015 DEBUG: Appending element: error
	Thu May 28 16:25:03 2015 DEBUG: getValueText()
	Thu May 28 16:25:03 2015 DEBUG: valueType: string
	Thu May 28 16:25:03 2015 DEBUG: Appended: error; Connection failed to WMI service. Error: Permission denied
	Thu May 28 16:25:03 2015 DEBUG: WMI not running
</error>

That is telling us that the account the MID Server is running with is not a valid credential for the WMI API to connect to the target machine.

Solution

There are two solutions:

  • Set the account the MID Server is running with (Windows Service) with the correct permissions to access the remote machine
  • Switch mid.use_powershell to true. This is the recommended option.

Troubleshooting Powershell probes - mid.use_powershell:true

  1. Send the probe (see "Simulating a simple probe" above)
  2. Go to the ECC Queue (Discovery > ECC Queue), order the list by Created from newest to oldest and observe the output message that we have just sent (topic "WMIRunner", name "TestWMI").
  3. Wait a few seconds so that the MID Server processes the probe.
  4. Refresh the list until you get the response ECC Message.
  5. Open the message and observe the contents of Payload (click the small XML button to see the payload formatted).
  6. Observe the message within the result tag.

When the probe succeeds, you receive something similar to this:

<Win32_BIOS>
	<SerialNumber>
		VMware-56 4d 41 9f 34 6e 0c 3d-1b be f1 3c f7 2c ad 7b
	</SerialNumber>
</Win32_BIOS>

The following are the most frequent errors.

The RPC server is unavailable

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: true:

<error>
	Authentication failure with the local MID server service credential.
</error>
<error>
	Failed to access target system. Please check credentials and firewall settings on the target system to ensure accessibility: The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	  at System.Management.ManagementScope.InitializeGuts(Object o)
	  at System.Management.ManagementScope.Initialize()
	  at System.Management.ManagementObjectSearcher.Initialize()
	  at System.Management.ManagementObjectSearcher.Get()
	  at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
</error>

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: false:

<error>
	Failure(s) with available Windows credentials from the instance. Credentials tried: LOCALDOMAIN\mid,autolab1\autouser
</error>
<error>
	The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	  at System.Management.ManagementScope.InitializeGuts(Object o)
	  at System.Management.ManagementScope.Initialize()
	  at System.Management.ManagementObjectSearcher.Initialize()
	  at System.Management.ManagementObjectSearcher.Get()
	  at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
</error>

This is a very confusing error. In most cases, this is not really an authentication failure caused by an incorrect credential as shown below.

In the MID Server logs (with debug enabled):

  • The MID Server first tries to use the credentials from the Credentials table which fail with exit code 2 (provided that mid.powershell.use_credentials: true)
  • Then it tries with the account the MID server is running with, which fails with exit code 3 (provided that mid.powershell.local_mid_service_credential_fallback: true)
  • In all cases we observe that the RPC server is unavailable. (Exception from HRESULT: 0x800706BA)

The MID Server returns the error details resulting from running the last credential to Discovery. For the three errors, we observe that the RPC server is unavailable. (Exception from HRESULT: 0x800706BA).
This indicates that the MID Server is not able to access the remote machine using RPC. Usually that is caused by a Windows firewall on the remote machine not letting RPC requests go through.

Troubleshooting

We can try to run a simple Powershell WMI query directly from the MID Server to the remote machine. 
Open a PowerShell command line and run the following:

gwmi win32_operatingsystem -computer 192.168.200.14 -credential 'LOCALDOMAIN\mid'

Substitute LOCALDOMAIN\mid by the credential that you want to test. The expected is something similar to:

SystemDirectory : C:\Windows\system32
Organization    :
BuildNumber     : 6001
RegisteredUser  : Windows User
SerialNumber    : 12345-OEM-1234567-12345
Version         : 6.0.6001

In this case you receive the following:

Get-WmiObject : The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)
At line:1 char:5
+ gwmi <<<<  win32_operatingsystem -computer 192.168.200.14 -credential 'localdomain\mid'
    + CategoryInfo          : InvalidOperation: (:) [Get-WmiObject], COMException
    + FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands.GetWmiObjectCommand 

Solution

  1. Ensure that there is IP connectivity to the remote machine using ping and see if it responds. If it does not respond, then you have a routing problem. Contact your network administrator.
  2. If IP is ok, then the issue is related to the remote machine or a filtering device between the MID Server and the remote machine.

The solution is to fix the firewall issue on the remote machine. See WMI, PowerShell and Windows Firewalls for information about troubleshooting RPC/DCOM related issues.

Access is denied

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: true:

<error>
	Authentication failure with the local MID server service credential.
</error>
<error>
	Failed to access target system. Please check credentials and firewall settings on the target system to ensure accessibility: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	  at System.Management.ManagementScope.InitializeGuts(Object o)
	  at System.Management.ManagementScope.Initialize()
	  at System.Management.ManagementObjectSearcher.Initialize()
	  at System.Management.ManagementObjectSearcher.Get()
	  at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
	  at System.Management.Automation.Cmdlet.DoBeginProcessing()
	  at System.Management.Automation.CommandProcessorBase.DoBegin()
</error>

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: false:

<error>
	Authentication failure(s) with available Windows credentials from the instance. Credentials tried: autolab1\autouserBAD,LOCALDOMAIN\midBAD
</error>

If the local mid service credential is not enabled, we are not going to see the E_ACCESSDENIED error message. We can trust the Authentication failure message. In most cases, this is caused by an incorrect credential.


In the MID Server logs (with debug enabled):

  • The MID Server first tries to use the credentials from the Credentials table which fail with exit code 1 (provided that mid.powershell.use_credentials: true)
  • Then it tries with the account the MID server is running with, which fails with exit code 3 (provided that mid.powershell.local_mid_service_credential_fallback: true).

The MID Server returns the error details resulting from running the last credential to Discovery. The mid server logs return a different error for remote or local credentials - this is a bit misleading. Go through the next section to confirm that this is an incorrect credential issue.

Troubleshooting

To troubleshoot the credentials, run a simple Powershell WMI query directly from the MID Server to the remote machine. Open a PowerShell command line and run the following:

gwmi win32_operatingsystem -computer 192.168.200.14 -credential 'LOCALDOMAIN\mid'

Substitute LOCALDOMAIN\mid by the credential that you want to test. We expect something similar to:

SystemDirectory : C:\Windows\system32
Organization    :
BuildNumber     : 6001
RegisteredUser  : Windows User
SerialNumber    : 12345-OEM-1234567-12345
Version         : 6.0.6001

In this case, however, you receive:

Get-WmiObject : Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
At line:1 char:5
+ gwmi <<<<  win32_operatingsystem -computer 192.168.200.14 -credential 'localdomain\mid'
    + CategoryInfo          : NotSpecified: (:) [Get-WmiObject], UnauthorizedAccessException
    + FullyQualifiedErrorId : System.UnauthorizedAccessException,Microsoft.PowerShell.Commands.GetWmiObjectCommand

This clearly indicates that the credential (username or password or both) is incorrect.

Solution

Find a credential that works and update it in ServiceNow.

Call was canceled by the message filter

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: true:

<error>
	Authentication failure with the local MID server service credential.
</error>
<error>
	Call was canceled by the message filter. (Exception from HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	  at System.Management.ManagementScope.InitializeGuts(Object o)
	  at System.Management.ManagementScope.Initialize()
	  at System.Management.ManagementObjectSearcher.Initialize()
	  at System.Management.ManagementObjectSearcher.Get()
	  at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
</error>

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: false:

<error>
	Failure(s) with available Windows credentials from the instance. Credentials tried: AUTOLAB1\autouser,localdomain\mid
</error>
<error>
	Call was canceled by the message filter. (Exception from HRESULT: 0x80010002 (RPC_E_CALL_CANCELED))
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	  at System.Management.ManagementScope.InitializeGuts(Object o)
	  at System.Management.ManagementScope.Initialize()
	  at System.Management.ManagementObjectSearcher.Initialize()
	  at System.Management.ManagementObjectSearcher.Get()
	  at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
</error>

Troubleshooting

The method for troubleshooting this error is similar to the methods used for previous errors. Run the Powershell test command and see if you can reproduce the error there. If you can, the issue is between the MID Server and the remote machine. 

The error indicates that the RPC connection is being canceled by the remote machine. Try running the same command against other Windows machines to see if it works. That would confirm that the problem is isolated to one particular Windows machine. The cause on the remote machine could be the Windows firewall filtering access to RCP/DCOM.

Solution

The next two articles can help you troubleshoot and fix these problems:

Server execution failed

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: true

<error>
	Authentication failure with the local MID server service credential.
</error>
<error>
	Failed to access target system. Please check credentials and firewall settings on the target system to ensure accessibility: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	 at System.Management.ManagementScope.InitializeGuts(Object o)
	 at System.Management.ManagementScope.Initialize()
	 at System.Management.ManagementObjectSearcher.Initialize()
	 at System.Management.ManagementObjectSearcher.Get()
	 at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
	 at System.Management.Automation.Cmdlet.DoBeginProcessing()
	 at System.Management.Automation.CommandProcessorBase.DoBegin()
</error>

This is how the error looks in the input payload when mid.powershell.local_mid_service_credential_fallback: false:

<error>
	Failure(s) with available Windows credentials from the instance. Credentials tried: autolab1\autouser,LOCALDOMAIN\mid
</error>
<error>
	Server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE))
	Stack Trace:    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
	 at System.Management.ManagementScope.InitializeGuts(Object o)
	 at System.Management.ManagementScope.Initialize()
	 at System.Management.ManagementObjectSearcher.Initialize()
	 at System.Management.ManagementObjectSearcher.Get()
	 at Microsoft.PowerShell.Commands.GetWmiObjectCommand.BeginProcessing()
</error>

If the wrong credential is used, the E_ACCESSDENIED error message appears. If a valid credential is used, the CO_E_SERVER_EXEC_FAILURE error message appears. This occurs when the Windows Management Instrumentation service is paused or down in the remote machine.

Troubleshooting

Open a PowerShell command line on the MID Server and run the following:

gwmi win32_operatingsystem -computer 192.168.200.14 -credential 'LOCALDOMAIN\mid'

Substitute LOCALDOMAIN\mid with the credential that you want to test. We expect something similar to:

SystemDirectory : C:\Windows\system32
Organization    :
BuildNumber     : 6001
RegisteredUser  : Windows User
SerialNumber    : 12345-OEM-1234567-12345
Version         : 6.0.6001

In this case, however, you receive:

Get-WmiObject : Server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE))
At line:1 char:5
+ gwmi <<<<  win32_operatingsystem -computer 192.168.200.14
    + CategoryInfo          : InvalidOperation: (:) [Get-WmiObject], COMException
    + FullyQualifiedErrorId : GetWMICOMException,Microsoft.PowerShell.Commands.GetWmiObjectCommand

This issue is usually caused by the Windows Management Instrumentation service being down on the remote machine.

Solution

Start Windows Management Instrumentation on the remote machine.