Health Log Analytics TroubleshootingOverview This KB provides steps on troubleshooting Health Log Analytics issues. For an quick overview of Health Log Analytics please see: Health Log Analytics For a more in depth overview of Health Log Analytics please see our doc pages. Table of Contents OverviewLog FlowTroubleshootingLifecycles Log Flow Passive In general, the flow of passive data inputs logs would follow this path: Log streamer sends data to MID serverMID server receives raw log data and update logs as configured in the data input pre processors for this data inputMID server sends logs to log processing node Troubleshooting Cannot see logs on log viewer Note: At least 100 raw lines for a data input needs to be received before the log viewer will display such logs. Therefore, it may take a few seconds/minutes before the logs are seen on the instance when you first setup a data input. First, we need to confirm that the logs are getting to the instance. Thus, we need to check communication between the log streamer to the MID server, and communication from the MID server to the instance. Review the log streaming server logs for any errors This will depend on the log streamer. Rsyslog, for example, logs to /var/log/messages by default Are there any errors on the log streaming server logs? Yes: Fix such errors The actions to fix will depend on the error No: Continue Can the source streaming server ping the MID server? Yes: ContinueNo: Fix network communication issues Note: Continue to next steps if your environment has ICMP/ping disabled Can the source streaming server telnet to the MID server on the TCP port the MID server is listening? Yes: ContinueNo: Check with your team for any network or device firewall configurations which may block this communication Confirm the MID server is listening on the expected port(s) Linux: Port check could be done via command lsof -i | grep "COMMAND\|<port_number>" Example output: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEjava 146675 renan 577u IPv6 1124446 0t0 TCP *:6002 (LISTEN) Windows: Port check can be done via command netstat -ano | findstr "<port_number>" Is the MID server listening on the configured port? Yes: ContinueNo: Stop the data input if data input is "Active" Navigate to "Health Log Analytics > Data Input > Data Inputs"Click on the data inputOnce loaded, click on "Stop Data Input" Note: check the "Status" field to determine if Data Input is active Start the data input Navigate to "Health Log Analytics > Data Input > Data Inputs"Click on the data inputOnce loaded, click on "Start Data Input" Review the MID server agent logs for errorsErrors in the MID server agent logs? Yes: Fix errorsNo: Continue Check that firewall rules on the MID server allow inbound traffic on the configured portOnce the MID server receives the logs, it sends the raw logs to <yourInstanceName>-data.service-now.com Is outbound traffic allowed to <yourInstanceName>-data.service-now.com? No: Have your network/proxy team allow traffic to <yourInstanceName>-data.service-now.comyes: Continue Lifecycles How are the data inputs started and stopped on the MID Server? The following happens when you submit a Data Input The configuration is passed to DataInputConnectorAjax.createConfigurationRecord()A call is made to BR "Start Data Input" which calls DataInputOperations.start()DataInputOperations calls MIDExtensionContext.start() MIDExtensionContext.start() creates an ecc_queue record where: Queue = OutputTopic = MIDExtension:DataInputWrapperExtensionName starts with => DataInputWrapperExtension:<data_input_name> The MID Server EccQueueMonitor processes the output, we can see in the agent logs (follow example is for a TCP passive data input): ECCQueueMonitor.1 (21)LogBufferService - Starting LogBufferService...ECCQueueMonitor.1 (21)LogBufferService - Creating channel #0ECCQueueMonitor.1 Starting GrpcSink...ECCQueueMonitor.1 GrpcSink is attempting to connect to : https://<yourInstanceName>-data.service-now.com/ECCQueueMonitor.1 MID is using basic authECCQueueMonitor.1 Starting marker service.ECCQueueMonitor.1 Enabling monitor: LogAnalyticsMetricMonitorECCQueueMonitor.1 Enabling monitor: StreamingSourcesStatsMonitorECCQueueMonitor.1 Creating datainput #1ECCQueueMonitor.1 About to retrieve script for data input: <data_input_sysId>ECCQueueMonitor.1 Getting instance ACLs for table: sn_occ_base_data_input_configECCQueueMonitor.1 script retrieved for data input:<data_input_sysId>function process(sample, metadata) { // write your code here return { 'modifiedInput': null, // manipulated raw data 'splitEvents': null // an array of strings, treated as separate events };}// Do not write code hereECCQueueMonitor.1 [TomcatLogs] - Configuring data inputECCQueueMonitor.1 [TomcatLogs] - Validating properties...ECCQueueMonitor.1 [TomcatLogs] - Successfully validated propertiesECCQueueMonitor.1 [TomcatLogs] - Successfully configured data input TCPDataInputECCQueueMonitor.1 [TomcatLogs] - Starting data inputECCQueueMonitor.1 [TomcatLogs] - Successfully started data input TCPDataInput<data_input_name>-data-input-consumer [<data_input_name>] - Consumer started. At this point the Data Input is ready to receive the logsAt this point, if a MID Server thread dump was taken, we would see thread like the following: | "input-boss-<data_input_sysId>" #172 daemon prio=5 os_prio=0 cpu=8.10ms elapsed=1438.51s tid=0x00007ff9605c4800 nid=0x23fe7 runnable [0x00007ff94f538000]|| "<data_input_name>-data-input-consumer" #173 daemon prio=5 os_prio=0 cpu=0.48ms elapsed=1438.42s tid=0x00007ff9605c6000 nid=0x23fea waiting on condition [0x00007ff94f237000]|| "grpc-default-executor-36" #267 daemon prio=5 os_prio=0 cpu=194.40ms elapsed=2864.02s tid=0x00007ff97441f000 nid=0x24162 waiting on condition [0x00007ff950645000]|| "grpc-client-1-send-loop686" #472 daemon prio=5 os_prio=0 cpu=22.03ms elapsed=168.16s tid=0x00007ff9840c4800 nid=0x245c7 waiting on condition [0x00007ff94e72e000] How is data sent from the MID Server to the log processing node? A server with a log shipper is configured to send raw logs to the MID serverThe MID server receives the logsFinally the MID server sends the raw logs to instance_name-data.service-now.com Note: These calls are done directly to the log processing server and not one of the ServiceNow application nodes How is a the last log time field updated for the streaming sources? MID Server thread StreamingSourcesStatsMonitor sends ecc_queue inputs to the instance with Topic = queue.log_streaming.statName = StreamingSourcesStatsMonitor These inputs are processed by business rule "Process Source Streaming Stats" which calls StreamingSourceStatsECCPaylodProcessor.process()This updates sn_occ_log_streaming_source_stats fields How are the MID Server metrics reported to the instance? MID Server thread LogAnalyticsMetricMonitor sends ecc_queue inputs to the instance with Topic = queue.analytics.statName = LogAnalyticsStats These inputs are processed by business rule "Process Log Analytics MID Metrics" which updates sn_occ_mid_metrics