Discovery Source, First Discovered, Most recent Discovery and Manual Entry don't mean what you think - Support and Troubleshooting

Discovery Source, First Discovered, Most recent Discovery and Manual Entry don't mean what you think Table of Contents Introduction In the beginning there was Discovery Then there were imports Then the CMDB Identification and Reconciliation engine was born And Discovery and Service Mapping started using Patterns Import Plugins migrated to Service Graph Connector Field level Reconciliation, and Multisource CMDB Conclusion Introduction This KB article hopes to reduce the confusion caused when the CMDB IRE re-purposed some existing Discovery fields in the CMDB table. For 'reasons' its not as clear cut as you may think. Here's a history lesson from someone who was there through it all:

In the beginning there was Discovery Go back a couple of decades, and there was the CMDB table, and it was populated by either the Discovery (horizontal, agent-less, IP scan), or Discovery's Help-The-Helpdesk script (HTHD, Windows browser javascript) features.

To help customers track what was updated, when, by what, 3 fields were added to the CMDB by Discovery:

Discovery source [discovery_source] "Service-Now" (since changed to "ServiceNow" in line with the company name change a decade ago), meaning ServiceNow discovered it, as as we only had Discovery at the time, it meant, and still means the "Discovery" product. Empty meant is was probably added some other way, possibly manually, or via Asset Management. First discovered [first_discovered] The timestamp the CI was first inserted by Discovery Most recent discovery [last_discovered] The last time the record was updated by Discovery Not all Discovery Sensors populated any or all three of these fields, and a couple still don't. The specific Sensor has to be coded to populate these as part of the inserts/updates that specific sensor was doing anyway. It was nice to have rather than mandatory. You were lucky if the Sensor populated these in the main CI, and would not expect child CIs like memory, processes, disks etc. to have this populated.

The timestamps were when the Sensors code ran for the ecc_queue input, which had data from the discovery Probe running earlier in the MID Server. So not exactly when it was Scanned by Discovery, but close enough for that to become the accepted meaning of the field.

Note the field names contain the word Discovery, because they were for and by Discovery back then. The MID Server was also part of Discovery, and specifically for Discovery. Now CMDB and MID Server are considered general platform features, but that's why so many tables and fields, such as discovery_credentials, also have the word Discovery in.

Then there were imports Customers wanted to bring data for Windows computers in from Microsoft System Center, instead of Discovery scans, so the first of a series of Import plugins for Microsoft SCCM were created, Jamf, and other 3rd party management systems. This solved the problem of having to have the laptop or computer turned on at the time, which Discovery would miss if running at night.

The Help-The-Helpdesk feature wasn't really up to the job either and was soon made obsolete, even before agent-based discovery from ACC-V came along.

These plugins also updated the Discovery source field directly, as e.g. "SCCM" or "ImportSet". However as it was SCCM doing the actual Scanning, some time earlier, the scanned timestamps from SCCM were populated in the the Most recent discovery field. It was no longer the update timestamp of the CI, but the timestamp of the scan of the source 3rd party system, subsequently providing the data.

Then the CMDB Identification and Reconciliation engine was born The pain and confusion of duplicate CI data, and conflicting/flapping data from multiple sources, was realised to be such a bad thing that a whole new feature was created. Its 2 main features are:

Identification - Do you match with and update an existing CI record, or create a new one? Reconciliation - Which data sources, for which fields, take priority over others The IRE APIs take a single JSON Payload, containing all the CI items, child and related CIs, and their related items to help with identification. One payload may end up with many CI and CI relationship records being inserted or updated in one go.

The CMDB developers decided to re-purpose the Discovery fields for IRE. Their purpose now became:

Discovery source [discovery_source] The "Data Source", from which the data came. By now there were many servicenow written, and 3rd party apps and plugins The Discovery source field was used for 2 things to begin with: The Data Source from which the data came, similar to before But also for marking the CI as "Duplicate" of another CI. We soon put a stop to that, and now there is a separate "Duplicate Of" reference field for that purpose. First discovered [first_discovered] When the IRE API first processed a payload containing the CI Most recent discovery [last_discovered] When the IRE API last processed a payload containing the CI All those CIs automatically have the 3 fields populated by the IRE code. This no longer needed the features to do this. They just have to use IRE APIs and it is done.

Note that the idea of a discovery or scan date no longer applies. It is when code, running the IRE APIs, does something with a CI. This may not even have anything to do with a Import or Discovery, but could be when a Service CI is created, or when a Dynamic CI Group is recalculated and updated.

The Reconciliation side of IRE uses these timestamps to simply compare which Data Source last updated the record, or for deeming the CI to be Stale.

This is now CMDB IRE metadata, for the use of only the IRE code, not user updatable fields.

And Discovery and Service Mapping started using Patterns The Nebula ServiceWatch product was acquired, and soon turned into the ServiceNow Service Mapping product, for top-down Discovery of Service CIs and their associated CIs.

That's why a Data Source of "ServiceWatch" actually means "Service Mapping". That was never corrected.

The big change for the CMDB updates was that Probes migrated to Patterns. These run many steps/probes all in one pattern, and executes in the MID Server, with just a large IRE Payload being passed back to the instance, to be passed through CMDB IRE APIs. The IRE APIs add the 3 field values.

Many outstanding problems for discovery sensors not populating the 3 fields were closed overnight once the probes/sensors became obsolete by the introduction of the equivalent Pattern.

That has led Discovery users to expect the 3 fields to be populated with every Discovery scan for all CIs, which is true if it is from a Pattern, but still not necessarily true for the remaining few Probes, which may populate the values for the main CI, but not for Child CIs. Updates not using the IRE leave the IRE oblivious and as if they had never happened, affecting reconciliation rules.

Import Plugins migrated to Service Graph Connector/ETL/RTE/E-IRE If an older import transform field map was simply populating the discovery_source field with a value, and updating the CIs and records directly, bypassing the IRE, the IRE was not able to track these as updates at all. It mean Reconciliation Rules couldn't be used. The assumption was that if the 3 IRE fields were populated, then it must have been done via the IRE, but there was lots and still is some that populate these fields directly, confusing the IRE (and customers/support engineers).

By now the general idea is that anything populating the CMDB should now work via the CMDB IRE APIs, and import plugins were becoming a problem.

An interim solution was to use the IRE to identify the CI, but carry on with a normal update once the sys_id was know. Slightly better was the CMDBTransformUtil script include, which had many limitations.

The final solution was a new framework called Service Graph Connector, based on IntegrationHub Extract Transform Load (ETL), Robust Transform Maps (RTE), and the actual committing of the data being done via Enhance-IRE APIs. Once that was in place, the old non-IRE plugins were replaced.

However to support the idea of populating "Most recent discovery" with the timestamp when the data was collected by the 3rd party system, rather than when servicenow added it to the CMDB, the CMDB IRE APIs had a property added to allow the Most recent discovery [last_discovered] field to be added to the IRE Payload data as an attribute.

If the payload has a last_discovered attribute for a CI, then that is used instead. Otherwise the time the IRE API is run is used.

That can be controlled further using these system properties

glide.identification_engine.skip_updating_source_last_discovered_if_older glide.identification_engine.ire_message_listener_skip_updating_source_last_discovered_to_now and this IRE payload property

skip_updating_source_last_discovered_to_now See Properties for Identification and Reconciliation .

Field level Reconciliation, and Multisource CMDB But after all this, the 3 fields are in effect obsolete once you realise IRE Reconcilliation rules work on the Field level, not the record level. Reporting on them is not going to be reliable.

Those 3 record level fields don't affect how reconciliation rules for fields work. The Data Source History [cmdb_datasource_last_update] table tracks Discovery source, and Most recent discovery, at the field level. That's also used to determine if a data source can update a stale CI.

There is also the last_scan attribute in the Source [sys_object_source] table, which also pre-existed CMDB IRE, and was repurposed for IRE. This is used in the identification of CIs during Imports, before falling back to the Identifiaction rules.

And recently the Multisource CMDB feature adds further tracking tables (cmdb_multisource_column_metadata and cmdb_multisource_data), saving the payloads. This can allow retrospectively changing the reconcilliation rules, then replaying recent updated based on the new rules.

Conclusion Discovery Source means the integration/import/discovery Data Source, or application from which code is making CMDB IRE Updates.

For example, you may see "ServiceWatch" or "CredentiallessDiscovery" as the Discovery Source even if the Service Mapping or Credential-less Discovery feature are not actually being used, because java code or script includes from those features is being re-used by things such as CSDM or the Service Model, which will be updating sets of CIs via IRE payloads and the IRE API internal to the instance in the background.

First discovered / Most recent discovery is simply when the payload was passed through the IRE API, unless overridden by an older value in an RTE Import payload.

And the record level fields may be meaningless when multiple sources are involved. e.g. SCCM just updated the RAM attribute, while Discovery updated the CPU attribute, on the same record, the same day. The record may say Discovery Source=SCCM, but only some fields were from that source.

And finally, "Manual Entry" discovery source doesn't mean Manual Entry. Apart from a couple of exceptions (Service creation/Asset to CI sync) no updates/inserts from forms/workspaces, by actual human users, go through the IRE. Don't use that in Reconciliation rules and expect it to do anything.

Now you know this I hope you can avoid making decisions based on false assumptions.