Percent of Hardware and VM Instance CIs updated in last 90 days Get Well Playbook Percent of Hardware and Virtual Machine Instance CIs not updated within 90 days A step-by-step guide to analyze and remediate stale CMDB data Table of Contents Summary Goal of this Playbook Audience Problem Overview Executive Summary How this playbook can help you achieve business goals How this playbook is structured Problem Analysis Upstream Causes Downstream Consequences Impact on Your Business Engagement Questions Remediation Plays Summary Play 1: Analysis Play 2: Fix Play 3: Data Governance Summary Goal of this Playbook The goal of this playbook is to help you identify Hardware and Virtual Machine Instance Configuration Items (CIs) that have not been updated within the last 90 days. Then this playbook will help you review common reasons why these CIs might be stale and will provide high level guidance on how to get them updated or removed from the Configuration Management Database (CMDB). Details about this playbook Author David WaffenDate 12/09/2020Addresses HSD # HSD0006683Applicable ServiceNow Releases AllTime Required Approximately 1 to 8 hours (depending on your environment) Audience Configuration Manager or Configuration Management teamServiceNow Administrator or Discovery Administrator Additional ResourcesCoordination with persons fulfilling the following types of roles may be required to complete this playbook: Asset ManagerNetworking EngineerSystems Administrator (Linux, Windows, etc.)Third-party Application Owner or Support Problem Overview Investment in your CMDB will pay off only when its information is both accurate and actionable, and Hardware and Virtual Machine Instance records are CMDB foundational items. If not accurate and actionable the value of your CMDB and the return on investment (ROI) of associated applications, tools, and functionality offered across the ServiceNow platform (e.g., Incident, Problem, Change Management, service mapping, and entity relationships) becomes limited, diminishing, or worse - a loss. Executive Summary How this playbook can help you achieve business goals When managed Configuration Items accurately reflect the current state of devices you can expect predictable process outcomes across the entire platform.Key decision makers deserve both accurate and up-to-date reporting based on these Configuration Items to make sound decisions regarding current and future business goals. How this playbook is structuredThis playbook guides you through a series of three Plays: Play 1: Analysis Play - Lets you see the stale Hardware or Virtual Machine Instance CIs in your CMDB, if any.Play 2: Fix Play - Lists various methods you can use to refresh the stale CIs and helps you choose one or more remediation methods, depending on root cause.Play 3: Governance - Explains processes you can implement to ensure your managed CIs (Hardware or Virtual) stay as up-to-date as possible. Problem Analysis Upstream Causes The device's IP address may never have been added to Discovery Schedules or IP Ranges.The Discovery Schedule containing a device's IP Range may have been disabled and forgotten.The Discovery Credential may have expired, been rotated, or have experienced a lock out leaving devices inaccessible.Third-party discovery sources / integrations are not working or have been disabled and forgotten.A scheduled data import frequency is greater than the desired time period (90 days).A device may have been taken off the network or the hardware was refreshed without following through to retire or archive the Configuration Item representing it.An infrastructure or network segment has been altered without communication to your Discovery Administrator leaving Discovery coverage inadequate.Duplicates may exist and now only a single record is being updated leaving others to become stale. Downstream Consequences Data Consequences Increased size (excess data due to the presence of retired devices) of CMDB may hinder performance Operation Consequences Erosion of stakeholder trust in Hardware and VM Instance data in the CMDBSupport teams relying on Hardware and VM Instance data may seek to implement alternative mini CMDBsDaily operations will require additional effort to filter out inaccurate records leading to increased query timesResolution times (MTTR) rise sharply when records must be searched through to determine their accuracyIntroduces risk to ITSM processes, such as service or change requests may be rejected or fail implementation when stale Hardware and VM CIs are chosen as the affected CIHealth Dashboards will display a poor "Correctness" score and reflect badly on your CMDB Application Consequences Dependent applications and processes will consume inaccurate Hardware and VM Instance data. Most ServiceNow applications depend on the accuracy within the CMDBAudit, Compliance, and SLA reporting may fail due to erroneous dataBusiness Rules set/triggered on Hardware and VM Instance classes will not execute if not these CIs are in a stale state. IT processes dependent on these rules will perform poorly. Impact on Your BusinessStale CI records may cause superficially plausible, but wrong decisions to be made by decision makers, which may impact key business initiatives. Audit and Compliance Data AccuracyData Consistency Lower MTTR Data AccuracyFacilitates easier identification of anomalies and business services Operational Visibility Integrity of Relationships Process Automation Data Accuracy Engagement Questions Consider the answers to these questions: Do you have automated Discovery, Service Mapping, or Import schedules?How often do run your schedules? Do you see any errors?What is the most reliable source for your CI data and specific classes?What CIs have you entered manually, and why can they not be updated with a discovery source?Do you periodically certify your data (for all or some classes)? If not, do you have a plan or intention to do so?Do you have defined CI Lifecycle states? Are you using the lifecycle state definitions included with the base system? If not, what custom definition states are you using?Do you regularly review your Discovery logs?When a network segment is added or modified, what means does the Networking Team use to notify the Discovery Administrator or Configuration Management Team? Remediation Plays Summary The table below lists and summarizes each of the remediation plays in the playbook. Details are included later. Play Name Analysis Play What this play is about Helps you find stale CIs in the Required tasks Import and commit the Update Set Fix Plays What this play is about Lists the methods you can use to refresh the stale Hardware or VM's. Methods vary according to the root cause Required tasks Use one or more of the available methods to refresh the stale CIs Data Governance What this play is about Lists the methods you can use to limit the number of stale CIs Required tasks Use one or more of the methods to limit the number of stale CIs, and ensure data accuracy Play 1 - Analysis What this Play is about This play helps you find the stale CIs in your CMDB. This play includes an Update Set you use to find stale Hardware and Virtual Machine Instance CIs. Required tasks Import and commit the HSD0006683 – Percent Hardware & VM updated last 90 days Update Set. Three modules containing the name of HSD0006883 will become available. HSD0006683 – List View: Percent Hardware & VM stale 90 daysHSD0006683 – Report: Percent Hardware stale 90 daysHSD0006683 – Report: Percent VMs stale 90 days Reload your browser window to refresh the Navigation Panel menu items. In the Navigation filter type "HSD0006683". Use the modules containing "Report:" to display the results for stale Hardware or Virtual Machine Instances.Review the results of the report, which by default is grouped by class. Your reports should look similar to the following, if any stale CI exist.Example: Hardware ReportExample: Virtual Machine ReportClick into any "slice" of the pie chart to display its records in a report List View. These reports can be scheduled or placed on a dashboard to track progress during remediation efforts.Use the module containing the name "List View:" to display all stale Hardware and VM Instance CIs records in a List View, if any exist in either the Hardware or Virtual Machine classes. If there aren't any stale CIs, you don't need to do anything else. You've completed the Analysis Play.If there are stale CIs, you need to review them and complete the tasks in the following Fix Play(s). Play 2 - Fix Play What this Play is about This play lists the methods you can use to refresh stale Hardware class CIs. Choose one of five (5) methods (described below), based on the output. Required tasks Using the module containing the name "List View:" display all stale Hardware and VM Instance CI records. Grouping the list by class may make reviewing easier. This play addresses just the Hardware class items and not Virtual Machine items.Important: Carefully review the stale Hardware CI records. It helps you identify the root causes, which then helps you decide which refresh method to use. If you have more than one root cause, you need to address all of them.It is recommended to address and refresh the category with the largest number of stale CIs first, or refresh the stale CIs in your most important CI classes.Choose the method you want to use, based on the root cause. The table below lists the root cause and the corresponding method to choose. What is causing the stale CIs (root cause)Method to useDiscoveryMethod AThird-party applicationsMethod BData importsMethod CManually entered CI recordsMethod DLegacy CI recordsMethod E Method A: Root cause - Discovery should be refreshing the CIs Work with your Discovery Administrator to refresh this data by fixing the root cause. Examine one CI class at a time. Verify that your Discovery schedules are marked active and are completing successfully.Check Discovery logs for error messages, such as problems with credentials, or firewalls. Review your IP Ranges and any possible excludes to ensure proper coverage within Discovery Schedules.(Optional) Consider using a Scheduled Job that retires CIs that are not discovered.Confirm with Support Team members that the devices listed are still in active use. Any records for devices that were retired should be archived. This includes VM instances referencing retired vCenters.Note: If the CI is no longer in the system, consider the answers to these questions: Is your asset-decommissioning process updating the CMDB when appropriate?Have you set up Data Archive for retired records from these CI classes? Method B: Root cause - Third-party applications should be updating the CIs Contact the third-party application owner and review how the data is sent to the CMDB.Ask the following questions: How are they managing the lifecycle of a CI and how are those updates sent to ServiceNow?Are only the deltas sent over periodically?Was the data sent as a one-time only activity? Can it be set up to happen more frequently?Can a "full refresh" job be performed?Are support teams for these records empowered to troubleshoot this on their own?Is this integration process monitored in any way? Scheduled Job failures should notify someone to act. Check if the third-party application supports passing the data though the Identification and Reconciliation Engine (IRE). There are options for both Transform Maps and REST integrations.(Optional – Paris or later) Consider using Discovery as a validation mechanism for your CIs. Multisource CMDB allows you to see the complete history of a device from many sources at once. Method C: Root cause - Import Sets should be updating the CIs Reimport the data. To avoid creating duplicate CIs, update the records using coalesce field matching or preferably by passing all transforms through the CMDBTransformUtil.TIP: To ensure you ONLY update records, you can add the following text as an onBefore script to the existing transform map. This text ensures that only matching records are updated. if (action == 'insert') Ignore = true; (Optional) Consider retiring or archiving records that are at the end of their lifecycles.(Optional) Consider using Discovery as the primary source for the CIs. Method D: Root cause - Manually entered CIs aren't being updated Confirm that the CI has not been updated recently.Research who manually entered the CI, find out why it has not been updated.See if there is a way to automate adding the CIs or see if there is a way to periodically certify any records entered manually.Ask if Discovery can access their devices, if provided a credential.Ask if a file (CSV, XML or XLSX) can be generated regularly. These files can be processed by a Data Source/Transform Map to keep their otherwise inaccessible devices updated. Method E: Root cause - Stale CIs are remnants of the legacy CMDB records Are the legacy records still valid? Consider marking them as retired or non-operational or consider archiving them. Archiving allows for these records to remain accessible if they are required for retention purposes but removes them from the everyday operational view so that other work is free of the historical clutter. For information on archiving the records, see the Data Archive Jump Start Guide. Play 3: Data Governance What is play is about This play lists the methods you can use to limit the number of stale CIs in your CMDB. You can use one or both of the following methods: Method A: Use a CMDB Health DashboardMethod B: Certify you CMDB data Required tasks Decide the method you want to use and follow the instructions. Method A: Setup the CMDB Health Dashboard The staleness metric is included in CMDB Correctness Scorecard. Complete the CMDB Fundamentals Training.Create health inclusion rules for your most important classes.(Optional) Consider using CMDB Remediation Rules to retire the stale CIs automatically.Example: Default Staleness Rule Method B: Certify your CMDB Data Even when CIs are added to the CMDB correctly, the data may still be inaccurate. To help with quality control, use an independent certification process on one or more of your records. Review the information about Data Certifications, and be sure to adhere to your company's development and change management guidelines.Schedule the data certification to run periodically.Example: Certification ScheduleReview Discovery Logs regularly, especially if you do not receive status field (on/off) updates from vCenter event collector. Credential errors should not be ignored. it is common for Cloud Discovery credentials to expire or be rotated periodically. Look for "VMWareProbe" messages that indicate credential failures. Navigate to Discovery > Output and Artifacts > Discovery Log and review and address any messages of type Warning or Error as soon as possible. Consider setting up a nightly report to have these sent to your Discovery Administrator or other stakeholders.Example: Discovery Log message of a credential failure in the attempt to connect to a vCenter Server during horizontal discovery Check relationships between stale VM instances and their vCenter Reference. It is possible Cloud Discovery is unable to discover the vCenter server and receive the most up to date information. The vCenter server may have been retired and never archived within the CMDB. Verify the existence of the vCenter Reference and correct accordingly.Example: Virtual Machine Instances that are staleExample: Virtual Machine Instances (image above) are referencing a vCenter server (vCenter Reference column) that has not been discovered in three (3) months.Open the vCenter sever record in Form View and use the context menu item titled "Check Schedules" to search for which Discovery Schedules contain your vCenter IP.Example: (Right Click ... discovery schedule, run search) Congratulations You have completed this Get Well Playbook.