Event Management dashboard is not loading and multiple EM impact jobs are stuck


Description

Two different symptoms of the same issue, each may appear separately:

1. Dashboard and/or alert panel are not loading until timeout exception.

2. One of the following jobs is be stuck (running few hours):

Steps to Reproduce

1. Use an instance with mySql DB.
2. Make sure there are many alert history records and impact status records.
3. Navigate to Dashboard. Notice the dashboard is not loading.
2. Go to Active Transactions. Note that EM jobs are running for a few hours.

Workaround

This problem is fixed in all currently supported releases. Review the Fixed In section to determine the latest version with a permanent fix your instance can be upgraded to.

The workaround was to add the system properties:

  1. evt_mgmt.impact_calculation.cleanup_age_seconds.em_alert_history = 259200
  2. evt_mgmt.impact_calculation.cleanup_age_seconds.em_impact_status = 259200
  3. evt_mgmt.max_objs_in_query = 300

Properties 1-2 are used to enable aggressive cleaning, to keep 3 days only (259200 sec) and remove historical data for em_impact_status and em_alert_history.

The third property is used to decrease the max objects in a query from 1k to 300 in order to improve performance.




Related Problem: PRB1353927