HLA System Properties


The purpose of this document is to list and provide an explanation for the HLA System Properties (sn_occ_system_settings)

The default (out of the box) settings are generally sufficient, however, in some cases customers may require changes and/or optimisations to the core HLA System settings for some of the following reasons.

The Properties below are grouped by subject, they are not in the order that they appear in the System Properties Table in HLA.


AGGREGATOR

Middle of the Data Ingestion pipeline. Responsible for grouping and storing of metrics.

aggregator.bloom_filter_factor 
The coefficient which we should multiply by the Bloom Filter size

aggregator.concurrency_override 
If specified, the value will override the initial automatic allocation of resources to the aggregator.

aggregator.gauge.aggregation_type 
Controls whether gauge metrics should be tested using the average or median (default). Accepted values are: Average or Median

aggregator.metrics_bloom_filter_fpp 
Expected false positive probability for the Bloom Filter. Used for monitoring purposes.

aggregator.min_non_null_values_for_stats 
Minimum number of non null values in a series to calculate stats for (moving average, stdev...).

aggregator.number_of_expected_metrics 
Approximation for how many unique metrics the aggregator should handle.

aggregator.queue_size 
Number of metrics that can be buffered in the aggregator before it starts blocking the processing pipe.

aggregator.resolution_seconds 
The resolution of the time series. i.e each data point represents the aggregation of data over this period of seconds.

aggregator.settle_seconds 
How many seconds should pass without receiving data until the window is considered settled. Once a window is settled, the detective can start running its algorithms.

aggregator.window_max_quantity_in_period_hours 
Circuit-Breaker: Max active time-span of a metric, in hours. Events with metrics that arrive with timestamps spanning a wider time-span will not be aggregated.

aggregator.window_size_seconds 
Value must be multiples of 60. How many seconds are considered a window. Windows are the time frame for the detection tasks to be exeucted on.

aggregator.window_size_seconds_custom 
Value must be multiples of 60. How many seconds are considered a window for a CustomMetric. Windows are the time frame for the detection tasks to be exeucted on.

aggregator.workload_level 
Work load level in which the Aggregator is considered stressed, options are: LOW, MEDIUM, HIGH.


ALERTS

alerts.annotation_property 
Part of the alert settings that deals with checking annotations for correlations.

alerts.is_anomaly_baseline_reference_decrease_disabled 
Indicates if 'anomaly_baseline_reference_decrease' alert is disabled

alerts.max_alert_age_hours 
Anomaly Detection won't apply to events older than this setting. This allows the system to identify and discard alerts that are considered 'too old'. If you are streaming real-time data and still see detection windows being dropped for age, this might indicate: 1) delay in the processing pipeline (for example: a specific Data Input was stopped for a couple of hours, then started again), 2) incorrect extraction of the timestamp field (for example wrong timezone: the timestamp being sent is supposed to be read as Easter Standard Time, but is being read as UTC since there is no indicator in the timestamp). If you are streaming historic data, this setting MUST be increased to include the dates of the historical data. (for example: if today is Jan 2021, and the historical data is being streamed from Jan 2020, please make sure the time here is AT LEAST 8760 (hours) i.e. 365 days * 24 hours/day) {Also note that an additional setting should also be increased: broker.events.max_age_hours}

alerts.recent_events.max_size_bytes 
The maximum size (bytes) allowed for recent events

alerts.recent_events_for_timeless_gauge_period_seconds 
same as alerts.recent_events_period_seconds but for timeless-gauge detections

alerts.recent_events_period_seconds 
Time period to look-back from the point of the anomaly to fetch relevant events - which will be used for RCA. (recommendation is not to exceed around 24 hours in seconds)


ALERTS CREATOR

The last part of the pipeline before alerts are created and populated in the incident list.

alerts_creator.queue_size 
Number of detections that can be buffered in the alerts creator before it starts blocking the processing pipe.


ARCA (Automatic Root Cause Analyses):

This refers to the properties extracted via Source Types that are categorized as "ARC_only".

arca.entities_analyzer.max_days_lookback 
To build the "meaningful entities" section of the RCA report, the AI engine goes back up to this number of days to analyze relevant events. (recommendation is not to exceed 2 days)

arca.entities_analyzer.max_entity_occurrences 
For each entity presented in the root-cause section, the AI engine adds events surrounding the detection time. This setting controls the number of such events that will be added.

arca.highlights_analyzer.majority_vote 
In the highlights analyzer, min number of past matching events from the same host/day/hour to qualify as a highlight

arca.highlights_analyzer.max_days_lookback 
To build the "highlights" section of the RCA report, the AI engine goes back up to this number of days to analyze relevant events.

arca.highlights_analyzer.number_of_highlights 
Number of highlights to be presented in the "highlights" section of the RCA

arca.mf_analyzer.number_of_changes 
Max number of changes to show in the "significant changes" section in the RCA


BROKER

This is the start of the Data Ingestion Pipeline. It is responsible for data integration and digestion, including parsing of the logs.

broker.concurrency_override 
If specified, the value will override the initial automatic allocation of resources to the event broker.

broker.events.max_age_hours 
Events older than this number of hours will be dropped

broker.headerdetection.vmware 
list of vmware apps used by header detection to detect as vmware

broker.header_detection.detect_beaver 
When on, the AI engine will attempt to detect and parse beaver headers. Default is ON

broker.header_detection.detect_syslog5424 
When on, the AI engine will attempt to detect and parse Syslog5424 headers. Default is ON

broker.queue_size 
Number of events that can be buffered in the event broker before it starts blocking the processing pipe.

broker.workload_level 
Work load level in which the Event Broker considered stressed, options are: LOW, MEDIUM, HIGH.


CLOTHO

clotho.batch_size 
Bulk size for persisting data points to Clotho

clotho.duration_days 
Clotho duration time per days

clotho.sampling_interval_minutes 
Clotho Sampling Interval per minute


DATA INPUTS

Responsible for the fetching or receiving of logs from different mediums. 


data_inputs.abstract_queue_size 
Queue size of all data input.

data_inputs.examples_refresh_interval 
Interval, in minutes, for updating the data-input examples in the database

data_inputs.max_length_bytes_per_stream 
Max size (in bytes) of a single request that can be handled by any data input

data_inputs.preprocess.examples.buffer.size 
Size of buffer for preprocess examples

data_input_mapping.max_examples 
Define the maximum number of samples to show on the Data Input Mapping screen, up to 500


DETECTIVE

Towards the end of the Data Ingestion Pipeline. Responsible for spotting 'regular' and 'anomalous' behavior in the data. Running multiple anomaly detection algorithms.

detective.alive_period_seconds_for_signal_dead 
The minimum period, in seconds, that a signal has to be alive before "dropping dead", for a signal-dead alert to be fired. Additionally, if there was another "dead signal" with similar duration in this period, then the current one will be disqualified.

detective.allowed_future_time_minutes 
Acceptable futuristic detection period. If a detection was created with a source time further down the future, its handling will be delayed.

detective.amplitude_coefficient 
This setting effects the overall sensitivity of the anomaly detection engine. The higher the number, the less alerts you will see.

detective.anomaly_detection.enabled 
When set to false, the AI engine will not attempt anomaly detection.

detective.concurrency_override
If specified, the value will override the initial automatic allocation of resources to the detective.

detective.detection_task_delay_seconds 
Delay in seconds before starting a detection task after the corresponding window has settled.

detective.few_elements 
A deep setting that effects the tolerance of weaker detection techniques

detective.global.mute_disabled 
When set to true, the mute or disable feedback will apply on a specific metric across ALL Application-services

detective.max_moments_in_memory.derivative 
This setting effects the tolerance of the derivative algorithm

detective.max_moments_in_memory.signal_alive 
How many "similar-in-amplitude" bursts should the signal-alive detector allow in the preceding period. This setting is in effect when raising an alert

detective.max_moments_in_memory.signal_dead 
How many "dead periods" should the signal-dead detector allow. This setting is in effect when raising an alert

detective.memory_in_days 
The memory, in days, of the different anomaly detection models (baseline, derivative and others)

detective.min_events_per_window 
The min number of events per window for a detection to be triggered. 

detective.of_custom_alert_concurrency_override 
If specified, the value will override the initial automatic allocation of resources to the customAlertDetective.

detective.points_in_timeless_trend 
The number of samples to consider when testing for trend shifts in disperse metrics

detective.queue_over_capacity_percent 
Will create a system notification when the detective queue is greater than value% capacity for over 5 minutes.

detective.queue_size 
Number of detection tasks that can be buffered in the detective before it starts blocking the processing pipe.

detective.resolution.signal_dead 
The number of seconds a metric's signal must be consecutively "dead" (no data, graph showing zero) for a signal-dead detection to be triggered for this metric. This setting can be configured per source.

detective.sigma_coefficient 
The coefficient for the sigma-based anomaly detection

detective.workload_level 
Work load level in which the Detective considered stressed, options are: LOW, MEDIUM, HIGH.


ELASTICSEARCH

elasticsearch.bulk_actions 
Number of entities in one bulk request, using as threshold for elastic bulk operation (together with the request size).

elasticsearch.bulk_concurrent_requests 
Concurrency of the Bulk write-requests to Elastic

elasticsearch.bulk_size_mb 
Size of the bulk request in MB, using as threshold for elastic bulk operation (together with the request entities number threshold).

elasticsearch.client.connect_timeout_millis 
Configures the timeout in milliseconds until a connection is established to elasticsearch.

elasticsearch.client.io_threads 
Configures the number of I/O dispatch threads to be used by the elasticsearch client

elasticsearch.client.socket_timeout_millis 
Configures the socket timeout in milliseconds to elasticsearch, which is the timeout for waiting for data or, put differently, a maximum period inactivity between two consecutive data packets.

elasticsearch.concurrency 
How many threads should index to elasticsearch

elasticsearch.flush_interval_seconds 
Bulk flush interval for indexing. Bulk will execute sooner if either bulk_actions or bulk_size_mb has been reached.

elasticsearch.mapping_keyword_properties 
By default, all string properties are indexed as 'keyword' (except message, rawMessage, stacktrace, and the additional_string_properties which are indexed as 'text') which allows aggregation but no partial searches. Any field with the 'property' prefix (e.g. 'property.UUID' ,or 'property.srcIp') can also be indexed as 'keyword'. Note that a change will only apply to newly created indices. Please also note that when inserting the value you are not adding the prefix 'property'.

elasticsearch.minimal_indexing 
When true, properties classified as invalid will not be indexed

elasticsearch.queue_size 
Bulk size for indexing.

elasticsearch.subsampling_ratio 
Use this to only index some of the events to Elastic. 1 -> index 1 out of 1 events. 2 -> index 1 out of 2 events. N -> index one out of N events.


EVENTS

events.keyword_extraction_non_patterned 
Keyword extraction from the non-patterned message, when there is no pattern (i.e. message label is not assigned) 

events.max_minutes_in_future 
Events that are further in the future than this will be dropped. Note: If you see event being dropped due to future timestamp double check that your timestamps are in the correct timezone.  


GLIDE

glide.datainput.max_errors_percentage_before_publish
Define the max % of errors in a data input before publishing a notification.

glide.table.change_detection.interval_seconds 
Interval, in seconds, for getting tables that changed in glide

grpc.port 
Define glide port


INCIDENTS

incidents.alerts.dilute_target 
When the number of alerts in an incident is too high (see incident.alerts.max_count), alerts are diluted (removed) until this number is reached.

incidents.alerts.max_count 
Maximum number of alerts in incident, to start the dilution process of excess alerts.

incidents.alert_interval_seconds 
Time-span to consider two alerts as related if correlating by occurrence time

incidents.application_correlation 
Should the alert application be taken into account when correlating alerts

incidents.component_correlation 
Should the alert service be taken into account when correlating alerts

incidents.cooldown_period_minutes 
Minutes to wait after the creation of an incident before sending a notification about it.

incidents.detection_time_correlation 
Please note that time frame of the correlation window is defined using: incidents.alert_interval_seconds setting

incidents.detection_type_correlation 
Should the alert detection type (anomaly, signal-dead, baseline etc.) be taken into account when correlating alerts

incidents.entities 
List of entities that will be used for correlation if two alerts shared the same value. This setting should be managed from the correlations setting pages (global or individual per-source)

incidents.host_correlation 
Should the alert host be taken into account when correlating alerts

incidents.min_correlation_score_for_aggregating 
Sensitivity level for the correlation engine. The higher the number is, the more alerts will need to have in common in order to be correlated

incidents.pattern_text_correlation 
Should the alert pattern-text be taken into account when correlating alerts

incidents.period_seconds 
Defines the time-frame for the Alerts Smart-Correlations logic (an alert might only correlate with alerts created in the preceding T hours)


KEYWORDS

dictionaries.resource.directory 
Directory name in which the dictionaries used in keyword's message extraction process are

keywords.message.extractor 
When set to false, the AI engine will not attempt to automatically extract the message from Keyword-based alerts

keywords.message.max.length 
Messages over the maximal specified length will not be extracted

keywords.message.stop.elsa.message 
When set to true, Elsa will not attempt to automatically extract the message labels


LICENSING

licensing.flush_interval_seconds 
Time interval after which nodes will flushed to the glide table

licensing.max_map_size 
Maximum number of nodes stored in memory before flushing to glide table

licensing.monitoring.interval_seconds 
Time interval after which licensing monitoring service wakes up to check for new nodes

 


LOGSOURCEINFO (CMDB)*

logsourceinfo.flush_interval_seconds 
Time interval, in seconds, for collecting log source host data and forwarding it to the Log-based CI candidate table. (Default value = 3600)

logsourceinfo.max_map_size 
Maximum number of data nodes to be stored before the data is forwarded to the Log-based CI candidates table. (Default value = 1000)

logsourceinfo.monitoring.interval_seconds 
Time interval, in seconds, for scanning log events to discover host-related data. (Default value = 60)

*NEW system properties available starting from HLA December Store release


METRICATOR

Middle of the Data ingestion pipeline. Responsible for storing and measuring unique metrics.

metricator.cache_eviction_factor 
Number of raw metrics to evict from the cache when eviction is needed

metricator.cache_size 
Maximum number of raw metrics to hold in memory

metricator.concurrency_override
If specified, the value will override the initial automatic allocation of resources to the metricator.

metricator.exclude_extended_keyword_metrics 
A list of regex that will exclude extended keyword metrics of type ERROR/EXCEPTION from being created. ex: .*love.*

metricator.min_severity 
Minimum severity of an event for creating raw and pattern metrics

metricator.new_pattern_min_severity 
Minimum severity of an event for creating new pattern metrics

metricator.queue_size 
Number of events that can be buffered in the metricator before it starts blocking the processing pipe.

metricator.workload_level 
Work load level in which the Metricator considered stressed, options are: LOW, MEDIUM, HIGH.


NOTIFICATIONS

notifications.default.recipients.configuration_notifications 
Default recipients of configuration-related notifications, such as JS errors, timestamp parsing etc.

notifications.default.recipients.operational_notifications 
Default recipients of operations-related notifications, such as Crashes


PATTERNATOR

Pattern recognition and analysis

patternator.cache_eviction_factor 
Number of patterns to evict from the cache when eviction is needed

patternator.cache_size 
Maximum number of patterns to hold in memory

patternator.concurrency_override 
If specified, the value will override the initial automatic allocation of resources to the patternator.

patternator.gbp.bulk.queue_size 
Max number of GBP statements pending to be written to the DB, before updates start getting dropped.

patternator.gbp.bulk.size 
Number of statements to trigger an update to the DB

patternator.gbp.examples.until.greedy 
Num of events to learn which greedy replacements to use

patternator.gbp.max.node.chars 
Max number of characters in gbp node

patternator.queue_size 
Number of events that can be buffered in the patternator before it starts blocking the processing pipe.

patternator.rate_limit 
Maximum number of new patterns per second

patternator.