Skip to main content
Version: Unreleased

Alert Configs: Concepts and Examples

info

This page provides an overview of alert configuration concepts, types, and practical examples, inspired by best practices from the Alerts and Automation resource.


Quick Start

To add a new alert:

  1. Choose an alert type below.
  2. Copy the example config from the "Hash for Import" tab.
  3. Import it into your system and adjust parameters as needed.

Overview

The Alert & Automation System enables rapid detection of anomalies and automated responses by alerting administrators or triggering RESTful actions. It is designed for real-time operation, using streaming algorithms to minimize delay and maximize responsiveness.

tip

See the Events Reference for a full list and details of supported event types and their attributes.

Key Features:

  • 🚀 Real-time anomaly detection
  • 🔗 RESTful plugin integration for notifications and automation
  • ⚙️ Flexible configuration using JSON
  • 🌐 Supports a wide range of alert types and use cases

How It Works

  1. Ingestion: The system ingests incoming JSON events from various data sources (e.g., network equipment, SBCs).
  2. Stream Mapping: Events are mapped into streams using selected keys (such as user, trunk, or IP).
  3. Alert Evaluation: Each stream is checked against alerting criteria. If a threshold is exceeded, an alert is triggered.
  4. Notification/Automation: Alerts are pushed via RESTful plugins or other integrations.

Alert Types

Alert types can be grouped into two classes:

  • Predefined: Easy to set up, with a few parameters.
  • Custom: More flexible, with additional parameters like custom keys and filtering expressions.
tip

Predefined alerts are great for quick setups. Use Custom alerts for advanced scenarios and fine-tuned monitoring.

Common Alert Types & Use Cases

  • Parallel Calls (PC): Detects too many simultaneous calls on a trunk.
  • Low QoS (stringmatch): Monitors for calls with low MOS scores[1].
  • Failing Destination Trunk (ratio): Detects trunks with a high percentage of failed calls.
  • Long Calls (stringmatch): Alerts on calls exceeding a duration threshold.
  • Unstable Registrar (sudden change): Detects rapid changes in registration counts.
  • Telemarketer (ratio): Finds users with a high ratio of short-duration calls.
  • Region Skipper (memory): Alerts on users accessing from many geographic regions.
  • IP Failing to Authenticate (rate): Detects repeated authentication failures from an IP.
  • New User-Agent (memory): Alerts on new user-agent strings.
  • DDoS Detection (ratio): Detects distributed attacks based on failed call patterns.

Alert Example Library

Below are common alert configuration examples. Each example includes a description and a ready-to-import config hash.

Parallel Calls on a Trunk

This alert is based on “parallel calls” session tracking which keeps track for trunk’s both incoming and outgoing calls (key attrs.dst_ca_name|attrs.src_ca_name) and raises an alert if number of sessions exceeds a threshold for the respective trunk.


Low QoS

The alert type is “string-match” and it looks for call-end event types that show either’s direction average MOS dropped below 3.0. Example expression: attrs.rtp-MOScqex-avg-a>3.0 & attrs.rtp-MOScqex-avg-b>3.0.


Failing Destination Trunk

Ratio alerts detect if a subset of events exceeds a percentage and is often used to detect situations when percentage of failure is beyond acceptable. Here we observe share of "call-attempt" events with specific SIP code (e.g. attrs.sip-code~'404|408|500|503') in the mix of all failed call attempts and call starts. The percentage is determined and alert raised individually for every Call-Agent (trunk) as the alert key is attrs.dst_ca_name.


Long Calls

Long calls may be easily detected by checking the attribute attrs.duration in call-end events. Use alert type string-match and a matching expression like attrs.type='call-end' & attrs.duration>3600 to alert on calls over one hour in length.


Unstable Registrar

A rapid decrease or increase in registrations typically indicates a connectivity issue with a registrar or a registration "avalanche". The sudden-change alert type observes occurrence for some events (e.g. attrs.type='reg-expired'). An alert is raised if a short-term occurrence departs from smoothened mid-term occurrence more than by a threshold. The alert key is tenant id (tls-cn).


Telemarketer

A telemarketer is detected by an abnormally high ratio of calls that are declined or terminated within the first few seconds of a call (attrs.duration<5). The alert relates to users and thus the key is set to attrs.from. The detection is implemented using the ratio alert type by alerting on a high percentage of such short-duration calls.


Region Skipper

Abnormally many regions from which a user accesses a service may be a sign of a frequent traveller, or worse, a remote attacker. The memory alert type observes users keyed by attrs.from and associates their geographic regions with them as in geoip.region_code. An alert is raised if too many new regions are discovered.


IP Failing to Authenticate

Too frequent failures to authenticate from an IP address may be caused by an innocently forgotten password, but at a higher rate they may indicate an attempt to guess a password. This rate alert looks at failing authentications (attrs.type='auth-failed') and relates to originating IP addresses (attrs.source). The alert is raised when 5 failures occur within 5 minutes.


New User-Agent

It may be useful to alert on traffic from a new User-Agent. This may indicate an attack, a firmware update, or a new type of device. The memory alert type observes UAs keyed by attrs.from-ua and associates the UA name. An alert is raised on the very first occurrence of a user-agent.


DDoS Detection

If a traffic probe supports detection of attacker signature, a high presence of failed calls coming from SIP user-agent with a specific signature indicates a distributed attack. The ratio alert type can be used to detect this pattern.


More Information


tip

For best results, tailor alert keys and expressions to your environment and monitoring needs.

Footnotes

[1] MOS score (Mean Opinion Score) is a standard measure for voice quality in VoIP systems, typically ranging from 1 (bad) to 5 (excellent). A low MOS score indicates poor call quality, often due to network issues such as packet loss, jitter, or latency.