Alert Badge #
Introduction #
The Alert Badge is the application-wide status indicator for the alert engine. It sits in the top toolbar and surfaces currently active, unacknowledged alerts at a glance — showing the highest-priority message inline, separate counts for each severity, and acting as the entry point to the full alerts side panel. It complements the live server-health indicator that renders alongside it.
Features #
- Inline preview of the highest-priority newest unacknowledged alert
- Severity-coded counts for
CRITICAL(red),WARNING(yellow), andNOTICE(blue), capped at99+ - Color-coded button that follows the highest active priority (danger / warning / default)
- Bell icon with badge cluster for quick visual scanning
- Sorted order: by priority first, then most recent
createdAt - Click to open a right-hand side panel listing all alerts
- Side panel filter: switch between “Unread” and “All” alerts
- Acknowledge alerts from the panel to remove them from the unread counts
- Hidden when no alerts are present — zero visual noise on a healthy system
How alerts are produced #
The badge displays the output of a backend alert engine built around two layers:
- Triggers watch low-level conditions (e.g., a dataset value changing, or moving into or out of a range). Each trigger has a
retentionTime(seconds) — the window during which a fired trigger is considered active. Common evaluators includeDatasetValueChangedEvaluatorandDatasetValueInRangeEvaluator. - Rules compose one or more triggers (many-to-many) into user-facing alerts. When all the triggers required by a rule are simultaneously active, the rule emits or extends an alert with a
timeoutMinuteslifetime (default60).
Evaluation runs every 30 seconds on a Quartz-scheduled job (@DisallowConcurrentExecution, so two runs never overlap). Each cycle clears expired triggers, re-evaluates all triggers in batch, then evaluates rules and creates or extends alerts. Re-firing an active rule extends the existing alert’s timeoutAt rather than creating duplicates, so the badge counts remain stable while a condition persists. When timeoutAt passes, the alert disappears from the badge automatically — the system tracks current state only, not a historical event log.
retentionTime vs timeoutMinutes
#
retentionTime | timeoutMinutes | |
|---|---|---|
| Belongs to | Trigger | Rule |
| Units | seconds | minutes |
| Controls | how long a fired trigger stays active so other triggers can co-occur | how long the resulting alert remains visible to the user |
| Layer | low-level (condition watching) | high-level (final alert) |
| Typical value | small (10–300 s) — must overlap with sibling triggers | larger (60+ min) — must give a person time to react |
In short: retentionTime is the memory window that lets triggers combine into a rule, while timeoutMinutes is the lifetime of the alert that the rule produces.
Use Cases #
- Surfacing live deviations from operational thresholds in the application toolbar
- Quick triage by severity: critical issues call attention without burying lower-priority notices
- Acknowledging alerts as they are reviewed, keeping the badge focused on what still needs attention
- Pairing low-level triggers (e.g., temperature out of range + valve open) into a single composite rule alert