What’s new
Changelog
Every improvement, fix, and new capability — in the order we shipped it. Follow along on GitHub for commit-level detail.
April 8, 2025
v0.9.0AI Incident Correlation Engine
AI correlation engine is now in general availability. It groups related alerts into a single incident, surfaces a probable root cause, and links the relevant traces and logs automatically. In internal testing, alert noise reduced by up to 70% on busy clusters.
- NewAI correlation engine: groups related alerts into incidents using causal graph analysis. Reduces alert noise by up to 70% in our internal testing.
- NewIncident timeline view: auto-generated chronological reconstruction of events from logs, traces, and metric anomalies.
- NewRoot cause hypothesis panel: the AI surfaces up to 3 ranked hypotheses with supporting evidence from your telemetry.
- ImprovedeBPF agent startup time reduced by 40% — kernel probes attach in under 200ms on modern kernels (5.15+).
- ImprovedIngestion pipeline throughput improved to 2M events/sec per node (up from 1.2M). Backpressure handling is now adaptive.
- FixedMemory leak in the eBPF userspace buffer when kernel perf ring buffers fill under sustained high load.
- FixedAlert state machine could get stuck in 'firing' if the metrics backend returned a transient empty series.
March 18, 2025
v0.8.2On-Call Scheduling Overhaul
On-call management gets a complete scheduling engine rewrite. Rotations now support multi-layer overrides, follow-the-sun configs, and a real-time schedule preview before you publish. We also shipped Slack-native incident commands.
- NewFollow-the-sun rotation support: define shifts by timezone and SaviourOps hands off automatically at shift boundaries.
- NewSlack integration v2: acknowledge, escalate, or resolve incidents directly from Slack without opening the dashboard.
- NewSchedule override UI: drag-and-drop overrides on the calendar view with conflict detection.
- NewOn-call health score: weekly digest showing MTTA, MTTR, and alert volume per engineer to identify toil concentration.
- ImprovedEscalation policy editor redesigned. Multi-step escalations with per-step delays and fallback contacts.
- ImprovedPagerDuty import tool now handles nested teams and custom escalation policies without manual cleanup.
- FixedEdge case where simultaneous acknowledgement from two engineers could leave an incident in a split-brain acknowledged/unacknowledged state.
February 24, 2025
v0.8.0eBPF Agent: Kubernetes Auto-Discovery
The eBPF agent now understands Kubernetes natively — it reads pod labels, namespace metadata, and service topology from the kube-apiserver and attaches that context to every trace span and network flow automatically. Zero YAML changes required.
- NewKubernetes auto-discovery: eBPF agent reads pod/service/namespace labels and enriches all telemetry automatically. No annotation changes needed.
- NewNetwork flow topology map: visualize east-west traffic between services derived from eBPF socket probes — no service mesh required.
- NewHTTP/2 and gRPC tracing support in the eBPF agent (previously HTTP/1.1 only).
- NewOpenTelemetry Collector compatibility: SaviourOps now accepts OTLP/gRPC and OTLP/HTTP on standard ports. Drop-in replacement for any OTLP-compatible collector.
- ImprovedeBPF agent CPU overhead reduced by 18% through batched BPF map reads and reduced context switch frequency.
- ImprovedHelm chart now supports topology spread constraints and PodDisruptionBudget for production deployments.
- FixedeBPF agent crashed on kernel 6.6+ due to a renamed struct field in the socket filter program. Now tested against kernels 5.4 through 6.8.
- FixedTrace context propagation dropped on HTTP requests with non-standard capitalization of the traceparent header.
January 30, 2025
v0.7.1Log Explorer and Query Performance
Log Explorer ships with full-text search, structured field filtering, and a live-tail mode. Under the hood we rewrote the ClickHouse query layer to push filter predicates closer to storage — p99 log query latency dropped by 60% on large datasets.
- NewLog Explorer: full-text search with AND/OR/NOT operators, structured field filtering, and saved searches.
- NewLive-tail mode: stream log lines in real-time from the browser, filtered by service, severity, or arbitrary fields.
- NewLog-to-trace linking: every log line with a trace_id is automatically linked to the parent span — click through without copy-pasting IDs.
- ImprovedClickHouse query layer rewritten to push filter predicates to the storage layer. p99 query latency on 10B+ row datasets down from 4.2s to 1.7s.
- ImprovedIngestion pipeline now automatically parses JSON log bodies and promotes top-level keys to indexed columns.
- ImprovedAPI rate limit errors now return Retry-After headers and a structured JSON error body with request_id for easier debugging.
- FixedDashboard time-range picker incorrectly adjusted timestamps for users in UTC+5:30 and similar half-hour offset timezones.
- FixedSelf-hosted installer failed silently if the target host had less than 4 GB of RAM instead of surfacing a pre-flight error.
January 6, 2025
v0.7.0Public Beta Launch
SaviourOps is now in public beta. This release includes the foundational observability pipeline, alerting engine, basic on-call scheduling, and the eBPF agent for zero-instrumentation Linux and Kubernetes deployments.
- NewPublic beta open. Sign up at saviourops.com — free tier includes 5 GB/month ingestion.
- NeweBPF agent: zero-instrumentation observability for Linux processes and Kubernetes workloads.
- NewAlerting engine with multi-condition rules, severity levels, and notification routing to email, Slack, and PagerDuty.
- NewBasic on-call scheduling with rotation management and escalation policies.
- NewSelf-hosted deployment option via Helm chart and Docker Compose.