News

Datadog’s Toto: Open-Source Observability Model

Datadog unveils Toto, a 151M-parameter time-series foundation model and BOOM benchmark, sparking a new open-source, AI-driven observability wave.

Jigar Bhatt

08 Jun 2025 — 4 min read

Toto: Open-Source Observability Model

Recently, Datadog unveiled Toto, a new open-source AI foundation model tailored for observability data, alongside a companion benchmark dataset called BOOM. Toto is a time series foundation model (TSFM) – essentially to metrics what large language models (LLMs) are to text – designed to learn from massive telemetry datasets and adapt to a range of monitoring tasks. Both the 151-million-parameter model and the 350-million-point BOOM benchmark have been open-sourced under a permissive license. Toto is the industry’s first foundation model focused on observability, potentially reshaping how engineers apply AI in monitoring and reliability.

Observability Gets Its Own Foundation Model

Toto’s release is a milestone for the observability and AI communities. By open-sourcing a high-performing model trained on real-world telemetry, Datadog is bridging academic AI advancements with practical infrastructure monitoring needs. Zero-shot capabilities are a key highlight – Toto can flag anomalies or forecast capacity without needing custom per-metric training, an important advantage when dealing with millions of ephemeral time series in cloud environments. In essence, the model has learned generic patterns of how infrastructure and application metrics behave, enabling instant insight on fresh data. For observability engineers, this promises more accurate alerts and predictions with less manual tuning.

Toto achieves top-ranked performance on both the observability benchmark (BOOM) and a general time-series test (GIFT-Eval), according to Datadog’s evaluations. (Keep an eye on follow-up peer reviews or papers)

Equally notable is the release of BOOM (Benchmark of Observability Metrics), a comprehensive dataset of 2,800+ real production time series for evaluating models. By publishing this alongside Toto, Datadog provides a standard yardstick for measuring progress on observability AI. It addresses the unique challenges of telemetry data – high sparsity, spiky outliers, short-lived series – which typical time-series benchmarks don’t fully capture. This kind of domain-focused foundation model and benchmark combo could accelerate innovation, allowing researchers and practitioners to iterate on solutions that are directly relevant to operational engineering problems.

How Toto Stacks Up in the AIOps Landscape

Toto’s debut also invites comparison with other efforts to inject AI into DevOps and infrastructure management. Traditionally, AIOps tools have been narrower in scope – for example, open-source projects like Loglizer apply machine learning to detect anomalies in log files, and platforms like Opni use pretrained models to spot unusual patterns in Kubernetes logs and metrics. While useful, these tools are often limited to specific data types or require significant tuning for each use case. In contrast, Toto arrives as a general-purpose model for time-series telemetry, pre-trained on a vast corpus of observability data. This makes it more akin to a GPT-style model for ops: a single model that can potentially handle diverse tasks (forecasting, anomaly detection, capacity planning, etc.) out-of-the-box.

It’s also noteworthy that many monitoring vendors have offered proprietary AI-assisted features for years (from anomaly alerts to automated root-cause hints), but those models typically operate as black boxes. Datadog’s open approach with Toto stands out. By releasing the weights and code, they enable the broader community to validate, improve, or repurpose the model beyond Datadog’s own platform. This openness echoes the trend seen in other domains where open-source foundation models (like computer vision or NLP models) spur faster adoption and innovation compared to closed solutions.

Open Model, Open Opportunities

Having a specialized foundation model under Apache 2.0 license means others in the industry can leverage Toto freely. Observability engineers could integrate Toto into their existing toolchains – for instance, plugging it into open-source monitoring stacks to get smarter anomaly detection on metrics streams. Because the model is domain-optimized, it may deliver more relevant results on telemetry data than a generic AI service. Organizations might also fine-tune Toto on their own datasets (e.g. refining it for a particular environment or for predicting specific KPIs), building on its foundation instead of starting from scratch.

The release may also encourage collaboration across companies and open-source projects. We could see community-driven improvements to Toto’s architecture or the emergence of adjacent models for logs and traces, applying a similar philosophy to other observability signals. For decision-makers, Toto exemplifies how AI capabilities can be productively shared: a major vendor contributing a state-of-the-art tool back to the community, which others can adopt and even commercialize in new ways. In a space as critical as reliability engineering, this democratization of advanced AI tools is a welcome development. It suggests that AI-powered observability is entering a new phase – one where open, shared intelligence helps everyone keep systems running smoothly.

The upshot is that Datadog’s Toto model is more than just a one-off release; it’s a signal of things to come. As foundation models continue to spread beyond NLP into IT operations, the observability field stands to benefit immensely. Open-source models like Toto provide a common technological base that practitioners, vendors, and researchers alike can build upon. That could accelerate the evolution of AIOps, making sophisticated AI-driven monitoring a standard part of the engineering toolkit across the industry.

Head over to 🤗 and happy TSFM-ing!

Get Started

Sources: Datadog blog and paper.