At any moment, thousands of interdependent services are processing workloads, consuming resources, and triggering each other in ways no dashboard fully captures.
Today's tools were built for a simpler era: reactive alerts, static thresholds, heuristic recommendations. They describe what happened. They don't explain why, or predict what comes next.
Incidents are diagnosed after users are already impacted. Capacity decisions are made on intuition, not on a model of how the system will actually behave.
As infrastructure scales and as autonomous systems take on more operational responsibility this gap between observation and understanding becomes critical.
Presage trains and deploys world models directly inside cloud environments. Our models learn from production data and build an internal representation of how infrastructure actually behaves.
This makes possible what current tooling cannot deliver:
/ Anticipate cost trajectories before they materialize
/ Simulate infrastructure changes before applying them to production
/ Trace the true root cause of incidents, upstream origin, not downstream symptom
/ Plan capacity on a real model of how workloads and resources will evolve
/ Give autonomous agents a model of the environment they operate in
That last point is increasingly important. AI agents are taking on more operational tasks, auto-scaling, cost optimization, incident response. Without a model of the environment, an agent acting rationally on local signals can generate system-wide failures. World models are what make autonomous infrastructure management 100% reliable.
Presage is a frontier AI research lab. We develop novel model architectures, drawing from deep learning, dynamical systems theory, and causal inference and deploy them directly inside production infrastructure.
We don't fine-tune existing models on cloud data. We build the architectures that the infrastructure AI problem actually requires.
Our research is academically rigorous, rooted in work conducted at ENS. And it runs in production, because research that stays in papers doesn't solve the problem.
Our first foundational world model for cloud infrastructure.
Cloud-1 learns the structure, dynamics, and causal relationships of AWS environments from production data, deployed inside your infrastructure.
It captures what static tooling cannot: how services depend on each other, how workloads evolve, how resource consumption propagates, how decisions made today shape the system tomorrow.
Most infrastructure tooling operates on observable signals. Cloud-1 goes deeper, it learns a latent representation of the system that encodes the dynamics driving how it evolves:
/ How workloads shift over time and across services
/ How resource consumption in one part of the system propagates to others
/ How service dependencies create fragility
/ How cost structures evolve as a function of infrastructure decisions
/ What early failure signatures look like, before they surface as incidents
By learning in latent space rather than operating on raw metrics, Cloud-1 achieves stable long-horizon predictions across the full complexity of the system.
The Data Factory pipeline
Raw cloud data is heterogeneous, different formats, sampling rates, schemas across services. Our Data Factory pipeline normalizes and structures this data into training sequences that expose the temporal and causal structure of the environment. It runs continuously as the infrastructure evolves.
Latent dynamics architecture
Cloud-1 learns to encode system state into a compact latent representation, then models how that representation evolves over time. Raw metrics are high-dimensional and noisy, many dimensions are unpredictable at the surface level but stable in latent space. This is what enables reliable long-horizon prediction at scale.
Action-conditioned simulation
Given a candidate change, a deployment, a scaling event, a configuration update, Cloud-1 simulates the trajectory of the system forward in time, with quantified uncertainty. Infrastructure decisions go from reactive to predictive.
Cloud-1 is the first model in a family. We are expanding coverage beyond AWS, deepening causal reasoning capabilities, and building abstractions for multi-service, multi-region architectures.
Each new deployment extends the base model. Each new system teaches us something new.
The most important AI systems of the next decade will be those that understand the environments they act in, that model cause and effect, simulate future states, and reason about downstream consequences.
Language models transformed what AI can do with text. World models will transform what AI can do with systems.
We started with cloud infrastructure: the environment where autonomous AI is already most active, complexity is highest, and the cost of acting without understanding is already the greatest.
A world model is an AI system trained to understand how an environment behaves, how it evolves over time, how its parts interact, and what happens next as conditions change.
When you learn to drive, you don't just memorize traffic rules. You develop an intuition for how the road behaves, how other cars will move, what happens if you brake too late, how speed and distance interact. That intuition is a world model.
Large language models never developed that intuition. They learned to read and write about the world, not to understand how it works. In this analogy they can describe every road in the world. They cannot anticipate what happens at the next turn.