2026 Prediction – Open Source Will Ride AI’s Wave Into Its Next Golden Age
IBM’s $11 billion acquisition of Confluent
announced in early December, is more than a major consolidation play in data infrastructure: it’s a public admission that Artificial Intelligence (AI) is fundamentally event-driven. In other words, this acquisition is proof that enterprises need trusted data-in-motion as much as data-at-rest. As organizations have rushed to deploy AI agents across their operations in 2025, the deal spotlights a critical realization:
real-time context is the missing ingredient in making agentic AI work at enterprise scale
.
IBM’s own framing reveals the strategic shift. They’re positioning the combined entity as a “smart data platform for AI agents” – infrastructure that can connect, process, and govern data in real time so agents can operate seamlessly across hybrid environments. This isn’t about selling more streaming infrastructure. It’s about acknowledging that
AI agents need continuous, fresh context to function reliably
, and that
streaming data is the plumbing that makes it possible
.
The Paradox at the Heart of Enterprise AI
IBM’s acquisition also highlights a fascinating paradox in the data infrastructure landscape. Over the past few years, some infrastructure vendors pulled back from open source, changing licenses and retreating to proprietary models in pursuit of stronger monetization. Yet AI adoption is forcing ecosystems back toward openness. Why? Agents need interoperable pipelines, connectors, and governance across many systems – not one singular vendor’s walled stack.
The rise of powerful open-source Large Language Models (LLMs) has pushed the entire AI ecosystem toward transparency and portability. Models like Llama, Mistral, and countless others give enterprises cheaper and better alternatives to closed models. This creates tension: data infrastructure vendors are closing their gardens just as the AI companies consuming their products are opening theirs. IBM’s acquisition of Confluent signals the beginning of a reset, where AI’s pull forces infrastructure back toward openness.
Context Management: The Enterprise Capability AI Demands
To understand why, we need to talk about what I call “context management” – an enterprise capability to deliver the most relevant, reliable, and retained context to model context windows. This isn’t just ad-hoc Retrieval-Augmented Generation (RAG) implementations scattered across different teams. It’s a systematic approach to ensuring AI agents have access to the information they need, when they need it, with proper governance and provenance.
Here’s a simple mental model:
Agents run on context. Context runs on pipelines.
The context pipeline looks like this:
sources → streaming → storage (lakehouse/OLTP) → indexing (vector + lexical + SQL) → policy/governance → serving → observability/evals.
Each layer needs to work reliably, and they need to work together. Streaming sits at the foundation because it provides the continuous freshness that agents require.
Traditional RAG approaches are often reactive – they fetch context when prompted. But agents also need proactive updates: events continuously refreshing memory, updating retrieval indexes, adjusting permissions, and enforcing policies. Confluent’s acquisition by IBM is fundamentally a bet on that “always-updating context layer” becoming critical infrastructure for enterprise AI.
Why Agents Fail (And Why It Matters)
Enterprise AI teams are discovering this unfortunate truth the hard way: agents don’t fail because the LLM is “dumb.” They fail because the underlying context is broken, stale, incomplete, or ungoverned. In fact, analysts estimate that up to
60% of AI projects will be abandoned due to a lack of AI-ready data
. An agent making procurement decisions based on yesterday’s inventory data isn’t helpful. An agent accessing customer records without proper authorization is a compliance nightmare. An agent that can’t explain its reasoning is unusable in regulated industries.
Enterprises can’t audit AI decisions without provenance. They can’t scale AI applications without consistent data freshness. They can’t deploy agents confidently without proper governance guardrails. All of these requirements point to the same conclusion: context management needs to become first-class infrastructure, not an afterthought.
The Open Source Resurgence
This is where the open source renaissance begins. Context spans too many vendors and systems for any single proprietary stack to win. The successful approach will be open interfaces plus portable building blocks: connectors, streaming platforms, metadata management, retrieval systems, and policy enforcement. Closed licensing slows integration – and integration is the entire game in agentic AI.
IBM understands this. Their history with open source (notably through Red Hat) gives them credibility. The combined IBM-Confluent entity is positioned to accelerate what they call “event-driven intelligence” by embracing openness where it matters: at the integration points where different systems need to work together seamlessly.
We’re already seeing this shift play out. Open-source streaming platforms, open table formats like Apache Iceberg and Delta Lake, and open standards for metadata and governance are becoming the connective tissue of enterprise AI infrastructure. Organizations are demanding portability and interoperability because they know they’ll be working with multiple AI models, multiple data stores, and multiple tools. Lock-in is the enemy of the flexibility they need.
A Prediction for 2026
By the end of 2026, I predict “context management” will emerge as a named category in enterprise technology stacks. Buyers will demand three things:
First, open connectors and “bring-your-own” architectures for data stores and indexes. No single vendor will control the entire context pipeline.
Second, standardized context APIs across tools. Teams need to be able to swap components without rebuilding entire systems.
Third, governed provenance as a default, not a bolt-on. Every piece of context needs a clear lineage, and every agent decision needs an audit trail.
These aren’t nice-to-haves. They’re table stakes for enterprise AI adoption at scale.
The Reset Begins
IBM buying Confluent marks the start of a fundamental reset in data infrastructure. AI’s momentum, driven by the need for sophisticated agents operating in fresh, reliable context, is forcing the industry back toward openness. Whether that’s pure open source or, at minimum, open and enforceable interoperability depends on how the market evolves. But the direction is clear.
The vendors that thrive in this new era won’t be those with the most closed, proprietary stacks. They’ll be the ones that embrace openness at the integration layer, that provide genuine interoperability, and that help enterprises build context management capabilities without artificial constraints.
The next wave of innovation will come from open-source AI infrastructure that enables enterprises to build sophisticated agents and applications without vendor lock-in. That’s not idealism – it’s pragmatism. Because when you’re building mission-critical AI systems that need to span your entire enterprise, openness isn’t a philosophy. It’s a requirement.
