Technology Analysis

AI Infrastructure in 2026: Which Layers Will Capture Lasting Value

By Tobias Fleischer, Principal February 4, 2026 10 min read

Two years ago, the most common question we received from founders was whether there was still room for an AI infrastructure startup to compete with the hyperscalers. Today, the question has inverted: the more interesting question is which layers of the AI infrastructure stack are going to be defensible enough to retain margin as the underlying technology commoditises, and which are going to be swept away by the same competitive dynamics that have made the hardware and model layers so difficult for independent companies to sustain.

This analysis is our current answer to that question, based on conversations with more than 60 enterprise AI buyers, deep diligence on 30+ AI infrastructure companies over the past 18 months, and the operating experience of the AI-adjacent companies in our portfolio. The conclusions are not comfortable for all parts of the stack, but we believe they reflect the structural dynamics that will determine which AI infrastructure bets generate returns over a five-year horizon.

The Stack, Briefly

To structure the analysis, let us define the five principal layers of the AI infrastructure stack as we map it, from bottom to top:

Layer 1: Hardware and compute. GPUs, specialised AI accelerators (TPUs, custom silicon), and the physical data centre infrastructure that powers model training and inference. Dominated by NVIDIA on the hardware side and the major cloud providers on the infrastructure side.

Layer 2: Foundation models. Large language models, vision models, and multimodal models trained on internet-scale data. The primary providers are OpenAI, Anthropic, Google DeepMind, Meta AI, and a growing set of open-source alternatives led by Meta's Llama family and Mistral.

Layer 3: Model serving and optimisation. Infrastructure for deploying models at production scale: inference optimisation, model compression, caching, load balancing, and the orchestration tools that connect models to enterprise applications. Includes companies like Modal, Replicate, and a growing set of specialised serving platforms.

Layer 4: Data and context. The tools for preparing enterprise data for AI consumption: data pipelines, embedding generation, vector databases, retrieval-augmented generation (RAG) infrastructure, fine-tuning tooling, and the evaluation and monitoring systems that ensure model outputs remain accurate and aligned with enterprise requirements.

Layer 5: Application intelligence. Domain-specific AI applications and the platforms for building them: AI agents, copilots, workflow automation, and vertical AI solutions that package foundation model capabilities into targeted products for specific enterprise functions or industries.

Where Value Is Consolidating: Layers 1 and 2

The clearest structural observation about the AI infrastructure stack is that value in Layers 1 and 2 is consolidating rapidly toward a small number of entrenched players with structural advantages that independent companies cannot replicate.

On the hardware side, NVIDIA's dominance in AI compute has proven more durable than many analysts expected in 2023. The CUDA software ecosystem, built over 15 years and deeply embedded in the toolchains of every major AI research and engineering team, creates switching costs that have proven remarkably sticky despite the commercial availability of competitive hardware from AMD, Google, and increasingly from custom silicon projects at Meta, Amazon, and Microsoft. The emergence of specialised inference chips — from Groq, Cerebras, and others — has created some competitive pressure on specific use cases, but has not fundamentally dislodged NVIDIA's position at the training layer.

On the foundation model side, the economics of training frontier models have produced a structural concentration that was predictable in retrospect but underappreciated in the early years of the current wave. Training a frontier model in 2026 requires capital outlays in the range of $100M–$1B, access to compute clusters that only the hyperscalers and a handful of well-capitalised AI labs can assemble, and research teams of a scale that independent startups cannot afford to build and retain. The open-source movement, led by Meta's Llama releases, has made capable models freely available — but the frontier remains the exclusive domain of organisations with resources that no seed-stage company can contemplate.

The implication for seed-stage investment is clear: Layers 1 and 2 are not investable at the seed stage. The competitive dynamics favour incumbent players too strongly, and the capital requirements are incompatible with seed-stage company building.

The Contested Middle: Layer 3

Layer 3 — model serving and optimisation — presents the most contested competitive dynamics in the current stack. The market is real and growing: as more enterprises move AI applications from pilot to production, the operational complexity of running models at scale has created genuine demand for infrastructure that sits between the raw model APIs and the enterprise application layer.

The challenge for Layer 3 companies is the direction of competitive pressure from both ends of the stack. From below, the foundation model providers are expanding their serving and deployment capabilities, packaging inference optimisation into their API offerings in ways that reduce the differentiated value of independent serving infrastructure. From above, the enterprise application layer is increasingly embedding serving optimisation directly into application platforms, reducing the need for standalone serving infrastructure for the majority of enterprise use cases.

We are cautious about seed investments in Layer 3, but not categorically opposed. The companies that have demonstrated durable differentiation in this layer share two characteristics: first, they focus on a specific technical problem — often inference latency, cost optimisation at scale, or model evaluation — rather than trying to provide a general-purpose serving platform; and second, they have built their differentiation on techniques or approaches that are genuinely difficult to replicate, typically rooted in deep research expertise rather than software engineering alone. The latter characteristic is the key screen: Layer 3 companies built on clever software integrations are vulnerable to replication by the hyperscalers; companies built on genuine research breakthroughs in inference optimisation have a more durable position.

"The AI infrastructure companies that will capture lasting value are not those that wrap foundation models. They are those that solve the problems that arise when enterprises try to put those models to work at production scale."

Where We Are Investing: Layer 4

The data and context layer — Layer 4 — is where we see the strongest combination of genuine enterprise pain, durable competitive dynamics, and seed-stage investability. The core insight driving our thesis here is that foundation models are only as good as the data they are given access to, and the problem of preparing enterprise data for AI consumption is significantly more complex and less automated than the model providers' marketing would suggest.

Enterprise data environments are heterogeneous, poorly documented, and governed by compliance requirements that the general-purpose data pipeline tools developed for the pre-AI era were not designed to address. The combination of these factors creates a set of genuinely hard problems that are not being adequately solved by the existing ecosystem of data infrastructure tools.

Consider the specific case of RAG (retrieval-augmented generation), which has become the dominant architecture for enterprise AI applications that require access to proprietary data. The technical concept is straightforward: when a user submits a query to an AI assistant, the system retrieves relevant documents from a knowledge base and provides them as context to the model. The implementation reality is considerably more complex. Enterprise knowledge bases contain documents in dozens of formats, varying levels of quality and recency, governed by access controls that must be respected at retrieval time, and requiring chunking and indexing strategies that vary by document type and query pattern. Building a RAG system that works reliably — that returns the right documents for a given query without hallucinating or surfacing documents the user is not authorised to see — requires significant engineering sophistication that most enterprise AI teams do not have in-house.

The market for Layer 4 tools has attracted significant investment and is consequently crowded in some sub-segments. Vector databases, in particular, are a segment where the number of funded companies has likely exceeded the eventual market capacity, and we expect significant consolidation over the next 24 months. Our investment focus within Layer 4 is on the sub-segments where genuine technical differentiation is possible and where the incumbent cloud provider offerings leave meaningful gaps: specifically, data quality and governance tooling for AI applications, evaluation and monitoring infrastructure for production AI systems, and fine-tuning and prompt engineering tooling for enterprise teams.

The Long Game: Layer 5

Layer 5 — application intelligence — is where we believe the largest long-term value in the AI stack will be captured, for a structural reason: enterprise workflow automation at the application layer requires deep domain knowledge that the foundation model providers, the cloud hyperscalers, and the horizontal application platforms do not have and cannot efficiently acquire.

The insight driving this thesis is that the hard part of building an AI application for a specific enterprise vertical is not the AI capability. Foundation models are powerful enough to handle most of the analytical and generative tasks that enterprise workflows require. The hard part is understanding the specific workflow, the data sources that feed it, the exception handling and compliance requirements that govern it, and the integration points with the legacy systems that surround it. This knowledge is domain-specific, tacit, and accumulated over years of working in a specific industry context. It cannot be Googled, and it cannot be automated.

This creates a structural opportunity for founders with deep domain expertise in specific enterprise functions or industries to build AI-native applications that are meaningfully better than what a horizontal platform can provide. The winning seed-stage companies we are evaluating in this layer are those where the founding team has 5–10 years of direct operating experience in the domain they are targeting — where the company's competitive advantage is not the AI capability but the founders' understanding of exactly how the AI capability should be applied to a specific, high-value enterprise workflow.

The Commoditisation Cycle and Its Implications

Underlying our analysis of all five layers is an observation about the pace of commoditisation in AI infrastructure. The historical pattern in infrastructure technology is that capabilities that are differentiated and expensive at one point in the cycle become commoditised and cheap at a later point, as the architecture matures, open-source equivalents emerge, and the hyperscalers integrate the capability into their standard offerings.

This pattern is playing out in AI infrastructure at an unusually rapid pace. The vector database market, which in 2022 contained a handful of purpose-built companies offering genuinely novel technology, now includes open-source alternatives (pgvector, Qdrant, Chroma) that provide adequate performance for the majority of enterprise use cases at zero cost. The embedding generation market, dominated in 2022 by companies charging premium prices for proprietary embedding models, has been largely commoditised by the open availability of high-quality embedding models from the major foundation model providers.

Founders building seed-stage AI infrastructure companies need to be explicit with themselves and their investors about their commoditisation timeline. The question is not whether the capability they are building will eventually be commoditised — it will. The question is whether the company can reach a scale and competitive position that allows it to sustain itself through the commoditisation event: either by moving up the stack to higher-value applications, by building data network effects that give the company access to proprietary training data that competitors cannot replicate, or by accumulating enterprise contracts with switching costs that outlast the commoditisation of the underlying technology.

The most compelling AI infrastructure seed investments we evaluate are those where the founding team has a clear, specific answer to the commoditisation question before we ask it. Founders who have thought carefully about how their product's value proposition evolves as the underlying technology gets cheaper demonstrate the strategic sophistication that the AI infrastructure market, more than almost any other technology category, demands.

About the author: Tobias Fleischer is a Principal at KnownWeil Capital. He focuses on AI infrastructure, developer tools, and applied machine learning.

All Posts | Next Post