Subscribe
The Floor and Ceiling of AI

The Floor and Ceiling of AI

The useful question is how much AI capability would still remain if the proprietary frontier disappeared tomorrow.
Hand-drawn editorial illustration of an open AI floor rising toward a proprietary AI ceiling, with model blocks, agent loops, and benchmark markers.
The ceiling shows the frontier. The floor shows what the world keeps.
Reader switch
?
Agent view turns the post into a terminal-style markdown transcript with explicit URLs, so coding agents can scan the structure and follow links directly.

Human view keeps the essay, imagery, section rail, and reference margin.

Markdown file

Most conversations about AI progress focus on the ceiling.

Which proprietary model is strongest right now? Which lab is ahead? Is the best coding model GPT-5.5 Pro, Claude Opus 4.7, or something else depending on the task?

That is useful, but it misses the part that feels more historically important.

The better question is what is happening to the floor.

By floor, I mean the strongest open-weight or open-source AI available at a given point in time. If the proprietary frontier collapsed tomorrow, if the APIs disappeared or access was cut off, what is the minimum capability the world would still have?

That is the floor.

The ceiling tells you what is possible at the top end. The floor tells you what cannot be put back in the bottle.

I built a small companion tracker for this idea here: Floor / Ceiling. It is meant to become a monthly record of the open floor and proprietary ceiling, rather than another loose list of model launches.

The tracker now uses Artificial Analysis for the points and gap as of 1 May 2026: Coding Index for coding, Agentic Index for agentic work, and Intelligence Index for general reasoning. The floor and ceiling model choices are still editorial, but the displayed gaps are tied to a public benchmark source. When a side lists multiple models, the side score is the highest published AA score among those models.

The ceiling in April 2026

As of late April 2026, I would put two systems in the ceiling conversation.

OpenAI's GPT-5.5 Pro is listed in OpenAI's model docs as a high-compute version of GPT-5.5, available through the Responses API, with a 1,050,000-token context window and 128,000 max output tokens.

Anthropic announced Claude Opus 4.7 on 16 April 2026. Anthropic describes it as available across Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry, with the claude-opus-4-7 model ID.

Those are ceiling models in the obvious sense. They are expensive relative to smaller systems. They sit behind proprietary services. They are the sort of models you reach for when the work is hard enough that reliability, depth, and supervision cost matter more than the token bill.

For coding, the exact winner probably depends on the work. Claude Opus 4.7 looks especially strong for long-running software engineering and agentic workflows. GPT-5.5 Pro is the parallel OpenAI ceiling for hard reasoning, coding, and professional work.

But the ceiling is not the whole story.

The floor is already high

Now look at what is openly available.

DeepSeek V4 Pro is on Hugging Face as a DeepSeek V4 preview model. The model card describes it as a mixture-of-experts model with 1.6 trillion total parameters, 49 billion activated parameters, a 1-million-token context length, and an MIT licence.

Xiaomi MiMo V2.5 Pro is also on Hugging Face. Xiaomi describes it as an open-source MoE model with 1.02 trillion total parameters, 42 billion active parameters, up to a 1-million-token context length, and an MIT licence. The model card is explicit about its target: agentic work, complex software engineering, and long-horizon tasks.

GLM-5.1 belongs in the April floor too. Z.AI lists GLM-5.1 in its 7 April 2026 release notes and frames it around long-horizon agentic engineering. The Hugging Face model is under an MIT licence.

Kimi K2.6 is another late-April floor candidate. Moonshot's model card calls it an open-source, multimodal, agentic model, with 1 trillion total parameters, 32 billion active parameters, a 256K context length, and a modified MIT licence.

Ling-2.6-1T now belongs on the same watchlist. InclusionAI's Hugging Face repo lists an MIT licence and 1.026 trillion parameters, and its config uses a 262K context window. The model card frames it around lower token overhead, tool calling, coding, and reliable multi-step execution. Artificial Analysis currently gives it a 34 Intelligence Index score, so it is more a fresh open-floor signal than the model that sets the April gap.

That is a strange sentence if you remember where the field was three years ago.

A consumer electronics company has released a trillion-parameter open model aimed at agentic software work. DeepSeek has released another trillion-scale open model with a million-token context window. Z.AI, Moonshot, and InclusionAI are pushing the open floor directly into long-running engineering agents. All of this is sitting behind normal Hugging Face links.

This is what I mean by the floor.

It is not a weak backup. It is not a small local model that can autocomplete a function if you hold it carefully. It is a serious capability layer. If the proprietary frontier vanished, the world would not fall back to early ChatGPT. It would fall back to open trillion-scale models with long context, permissive licences, and enough capability to keep a lot of AI work moving.

Hand-drawn diagram of an open AI floor made from model weights, repositories, benchmarks, and agent loops rising toward a distant ceiling line.
The floor rises when public weights, serving stacks, benchmarks, and agent tooling compound together.

Why the floor matters more than it sounds

The ceiling moves first because frontier labs have more compute, more capital, and more control over the whole stack. A new capability usually appears there first.

But capabilities migrate downward.

Long context used to be a frontier differentiator. Now open models are claiming million-token windows. Strong coding used to sit mostly with closed systems. Now the open floor is clearly targeting complex coding agents. Tool use, reasoning traces, instruction following, agentic loops, and efficient MoE serving have all moved from rare frontier capability into the open ecosystem.

That migration changes the shape of the AI argument.

If a capability only exists in a closed lab, it is fragile. It can be rate-limited, repriced, regulated, withdrawn, or region-locked. Once the weights are public and the licence is permissive, the capability becomes part of the technical environment. It can still be hard to run. It can still be expensive. It can still be weaker than the ceiling. But it is no longer held in one place.

This is why I do not think there is a realistic return to pre-AI times.

You can imagine policy changes, market corrections, lab failures, API bans, or product reversals. None of those remove the models already released. The weights are distributed. The serving stacks improve. Quantisation gets better. Fine-tunes appear. Tooling grows around the models. Teams learn how to use them.

The floor has its own momentum now.

The gap is real

This does not mean the floor and ceiling are the same.

If I had to run a difficult production migration, review a subtle codebase change, or hand off a long research task where mistakes are expensive, I would still start with the ceiling. GPT-5.5 Pro and Claude Opus 4.7 should be more reliable on the hardest work. They will usually have better product scaffolding, safer defaults, stronger tool integration, and more consistent performance.

The gap still exists.

The point is that the gap is no longer the difference between "AI" and "no AI". It is the difference between the best available AI and the best openly available AI.

Hand-drawn architectural section showing a proprietary AI ceiling above an open AI floor, with dashed measurement lines showing the gap between them.
The gap still matters, but it is no longer a gap between capability and no capability.

That distinction matters.

Once the floor is high enough, a lot of work becomes durable. A school, small company, solo developer, research group, or country without clean access to proprietary APIs can still build with capable models. Not always at the ceiling. But far above the old baseline.

That is the part that changes the world underneath the model leaderboard.

Track the floor monthly

The useful thing to watch now is not a single snapshot. It is the trajectory.

Some months the ceiling jumps and the gap widens. Then the floor catches up. Sometimes the open model is not broadly better, but it becomes good enough in one domain: coding, long context, maths, search, agents, multimodal work. Those domain-specific jumps are how the floor rises.

That is why I want the Floor / Ceiling tracker to be monthly.

The question each month is simple:

  • What is the current proprietary ceiling?
  • What is the current open floor?
  • Which domain moved: coding, general reasoning, agents, multimodal work?
  • If the ceiling disappeared, what would still be possible?

That last question is the one I care about.

The ceiling tells us where the frontier is. The floor tells us what the world already has.

Right now, for coding and agentic work, the floor looks something like Kimi K2.6, GLM-5.1, DeepSeek V4 Pro, and Xiaomi MiMo V2.5 Pro. The ceiling looks something like GPT-5.5 Pro and Claude Opus 4.7.

That is not a return-to-normal situation. That is a permanently changed baseline.

Try this prompt

Pick one AI capability and estimate its floor and ceiling. Define the proprietary frontier version, the strongest open or widely available alternative, the practical gap between them, and which parts of the workflow would still survive if the frontier disappeared tomorrow. End with what this changes about my dependence on closed models.

Sources

A hand-drawn workbench horizon of notes, tools, and purple pathways becoming a publishing system