DeepSeek released V4 last week. The numbers are striking enough that they deserve a sober read, not the usual hype framing.
V4-Pro processes 1 million tokens of context, the equivalent of all three Lord of the Rings volumes plus The Hobbit. It uses 27% of V3.2's compute and 10% of its memory at that context length, achieved through a memory-compression approach that summarizes older information while keeping recent text precise.
Performance benchmarks put V4-Pro in the same band as Claude Opus 4.6, GPT-5.4, and Gemini 3.1 on coding, math, and STEM tasks. A developer survey of 85 practitioners had over 90% naming V4-Pro among their top model choices for coding work.
And it runs on Huawei Ascend 950 chips. This is DeepSeek's first model fully optimized for Chinese hardware.
MIT Technology Review covered the launch this week with a careful framing: open source has never been further along, and the chip dependency story is inverting faster than most Western analysts expected.
The Pricing Math Forces a Conversation
V4-Pro lists at $1.74 per million input tokens and $3.48 per million output tokens. V4-Flash sits at roughly $0.14 per million input tokens.
For comparison, Anthropic and OpenAI's frontier models are several times more expensive at equivalent capability tiers. The gap was always going to close, but the velocity surprised the labs that had assumed open-source progress would lag by 12 to 18 months on serious capabilities.
The closed-source argument has been three things. Better performance. Better safety. Better support and reliability. The first one is now contestable for a non-trivial subset of workloads. The second remains valid but the gap is narrowing. The third is real but pricable.
For any business making AI deployment decisions in 2026, the build-versus-buy calculation is being rewritten quarter by quarter. The total cost of ownership for an open-source model deployed on owned infrastructure was nine times higher than API calls a year ago. It's now closer to two times higher. By Q4, it may be at parity for high-volume workloads.
Why the Chip Story Is the Real Headline
The Huawei optimization piece deserves more attention than it's getting. For three years, the AI industry has operated on an implicit assumption: Nvidia chips power frontier AI, US export controls shape who gets them, and Chinese AI companies operate at a structural disadvantage from chip access.
V4 is the first major model release where that assumption looks weaker. DeepSeek didn't just port their model to run on Ascend 950 chips. They optimized the architecture for the specific characteristics of those chips. The result outperforms what V3.2 did on Nvidia hardware in some compute and memory metrics.
That changes the geopolitical and competitive picture. If Chinese AI labs can train and serve frontier-comparable models on domestically produced silicon, the chip-export-control strategy doesn't slow the capability frontier the way Washington has been planning.
For US technology companies, this matters in two ways. Custom silicon investments by Anthropic on Trainium, by OpenAI on Microsoft's accelerators, by Google on TPUs are not just cost optimizations anymore. They're risk-mitigation against a world where Nvidia's monopoly on AI training is contested. The Amazon-Anthropic deal I covered last week, with $100 billion in compute commitments, looks more rational in this context. It's chip lock-in, not just funding.
What This Means for Your Stack Decisions
The practical implication for businesses outside the labs is the one most people are missing. The capability gap between paid and open is narrowing. The pricing gap is widening in open's favor. The deployment complexity has dropped substantially over the past year as the tooling matured.
The case for using API-based frontier models still holds for workloads where you need the absolute capability ceiling, where safety and compliance frameworks are loadbearing, or where engineering resources for self-hosting aren't available. But the share of workloads that hit those criteria is shrinking.
For high-volume routine work, code generation, document processing, customer routing, summarization, internal tooling, the build-on-open option is now operationally viable for most mid-market companies and competitive for enterprises. We're seeing it inside our own stack at difrnt.ai when we deploy custom AI solutions for clients. The question moved from "can we afford this?" to "do we have the team to operate this?"
The real lock-in moving forward isn't going to be at the model layer. It's going to be at the integration layer. The proprietary data, the workflow architecture, the orchestration that makes models useful inside specific business contexts.
DeepSeek just made that the strategic question, faster than anyone expected.
FAQ
How does DeepSeek V4 compare to closed-source frontier models?
DeepSeek V4-Pro performs in the same band as Claude Opus 4.6, GPT-5.4, and Gemini 3.1 on coding, math, and STEM tasks according to standard benchmarks. A developer survey reported by MIT Technology Review found that over 90% of 85 surveyed practitioners included V4-Pro among their top model choices for coding work.
What is significant about DeepSeek V4 running on Chinese chips?
V4 is DeepSeek's first model fully optimized for Huawei Ascend 950 chips rather than Nvidia hardware. This challenges the implicit assumption that frontier AI capability requires Nvidia silicon and that US export controls would slow Chinese AI progress. The chip-export-control strategy looks weaker if Chinese labs can ship frontier-comparable models on domestic hardware.
How should businesses think about open-source versus API-based AI deployment in 2026?
The capability gap between paid frontier models and open-source models has narrowed materially. Pricing favors open by several multiples. For high-volume routine workloads, deploying open-source models is now operationally viable for mid-market companies. The remaining advantages of API access are the absolute capability ceiling, integrated safety frameworks, and lower engineering overhead.
