All you need to know about DeepSeek's new AI model

By AFP | 2mos ago

Audio By Vocalize

**This photo illustration shows the DeepSeek app on a mobile phone in Beijing on January 27, 2025.** [AFP]

Chinese startup DeepSeek released a new artificial intelligence model with "drastically reduced" costs Friday, more than a year after it stunned the world with a low-cost reasoning model that matched the capabilities of US rivals.

The AI race has intensified the rivalry between China and the United States, with the White House on Thursday accusing Chinese entities of a massive effort to steal artificial intelligence technology. Beijing called the claim "baseless".

Hangzhou-based DeepSeek burst onto the scene in January last year with a generative AI chatbot, powered by its R1 reasoning model, that upended assumptions of US dominance in the strategic sector.

DeepSeek-V4 "features an ultra-long context", the company said in a statement on social media platform WeChat, hailing it as "world-leading... with drastically reduced compute (and) memory costs" in a separate announcement on X.

V4 supports a context length of one million "tokens" -- small components of text including words or punctuation -- putting it on par with Google's Gemini.

Context length determines how much input a model is able to absorb to help it complete tasks.

The new V4 is released as two versions, DeepSeek-V4-Pro and DeepSeek-V4-Flash, with the latter being "a more efficient and economical choice" because it has smaller parameters.

In terms of "world knowledge", a benchmark for reasoning, V4-Pro trails only the latest Gemini model, DeepSeek said.

A "preview version" of the open source model is now available, the company said, without indicating when a final version would be released.

Experts say V4's arrival marks an "inflection point" in terms of hardware and cost.

"This addresses the long-standing issues of slower performance and higher costs associated with long context lengths, marking a genuine inflection point for the industry," Zhang Yi, the founder of tech research firm iiMedia, told AFP.

"For end users, this will bring widespread, accessible benefits. For instance, if ultra-long context support becomes a standard feature, long-text processing is expected to move beyond high-end research labs and enter mainstream commercial applications," he said.

V4-Pro has 1.6 trillion parameters while the V4-Flash has 284 billion parameters, which refine models' decision-making ability.

The model has also been "optimised" for popular AI Agent products such as Claude Code, OpenClaw, OpenCode and CodeBuddy, the DeepSeek statement said.

DeepSeek's latest release is a "milestone" for Chinese firms, said veteran AI industry analyst Max Liu.

"It's a good thing for the entire domestic AI industry. It can provide better models for domestic users and we can now expect a lot more things -- more products (and a) more competitive market," he told AFP.

"This is no less shocking than when DeepSeek first came out" if its new model indeed matches the performance of leading models from Western labs, he added.