
Chinese artificial intelligence startup DeepSeek is ready with an advanced model, which is expected to be released in the coming days. According to the South China Morning Post (SCMP), DeepSeek-R2, the successor to the R1, will be cheaper and better, giving tough competition to ChatGPT's maker OpenAI. Notably, these speculations swirling on social media come amid an intensifying US-China tech war. It also comes months after the startup released two advanced open-source AI models, V3 and R1, which were built at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects.
What to expect?
According to SCMP, the new advanced model, R2 is said to have been developed with a so-called hybrid mixture-of-experts (MoE) architecture, making it 97.3 per cent cheaper than OpenAI's GPT-4o model. MoE is a machine-learning approach that divides an AI model into separate sub-networks to jointly perform a task. This will greatly reduce computation costs during pre-training and achieve faster performance during inference time, the outlet reported.
Experts have claimed that R2 is a "better vision" than R1, which had no vision functionality. Additionally, it is expected to feature 1.2 trillion parameters and will be trained on 5.2 petabytes of data.
With this new model, DeepSeek could position Huawei as the first major challenger to NVIDIA, experts said. The AI startup is also planning to take over Meta in dominating the open-source AI category by making its own models free to use, they added.
Also Read | Meet Sukant Singh Suki, First Indian To Complete Three 200-Mile Ultramarathons
A Reuters report in March said DeepSeek was preparing to launch R2 in April. But the company is yet to confirm the date.
DeepSeek-V3-0324
Notably, DeepSeek has rapidly emerged as a notable player in the global AI landscape in recent months, releasing a series of models that compete with Western counterparts while offering lower operational costs.
In March, the company released a major upgrade to its V3 large language model, intensifying competition with US tech leaders like OpenAI and Anthropic. According to Reuters, the new model was made available through AI development platform Hugging Face, marking the company's latest push to establish itself in the rapidly evolving AI market.
At the time, experts said DeepSeek-V3-0324 demonstrates significant improvements in areas such as reasoning and coding capabilities compared to its predecessor, with benchmark tests showing enhanced performance across multiple technical metrics published on Hugging Face.
Track Latest News Live on NDTV.com and get news updates from India and around the world