Chinese AI startup DeepSeek has unveiled its new reasoning language model, DeepSeek-R1, which rivals OpenAI’s o1 in performance across tasks like mathematics, coding, and general reasoning.
Built on the DeepSeek V3 mixture-of-experts model, DeepSeek-R1 is an open-source breakthrough, demonstrating that affordable AI systems can compete with closed commercial models. The model is also highly cost-effective, with a 90-95% reduction in operational costs compared to OpenAI’s o1.
DeepSeek-R1’s development highlights significant progress in artificial general intelligence (AGI). Using reinforcement learning (RL) and supervised fine-tuning, the model refines its reasoning capabilities through iterative processes.
It scored 79.8% on the AIME 2024 mathematics test, 97.3% on the MATH-500 benchmark, and achieved a 2,029 rating on Codeforces, outperforming 96.3% of human programmers.
Comparatively, OpenAI o1 scored 79.2% on AIME 2024, 96.4% on MATH-500, and 96.6% on Codeforces. DeepSeek-R1 also demonstrated 90.8% accuracy in general knowledge tasks, nearly matching o1’s 91.8%.
DeepSeek-R1 was built upon the earlier DeepSeek-R1-Zero model, which relied solely on RL to self-evolve. The initial model showcased impressive reasoning behaviors but faced issues such as poor readability and language inconsistencies.
To address these challenges, the team incorporated supervised learning, creating a multi-stage process that fine-tuned the model for enhanced performance and readability.
The affordability of DeepSeek-R1 is another major milestone. While OpenAI o1 charges $15 per million input tokens and $60 per million output tokens, DeepSeek-R1 reduces these costs to $0.55 and $2.19, respectively.
This cost advantage, combined with its competitive performance, positions DeepSeek-R1 as a major player in the global AI landscape.