Open Source AI Model Surpasses Meta & OpenAI At Lower Cost

DeepSeek V3, a large language model from Chinese startup DeepSeek, outperforms Meta and OpenAI models with 671B parameters & Mixture-of-Experts architecture. Trained in 2.78M GPU hours at $5.58M, it's more efficient & cost-effective than competitors.

A Chinese startup in Hangzhou, DeepSeek, recently made headlines after launching its new large language model, DeepSeek V3. This model is said to be performing better and more efficiently than the existing models of the well-established players like Meta and OpenAI.
Key Features of DeepSeek V3
Parameters and Architecture: DeepSeek V3 has 671 billion parameters, and is a Mixture-of-Experts (MoE) architecture. This design allows only 37 billion of the model's parameters to be engaged for a particular task. The efficiency improvement remains high but it is greatly increased[2][6].
Training Effici...

Read the full article