DeepSeek, an innovative AI lab, has launched its latest AI model, DeepSeek-R1-0528-Qwen3-8B, designed to operate efficiently on a single GPU. The revised R1 reasoning AI model has been distilled into a more user-friendly version. You can see it now on the AI development platform, Hugging Face. The new model is another example of DeepSeek’s efforts to make the most cutting-edge AI advancements accessible while being mindful of user hardware constraints.
DeepSeek developed DeepSeek-R1-0528-Qwen3-8B based on the text of the newest R1 model. In turn, they honed this information to improve the Qwen3-8B model that Alibaba first released in May. The final product is a model that is both smaller and more efficient while maintaining high performance metrics. DeepSeek asserts that this distilled version outperforms models of similar size on several important benchmarks. It especially excels compared to Google’s Gemini 2.5 Flash on the most challenging math questions from the AIME 2025 dataset.
To run DeepSeek-R1-0528-Qwen3-8B properly, it needs a 40GB-80GB GPU like the powerful Nvidia H100. By comparison, the large format new R1 model requires up to a dozen 80GB GPUs to function effectively. This move to smaller but more efficient models matches rapidly accelerating industry trends towards efficiency and low-cost accessibility in large language models for applications.
DeepSeek-R1-0528-Qwen3-8B is currently available from several hosts including Alternatively, you can use it more easily through an API on platforms like LM Studio. This change significantly improves the accessibility of the model and promotes greater use by developers and researchers alike.
DeepSeek emphasizes the potential applications of its new model, stating that it serves “for both academic research on reasoning models and industrial development focused on small-scale models.” Visit the dedicated webpage for DeepSeek-R1-0528-Qwen3-8B on Hugging Face. Provides essential tips and tools for those ready to leverage this powerful AI technology!