pablo-reyes8/deepseek-v4-mini-pytorch
From-scratch, paper-faithful PyTorch implementation of DeepSeek-V4 architecture for transparent study, testing, ablation, and mini-scale training.
GitHub repository with 5 stars and 0 forks.
Language: Python
Topics: deep-learning, deepseek, deepseek-v4, from-scratch, language-model, llm, long-context, mixture-of-experts, pytorch, research-implementation