Browse Models
DeepSeek-VL2 comprises three open-weight Mixture-of-Experts vision-language models featuring dynamic tiling strategies and Multi-head Latent Attention architecture. The family includes Tiny (1.0B activated parameters), Small (2.8B activated), and standard (4.5B activated) variants, each supporting 4096-token context length and designed for multimodal reasoning, visual question answering, document analysis, and visual grounding tasks across research and applied scenarios.