MiniMax's frontier model with 1M context, native multimodality, and MSA sparse attention. 59% SWE-Bench Pro.
Agentic coding, long-context tasks, multimodal workflows
1M token context via MiniMax Sparse Attention
Native multimodal (image, video, computer use)
59% SWE-Bench Pro, 66% Terminal-Bench 2.1
Toggleable thinking mode for complex reasoning
Choose MiniMax M3 for long-horizon agentic coding, complex multi-file engineering, and tasks requiring massive context. Native multimodality means it can understand screenshots and diagrams alongside code.
Weights not yet open-sourced. For quick simple tasks, lighter models are faster and cheaper.
42 models available. Switch per message. Start free.
Get Started Free