Overview
MiniMax-M3 is an open-weight large language model from MiniMaxAI, the Chinese AI lab behind the MiniMax series. It is a Mixture-of-Experts (MoE) model with about 427B total parameters but only roughly 23B activated per token, so it runs far cheaper than its full size suggests. The architecture pairs MoE routing with MiniMax Sparse Attention (MSA), which is how it sustains a context window of 1,048,576 tokens.
The model is natively multimodal, trained from the start to handle interleaved text and images rather than bolting vision on afterward. On atomic.chat you can pull the weights and run MiniMax-M3 on your own hardware, so prompts, code, and documents stay on your machine, fully offline, with nothing sent to a remote API.
What it is good at
MiniMax-M3 was built around coding and agentic work, with reasoning and vision as first-class capabilities. Three things it handles well:
- Coding and agentic tasks — it posts strong autonomous-agent scores (around 59% on SWE-Bench Pro in MiniMaxAI's reporting), so it can plan, edit across files, and run multi-step tool loops.
- Long-context reasoning — the 1M-token window plus sparse attention lets it read entire codebases or long document sets in one pass and reason over them with the thinking mode.
- Vision and visual understanding — being natively multimodal, it reads screenshots, diagrams, and UI mockups alongside text, which suits debugging from an image or extracting data from a chart.
Running it locally
This is a heavy model. At 427B parameters even quantized builds run large: the smallest GGUF quant lands near 128GB, and a comfortable quant wants 130GB or more of RAM or unified memory. A 512GB Mac Studio M3 Ultra can drive long generations; multi-GPU servers use 8-way tensor parallelism. The full 1,048,576-token context also needs room for the KV cache on top of the weights.
huggingface-cli download MiniMaxAI/MiniMax-M3
Once the weights are local you can serve them through vLLM, SGLang, or llama.cpp, or load the model in Atomic Chat with one click and start chatting without touching a config file.
License
MiniMax-M3 ships under a custom "other" license rather than a standard OSI license, so the exact terms come from MiniMaxAI's own agreement on the model's Hugging Face page. The weights are openly downloadable for local and offline use; check that license text before any commercial deployment to confirm what redistribution and production use it permits.
