Overview
MiniMax-M2.5 is an open-weight large language model released by MiniMax, the Shanghai-based AI lab behind the M2 series. It uses a Mixture-of-Experts design with 229B total parameters and about 10B active per token, routing each token through 8 of its experts. The model ships in fp8 on Hugging Face and arrived in early 2026 as the third release in the M2 family, following M2 and M2.1 within roughly three and a half months. A faster sibling, M2.5-Lightning, has identical capability but higher throughput.
What it's good at
M2.5 was trained with reinforcement learning across more than 200,000 real-world environments, and it targets agentic work rather than chat alone. It reports 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp. Coding spans over ten languages, including Go, Rust, TypeScript, Python, Java, and C++, across web, mobile, and server projects. Before writing code it tends to plan like an architect, decomposing features and structure first. It also handles tool calling, web search, and office deliverables such as Word documents, slides, and Excel financial models. The model thinks step by step using a built-in reasoning trace and runs natively at up to 100 tokens per second.
Running locally
The 10B active parameters keep compute low, but all 229B weights still need to fit in memory, so the fp8 checkpoint is around 230 GB. A single 24 GB GPU cannot hold it; realistic local setups use 96 GB or more of combined VRAM and system RAM, such as 2x H100 80 GB or several consumer GPUs with CPU offload. Quantized GGUF builds help: a 3-bit quant is about 101 GB and runs near 25 tokens per second on an 80 GB H100. MiniMax recommends SGLang, vLLM, Transformers, or KTransformers for serving, with sampling at temperature 1.0, top_p 0.95, and top_k 40.
License
MiniMax-M2.5 is distributed under a Modified MIT license. You can download the weights, run them, fine-tune them, and use the model commercially. It is open-weight rather than fully open-source, since the training data and full pipeline are not published.

