Overview
Kimi-K2-Instruct is the instruction-tuned chat model in Moonshot AI's Kimi K2 series, released in mid-2025. It is a mixture-of-experts (MoE) architecture with 1 trillion total parameters and 32 billion activated per token, drawn from 384 experts with 8 selected per token. The model was pre-trained on 15.5 trillion tokens using Moonshot's MuonClip optimizer, which the team used to scale Muon-based training without the loss instabilities that usually appear at this size. Moonshot calls it a reflex-grade model: it answers directly rather than producing long internal chain-of-thought.
What it's good at
The model is built around agentic work and coding. On SWE-bench Verified it reaches 65.8% pass@1 with bash and editor tools on single-attempt patches, and 47.3% on SWE-bench Multilingual under the same setup. It has native tool calling: you supply the list of available tools in each request and the model decides when and how to invoke them, which makes it a fit for autonomous agents. It also performs well on knowledge, math, and general reasoning benchmarks. Because it skips extended thinking, latency tends to be lower than reasoning-first models, at the cost of step-by-step deliberation on the hardest problems.
Running locally
Self-hosting is heavy. The weights ship in block-fp8 format and the full 1T-parameter model still needs roughly a terabyte of storage and a multi-GPU server with hundreds of gigabytes of combined VRAM. It runs through frameworks such as vLLM, SGLang, and TensorRT-LLM, and the architecture is DeepseekV3-compatible so existing MoE serving stacks work with minor config. For most users a hosted endpoint or Moonshot's OpenAI/Anthropic-compatible API is the practical path rather than local inference.
License
Both the code and the weights are released under a Modified MIT License. It allows commercial use and redistribution. The one added term is an attribution requirement that applies to very large commercial deployments, so most projects can use it freely while large-scale products must display Kimi K2 attribution.


