Overview
Kimi-K2-Instruct-0905 is a large language model from Moonshot AI, the September 2025 refresh of the Kimi K2 line. It uses a mixture-of-experts (MoE) design with about 1 trillion total parameters and roughly 32 billion active per token, so only a slice of the network fires on each request. The instruct variant is tuned for chat, tool use, and agentic coding rather than raw pretraining.
In Atomic Chat the appeal is keeping all of that on your own machine. Once the weights are downloaded the model runs on-device, with no request leaving your hardware and no account or connection required to use it. Your prompts, code, and documents stay local.
What it is good at
The model carries the code, tools, and reasoning capabilities, and its agentic and long-context tags point at where it earns its keep:
- Agentic coding — it can plan a multi-step change, call tools, and work through a task across many turns, which is the use case Moonshot AI pushed hardest in the 0905 update, including frontend work.
- Tool calling — you pass a list of available functions and the model decides when and how to invoke them, so it slots into agent loops and local automation.
- Long-context work — the 256K-token window lets it hold a large codebase, a long transcript, or several documents in one session without losing the thread.
Running it locally
This is a server-grade model. At 1026.5B total parameters even quantized builds are heavy: community GGUF quants land around 250GB+ of combined system RAM and VRAM, and higher-precision versions need far more. The 256K context also adds memory on top of the weights, so plan for a workstation with a lot of RAM rather than a single consumer GPU.
huggingface-cli download moonshotai/Kimi-K2-Instruct-0905
From there you can serve the weights through an inference engine like vLLM or SGLang, or load a quantized build through Atomic Chat to run it on-device without wiring up a server yourself.
License
Kimi-K2-Instruct-0905 ships under a custom license (listed as "other"), which Moonshot AI describes as a Modified MIT License. The weights are openly available to download and run, including local and commercial use, with the modified terms attached by the publisher — check the license text on the model's Hugging Face page before deploying at scale.

