Overview
DeepSeek-Coder-V2-Lite-Instruct is the smaller member of the DeepSeek-Coder-V2 family, released by DeepSeek AI in mid-2024. It is a Mixture-of-Experts model with 16B total parameters, of which only 2.4B are active for any given token. DeepSeek built it by continuing the pretraining of an intermediate DeepSeek-V2 checkpoint on an additional 6 trillion tokens, heavily weighted toward source code and math. The Instruct variant is the chat-tuned version meant for interactive coding help; a Base variant ships alongside it for raw completion.
What it's good at
The model is built for programming. It supports 338 programming languages and handles code completion, fill-in-the-middle insertion, generation, and debugging. On HumanEval it scores around 81% pass@1, and it does well on MBPP and math benchmarks such as GSM8K thanks to the math-heavy pretraining. The 128K context window means it can read large files or several files at once, which helps with repository-level questions. The much larger 236B sibling reaches GPT-4-Turbo-level results on code tasks; the Lite model gives up some of that accuracy in exchange for running on modest hardware.
Running locally
The MoE design keeps inference cheap. Full BF16 weights need about 32 GB of memory, but 4-bit GGUF quants drop that to roughly 10-12 GB, so the Lite model runs on a single 16 GB GPU or an Apple Silicon Mac with enough unified memory. You can serve it with Hugging Face Transformers (set trust_remote_code=True), vLLM for higher throughput, or llama.cpp and Ollama using community GGUF builds. The chat template uses User:/Assistant: turns with DeepSeek's special sentence tokens.
License
The code repository is MIT-licensed, and the weights fall under the DeepSeek Model License. That license permits commercial use, so teams can deploy the model in products, subject to the terms in the agreement. Review the model license before shipping it commercially.

