Overview
Phi-4-mini-instruct is a 3.8-billion-parameter open language model released by Microsoft in February 2025 as part of the Phi-4 family. It is a dense, decoder-only Transformer trained on synthetic data and filtered web content, with a deliberate emphasis on reasoning-heavy material. Architecturally it differs from its Phi-3.5-mini predecessor through a larger 200K-token vocabulary, grouped-query attention, and shared input and output embeddings. The model was trained on 512 A100-80G GPUs between November and December 2024, with a data cutoff of June 2024, and supports a 128K-token context window.
What it's good at
For its size, Phi-4-mini-instruct is strong at math and structured reasoning. It scores 88.6 on GSM8K (8-shot CoT) and 64.0 on MATH, beating several larger 7B-9B models on those tasks, and reaches 70.4 on BigBench Hard. It handles 24 languages, follows instructions reliably thanks to supervised fine-tuning plus direct preference optimization, and adds proper function calling, where tools are declared as JSON in the system prompt. Microsoft is candid about the trade-off: a 3.8B model cannot store much factual knowledge, so it can be factually wrong on long-tail topics. Pairing it with retrieval (RAG) is the recommended fix.
Running locally
The small size makes local deployment easy. A 4-bit quantized build needs around 3 to 4 GB of VRAM, so it fits comfortably on an 8 GB consumer GPU, and the full-precision weights sit near 8 to 9 GB. It runs through Hugging Face transformers, vLLM, llama.cpp, and Ollama (as phi4-mini). The default transformers path uses flash attention and expects an Ampere-class or newer card; on V100 or older GPUs, load it with attn_implementation set to eager. Python 3.8 or 3.10 is recommended for the reference setup.
License
Phi-4-mini-instruct is released under the MIT license. That allows commercial and research use, modification, and redistribution with attribution and almost no other restrictions, which makes it one of the more permissive options among small instruct models.
