Overview
Nex-N2-Pro is an open-weight large language model from nex-agi, released in June 2026 under the Apache 2.0 license. It uses a mixture-of-experts (MoE) design built on the Qwen3.5 architecture, with 396.8B total parameters but only about 17B active per token. That sparsity is what makes it tractable to run yourself: you get the capacity of a frontier-scale model while paying inference cost closer to a 17B dense model.
The model handles both text and images as input and produces text. In Atomic Chat it runs fully on-device, so prompts, code, and documents never leave your machine. There is no API key, no usage metering, and no network round-trip once the weights are downloaded, which means it keeps working offline.
What it is good at
Nex-N2-Pro is tuned for agentic work and carries capabilities for reasoning, tool calling, vision, and code. A few concrete things it does well:
- Coding and debugging — it writes, reads, and fixes code across a repository, and is built for the plan-implement-debug loop rather than one-off snippets.
- Tool calling and agent loops — it can call functions and chain multi-step tool use, so you can wire it into local scripts, file operations, or a research workflow.
- Vision and long-context reasoning — it reads images alongside text and reasons over inputs up to 262,144 tokens, enough to hold large codebases or long documents in a single session.
Running it locally
At 396.8B total parameters the full-precision weights are large, but the MoE layout and quantization bring it within reach of high-memory workstations. A 4-bit GGUF build lands around 214-256 GB of combined memory, which runs on a 256 GB Mac Studio or on a 24 GB GPU paired with large system RAM using llama.cpp MoE offloading. The 262,144-token context is available locally, bounded by how much memory you can spare for the KV cache.
huggingface-cli download nex-agi/Nex-N2-Pro
You can load it with Transformers or vLLM for scripted use, run quantized GGUF builds through llama.cpp, or open it in Atomic Chat, which downloads the weights and sets up the runtime in one click.
License
Nex-N2-Pro ships under Apache 2.0. You can use it commercially, modify the weights, fine-tune it, and redistribute your own builds, as long as you keep the license and attribution notices. There is no per-token fee and no separate commercial agreement to sign.
