Overview
diffusiongemma-26B-A4B-it is an open-weight text diffusion model from Google DeepMind, built on the Gemma 4 architecture. It is a Mixture-of-Experts (MoE) design: 25.8B total parameters, but only about 3.8B activate on each forward pass, which keeps memory and compute far below what the headline size suggests. It is multimodal, accepting text, image, and audio input, and ships under the apache-2.0 license.
What sets it apart is how it writes. Instead of predicting one token at a time, it starts from a block of placeholder tokens and refines them across several denoising passes, generating up to 256 tokens in parallel per pass. Running it inside Atomic Chat keeps every prompt, file, and response on your own machine. No request leaves your device, so it works on a plane, behind a firewall, or anywhere offline.
What it is good at
The parallel, bidirectional approach gives diffusiongemma-26B-A4B-it a structural edge on tasks where the model needs to see the whole output at once rather than guess left to right.
- Code infilling — filling a gap in the middle of a file, where the model can attend to code on both sides of the cursor before it writes.
- Inline editing — you change one sentence and it produces a local replacement quickly, drawing on its code and reasoning capabilities.
- Multimodal and multilingual work — its vision and audio inputs handle screenshots or clips, and multilingual support covers prompts across many languages with a 256K context window for long documents.
Running it locally
The model is 25.8B parameters with a 256K context length. Because only ~3.8B parameters are active at inference, a 4-bit quant of the 26B-A4B class fits in roughly 18GB of VRAM, which puts it within reach of a 24GB consumer card like an RTX 4090 or 5090. Budget extra headroom for the KV cache, since long contexts grow memory use beyond the weights alone. Pull the weights with:
huggingface-cli download google/diffusiongemma-26B-A4B-it
You can load it through Transformers or vLLM for scripted use, or open it in Atomic Chat with one click and start chatting without touching a config file.
License
diffusiongemma-26B-A4B-it is released under the apache-2.0 license. That permits commercial use, modification, and redistribution, so you can run it locally, fine-tune it, and ship it inside products without a usage fee.
