Models — Atomic Chat

Reasoning

Code

Multilingual

Web

Vision

Thinking

Tools

Audio

Visuals

Embedding

DeepSeek-V4-Pro

Run DeepSeek-V4-Pro, a large MoE model, locally with Atomic Chat. Private, offline reasoning and coding — no cloud, no limits. Free.

Thinking

Reasoning

Code

Multilingual

gemma-4-31B-it

Run Gemma 4 31B locally with Atomic Chat — a capable Google model, fully offline and private. No API keys, no limits. Free.

Thinking

Embedding

Vision

Audio

Reasoning

LocateAnything-3B

Run LocateAnything-3B, a compact vision model, locally in Atomic Chat. Offline image understanding, fully private. Download free.

Thinking

Vision

Reasoning

Code

Qwen3.6-35B-A3B

Run Qwen3.6-35B-A3B, a MoE model, locally with Atomic Chat. Private, offline reasoning and coding — no API keys, no limits. Free.

Tools

Thinking

Embedding

Vision

Reasoning

GLM-5.1

Run GLM-5.1, a large MoE model, locally in Atomic Chat. Private, offline reasoning and coding with no cloud and no limits. Free.

Tools

Thinking

Reasoning

Code

Qwen3.6-27B

Run Qwen3.6-27B locally in Atomic Chat — strong reasoning and coding, offline and private. No cloud, no usage limits. Download free.

Tools

Thinking

Embedding

Vision

Reasoning

DeepSeek-V4-Flash

Run DeepSeek-V4-Flash, a fast MoE model, locally in Atomic Chat. Private, offline reasoning and coding — no cloud, no limits. Free.

Thinking

Reasoning

Code

Multilingual

GLM-5.2

Run GLM-5.2, a large MoE model, locally with Atomic Chat. Private, offline reasoning and coding with no cloud and no limits. Free.

Thinking

Reasoning

Code

gemma-4-26B-A4B-it

Run Gemma 4 26B-A4B, a Gemma MoE model, locally in Atomic Chat. Offline and private — no cloud, no usage limits. Download free.

Thinking

Embedding

Vision

Audio

Reasoning

MiniMax-M3

Run MiniMax-M3, a 427B MoE model, locally in Atomic Chat. Private, offline long-context reasoning — no API keys, no limits. Free.

Thinking

Vision

Reasoning

Code

gemma-4-12B-it

Run Gemma 4 12B locally with Atomic Chat — capable and efficient, fully offline and private. No cloud, no usage limits. Free.

Thinking

Embedding

Vision

Audio

Reasoning

diffusiongemma-26B-A4B-it

Run diffusiongemma-26B-A4B, a Gemma-based MoE model, locally with Atomic Chat. Offline and private — no API keys or limits. Free.

Thinking

Vision

Audio

Reasoning

Code

Kimi-K2.7-Code

Run Kimi-K2.7-Code, a 1T-parameter coding model, locally with Atomic Chat. Offline agentic coding, fully private. Download free.

Tools

Thinking

Vision

Reasoning

Code

gemma-4-E2B-it

Run Gemma 4 E2B, a lightweight Google model, locally with Atomic Chat. Fast, offline and private — no API keys. Download free.

Thinking

Embedding

Vision

Audio

Reasoning

Qwen3-8B

An 8.2B dense LLM from Alibaba's Qwen3 series with switchable thinking mode, strong reasoning, coding, and 100+ language support.

Thinking

Tools

Reasoning

Code

Multilingual

Qwen2.5-7B-Instruct

A 7.61B instruction-tuned LLM from Alibaba's Qwen2.5 series with strong coding, math, and multilingual ability across 29+ languages.

Reasoning

Code

Multilingual

Tools

DeepSeek-R1-0528

A 671B-parameter MoE reasoning model from DeepSeek with 37B active params, MIT-licensed, strong at math, code, and long chain-of-thought.

Thinking

Reasoning

Code

Tools

MiMo-V2.5-Pro

Run MiMo-V2.5-Pro, a 1T-parameter model, locally in Atomic Chat. Private, offline reasoning and coding — no cloud, no limits. Free.

Tools

Thinking

Reasoning

Code

Multilingual

Qwen2.5-32B-Instruct

A 32.5B instruction-tuned LLM from Alibaba's Qwen2.5 series with strong coding, math, and 29+ language support.

Reasoning

Code

Multilingual

Tools

Qwen2.5-14B-Instruct

A 14.7B instruction-tuned LLM from Alibaba's Qwen2.5 series with strong coding, math, and 29+ language support.

Reasoning

Code

Multilingual

Tools

SmolLM2-135M-Instruct

A 135M-parameter instruction-tuned LLM from Hugging Face's SmolLM2 family, small enough to run on CPU and on-device.

Reasoning

Qwen2.5-Coder-7B-Instruct

A 7.6B code-specialized LLM from Alibaba's Qwen2.5-Coder series, tuned for code generation, reasoning, and fixing.

Code

Reasoning

Tools

OpenELM-1_1B-Instruct

Apple's 1.1B instruction-tuned OpenELM model, built with layer-wise scaling for efficient on-device English text generation.

Reasoning

Qwen2.5-Coder-32B-Instruct

A 32.5B code-specialized LLM from Alibaba's Qwen2.5-Coder series with open-model state-of-the-art coding ability and 128K context.

Code

Reasoning

Tools

Multilingual

DeepSeek-V3-0324

A 671B-parameter Mixture-of-Experts LLM from DeepSeek-AI (37B active) with a 128K context, strong coding and improved function calling.

Reasoning

Code

Multilingual

Tools

DeepSeek-Coder-V2-Lite-Instruct

A 16B Mixture-of-Experts code model from DeepSeek AI with 2.4B active params, 128K context, and support for 338 programming languages.

Code

Reasoning

Tools

Phi-3.5-mini-instruct

A 3.8B dense instruction-tuned LLM from Microsoft's Phi-3.5 family with a 128K context window and multilingual support.

Reasoning

Code

Multilingual

Mistral-7B-Instruct-v0.2

A 7B instruction-tuned LLM from Mistral AI with a 32K context window, released under Apache 2.0.

Reasoning

Code

Phi-4

A 14B open model from Microsoft Research tuned for math, reasoning, and code, competitive with much larger LLMs.

Reasoning

Code

NVIDIA-Nemotron-Nano-9B-v2

A 9B hybrid Mamba2-Transformer reasoning model from NVIDIA with toggleable thinking, 128K context, and tool calling.

Thinking

Reasoning

Code

Tools

Multilingual

Qwen3-235B-A22B

A 235B mixture-of-experts LLM from Alibaba's Qwen3 series that activates 22B parameters and switches between thinking and non-thinking modes.

Thinking

Reasoning

Code

Multilingual

Tools

Qwen3-30B-A3B-Instruct-2507

A 30.5B-parameter (3.3B active) MoE instruct model from Alibaba's Qwen3 series with 256K context and strong reasoning, coding, and tool use.

Reasoning

Code

Multilingual

Tools

Phi-4-mini-instruct

A 3.8B-parameter open instruct model from Microsoft's Phi-4 family with 128K context, strong math and reasoning, and function calling.

Reasoning

Code

Multilingual

Tools

Qwen2.5-72B-Instruct

A 72.7B instruction-tuned LLM from Alibaba's Qwen2.5 series with strong coding, math and multilingual ability across 29+ languages.

Reasoning

Code

Multilingual

Tools

Llama-3.3-Nemotron-Super-49B-v1.5

A 49B reasoning and chat LLM from NVIDIA, distilled from Llama-3.3-70B via Neural Architecture Search with a 128K context.

Thinking

Reasoning

Code

Tools

Multilingual

SmolLM3-3B

A fully open 3B reasoning model from Hugging Face with dual-mode thinking, six native languages, tool calling, and 128K context.

Reasoning

Multilingual

Tools

Thinking

MiniMax-M2.5

A 229B-parameter (10B active) MoE model from MiniMax built for agentic coding, tool use, and search, with a 200K context window.

Thinking

Tools

Reasoning

Code

Web

Qwen3-4B-Thinking-2507

A 4B reasoning-focused LLM from Alibaba's Qwen3 series that always thinks step by step, with a 256K context and strong math, coding, and agentic scores.

Thinking

Reasoning

Code

Tools

Multilingual

Kimi-K2-Instruct

A 1T-parameter MoE chat model from Moonshot AI with 32B active parameters, built for agentic tool use and strong coding.

Tools

Code

Reasoning

Multilingual

Granite-4.0-H-Small

IBM's 32B (9B active) hybrid Mamba-2/MoE instruct model with 128K context, strong tool-calling and multilingual support, under Apache 2.0.

Reasoning

Code

Multilingual

Tools

North-Mini-Code-1.0

Run North-Mini-Code-1.0, a code-focused model, locally with Atomic Chat. Offline coding with no API keys or usage limits. Free.

Tools

Thinking

Embedding

Reasoning

Code

VibeThinker-3B

Run VibeThinker-3B, a small reasoning model, locally with Atomic Chat. Offline step-by-step thinking, fully private. Download free.

Tools

Thinking

Reasoning

Code

Nex-N2-Pro

Run Nex-N2-Pro, a large MoE model, locally in Atomic Chat. Private, offline reasoning and coding — no cloud, no limits. Free.

Tools

Thinking

Vision

Reasoning

Code

Nex-N2-mini

Run Nex-N2-mini, a 35B model, locally with Atomic Chat. Private, offline inference with no API keys, no cloud and no limits. Free.

Tools

Thinking

Vision

Reasoning

Code

WebWorld-8B

An 8B open-weight multimodal model from Qwen built for web-agent and reasoning tasks, with vision input and a 40K context window. Runs fully on local hardware.

Thinking

Vision

Reasoning

Web

FastContext-1.0-4B-RL

Run FastContext-1.0-4B-RL, a compact long-context model, locally in Atomic Chat. Offline and private — no API keys. Free.

Tools

Code

Multilingual

qwen3.6-27b-mtp

Run Qwen3.6-27B-MTP locally with Atomic Chat — offline, private, and free. No cloud, no usage limits.

Audio

Thinking

Tools

dramabox

Run DramaBox locally with Atomic Chat — offline, private, and free. No cloud, no usage limits.

Tools

Thinking

MiniCPM-V 4.6

A compact 1B vision-language model from OpenBMB that runs image-text-to-text tasks fully on-device, tuned for fast inference on consumer laptops.

Embedding

Tools

anima

Run Anima locally with Atomic Chat — offline, private, and free. No cloud, no usage limits.

Audio

supertonic-3

Run Supertonic-3 locally with Atomic Chat — offline, private, and free. No cloud, no usage limits.

Visuals

Tools

Thinking

sulphur-2-base

Run sulphur-2-base locally with Atomic Chat — offline, private, and free. No cloud, no usage limits.

Thinking

Embedding

Kimi-K2-Instruct-0905

Run Kimi-K2-Instruct-0905, a 1T-parameter MoE model, locally in Atomic Chat. Private, offline agentic tool use — no API keys, no cloud. Free.

Tools

Reasoning

Code

GLM-4.7-Flash

Run GLM-4.7-Flash, a fast 31B model, locally with Atomic Chat. Private, offline inference with no API keys, no cloud and no usage limits. Free.

Tools

Thinking

Reasoning

Code

MiniMax-M2.7

Run MiniMax-M2.7, a 229B MoE model, locally in Atomic Chat. Private, offline reasoning and coding — no API keys, no cloud, no limits. Free.

Thinking

Tools

Reasoning

Code

gpt-oss-120b

Run gpt-oss-120b, OpenAI's open 120B model, locally with Atomic Chat. Private, offline, with no API keys or usage limits. Download free.

Tools

Thinking

Reasoning

Code

gpt-oss-20b

Run gpt-oss-20b, OpenAI's open 20B model, locally with Atomic Chat. Private, offline inference on your own hardware. Download free.

Tools

Thinking

Reasoning

Code

gemma-3-270m

Run Gemma 3 270M, an ultra-light Google model, locally in Atomic Chat. Fast on-device inference, fully offline and private. Free.

Reasoning

Multilingual

gemma-3-1b-it

Run Gemma 3 1B, a lightweight Google model, locally with Atomic Chat. Offline, private, no API keys or cloud. Download free.

Reasoning

Code

Multilingual

DeepSeek-V3.2

Run DeepSeek-V3.2, a 685B MoE model, locally in Atomic Chat. Private, offline reasoning and coding — no API keys, no cloud, no limits. Free.

Thinking

Tools

Reasoning

Code

Multilingual

DeepSeek-R1

Run DeepSeek-R1, a 671B reasoning model, locally with Atomic Chat. Private, offline chain-of-thought with no cloud and no limits. Free.

Thinking

Reasoning

Code

Multilingual

Llama-3.2-3B-Instruct

Run Llama 3.2 3B Instruct, Meta's compact model, locally in Atomic Chat. Lightweight, offline and private — no API keys. Download free.

Tools

Reasoning

Code

Multilingual

Llama-3.1-8B-Instruct

Run Llama 3.1 8B Instruct, Meta's popular model, locally with Atomic Chat. Fast, offline and private — no cloud, no limits. Free.

Tools

Reasoning

Code

Multilingual

Qwen3-Coder-30B-A3B-Instruct

Run Qwen3-Coder-30B-A3B, a code-specialized MoE model, locally in Atomic Chat. Offline coding with no API keys or usage limits. Free.

Tools

Reasoning

Code

Multilingual

Qwen3-30B-A3B

Run Qwen3-30B-A3B, a 30B MoE model, locally with Atomic Chat. Private, offline reasoning and coding — no cloud, no limits. Free.

Tools

Thinking

Reasoning

Code

Multilingual

Qwen3-14B

Run Qwen3-14B locally with Atomic Chat — strong reasoning and coding, fully offline and private. No API keys, no limits. Download free.

Tools

Thinking

Reasoning

Code

Multilingual

Qwen3-32B

Run Qwen3-32B locally in Atomic Chat — high-quality reasoning and coding, offline and private. No cloud, no usage limits. Free.

Tools

Thinking

Reasoning

Code

Multilingual

No models match your search. Try removing a filter or widening the parameter range.

Choosing an open-source model to run locally

Every model in this catalog runs entirely on your own hardware — no API keys, no per-token billing and no data leaving your machine. That makes local models a good fit for private workloads, offline environments and high-volume tasks where a metered cloud API would get expensive. The trade-off is that you pick the hardware, so the right model depends as much on your machine as on the task.

The two numbers that matter most are parameter count and VRAM required. Parameter count is a rough proxy for capability — larger models reason and write better, but need more memory and run slower. VRAM required tells you whether a model fits on your GPU at all; a quantized build lowers that number, at a small cost to quality. Use the sidebar filters to narrow the list to models your machine can actually run before comparing anything else.

Match the model to the task

A model tuned for code completion behaves differently from one tuned for chat or vision, even at the same size. The Tasks filter groups models by what they were trained to do — text generation, image-to-text, text-to-image and more — so start there, then sort within the task by size or how recently the weights were updated.

Best models for general chat and reasoning

These general-purpose models balance answer quality against hardware cost. Each one runs comfortably on a recent laptop or a mid-range GPU.

Model	Parameters	VRAM required	Best for
Qwen 3 8B	8B	~8 GB	Everyday chat on a laptop
Llama 3.1 8B	8B	~8 GB	Balanced reasoning and writing
Mistral Small 24B	24B	~16 GB	Stronger reasoning, mid-range GPU
Qwen 3 32B	32B	~24 GB	Highest quality, desktop GPU

Best models for low-end hardware

If you're working on a CPU-only machine or a GPU with limited memory, these compact and quantized models stay responsive without a discrete graphics card.

Model	Parameters	VRAM required	Best for
Qwen 3 1.7B	1.7B	CPU only	Fast replies on any laptop
Llama 3.2 3B	3B	< 8 GB	Lightweight assistant tasks
Mistral 7B (Q4)	7B	~6 GB	Quantized build for older GPUs

These picks are a starting point, not a ranking — the right model is the largest one that runs smoothly on your hardware for the task you care about. Use the filters above to explore the full catalog.

Frequently asked questions

Local models run on your own hardware, so there are no API keys, no usage limits and no data sent to a third party. Cloud APIs are easier to scale, but bill per token and require an internet connection — local models trade that convenience for privacy and predictable cost.

Check the "VRAM required" value against your GPU memory, or filter the catalog by it. If a model is larger than your GPU, a quantized build or a CPU-only model will still work — just more slowly.

Parameters are the learned weights inside a model. More parameters generally means better reasoning and writing, at the cost of more memory and slower generation. It's a rough guide to capability, not an exact score.

It depends on each model's license. Many are released under permissive licenses such as Apache 2.0, while others restrict commercial use. Always check the license listed on the model's page before shipping it in a product.

Yes. Once the weights are downloaded, a local model runs without any network connection — useful for air-gapped setups and for keeping sensitive data on-device.