Qwen3.6 27B MTP
27B
No models found

No models match your search. Try removing a filter or widening the parameter range.

Download Atomic Chat
to get started

Run open-source AI models on your own device
Desktop
macOS
(M1 or better)
Download
Windows
(x64)
Download
Mobile

Choosing an open-source model to run locally

Every model in this catalog runs entirely on your own hardware — no API keys, no per-token billing and no data leaving your machine. That makes local models a good fit for private workloads, offline environments and high-volume tasks where a metered cloud API would get expensive. The trade-off is that you pick the hardware, so the right model depends as much on your machine as on the task.

The two numbers that matter most are parameter count and VRAM required. Parameter count is a rough proxy for capability — larger models reason and write better, but need more memory and run slower. VRAM required tells you whether a model fits on your GPU at all; a quantized build lowers that number, at a small cost to quality. Use the sidebar filters to narrow the list to models your machine can actually run before comparing anything else.

Match the model to the task

A model tuned for code completion behaves differently from one tuned for chat or vision, even at the same size. The Tasks filter groups models by what they were trained to do — text generation, image-to-text, text-to-image and more — so start there, then sort within the task by size or how recently the weights were updated.

Best models for general chat and reasoning

These general-purpose models balance answer quality against hardware cost. Each one runs comfortably on a recent laptop or a mid-range GPU.

ModelParametersVRAM requiredBest for
Qwen 3 8B
8B~8 GBEveryday chat on a laptop
Llama 3.1 8B
8B~8 GBBalanced reasoning and writing
Mistral Small 24B
24B~16 GBStronger reasoning, mid-range GPU
Qwen 3 32B
32B~24 GBHighest quality, desktop GPU

Best models for low-end hardware

If you're working on a CPU-only machine or a GPU with limited memory, these compact and quantized models stay responsive without a discrete graphics card.

ModelParametersVRAM requiredBest for
Qwen 3 1.7B
1.7BCPU onlyFast replies on any laptop
Llama 3.2 3B
3B< 8 GBLightweight assistant tasks
Mistral 7B (Q4)
7B~6 GBQuantized build for older GPUs

These picks are a starting point, not a ranking — the right model is the largest one that runs smoothly on your hardware for the task you care about. Use the filters above to explore the full catalog.

Frequently asked questions

Local models run on your own hardware, so there are no API keys, no usage limits and no data sent to a third party. Cloud APIs are easier to scale, but bill per token and require an internet connection — local models trade that convenience for privacy and predictable cost.

Check the "VRAM required" value against your GPU memory, or filter the catalog by it. If a model is larger than your GPU, a quantized build or a CPU-only model will still work — just more slowly.

Parameters are the learned weights inside a model. More parameters generally means better reasoning and writing, at the cost of more memory and slower generation. It's a rough guide to capability, not an exact score.

It depends on each model's license. Many are released under permissive licenses such as Apache 2.0, while others restrict commercial use. Always check the license listed on the model's page before shipping it in a product.

Yes. Once the weights are downloaded, a local model runs without any network connection — useful for air-gapped setups and for keeping sensitive data on-device.