Local AI

Run Local

Find open models that fit your hardware. Filter by tool use support, context window, and get the install command for Ollama or LM Studio.

Updated March 22, 2026 · New models are released constantly — contributions welcome

Your Hardware

GPU VRAM

select to filter models

System RAM

used when running without a GPU

Tool calling:

S—Runs greatA—Runs wellB—Runs fineC—Tight fitD—Very tightF—Can't run

Llama 3.2 1B

Meta · 1B

Lightest Llama. Runs on anything.

1GB VRAM·128K ctx·Partial tool calling

ollama run llama3.2:1b

Llama 3.2 3B

Tool use

Meta · 3B

Fast and capable for everyday tasks.

CodeAgents

2GB VRAM·128K ctx·Full tool calling

ollama run llama3.2:3b

Qwen 2.5 0.5B

Alibaba · 0.5B

Tiny but surprisingly capable.

0.5GB VRAM·128K ctx·No tool calling

ollama run qwen2.5:0.5b

Qwen 2.5 3B

Tool use

Alibaba · 3B

Strong multilingual support.

CodeAgents

2GB VRAM·128K ctx·Full tool calling

ollama run qwen2.5:3b

Gemma 3 1B

Google · 1B

Google's lightest model. Runs on anything.

1GB VRAM·32K ctx·No tool calling

ollama run gemma3:1b

Phi-3 Mini

Microsoft · 3.8B

Punches above its weight. Great for coding.

Code

2.5GB VRAM·128K ctx·Partial tool calling

ollama run phi3:mini

Gemma 3 4B

Tool use

Google · 4B

Strong at 4B. Solid tool use and multilingual support.

CodeAgents

3GB VRAM·128K ctx·Full tool calling

ollama run gemma3:4b

Llama 3.1 8B

Tool use

Meta · 8B

Best quality/size ratio for most use cases.

CodeAgents

5GB VRAM·128K ctx·Full tool calling

ollama run llama3.1:8b

Qwen 2.5 7B

Tool use

Alibaba · 7B

Excellent code and tool use at 7B scale.

CodeAgents

5GB VRAM·128K ctx·Full tool calling

ollama run qwen2.5:7b

Mistral 7B

Mistral · 7B

Fast, efficient, great instruction following.

Code

5GB VRAM·32K ctx·Partial tool calling

ollama run mistral:7b

DeepSeek R1 7B

DeepSeek · 7B

Reasoning model. Chain-of-thought distilled.

Reasoning

5GB VRAM·128K ctx·Partial tool calling

ollama run deepseek-r1:7b

CodeLlama 7B

Meta · 7B

Solid general code generation.

Code

5GB VRAM·16K ctx·No tool calling

ollama run codellama:7b

LLaVA 7B

LLaVA Team · 7B

Vision + language. Understands images.

Vision

5GB VRAM·4K ctx·No tool calling

ollama run llava:7b

Llama 3.3 70B

Tool use

Meta · 70B

Better than 3.1 70B with the same VRAM. Meta's best open 70B.

CodeAgents

40GB VRAM·128K ctx·Full tool calling

ollama run llama3.3:70b

Llama 3.1 70B

Tool use

Meta · 70B

Near-frontier quality locally.

CodeAgents

40GB VRAM·128K ctx·Full tool calling

ollama run llama3.1:70b

Llama 3.1 405B

Tool use

Meta · 405B

Largest open model. Needs serious hardware.

CodeAgents

230GB VRAM·128K ctx·Full tool calling

ollama run llama3.1:405b

Qwen 2.5 14B

Tool use

Alibaba · 14B

Strong all-rounder, great for coding.

CodeAgents

9GB VRAM·128K ctx·Full tool calling

ollama run qwen2.5:14b

Qwen 2.5 32B

Tool use

Alibaba · 32B

Top open-source quality below 70B.

CodeAgents

20GB VRAM·128K ctx·Full tool calling

ollama run qwen2.5:32b

Qwen 2.5 72B

Tool use

Alibaba · 72B

Flagship Qwen. Best open multilingual model.

CodeAgents

42GB VRAM·128K ctx·Full tool calling

ollama run qwen2.5:72b

Mixtral 8×7B

Mistral · 47B MoE

MoE model with 12.9B active params.

Code

26GB VRAM·32K ctx·Partial tool calling

ollama run mixtral:8x7b

Mistral Small 22B

Tool use

Mistral · 22B

Best small Mistral for coding and agents.

CodeAgents

14GB VRAM·32K ctx·Full tool calling

ollama run mistral-small

Mistral Small 3

Tool use

Mistral · 24B

Latest Mistral Small. Faster and more accurate than its predecessor.

CodeAgents

15GB VRAM·32K ctx·Full tool calling

ollama run mistral-small3

Phi-3 Medium

Microsoft · 14B

Strong reasoning for a 14B model.

Code

9GB VRAM·128K ctx·Partial tool calling

ollama run phi3:medium

Phi-4

Tool use

Microsoft · 14B

Latest Phi. Excellent at STEM and tool use.

CodeAgents

9GB VRAM·16K ctx·Full tool calling

ollama run phi4

Gemma 3 12B

Tool use

Google · 12B

Best Gemma 3 for everyday tasks. Great instruction following.

CodeAgents

8GB VRAM·128K ctx·Full tool calling

ollama run gemma3:12b

Gemma 3 27B

Tool use

Google · 27B

Google's flagship open model. Rivals much larger models.

CodeAgents

17GB VRAM·128K ctx·Full tool calling

ollama run gemma3:27b

DeepSeek V3

Tool use

DeepSeek · 236B MoE

DeepSeek's flagship general model. Rivals GPT-4 class. Needs serious hardware.

CodeAgents

80GB VRAM·128K ctx·Full tool calling

ollama run deepseek-v3

DeepSeek R1 14B

DeepSeek · 14B

Strong reasoning at 14B.

Reasoning

9GB VRAM·128K ctx·Partial tool calling

ollama run deepseek-r1:14b

DeepSeek R1 32B

DeepSeek · 32B

Top open reasoning model.

Reasoning

20GB VRAM·128K ctx·Partial tool calling

ollama run deepseek-r1:32b

DeepSeek Coder V2 16B

DeepSeek · 16B MoE

Best open coding model.

Code

10GB VRAM·32K ctx·Partial tool calling

ollama run deepseek-coder-v2:16b

CodeLlama 34B

Meta · 34B

Best CodeLlama variant.

Code

21GB VRAM·16K ctx·No tool calling

ollama run codellama:34b

Llama 3.2 Vision 11B

Tool use

Meta · 11B

Meta's multimodal model with tool calling.

VisionAgents

7GB VRAM·128K ctx·Full tool calling

ollama run llama3.2-vision:11b