Product guide · Self-hosted

Ollama

Run LLMs locally. Free, open-source. No API bills. Cost is hardware and power.

Visit vendor site

Use cases

What teams actually use it for

Drafting and Q&A
Code assistance
Document analysis
Private/internal use cases
Offline or air-gapped
Cost-predictable high volume

Pricing

Pricing model

Free software. No per-token costs. You pay for: GPU/hardware, electricity, and maintenance.

Software

Free

Open source, MIT license

Hardware

One-off

GPU required; 8GB+ VRAM for smaller models, 24GB+ for 14B, 40GB+ for 70B

Power

Ongoing

See our Self-hosted GPU Comparisons for £/1M token estimates

Business fit

What to know before you commit

Pros

No data leaves your premises
Predictable cost (hardware + power)
No per-token bills
Wide model support (Llama, Mistral, Qwen, etc.)
Simple local setup

Considerations

Upfront hardware cost
You manage updates and security
Model quality depends on hardware
No managed SLA

When it makes sense: Good for data-sensitive workloads, high volume where cloud costs would exceed hardware, or air-gapped environments. Compare with our GPU cost guide.

Data handling

Where your data goes

All data stays on your hardware. No data sent to third parties. Full control.

GDPR / compliance. Data never leaves your infrastructure.

Data sovereignty. Complete data sovereignty.

Reference source

Want a recommendation for your use case?

Every team's fit is different. We'll model cost and ROI across cloud, self-hosted, and hybrid before recommending anything, including this product.

Book a 30-min discovery call Back to all products