Benchmark local LLMs on your PC and know exactly which models your GPU can run.

Run a verified real-inference benchmark, detect your hardware, and get recommended models + performance (tokens/s) + VRAM limits.

View GPU Leaderboard
Small .exe + ~670MB first benchmark ~2-4 min verified run No account needed Works offline
Runs on: NVIDIA | AMD | Intel | Apple Silicon
vramcheck v1.1
$ vramcheck detect
[OK] NVIDIA RTX 3060 - 12GB VRAM
[OK] AMD Ryzen 5 5600X - 6 cores
[OK] 32GB DDR4 RAM
$ vramcheck benchmark --model llama3:8b
Running benchmark... ---------- 100%
Generation: 42.3 tokens/sec
VRAM used: 6.2GB / 12GB
Context: 4096 tokens max
Compatible models:
[OK] Llama 3 8B (Q4, Q5, Q8)
[OK] Mistral 7B (Q4)
[X] Llama 3 70B (needs 40GB+)
Real result from RTX 3060 12GB - common setup
50,000+ Benchmarks run
2,500+ GitHub stars
150+ GPUs tested
80+ Models supported

Buying an expensive GPU for AI without knowing if it can run the models you need is frustrating.

?

Uncertainty

"Will my RTX 4060 run Llama 3 70B?" No clear answers from generic specs.

$

Cost of Mistakes

Buying the wrong GPU and discovering it can't run the models you need.

T

Time Wasted

Hours comparing contradictory info across Reddit, HuggingFace, Discord...

VRAM Check gives you comparable verified results from your own hardware.

Benchmark Your Setup in 3 Simple Steps

No account. No signup. Just download, run, and know.

1

Download

Get one .exe for your OS

No installation needed
2

Run Test

Verified run benchmarks real model inference

Tests GPU, CPU & RAM
3

See Results

Get specific recommendations

Know exactly what you can run

You get:

  • Tokens/sec (model-by-model)
  • VRAM headroom & max context tested
  • Recommended quantization (Q4/Q5/Q8)
  • Top comparable systems (leaderboard)
  • Specific model list (runs / won't run)
  • PDF export with full specs

How We Benchmark

Transparent methodology you can trust.

REAL

Real Workloads

Actual model inference, not synthetic tests

METRICS

Consistent Metrics

Tokens/sec, VRAM, latency measured uniformly

REPRO

Reproducible

Same test, same conditions, comparable results

OSS

Open Source

Full methodology documented on GitHub

Read Canonical v1.1 methodology | Open source on GitHub

Ready to know what your hardware can run?

Download the CLI, run one verified benchmark, and get your personalized results.

Free & open source | No signup | Works offline