2026 GPU Spec List for Local AI, VR and Image Generation

How to choose a GPU and why VRAM matters so much are covered in separate articles.

Related
For how to choose a GPU and a budget-by-budget guide, see → Running a local AI chatbot at home: a budget-by-budget guide

This article lists the full specs and prices of every GPU you can buy as of April 2026. If you’re weighing your options, bookmark it and use it for comparison.

Full GPU spec list (as of April 2026)

How to read this table
VRAM = the amount of memory on the GPU. Since an AI model has to load in its entirety, this is the single most important spec.
Price = street price as of April 2026 (tax included). It changes daily.
Comment = the practical points for AI and VR use.
Rows with a yellow background are the good-value picks.

NVIDIA GeForce RTX 50 series (Blackwell, released 2025–2026)

GPU VRAM New price Used range Comment
RTX 5090 32GB ¥400k–700k (list ~¥400k; inflated on shortages) The full spread. Power draw and price are also top-tier
RTX 5080 16GB ¥210k–250k Great for max-setting VR. For LLMs, 16GB is the ceiling
RTX 5070 Ti 16GB ~¥160k A balanced VR + AI pick. Modest power draw too
RTX 5070 12GB ~¥100k Good for VR, but 12GB is limiting for AI
RTX 5060 Ti 16GB 16GB ~¥90k–100k Cheapest-class 16GB CUDA. Great for AI image generation
RTX 5060 Ti 8GB 8GB ~¥70k Entry VR. LLMs up to 8B models
RTX 5060 8GB ~¥60k Low power. For VR + light AI use
See detailed RTX 50-series specs (bandwidth, TDP, LLM score)
GPU Memory bandwidth TDP LLM score
RTX 5090 1792 GB/s 575W 120
RTX 5080 960 GB/s 360W 80
RTX 5070 Ti 896 GB/s 300W 72
RTX 5070 672 GB/s 250W 60
RTX 5060 Ti 16GB 448 GB/s 180W 45
RTX 5060 Ti 8GB 448 GB/s 180W 40
RTX 5060 448 GB/s 145W 35
This is the generation to buy new in spring 2026. The RTX 5060 Ti 16GB (~¥90k–110k) stands out as the cheapest class of card to run 16GB of VRAM in a CUDA environment. If your budget has room, the RTX 5070 Ti is the recommended balanced pick.
MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

¥102,353 (as of 2026-06-22)

NVIDIA GeForce RTX 40 series (Ada Lovelace, released 2022–2024)

GPU VRAM New price Used range Comment
RTX 4090 24GB ¥460k–550k ¥300k–380k The reference. Prices spiking as production ends
RTX 4080 SUPER 16GB Low stock ¥150k–180k Max-setting VR. A bargain used
RTX 4080 16GB Low stock ¥130k–160k Nearly identical to the 4080 SUPER. Used is a good buy
RTX 4070 Ti SUPER 16GB Low stock ¥110k–140k Tends to be the cheapest 16GB used
RTX 4070 Ti 12GB Low stock ¥90k–110k Good for VR. 12GB is limiting for AI
RTX 4070 SUPER 12GB Low stock ¥80k–100k Low power, comfy VR. Good used value
RTX 4070 12GB Low stock ¥70k–90k Entry low-power VR. 12GB works for image generation too
RTX 4060 Ti 16GB 16GB Low stock ¥70k–100k Narrow bandwidth, so LLMs are slow. The 16GB version is scarce even used
RTX 4060 Ti 8GB 8GB Low stock ¥40k–50k For lightweight VR titles
RTX 4060 8GB Low stock ¥30k–40k Lowest power. LLMs only up to 8B models
See detailed RTX 40-series specs (bandwidth, TDP, LLM score)
GPU Memory bandwidth TDP LLM score
RTX 4090 1008 GB/s 450W 100
RTX 4080 SUPER 736 GB/s 320W 70
RTX 4080 716 GB/s 320W 68
RTX 4070 Ti SUPER 672 GB/s 285W 62
RTX 4070 Ti 504 GB/s 285W 55
RTX 4070 SUPER 504 GB/s 220W 50
RTX 4070 504 GB/s 200W 45
RTX 4060 Ti 16GB 288 GB/s 165W 32
RTX 4060 Ti 8GB 288 GB/s 160W 28
RTX 4060 272 GB/s 115W 22
The RTX 40 series is nearly out of stock new. This is a generation to target used. The RTX 4070 Ti SUPER tends to fall into the cheapest 16GB-used band and handles both VR and AI — a well-balanced option. Note that the 4060 Ti 16GB has narrow bandwidth.

NVIDIA GeForce RTX 30 series (Ampere, released 2020–2022)

GPU VRAM New price Used range Comment
RTX 3090 Ti 24GB Discontinued ¥140k–180k About the same performance as the 3090. High 450W draw. The 3090 is often cheaper
RTX 3090 24GB Discontinued ¥100k–140k 24GB in the ¥100k range. The used LLM standard
RTX 3080 Ti 12GB Discontinued ¥60k–80k Wide bandwidth but 12GB VRAM. Gaming + light AI
RTX 3080 12GB 12GB Discontinued ¥50k–70k Nearly identical to the 3080 Ti
RTX 3080 10GB 10GB Discontinued ¥40k–60k 10GB is awkward. 8B models will run
RTX 3070 Ti 8GB Discontinued ¥30k–50k Cheap as a used VR card
RTX 3070 8GB Discontinued ¥30k–40k The used entry-VR standard
RTX 3060 Ti 8GB Discontinued ¥25k–40k Good-value used VR card
RTX 3060 12GB Discontinued ¥20k–30k 12GB in the low ¥20,000s. Narrow bandwidth, but good for entry-level LLMs
See detailed RTX 30-series specs (bandwidth, TDP, LLM score)
GPU Memory bandwidth TDP LLM score
RTX 3090 Ti 1008 GB/s 450W 75
RTX 3090 936 GB/s 350W 70
RTX 3080 Ti 912 GB/s 350W 48
RTX 3080 12GB 912 GB/s 350W 45
RTX 3080 10GB 760 GB/s 320W 42
RTX 3070 Ti 608 GB/s 290W 32
RTX 3070 448 GB/s 220W 30
RTX 3060 Ti 448 GB/s 200W 25
RTX 3060 360 GB/s 170W 20
A mining-era generation, so used stock is plentiful. The RTX 3060 12GB (¥20k–30k) and RTX 3090 24GB (¥130k–180k) are the good-value used picks. The 3060 has narrow bandwidth so LLM generation is slow, but 12GB is enough to load the model. The 3090’s 24GB makes it usable for serious local AI.

【中古】ELSA GeForce RTX 3060 12GB

【中古】ELSA GeForce RTX 3060 12GB

¥52,800 (as of 2026-06-22)


【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

¥148,000 (as of 2026/5/1)

AMD Radeon RX 9000 series (RDNA 4, released 2025)

GPU VRAM New price Used range Comment
RX 9070 XT 16GB ~¥90k–100k Gaming performance around the 5070 Ti. Linux recommended for AI
RX 9070 16GB ~¥80k 16GB for ¥80k. AI-tool support on Windows is unstable

AMD Radeon RX 7000 series (RDNA 3, released 2022–2024)

GPU VRAM New price Used range Comment
RX 7900 XTX 24GB ~¥180k ¥120k–150k 24GB for ¥180k. A strong pick with ROCm + Linux
RX 7900 XT 20GB ~¥130k ¥90k–120k 20GB is an unusual size. Worth considering at the right price
RX 7900 GRE 16GB ~¥80k ¥60k–80k 16GB in the ¥80k range. For AMD fans
RX 7800 XT 16GB ~¥70k ¥50k–70k Good for gaming. AI requires a ROCm environment
RX 7700 XT 12GB ~¥50k ¥40k–50k 12GB for ¥50k. Gaming-oriented
See detailed AMD Radeon specs (bandwidth, TDP, LLM score)
GPU Memory bandwidth TDP LLM score
RX 9070 XT 640 GB/s 304W 45 (ROCm)
RX 9070 640 GB/s 220W 40 (ROCm)
RX 7900 XTX 960 GB/s 355W 60 (ROCm)
RX 7900 XT 800 GB/s 315W 50 (ROCm)
RX 7900 GRE 576 GB/s 260W 38 (ROCm)
RX 7800 XT 624 GB/s 263W 35 (ROCm)
RX 7700 XT 432 GB/s 245W 25 (ROCm)
AMD Radeon’s appeal is its low price per GB of VRAM, but AI-tool support (Ollama, ComfyUI, etc.) trails NVIDIA. On Windows, CUDA-targeted tools often won’t run, so a Linux + ROCm environment is the assumption. If gaming is your main use, the value is high, but for AI-first use, NVIDIA is the safer bet.
A note on AMD’s AI support: the LLM scores for AMD GPUs are reference values in a ROCm (Linux) environment. On Windows, AI-tool support is often unstable compared to CUDA, and the scores won’t hold. For AI-first use, NVIDIA is the safer bet.

Quick recommendation table by VRAM

How to read this table
For each VRAM capacity, it lists one recommended GPU each, new and used.
AI use guide = the AI models you can practically run with that VRAM capacity.
VRAM New pick Used pick AI use guide
8GB RTX 5060 (~¥60k) RTX 3060 Ti (¥25k–40k) 8B LLM, SD 1.5, medium VR settings
12GB RTX 5070 (~¥100k) RTX 3060 (¥20k–30k) 14B LLM, SDXL, high VR settings
16GB RTX 5060 Ti 16GB (~¥90k–110k) RTX 4070 Ti SUPER (¥110k–140k) 14B + long context, complex SDXL workflows
20–24GB RX 7900 XTX (~¥180k) RTX 3090 (¥130k–180k) 32B LLM, FLUX Dev, nearly all image generation
32GB RTX 5090 (¥400k–700k; list ~¥400k, inflated on shortages) 32B + very long context (32K+)

Notes on the spec tables

Why memory bandwidth matters

LLM generation speed is roughly proportional to memory bandwidth. Even at the same VRAM capacity, a GPU with wider bandwidth generates text faster. For example, between the RTX 3060 (12GB / 360 GB/s) and the RTX 3080 Ti (12GB / 912 GB/s), the same model shows a 2x+ difference in generation speed.

About the LLM score

The scores in this article are relative values with the RTX 4090 set to 100, based on text-generation speed with Q4_K_M quantized models. Actual speed varies with the model and context length. AMD GPU scores are reference values in a ROCm (Linux) environment.

Cautions when buying a used GPU

Used prices are based on shop stock (with a warranty). Flea-market apps can be cheaper, but there’s a risk of mining history and fan wear, so I recommend buying from a shop.

How generation speed is estimated

The “estimated tok/s" figures in this and related articles are calculated with the following formula.

estimated tok/s ≈ memory bandwidth (GB/s) ÷ model size (GB) × 0.75

What this formula means

Every time a GPU generates one token, it reads all of the model’s parameters from memory. The ratio between “the minimum time to read all of it" and “the time it actually takes to generate one token" is about 0.75. The remaining 0.25 goes to reading and writing the past conversation history (the KV cache) and to decompressing the compressed data (Q4 quantization).

Basis for 0.75: measured data

Below are results measured on an RTX 3090 and an RTX 3060 under conditions where the model fully fits in VRAM (April 15, 2026, Ollama 0.20.2, 256-token generation, Linux).

GPU Bandwidth Model Size Theoretical tok/s Measured tok/s Measured ÷ theoretical
RTX 3090 936 GB/s qwen3:8b 5.2GB 180 126.4 0.70
RTX 3090 936 GB/s qwen3.5:9b 6.6GB 142 98.0 0.69
RTX 3090 936 GB/s qwen3:14b 9.3GB 101 76.6 0.76
RTX 3060 360 GB/s qwen3:8b 5.2GB 69 60.1 0.87
RTX 3060 360 GB/s qwen3.5:9b 6.6GB 55 46.6 0.85
RTX 3060 360 GB/s qwen3:14b 9.3GB 39 34.5 0.89

* Theoretical tok/s = memory bandwidth ÷ model size. The theoretical ceiling, assuming “one token can be generated in the time it takes to read all parameters once."

“Measured ÷ theoretical" falls in the 0.69–0.89 range, with a median of about 0.75. The narrower a GPU’s bandwidth (RTX 3060), the more memory-read waiting dominates, so results land close to the theoretical value. On a wide-bandwidth GPU (RTX 3090), the non-read work (reading/writing conversation history, decompression) stands out relatively more, so the ratio comes out a bit lower.

Limits of the estimate

This formula is only valid when the model fully fits in VRAM. If VRAM runs short and the model spills into CPU memory, speed drops dramatically (measured, it can fall to a tenth or less).

Also, when a model is split across two GPUs, you can combine the bandwidth of both, so it’s faster than a single card. In my setup, qwen3.5:27b (17GB) split across an RTX 3090 + RTX 3060 ran at 25.5 tok/s.

Quick estimated-speed table by GPU (8B and 14B models)

GPU Bandwidth 8B est. 14B est. Notes
RTX 4060 Ti 8GB 288 GB/s 42 14B won’t fit in VRAM
RTX 3060 12GB 360 GB/s ★ 60 ★ 35 Author-measured
RTX 5060 Ti 16GB 448 GB/s 65 36
RTX 4070 Ti S 16GB 672 GB/s 97 54
RTX 5070 12GB 672 GB/s 97 14B won’t fit in VRAM
RTX 3080 10GB 760 GB/s 110 14B won’t fit in VRAM
RTX 5070 Ti 16GB 896 GB/s 129 72
RTX 3090 24GB 936 GB/s ★ 126 ★ 77 Author-measured
RX 7900 XTX 24GB 960 GB/s 138 77 Assumes ROCm (Linux)
RTX 4090 24GB 1008 GB/s 145 81
RTX 5090 32GB 1792 GB/s 258 144

★ = author-measured (April 2026, Ollama 0.20.2, Linux). The rest are estimates from the formula (bandwidth ÷ model size × 0.75). Valid only when the model fully fits in VRAM. AMD GPUs are estimated in a ROCm environment.

Note: The estimates are guides only. They vary with the Ollama version, context length, number of concurrent requests, GPU cooling, and more. Cross-reference external benchmark sites (FormulaMod, AwesomeAgents) as well.

Full GPU value ranking

How to read this chart: the longer the bar, the higher the “LLM performance per unit price" — i.e., the better the value. The value metric is “LLM score ÷ GPU price (¥10k units)." Marked [new]/[used].

RTX 5090 32GB [New]
2.4
RTX 4090 24GB [Used]
2.9
RX 7900XTX 24GB [New]
3.3
RTX 5080 16GB [New]
3.5
RTX 4080S 16GB [Used]
4.2
RTX 5070Ti 16GB [New]
4.5
RTX 5060Ti 16GB [New]
4.7
RTX 4070TiS 16GB [Used]
5
RX 9070 16GB [New]
5
RTX 4060Ti 16GB [Used]
3.8
RTX 4070S 12GB [Used]
5.6
RTX 5060Ti 8GB [New]
5.7
RTX 5060 8GB [New]
5.8
RTX 3090 24GB [Used]
5.8
RTX 5070 12GB [New]
6
RTX 3080 12GB [Used]
7.5
RTX 3060 12GB [Used]
8

* The LLM score is a relative value of text-generation speed with Ollama (Q4_K_M quantized), based on RTX 4090 = 100. AMD is a reference value under ROCm (Linux). Prices are as of April 2026. Used prices are the midpoint of the shop range.

The used RTX 3060 12GB and RTX 3080 12GB stand out on value. Among new cards, the RTX 5070 12GB and RTX 5060 Ti 16GB rank high. The pricey RTX 5090 and RTX 4090 are high-performance but rank low on the value metric. That said, choosing on value alone biases you toward “cheap GPUs with little VRAM," so the key is to secure the VRAM your use case needs first, then look at value.

Conclusion: if you’re torn, these three

How to choose Recommended GPU Price guide Reason
One new card RTX 5060 Ti 16GB ~¥90k Cheapest class for 16GB VRAM in a CUDA environment. Runs AI image generation and LLMs practically
One used card RTX 3060 12GB ¥20k–30k 12GB in the low ¥20,000s. LLM speed is slow, but cheapest if just loading the model comes first
Serious AI RTX 3090 used / RX 7900 XTX new ¥130k–180k 24GB VRAM runs 32B models. The 3090 is a used CUDA setup; the 7900 XTX assumes Linux + ROCm
MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

¥102,353 (as of 2026-06-22)

【中古】ELSA GeForce RTX 3060 12GB

【中古】ELSA GeForce RTX 3060 12GB

¥52,800 (as of 2026-06-22)

【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

¥148,000 (as of 2026/5/1)

As of April 2026, the RX 7900 XTX (24GB) has dropped to around ¥120k. That’s cheaper than a used RTX 3090 (¥130k–200k) and comes with a new-product warranty. If you can run Linux, the RX 7900 XTX is a very attractive option.

Summary

This article is a reference of the full GPU spec list and prices as of April 2026. Prices move, so check the latest street prices before buying.

For how to choose a GPU (recommendations by use case, VRAM guidelines, a budget-by-budget guide), see the related articles below.

Related
– I want to run an AI chatbot at home → budget-by-budget guide
– How to start local AI with a used GPU → used-GPU buying guide
– Getting started with AI image generation → ComfyUI intro

The prices and specs in this article are as of April 2026. Prices change daily, so check the latest before buying.

Related

Running a Local AI Chatbot at Home: A Budget-by-Budget Guide

Starting Local AI with a Used GPU: The Value of the RTX 30/40 Series


2026 GPU prices — by VRAM

What matters for local AI is the size of the graphics card’s working space (VRAM — you can run as much model as fits here). I looked up the main cards on sale as of June 2026 on price-comparison sites (kakaku.com, etc.) and grouped them by VRAM size. Prices move by shop and timing, so treat them as ballpark figures (across 2026, prices have trended high overall due to a memory shortage).

VRAM Main GPUs Street price (new, guide) Where it sits for local AI
16GB RX 9060 XT 16GB (AMD) ~¥58k–64k Cheapest-class 16GB. Affordable entry. Check software support
16GB RTX 5060 Ti 16GB ~¥80k–95k The new-NVIDIA entry point. Room up to 14B class. CUDA support makes it safe
16GB RX 9070 XT 16GB (AMD) ~¥92k–115k A top seller in 2026. Well-priced and high-performance
16GB RTX 5070 Ti / 5080 ~¥150k–210k Fast, but capacity tops out at 16GB
20GB RX 7900 XT (previous gen) Mostly stock/used A rare 20GB. Previous gen, so new stock is dwindling
24GB Used RTX 3090 ~¥150k–200k (used) The shortest path to 24GB. The real pick that runs 30B class. Spiking on AI demand and memory shortage; used only; check condition
24GB RX 7900 XTX / used RTX 4090 Mostly previous gen/used There’s almost no current-gen 24GB new; previous gen or used is realistic
32GB RTX 5090 ~¥690k–720k 70B class in reach. Scarce, spiking, high power
Prices are ballpark figures looked up on price-comparison sites as of June 2026 and will change (not measured). Check used prices and condition before buying.

For local AI, the two realistic choices in 2026 are: to start affordably new, 16GB (RTX 5060 Ti or the RX 9060/9070 line); to aim for 24GB, a used RTX 3090 (recently spiking). New 24GB has all but vanished from the current generation (even the RTX 5080 is 16GB), and beyond that you jump straight to the 32GB RTX 5090 (~¥700k). Decide first how large a model you want to run, then choose from the VRAM class that fits it and you won’t go wrong. Note that AMD (RX line) support varies by software, so it’s wise to confirm the tools you want to use actually run before buying.

Sponsored