2026 GPU Spec List for Local AI, VR and Image Generation

2026年7月5日

How to choose a GPU and why VRAM matters so much are covered in separate articles.

Related
For how to choose a GPU and a budget-by-budget guide, see → Running a local AI chatbot at home: a budget-by-budget guide

This article lists the full specs and prices of every GPU you can buy as of April 2026. If you’re weighing your options, bookmark it and use it for comparison.

Full GPU spec list (as of April 2026)

How to read this table
VRAM = the amount of memory on the GPU. Since an AI model has to load in its entirety, this is the single most important spec.
Price = street price as of April 2026 (tax included). It changes daily.
Comment = the practical points for AI and VR use.
Rows with a yellow background are the good-value picks.

NVIDIA GeForce RTX 50 series (Blackwell, released 2025–2026)

GPU	VRAM	New price	Used range	Comment
RTX 5090	32GB	¥400k–700k (list ~¥400k; inflated on shortages)	–	The full spread. Power draw and price are also top-tier
RTX 5080	16GB	¥210k–250k	–	Great for max-setting VR. For LLMs, 16GB is the ceiling
RTX 5070 Ti	16GB	~¥160k	–	A balanced VR + AI pick. Modest power draw too
RTX 5070	12GB	~¥100k	–	Good for VR, but 12GB is limiting for AI
RTX 5060 Ti 16GB	16GB	~¥90k–100k	–	Cheapest-class 16GB CUDA. Great for AI image generation
RTX 5060 Ti 8GB	8GB	~¥70k	–	Entry VR. LLMs up to 8B models
RTX 5060	8GB	~¥60k	–	Low power. For VR + light AI use

See detailed RTX 50-series specs (bandwidth, TDP, LLM score)

GPU	Memory bandwidth	TDP	LLM score
RTX 5090	1792 GB/s	575W	120
RTX 5080	960 GB/s	360W	80
RTX 5070 Ti	896 GB/s	300W	72
RTX 5070	672 GB/s	250W	60
RTX 5060 Ti 16GB	448 GB/s	180W	45
RTX 5060 Ti 8GB	448 GB/s	180W	40
RTX 5060	448 GB/s	145W	35

This is the generation to buy new in spring 2026. The RTX 5060 Ti 16GB (~¥90k–110k) stands out as the cheapest class of card to run 16GB of VRAM in a CUDA environment. If your budget has room, the RTX 5070 Ti is the recommended balanced pick.

MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

¥102,353 (as of 2026-06-22)

Amazon 楽天市場 Yahoo!

NVIDIA GeForce RTX 40 series (Ada Lovelace, released 2022–2024)

GPU	VRAM	New price	Used range	Comment
RTX 4090	24GB	¥460k–550k	¥300k–380k	The reference. Prices spiking as production ends
RTX 4080 SUPER	16GB	Low stock	¥150k–180k	Max-setting VR. A bargain used
RTX 4080	16GB	Low stock	¥130k–160k	Nearly identical to the 4080 SUPER. Used is a good buy
RTX 4070 Ti SUPER	16GB	Low stock	¥110k–140k	Tends to be the cheapest 16GB used
RTX 4070 Ti	12GB	Low stock	¥90k–110k	Good for VR. 12GB is limiting for AI
RTX 4070 SUPER	12GB	Low stock	¥80k–100k	Low power, comfy VR. Good used value
RTX 4070	12GB	Low stock	¥70k–90k	Entry low-power VR. 12GB works for image generation too
RTX 4060 Ti 16GB	16GB	Low stock	¥70k–100k	Narrow bandwidth, so LLMs are slow. The 16GB version is scarce even used
RTX 4060 Ti 8GB	8GB	Low stock	¥40k–50k	For lightweight VR titles
RTX 4060	8GB	Low stock	¥30k–40k	Lowest power. LLMs only up to 8B models

See detailed RTX 40-series specs (bandwidth, TDP, LLM score)

GPU	Memory bandwidth	TDP	LLM score
RTX 4090	1008 GB/s	450W	100
RTX 4080 SUPER	736 GB/s	320W	70
RTX 4080	716 GB/s	320W	68
RTX 4070 Ti SUPER	672 GB/s	285W	62
RTX 4070 Ti	504 GB/s	285W	55
RTX 4070 SUPER	504 GB/s	220W	50
RTX 4070	504 GB/s	200W	45
RTX 4060 Ti 16GB	288 GB/s	165W	32
RTX 4060 Ti 8GB	288 GB/s	160W	28
RTX 4060	272 GB/s	115W	22

The RTX 40 series is nearly out of stock new. This is a generation to target used. The RTX 4070 Ti SUPER tends to fall into the cheapest 16GB-used band and handles both VR and AI — a well-balanced option. Note that the 4060 Ti 16GB has narrow bandwidth.

NVIDIA GeForce RTX 30 series (Ampere, released 2020–2022)

GPU	VRAM	New price	Used range	Comment
RTX 3090 Ti	24GB	Discontinued	¥140k–180k	About the same performance as the 3090. High 450W draw. The 3090 is often cheaper
RTX 3090	24GB	Discontinued	¥100k–140k	24GB in the ¥100k range. The used LLM standard
RTX 3080 Ti	12GB	Discontinued	¥60k–80k	Wide bandwidth but 12GB VRAM. Gaming + light AI
RTX 3080 12GB	12GB	Discontinued	¥50k–70k	Nearly identical to the 3080 Ti
RTX 3080 10GB	10GB	Discontinued	¥40k–60k	10GB is awkward. 8B models will run
RTX 3070 Ti	8GB	Discontinued	¥30k–50k	Cheap as a used VR card
RTX 3070	8GB	Discontinued	¥30k–40k	The used entry-VR standard
RTX 3060 Ti	8GB	Discontinued	¥25k–40k	Good-value used VR card
RTX 3060	12GB	Discontinued	¥20k–30k	12GB in the low ¥20,000s. Narrow bandwidth, but good for entry-level LLMs

See detailed RTX 30-series specs (bandwidth, TDP, LLM score)

GPU	Memory bandwidth	TDP	LLM score
RTX 3090 Ti	1008 GB/s	450W	75
RTX 3090	936 GB/s	350W	70
RTX 3080 Ti	912 GB/s	350W	48
RTX 3080 12GB	912 GB/s	350W	45
RTX 3080 10GB	760 GB/s	320W	42
RTX 3070 Ti	608 GB/s	290W	32
RTX 3070	448 GB/s	220W	30
RTX 3060 Ti	448 GB/s	200W	25
RTX 3060	360 GB/s	170W	20

A mining-era generation, so used stock is plentiful. The RTX 3060 12GB (¥20k–30k) and RTX 3090 24GB (¥130k–180k) are the good-value used picks. The 3060 has narrow bandwidth so LLM generation is slow, but 12GB is enough to load the model. The 3090’s 24GB makes it usable for serious local AI.

【中古】ELSA GeForce RTX 3060 12GB

¥52,800 (as of 2026-06-22)

Amazon 楽天市場 Yahoo!

【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

¥148,000 (as of 2026/5/1)

Amazon 楽天市場 Yahoo!

AMD Radeon RX 9000 series (RDNA 4, released 2025)

GPU	VRAM	New price	Used range	Comment
RX 9070 XT	16GB	~¥90k–100k	–	Gaming performance around the 5070 Ti. Linux recommended for AI
RX 9070	16GB	~¥80k	–	16GB for ¥80k. AI-tool support on Windows is unstable

AMD Radeon RX 7000 series (RDNA 3, released 2022–2024)

GPU	VRAM	New price	Used range	Comment
RX 7900 XTX	24GB	~¥180k	¥120k–150k	24GB for ¥180k. A strong pick with ROCm + Linux
RX 7900 XT	20GB	~¥130k	¥90k–120k	20GB is an unusual size. Worth considering at the right price
RX 7900 GRE	16GB	~¥80k	¥60k–80k	16GB in the ¥80k range. For AMD fans
RX 7800 XT	16GB	~¥70k	¥50k–70k	Good for gaming. AI requires a ROCm environment
RX 7700 XT	12GB	~¥50k	¥40k–50k	12GB for ¥50k. Gaming-oriented

See detailed AMD Radeon specs (bandwidth, TDP, LLM score)

GPU	Memory bandwidth	TDP	LLM score
RX 9070 XT	640 GB/s	304W	45 (ROCm)
RX 9070	640 GB/s	220W	40 (ROCm)
RX 7900 XTX	960 GB/s	355W	60 (ROCm)
RX 7900 XT	800 GB/s	315W	50 (ROCm)
RX 7900 GRE	576 GB/s	260W	38 (ROCm)
RX 7800 XT	624 GB/s	263W	35 (ROCm)
RX 7700 XT	432 GB/s	245W	25 (ROCm)

AMD Radeon’s appeal is its low price per GB of VRAM, but AI-tool support (Ollama, ComfyUI, etc.) trails NVIDIA. On Windows, CUDA-targeted tools often won’t run, so a Linux + ROCm environment is the assumption. If gaming is your main use, the value is high, but for AI-first use, NVIDIA is the safer bet.

A note on AMD’s AI support: the LLM scores for AMD GPUs are reference values in a ROCm (Linux) environment. On Windows, AI-tool support is often unstable compared to CUDA, and the scores won’t hold. For AI-first use, NVIDIA is the safer bet.

Quick recommendation table by VRAM

How to read this table
For each VRAM capacity, it lists one recommended GPU each, new and used.
AI use guide = the AI models you can practically run with that VRAM capacity.

VRAM	New pick	Used pick	AI use guide
8GB	RTX 5060 (~¥60k)	RTX 3060 Ti (¥25k–40k)	8B LLM, SD 1.5, medium VR settings
12GB	RTX 5070 (~¥100k)	RTX 3060 (¥20k–30k)	14B LLM, SDXL, high VR settings
16GB	RTX 5060 Ti 16GB (~¥90k–110k)	RTX 4070 Ti SUPER (¥110k–140k)	14B + long context, complex SDXL workflows
20–24GB	RX 7900 XTX (~¥180k)	RTX 3090 (¥130k–180k)	32B LLM, FLUX Dev, nearly all image generation
32GB	RTX 5090 (¥400k–700k; list ~¥400k, inflated on shortages)	–	32B + very long context (32K+)

Notes on the spec tables

Why memory bandwidth matters

LLM generation speed is roughly proportional to memory bandwidth. Even at the same VRAM capacity, a GPU with wider bandwidth generates text faster. For example, between the RTX 3060 (12GB / 360 GB/s) and the RTX 3080 Ti (12GB / 912 GB/s), the same model shows a 2x+ difference in generation speed.

About the LLM score

The scores in this article are relative values with the RTX 4090 set to 100, based on text-generation speed with Q4_K_M quantized models. Actual speed varies with the model and context length. AMD GPU scores are reference values in a ROCm (Linux) environment.

Cautions when buying a used GPU

Used prices are based on shop stock (with a warranty). Flea-market apps can be cheaper, but there’s a risk of mining history and fan wear, so I recommend buying from a shop.

How generation speed is estimated

The “estimated tok/s" figures in this and related articles are calculated with the following formula.

estimated tok/s ≈ memory bandwidth (GB/s) ÷ model size (GB) × 0.75

What this formula means

Every time a GPU generates one token, it reads all of the model’s parameters from memory. The ratio between “the minimum time to read all of it" and “the time it actually takes to generate one token" is about 0.75. The remaining 0.25 goes to reading and writing the past conversation history (the KV cache) and to decompressing the compressed data (Q4 quantization).

Basis for 0.75: measured data

Below are results measured on an RTX 3090 and an RTX 3060 under conditions where the model fully fits in VRAM (April 15, 2026, Ollama 0.20.2, 256-token generation, Linux).

GPU	Bandwidth	Model	Size	Theoretical tok/s	Measured tok/s	Measured ÷ theoretical
RTX 3090	936 GB/s	qwen3:8b	5.2GB	180	126.4	0.70
RTX 3090	936 GB/s	qwen3.5:9b	6.6GB	142	98.0	0.69
RTX 3090	936 GB/s	qwen3:14b	9.3GB	101	76.6	0.76
RTX 3060	360 GB/s	qwen3:8b	5.2GB	69	60.1	0.87
RTX 3060	360 GB/s	qwen3.5:9b	6.6GB	55	46.6	0.85
RTX 3060	360 GB/s	qwen3:14b	9.3GB	39	34.5	0.89

* Theoretical tok/s = memory bandwidth ÷ model size. The theoretical ceiling, assuming “one token can be generated in the time it takes to read all parameters once."

“Measured ÷ theoretical" falls in the 0.69–0.89 range, with a median of about 0.75. The narrower a GPU’s bandwidth (RTX 3060), the more memory-read waiting dominates, so results land close to the theoretical value. On a wide-bandwidth GPU (RTX 3090), the non-read work (reading/writing conversation history, decompression) stands out relatively more, so the ratio comes out a bit lower.

Limits of the estimate

This formula is only valid when the model fully fits in VRAM. If VRAM runs short and the model spills into CPU memory, speed drops dramatically (measured, it can fall to a tenth or less).

Also, when a model is split across two GPUs, you can combine the bandwidth of both, so it’s faster than a single card. In my setup, qwen3.5:27b (17GB) split across an RTX 3090 + RTX 3060 ran at 25.5 tok/s.

Quick estimated-speed table by GPU (8B and 14B models)

GPU	Bandwidth	8B est.	14B est.	Notes
RTX 4060 Ti 8GB	288 GB/s	42	—	14B won’t fit in VRAM
RTX 3060 12GB	360 GB/s	★ 60	★ 35	Author-measured
RTX 5060 Ti 16GB	448 GB/s	65	36
RTX 4070 Ti S 16GB	672 GB/s	97	54
RTX 5070 12GB	672 GB/s	97	—	14B won’t fit in VRAM
RTX 3080 10GB	760 GB/s	110	—	14B won’t fit in VRAM
RTX 5070 Ti 16GB	896 GB/s	129	72
RTX 3090 24GB	936 GB/s	★ 126	★ 77	Author-measured
RX 7900 XTX 24GB	960 GB/s	138	77	Assumes ROCm (Linux)
RTX 4090 24GB	1008 GB/s	145	81
RTX 5090 32GB	1792 GB/s	258	144

★ = author-measured (April 2026, Ollama 0.20.2, Linux). The rest are estimates from the formula (bandwidth ÷ model size × 0.75). Valid only when the model fully fits in VRAM. AMD GPUs are estimated in a ROCm environment.

Note: The estimates are guides only. They vary with the Ollama version, context length, number of concurrent requests, GPU cooling, and more. Cross-reference external benchmark sites (FormulaMod, AwesomeAgents) as well.

Full GPU value ranking

How to read this chart: the longer the bar, the higher the “LLM performance per unit price" — i.e., the better the value. The value metric is “LLM score ÷ GPU price (¥10k units)." Marked [new]/[used].

RTX 5090 32GB [New]

2.4

RTX 4090 24GB [Used]

2.9

RX 7900XTX 24GB [New]

3.3

RTX 5080 16GB [New]

3.5

RTX 4080S 16GB [Used]

4.2

RTX 5070Ti 16GB [New]

4.5

RTX 5060Ti 16GB [New]

4.7

RTX 4070TiS 16GB [Used]

RX 9070 16GB [New]

RTX 4060Ti 16GB [Used]

3.8

RTX 4070S 12GB [Used]

5.6

RTX 5060Ti 8GB [New]

5.7

RTX 5060 8GB [New]

5.8

RTX 3090 24GB [Used]

5.8

RTX 5070 12GB [New]

RTX 3080 12GB [Used]

7.5

RTX 3060 12GB [Used]

* The LLM score is a relative value of text-generation speed with Ollama (Q4_K_M quantized), based on RTX 4090 = 100. AMD is a reference value under ROCm (Linux). Prices are as of April 2026. Used prices are the midpoint of the shop range.

The used RTX 3060 12GB and RTX 3080 12GB stand out on value. Among new cards, the RTX 5070 12GB and RTX 5060 Ti 16GB rank high. The pricey RTX 5090 and RTX 4090 are high-performance but rank low on the value metric. That said, choosing on value alone biases you toward “cheap GPUs with little VRAM," so the key is to secure the VRAM your use case needs first, then look at value.

Conclusion: if you’re torn, these three

How to choose	Recommended GPU	Price guide	Reason
One new card	RTX 5060 Ti 16GB	~¥90k	Cheapest class for 16GB VRAM in a CUDA environment. Runs AI image generation and LLMs practically
One used card	RTX 3060 12GB	¥20k–30k	12GB in the low ¥20,000s. LLM speed is slow, but cheapest if just loading the model comes first
Serious AI	RTX 3090 used / RX 7900 XTX new	¥130k–180k	24GB VRAM runs 32B models. The 3090 is a used CUDA setup; the 7900 XTX assumes Linux + ROCm

MSI GeForce RTX 5060 Ti 16GB VENTUS 2X OC PLUS

¥102,353 (as of 2026-06-22)

Amazon 楽天市場 Yahoo!

【中古】ELSA GeForce RTX 3060 12GB

¥52,800 (as of 2026-06-22)

Amazon 楽天市場 Yahoo!

【中古】MSI GeForce RTX 3090 GAMING X TRIO 24GB

¥148,000 (as of 2026/5/1)

Amazon 楽天市場 Yahoo!

As of April 2026, the RX 7900 XTX (24GB) has dropped to around ¥120k. That’s cheaper than a used RTX 3090 (¥130k–200k) and comes with a new-product warranty. If you can run Linux, the RX 7900 XTX is a very attractive option.

Summary

This article is a reference of the full GPU spec list and prices as of April 2026. Prices move, so check the latest street prices before buying.

For how to choose a GPU (recommendations by use case, VRAM guidelines, a budget-by-budget guide), see the related articles below.

Related
– I want to run an AI chatbot at home → budget-by-budget guide
– How to start local AI with a used GPU → used-GPU buying guide
– Getting started with AI image generation → ComfyUI intro

The prices and specs in this article are as of April 2026. Prices change daily, so check the latest before buying.

Running a Local AI Chatbot at Home: A Budget-by-Budget Guide

Starting Local AI with a Used GPU: The Value of the RTX 30/40 Series

2026 GPU prices — by VRAM

What matters for local AI is the size of the graphics card’s working space (VRAM — you can run as much model as fits here). I looked up the main cards on sale as of June 2026 on price-comparison sites (kakaku.com, etc.) and grouped them by VRAM size. Prices move by shop and timing, so treat them as ballpark figures (across 2026, prices have trended high overall due to a memory shortage).

VRAM	Main GPUs	Street price (new, guide)	Where it sits for local AI
16GB	RX 9060 XT 16GB (AMD)	~¥58k–64k	Cheapest-class 16GB. Affordable entry. Check software support
16GB	RTX 5060 Ti 16GB	~¥80k–95k	The new-NVIDIA entry point. Room up to 14B class. CUDA support makes it safe
16GB	RX 9070 XT 16GB (AMD)	~¥92k–115k	A top seller in 2026. Well-priced and high-performance
16GB	RTX 5070 Ti / 5080	~¥150k–210k	Fast, but capacity tops out at 16GB
20GB	RX 7900 XT (previous gen)	Mostly stock/used	A rare 20GB. Previous gen, so new stock is dwindling
24GB	Used RTX 3090	~¥150k–200k (used)	The shortest path to 24GB. The real pick that runs 30B class. Spiking on AI demand and memory shortage; used only; check condition
24GB	RX 7900 XTX / used RTX 4090	Mostly previous gen/used	There’s almost no current-gen 24GB new; previous gen or used is realistic
32GB	RTX 5090	~¥690k–720k	70B class in reach. Scarce, spiking, high power

Prices are ballpark figures looked up on price-comparison sites as of June 2026 and will change (not measured). Check used prices and condition before buying.

For local AI, the two realistic choices in 2026 are: to start affordably new, 16GB (RTX 5060 Ti or the RX 9060/9070 line); to aim for 24GB, a used RTX 3090 (recently spiking). New 24GB has all but vanished from the current generation (even the RTX 5080 is 16GB), and beyond that you jump straight to the 32GB RTX 5090 (~¥700k). Decide first how large a model you want to run, then choose from the VRAM class that fits it and you won’t go wrong. Note that AMD (RX line) support varies by software, so it’s wise to confirm the tools you want to use actually run before buying.

▶ Go deeper on local AI (related)