The Family Tree of Open-Source LLMs: 8 Major Families

2026年7月5日

The text-generation AI models you can run on your own PC — local LLMs, meaning conversational AI that runs on your own hardware — fall into a handful of distinct “families." All of them are free and open source (their code and weight data are published for anyone to use), but the companies behind them, what they’re good at, and the paths they’ve taken all differ. This article rounds up the eight major families worth knowing right now, looking at who built each one, how it grew, and whether you can run it on your own machine.

The short version first: around 2023, Meta’s Llama lit the fuse for the open-source movement; from there Qwen, Gemma, DeepSeek and others competed fiercely, and as of 2026 the pace of updates stays brisk, led mainly by Chinese labs. The finer history, family by family, is below.

Contents

1. The lineage at a glance
2. 1. Llama (Meta) — the family that lit the open-source fuse
3. 2. Qwen (Alibaba) — the all-rounder that became the local default
4. 3. Gemma (Google) — the light, efficient family
5. 4. Mistral (France) — the European family that spread MoE
6. 5. DeepSeek (China) — the family that surprised everyone with reasoning and efficiency
7. 6. GLM (Zhipu) — the family strong at code and agents
8. 7. gpt-oss (OpenAI) — the family OpenAI released after a long gap
9. 8. MiniMax (China) — the newcomer large-agent family
10. By the way: fine-tunes (modified versions) as offshoots
11. Summary — get a handle on the families first, then choose

The lineage at a glance

Major open-source LLM families (roughly)

US labs
Llama (Meta) / Gemma (Google) / gpt-oss (OpenAI) / Mistral (France)

Chinese labs
Qwen (Alibaba) / DeepSeek / GLM (Zhipu) / MiniMax

The big picture: Llama lit the fuse in 2023 → others followed → across 2025–2026, Chinese labs and newcomers (like gpt-oss) keep shipping high-performance open models one after another.

1. Llama (Meta) — the family that lit the open-source fuse

Published around 2023 by Facebook (now Meta), this family is the starting point of today’s local-LLM boom. It began as research-only, but partway through it moved to a form that allows commercial use too (open-weight — releasing the weight data), and it became the base for fine-tuning all over the world. It has grown smarter across generations, and recent releases adopt a “run only the experts you need" design (MoE / Mixture of Experts). You can run it locally, though the larger models call for a fairly big graphics card.

2. Qwen (Alibaba) — the all-rounder that became the local default

Built by China’s Alibaba, this family has updated at short intervals since around 2023 and is now the default for local LLMs. It spans a wide range of sizes, from tiny to enormous, and many of its models are “jack-of-all-trades" types that handle prose, math, and code alike. There are code-specialized versions (Qwen-Coder) and a solid lineup of MoE versions that run only the experts you need. On this blog it’s a mainstay of my measurements, standing out for how easy it is to work with and how well-balanced it is.

3. Gemma (Google) — the light, efficient family

Google publishes this family based on the technology behind its large “Gemini" AI. Its selling point is efficiency — running smart at smaller sizes — which makes it easy to run even on a modest PC. With each generation the range has widened, adding versions that can handle images and MoE versions that run only the experts you need. It’s a good family for beginners who just want to try things out.

4. Mistral (France) — the European family that spread MoE

Built by France’s Mistral AI. It drew attention with “Mistral 7B" — small but smart — and its follow-up, “Mixtral," was the driving force that spread the “run only the experts you need" approach (MoE) through the open-source world. As a European effort it has real presence, and it tends to get picked when you want a balance of light weight and performance.

5. DeepSeek (China) — the family that surprised everyone with reasoning and efficiency

China’s DeepSeek is a family that aggressively adopted MoE — huge overall, but narrowing down the parts that actually run for efficiency — along with tricks that look ahead at several words to go faster (MTP / Multi-Token Prediction). Its “reasoning-focused" models, which take their time to think before answering, became a global talking point in particular. It’s hard to run in full on a home machine because it’s so large, but compressed (quantized) versions are in circulation.

6. GLM (Zhipu) — the family strong at code and agents

Built by China’s Zhipu AI (with roots at Tsinghua University), this family began with the conversational model “ChatGLM" and, over successive generations, grew strong at writing code and at “agent" use — working autonomously with tools. As of 2026 its top models are extremely capable, but correspondingly very large, so what you can realistically run at home is mainly the lightweight versions (Flash and the like).

7. gpt-oss (OpenAI) — the family OpenAI released after a long gap

OpenAI, known for ChatGPT, released weight data with this family after a long absence. It uses an MoE design that runs only the experts you need, supports reasoning and tool use, and ranks among the stronger open models. It’s on the large side, but it runs at practical speed on mini PCs with plenty of memory. In my own measurements on this blog, it was a machine with a good balance of smarts and speed.

8. MiniMax (China) — the newcomer large-agent family

China’s MiniMax is a newcomer family that focuses on handling long text and on autonomous agent use. In 2026 it released a large open model that came within a hair of the frontier (the cutting-edge paid AIs). Its total parameter count is enormous, but thanks to the “run only the experts you need" design, it runs at practical speed on gear with plenty of memory.

By the way: fine-tunes (modified versions) as offshoots

The eight families above are the “base" originals, but there are also many “fine-tunes (modified versions)" that retrain those bases for a specific purpose. For example, there are modified versions reinforced for code, and these are offshoots whose parents are originals like Gemma or Qwen. When choosing a local LLM, it’s easiest to first get a handle on the base families, then look for a modified version that fits your use case.

Summary — get a handle on the families first, then choose

Local LLMs grew up like this: Llama lit the fuse, Qwen, Gemma, and Mistral solidified the base, and DeepSeek, GLM, gpt-oss, and MiniMax competed to push performance higher. As of 2026, the momentum of updates clearly favors Chinese labs, now joined by OpenAI’s gpt-oss. Start by pinning down “which family," then check “is there a size that fits my PC?" and “does it match my use case (prose or code)?" — that makes the choice far less confusing. For how much a given model actually runs on a home machine, the records I’ve measured on this blog should be a useful reference.

This article is a summary compiled from each company’s public information and news coverage, not hands-on measurements of my own (the histories and dates are approximate). Technical terms are noted in parentheses.

▶ Go deeper on local AI (related)