Claude Haiku vs Sonnet vs Opus: Which to Use When
Last updated May 2026. We tested all three current Claude tiers (Haiku 4.5, Sonnet 4.6, Opus 4.6) on the same fixed task set. We pay full price for our Claude Pro subscription and earn affiliate commission if you sign up through our links.
Quick verdict: pick Sonnet for almost everything. Pick Haiku when speed and cost matter more than quality (high-volume classification, real-time chat, agent loops). Pick Opus for the hardest reasoning tasks (research, complex code, ambiguous prompts that need careful thinking). The three tiers are not "good, better, best." They are three different products tuned for different workloads, and the cost difference between Haiku and Opus is enough that picking wrong wastes meaningful money.
Below we walk through each tier, what it costs, what it is good at, and a decision tree we use ourselves.
At-a-glance comparison
Get the no-hype AI weekly
Every Tuesday: one honest review, one tool worth your money, one trap to skip. No fluff.
| Spec | Haiku 4.5 | Sonnet 4.6 | Opus 4.6 |
|---|---|---|---|
| Speed (tokens / sec, our tests) | ~250 | ~85 | ~45 |
| API price (input) | $0.80 / M tokens | $3.00 / M tokens | $15.00 / M tokens |
| API price (output) | $4.00 / M tokens | $15.00 / M tokens | $75.00 / M tokens |
| Context window | 200K | 200K (1M beta) | 200K |
| Vision | Yes | Yes | Yes |
| Tool use | Yes | Yes (best) | Yes |
| Best at | Classification, fast chat, cheap inference | General-purpose work, code, writing | Hard reasoning, ambiguous research |
| Available in | API, Claude.ai (free tier baseline) | API, Claude.ai (Pro default) | API, Claude.ai (Pro on demand) |
| Affiliate link | Try Claude Pro to access all three tiers | ||
Sources: Anthropic pricing and model card pages, retrieved May 2026. Speed numbers are our internal benchmarks across 50 typical prompts.
Haiku 4.5: the speed and cost tier
Haiku is Anthropic's fast and cheap model. The new 4.5 release is roughly 2x faster than 4.0 at similar quality, and the API price (about $0.80 per million input tokens) makes it viable for workloads where cost dominates the design.
Where Haiku wins:
- Bulk classification. Tagging 100,000 customer feedback rows costs cents on Haiku and dollars on Sonnet. Quality is good enough for most classification tasks where the answer is a category, not a paragraph.
- Real-time chat. The 250-token-per-second throughput means responses feel instant in interactive UIs. Sonnet feels deliberate by comparison; Opus feels slow.
- Agent loops. When an agent needs to make 20 tool calls to complete a task, the per-call latency adds up fast. Haiku as the inner loop with Sonnet as the planner is the configuration we ship most often in 2026.
- Embedded model in cost-sensitive products. If you are building a chat feature for a free-tier user, Haiku is the only economically viable Claude tier in most cases.
Where Haiku loses: anything that requires careful reasoning, multi-step planning, long writing, or judgment under ambiguity. Haiku is fast and competent. It is not deep.
When we use Haiku
Tagging support tickets, generating short product descriptions from structured data, summarizing user feedback into categories, running the inner loop of an agent that does many small steps. We do not use Haiku for editorial work, complex code, or research synthesis.
Sonnet 4.6: the daily driver
Sonnet is the default model on Claude.ai Pro and the model most users are interacting with most of the time. The 4.6 release in early 2026 closed most of the quality gap with Opus while keeping the price at $3 input / $15 output per million tokens, which is the sweet spot of the lineup.
Where Sonnet wins:
- General-purpose work. Writing, code, research, planning, analysis. Sonnet is the right answer about 85% of the time in our usage logs.
- Code. Sonnet 4.6 is genuinely competitive with Opus on most coding tasks and faster, which makes it the better choice for the iterative work that defines real coding sessions.
- Tool use. Sonnet is, in our testing, the most reliable at sticking to tool schemas, calling the right tool for the job, and recovering from tool errors. This matters for agents.
- Long documents. The 200K context (1M token beta on select plans) plus solid retrieval reasoning makes Sonnet the right tier for "drop in the whole doc and ask questions."
Where Sonnet loses: the genuinely hard reasoning problems. If you are pushing the model to its limit (PhD-level math, novel algorithm design, ambiguous research syntheses), Opus produces noticeably better answers about a third of the time.
When we use Sonnet
Default for everything in our writing, code, and analysis workflows. Default for Claude Code sessions. Default for our agent's planner role.
Opus 4.6: the deep reasoning tier
Opus is the most capable model in the lineup, the slowest, and the most expensive (5x Sonnet on input, 5x on output). The 4.6 release in 2026 emphasized reasoning depth over speed, and the gap between Opus and Sonnet is meaningful only on the hardest tasks.
Where Opus wins:
- Genuine research synthesis. Reading a stack of papers and producing a novel synthesis is the canonical Opus task in our usage. Sonnet does this competently. Opus does it better, often by a lot.
- Hard math and algorithms. If your problem genuinely requires reasoning through several layers, Opus is the right tier. Most code and most analysis do not need this depth; some genuinely do.
- Ambiguous prompts. When the user did not specify enough, Opus is more likely to ask the right clarifying question or to produce a defensible interpretation. Sonnet sometimes guesses wrong.
- Editorial review of complex documents. We use Opus to review the toughest pieces (legal memos, technical specs) for logical errors after Sonnet has done the bulk of the writing.
Where Opus loses: anything where speed matters, anything routine, anything where the cost-quality tradeoff is unfavorable. About 90% of the queries we run on Opus, in retrospect, would have been fine on Sonnet.
When we use Opus
Hard research synthesis, the toughest editorial review, ambiguous prompts where we want the model's first reading to be careful. We use Opus probably 5% of our overall Claude usage in 2026.
Decision tree
Here is the rule we follow:
- Are you running this prompt more than 1000 times today? Yes → Haiku. No → continue.
- Is latency the bottleneck (interactive chat, real-time UI)? Yes → Haiku. No → continue.
- Is this novel research, hard math, or ambiguous synthesis? Yes → Opus. No → continue.
- Default → Sonnet.
For Claude.ai Pro users, the rule is simpler: stay on Sonnet by default, switch to Haiku for chats where speed matters more than quality, switch to Opus when Sonnet visibly struggles on a hard problem. Don't use Opus for routine work; you will hit your message limit faster and the quality lift is rarely there.
Five real tasks, three tiers each
Task 1: Summarize a 30-page contract
Haiku produced a competent two-paragraph summary in 12 seconds. Sonnet produced a better-structured summary with key clauses called out in 28 seconds. Opus produced a similar-quality summary to Sonnet with one extra observation about an unusual clause, in 52 seconds. Winner: Sonnet (best quality-to-speed ratio).
Task 2: Tag 5000 customer feedback rows by sentiment
Haiku tagged the batch in 7 minutes for $1.20 in API costs at our test rates. Sonnet would have cost about $4.50 for marginally better quality. Opus would have cost about $22 with no meaningful gain. Winner: Haiku.
Task 3: Refactor a 1000-line Python module
Haiku produced a refactor that compiled but missed two cross-cutting concerns. Sonnet produced a clean refactor in one pass. Opus produced a similar quality refactor in two passes (with one extra unit test we did not ask for). Winner: Sonnet.
Task 4: Synthesize 12 research papers on a contested topic
Haiku produced a generic summary. Sonnet produced a competent synthesis with one or two thin spots. Opus produced a synthesis that read like an academic literature review, including a section on disagreements among the authors that Sonnet missed. Winner: Opus.
Task 5: Real-time chat with users on a free tier
Haiku is the only viable answer at the cost and latency we needed. Sonnet would have been more pleasant to talk to and would have priced our product out of the market. Winner: Haiku.
API users vs Claude.ai users
If you are using Claude.ai Pro, the model picker hides the cost. Use the decision tree above and don't worry about API rates. The trade-off you feel is rate limits: Opus uses your message budget faster than Sonnet, and Haiku barely uses it at all. If you are hitting your weekly limit on Opus, downgrade routine work to Sonnet.
If you are building on the API, the cost difference between tiers is the dominant constraint. We default to Sonnet, route the bulk classification work to Haiku, and reserve Opus for the small fraction of calls where the quality lift pays for itself. A typical production deployment ends up roughly 60% Sonnet, 35% Haiku, 5% Opus by token count, which inverts to roughly 30% Sonnet, 5% Haiku, 65% Opus by dollar cost. Be aware of which side of that ratio matters for your business.
Bottom line
Sonnet is the answer for most things. Haiku is the answer when you need volume or speed. Opus is the answer when the problem is genuinely hard. The trick is recognizing which tier a given task wants, and the decision tree above does most of the work. Claude Pro gives you all three on Claude.ai for $20 per month; the API is the route if you are building on top.
Frequently asked questions
What happened to Claude 3 (Opus 3, Sonnet 3, Haiku 3)?
The Claude 3 family is the previous generation. The 4 family launched in 2025 with major capability and speed improvements. Anthropic typically keeps older models available on the API for a while after a new generation launches; check the API model list for current availability.
How does Sonnet 4.6 compare to GPT-5?
Roughly competitive on most benchmarks, with Sonnet ahead on writing and reasoning quality and GPT-5 ahead on multimodal tasks (images, voice). See our head-to-head comparison.
Can I switch tiers mid-conversation on Claude.ai?
Yes. The model picker is per message. You can ask Sonnet a routine question, then switch to Opus for a hard follow-up, then back to Sonnet. The conversation context carries through.
Is the 1M-token context on Sonnet worth it?
For most users, no. The 200K window is enough for almost all real-world documents. The 1M beta matters if you work with codebases of 100,000+ lines or need to drop in entire books. Quality slowly degrades past 200K in our tests; structure your prompt carefully.
Which tier should I use for Claude Code?
Sonnet by default. Switch to Opus for an intentionally hard task. Haiku is too fast-and-shallow for serious code work; we don't recommend it for Claude Code sessions.
What about a paid tier above Opus?
Anthropic has not shipped a tier above Opus as of May 2026. The expected next move is a faster Opus or a new flagship; we will update this article when that happens.
Get the no-hype AI weekly
Every Tuesday: one honest review, one tool worth your money, one trap to skip. No fluff.