Level 3Lesson 19⏱️ 90 min

How Claude Actually Works: Models, Memory & Mindset

Understand the engine — so you can get dramatically better results every time

Why Most People Get Mediocre Results

Most users treat Claude like a search engine. Experts treat it like a brilliant colleague who needs the right context. That mental model gap produces wildly different results from the same tool.

This lesson gives you the foundation — model selection, tokens, system prompts, and how Claude thinks — that makes every advanced technique in the rest of Level 3 work better.

The Model Lineup

Claude Opus 4.5 — The Thinker

Most intelligent. Use for: complex reasoning, long document analysis, research synthesis, difficult architecture decisions. Slower, higher cost. Best when quality matters more than speed.

Claude Sonnet 4.5 — The Daily Driver

Best balance of intelligence and speed. Handles 90% of real tasks with high quality. Default choice unless you have a specific reason to switch.

Claude Haiku 4.5 — The Sprinter

Fastest, lowest cost. Best for high-volume automation, simple classification, quick structured extraction. Use in pipelines where you need thousands of calls.

Decision rule: Default to Sonnet. Upgrade to Opus only when a task needs deep, multi-step reasoning. Use Haiku for repeatable, simple, automated steps.

Real Example: Same Request, Three Models

Here's the same request sent to all three models. Notice the differences in depth, structure, and nuance:

The Request

We're deciding whether to build our own auth system or use Auth0. 
We're a 5-person startup, 6 months from launch, TypeScript/Next.js stack.

Haiku Response (~100 words)

Use Auth0. Building auth yourself takes 2-4 weeks and introduces 
security risks. Auth0 handles OAuth, MFA, session management, and 
compliance (SOC 2, GDPR). For a 5-person team 6 months from launch, 
your engineering time is better spent on your core product. 

Auth0 free tier covers up to 7,500 monthly active users — enough 
to launch. Switch to paid ($240/month) at scale if needed.

Sonnet Response (~300 words)

Recommendation: Use Auth0 now, with a clear exit strategy.

Why Auth0 wins for your situation:
- 6 months to launch means ~24 weeks. A solid auth system (SSO, MFA, 
  password reset, session management, token rotation) takes 3-5 weeks 
  to build correctly and 2-3 weeks to harden. That's 20%+ of your runway.
- Auth0 is SOC 2 Type II certified. Your custom solution won't be for years.
- You get social login (Google, GitHub) out of the box — typically +15-25% 
  in signup conversion.

The hidden cost of DIY auth:
Every auth bug that reaches production is a reputational event. 
Session fixation, token leakage, password hashing mistakes — 
these are career-defining incidents at a startup.

Exit strategy (when to consider leaving Auth0):
- Hitting 50k+ MAU and paying $3k+/month → evaluate Clerk or Supabase Auth
- Need custom auth flows Auth0 can't support → build specific pieces only
- Building an identity product → then auth is your core competency

Implementation path:
1. next-auth v5 + Auth0 adapter (4 hours setup)
2. Add to CLAUDE.md: "Auth uses next-auth with Auth0 adapter in src/auth/"
3. Set Auth0 tenant: dev / staging / prod separation from day one

The only case for DIY: if your product IS authentication (building an 
identity provider for enterprises). Otherwise, use the tool.

Opus Response (~500 words)

This is a classic build-vs-buy decision with some startup-specific wrinkles 
worth thinking through carefully.

My recommendation: Auth0, but let me explain why the tradeoffs are more 
nuanced than "just use Auth0."

The standard case for Auth0 is clear:
[...deeper analysis of security surface area, regulatory trajectory, 
team velocity math, specific Next.js integration patterns, vendor lock-in 
risk analysis, migration complexity to alternative providers, pricing model 
at different growth scenarios, and 3 specific implementation decisions to 
make before starting...]

The one scenario where I'd reconsider:
If your target customer is enterprise, and you expect SOC 2 Type II 
requirements within 18 months, Auth0's enterprise tier pricing 
($2/MAU above 7,500) can become expensive fast. In that case, 
evaluate Stytch or Clerk — similar capabilities, more predictable 
pricing at enterprise scale...

Notice: Opus proactively identifies an edge case (enterprise customers) that Sonnet and Haiku both missed. That's when Opus earns its cost.

Context Windows and Tokens

A token is roughly 0.75 words (~4 characters). Claude processes your entire conversation in a single context window. Claude Sonnet and newer support 200,000 tokens — approximately 150,000 words.

What this means in practice

Paste whole documents — a 60-page PDF, an entire codebase, 200 emails. Claude handles it. No need to chunk manually.
Long sessions accumulate context — in a 4-hour session, Claude is reading everything you've said. Run /compact when switching tasks to free space.
Irrelevant context hurts — pasting 50 unrelated files dilutes focus. Be intentional about what you include.

Worked example: Token budget for a code review session

Your system prompt / CLAUDE.md:  ~1,500 tokens (2 pages)
5 files you pasted for context:  ~8,000 tokens (10-12 pages of code)
Conversation so far:             ~3,000 tokens
Current question:                ~200 tokens
─────────────────────────────────────────────
Total used:                      ~12,700 tokens
Remaining (200k window):         ~187,300 tokens

You have massive room. In a normal 1-hour session you'll 
rarely exceed 40,000 tokens even with large file pastes.

System Prompts — The Hidden Layer

A system prompt is a set of instructions Claude reads before your conversation starts. It shapes tone, behavior, focus, and constraints — invisibly from the user's perspective.

Where system prompts appear

Claude Projects — "Project instructions" field = your system prompt. Every chat starts with it.
Claude Code — Reads CLAUDE.md as a system prompt at session start.
Cowork Skills — Each SKILL.md becomes part of the prompt when the skill runs.
API calls — The system parameter in every API request.

Real example: Same question, no system prompt vs. with one

Question: "What should I do about my underperforming team member?"

WITHOUT system prompt → Claude gives generic management advice:
"Consider having a direct conversation, setting clear expectations, 
providing regular feedback, and if needed, a performance improvement plan..."

WITH system prompt: "You are a senior engineering manager at a Series B 
startup. I manage a team of 6. Responses should be direct and practical, 
not HR boilerplate. Flag if something is legally sensitive."

→ Claude's response:
"Direct answer: how long and in what ways are they underperforming?
If <3 months, this is a management problem (unclear expectations, 
wrong fit for current work). If >6 months with documented feedback, 
this is a PIP conversation — talk to HR first to understand your 
company's process because wrongful termination risk is real even 
at startups. What specifically is not meeting the bar?"

The second response is actionable. The first is useless.

How Claude "Thinks": Constitutional AI

Claude was trained with Constitutional AI (CAI) — a process where Claude learned to evaluate its own responses against principles: be helpful, be harmless, be honest. Claude isn't just pattern-matching; it reasons about whether its responses are actually good.

What this means for you:

Tell Claude your intent. "I'm building a product for healthcare professionals" gives Claude context to calibrate helpfulness vs. caution appropriately.
Claude can push back. If you ask for something that conflicts with its values, it will say so — and often offer an alternative approach.
Be direct. Claude wants to help. Vague requests get vague results — not because Claude can't, but because it doesn't know what "excellent" looks like for your specific situation.

Hands-On Exercise (~25 min)

Part A: Model Comparison (10 min)

Pick a real decision you're facing at work. Send the same prompt to Haiku, Sonnet, and Opus (switch models with /model haiku, etc.). Document where Sonnet adds value over Haiku, and where Opus goes deeper than Sonnet. This tells you your personal model selection rule.

Part B: Build Your System Prompt (15 min)

Create a Claude Project for your most common work type. Add Project Instructions that include: your role, your company context, how you like answers structured, and 3-5 "do not" rules. Ask a real work question. Compare to asking the same question without the system prompt in a new conversation. Save the version that worked as your template.

Lesson 19 Quick Reference

Opus 4.5

Use when: complex multi-step reasoning, nuanced analysis, the difference between good and great matters. Costs more, thinks deeper.

Sonnet 4.5

Default for 90% of tasks. Best speed/quality balance. If you're not sure which model, use Sonnet.

Haiku 4.5

Use for: high-volume pipelines, simple classification, fast structured extraction. Fastest and cheapest.

200k context window

~150,000 words. Paste whole documents, large codebases, many emails. Use /compact in long sessions to free space.

System prompt

Hidden instructions that shape all of Claude's responses. Set via Project Instructions, CLAUDE.md, or API system parameter.

Constitutional AI

Claude reasons about whether its responses are good, not just pattern-matches. Give Claude your intent for better calibration.

← Level 2 CapstoneLesson 20: Advanced Prompting →