Analyst Prep Guide

SQL Quest — Everything You'll Learn

23 concepts. Every one used by data analysts on the job, every day. Here's what each lesson covers, why it matters, and how you'll use it in the real world.

▶ Launch the Game
Before you play
Why SQL, and why this curriculum?

SQL is the language every data analyst reaches for first. It's how you pull rows, filter data, join tables, and compute summaries — the moves behind almost every report, dashboard, and data question you'll face. This game teaches the complete entry-level toolkit: the exact queries you'll write in your first week on the job and every week after.

It's universal

SQL runs on every major database — Postgres, Snowflake, BigQuery, MySQL, SQLite. Learn it once, use it everywhere.

It's the first question

Before Python, before dashboards, before anything — SQL is how you get the data you need to answer the question.

It's interviewed

Almost every data analyst role tests SQL. This curriculum covers exactly what those interviews probe.

It compounds

Each concept in this game builds on the last. By the end, you'll combine them all in a single query naturally.

The curriculum
All 23 concepts
Tier 1 — Foundations

Single-table fluency: how to read, filter, sort, and deduplicate data before any joining or grouping.

Wave 1 SELECT *
Pull every row from a table.
SELECT is the first thing in every query. The * wildcard means "every column" — you get the whole table, nothing filtered out. It's how you start exploring any dataset.
A teammate asks: "Can you pull the customers table so we can look at it?" — you'd reach for SELECT * FROM customers.
Wave 2 SELECT columns
Choose exactly which columns come back.
Real tables have dozens of columns. Naming only the ones you need keeps results readable and queries fast. It's also clearer to anyone reading your SQL later.
A teammate asks: "Just give me each customer's name and email" — you'd name those two columns instead of selecting everything.
Wave 3 SELECT DISTINCT
Collapse duplicates — return each unique value once.
Duplicates sneak in through logging, joins, and ETL pipelines. DISTINCT is the quick sanity-check: how many unique values actually exist in this column?
A teammate asks: "How many distinct product categories do we carry?" — SELECT DISTINCT category FROM products.
Wave 4 WHERE (text)
Filter rows by a text value — keep only what matches.
WHERE zooms a whole table down to just the rows a question is about. One region, one status, one customer type — WHERE is how you narrow to the exact slice you need.
A teammate asks: "Which orders are still marked pending?" — WHERE status = 'pending'.
Wave 5 WHERE (numbers)
Filter by numeric comparisons — greater than, less than, equal to.
Numeric thresholds are everywhere: orders over $100, sessions longer than 5 minutes, accounts older than a year. Numbers don't get quotes — that's the main gotcha.
A teammate asks: "Show me every transaction over $1,000" — WHERE amount > 1000 (no quotes, it's a number).
Wave 6 AND
Stack two conditions — a row must pass both.
Real questions stack conditions constantly. AND narrows results — each extra AND can only shrink the set, never grow it. Useful for combining category, threshold, and date filters.
A teammate asks: "California orders over $100 that shipped this week" — that's three conditions joined with AND.
Wave 7 OR
Widen the filter — a row passes if either condition is true.
Where AND narrows, OR grows the set. Pull several categories or values in one pass without writing a separate query for each.
A teammate asks: "Orders from California or New York" — WHERE state = 'CA' OR state = 'NY'.
Wave 8 ORDER BY
Sort the result — highest to lowest, A to Z, newest first.
Sorting turns a pile of rows into a ranking. Top sellers, slowest queries, newest signups — ORDER BY is how you make data tell a story. Add DESC for largest-first.
A teammate asks: "Rank our customers by total spend, biggest first" — ORDER BY total_spend DESC.
Wave 9 LIMIT
Cap the result at N rows — "top 5", "top 10".
"Top N" is one of the most-asked analyst questions. LIMIT also lets you safely peek at a huge table without pulling millions of rows when you just need to see the shape.
A teammate asks: "What are our five best-selling products?" — ORDER BY sales DESC LIMIT 5.
Tier 2 — Joining & Filtering

Connecting multiple tables, handling missing data, and more expressive filters.

Wave 10 JOIN (INNER)
Stitch two tables together on a shared column — keep only matching rows.
Real data lives across many tables. JOIN is how you combine orders with customers, users with events, products with inventory. The ON clause specifies which columns link them.
A teammate asks: "Which customers actually placed an order?" — JOIN customers to orders on their shared customer ID.
Wave 11 LEFT JOIN
Keep ALL rows from the left table — fill NULL where there's no right-side match.
INNER JOIN silently drops unmatched rows. LEFT JOIN keeps them. That's essential when "no match" is itself meaningful — customers with no orders, products with no sales, users with no activity.
A teammate asks: "Which customers have never placed an order?" — LEFT JOIN customers to orders; the unmatched customers have NULL on the orders side.
Wave 12 IS NULL
Find rows where a field has no data at all.
NULL means "unknown" or "not recorded" — not zero, not an empty string. You can't test it with =; SQL requires IS NULL. Missing data is everywhere in real tables.
A teammate asks: "Which customers haven't provided a phone number?" — WHERE phone IS NULL.
Wave 13 IN
Check if a value matches any item in a list — cleaner than chaining OR.
When a question has a list of acceptable values, IN keeps the WHERE clause tidy — whether the list is 2 things or 20. It's shorthand for multiple OR conditions.
A teammate asks: "Pull orders for these five customer IDs" — WHERE customer_id IN (101, 204, 307, 412, 518).
Wave 14 BETWEEN
Filter within a range — includes both endpoints.
Range filters are constant: dates between two days, amounts within a band, ages in a bracket. BETWEEN is the readable shorthand for >= x AND <= y.
A teammate asks: "Show me transactions from last quarter" — WHERE order_date BETWEEN '2026-01-01' AND '2026-03-31'.
Wave 15 LIKE
Match by pattern — "starts with", "contains", "ends with".
Exact matching breaks the moment you need partial matches. The % wildcard means "anything, any length." Searching messy text — names, emails, product codes — is constant in analysis.
A teammate asks: "All customers with a Gmail address" — WHERE email LIKE '%@gmail.com'.
Tier 3 — Aggregates & Grouping

Collapsing many rows into summary numbers — the core of reporting and dashboards.

Wave 16 COUNT
"How many?" — count the rows that match.
COUNT is the first aggregate you'll reach for, every day. "How many sign-ups this week?" "How many failed payments?" "How many active accounts?" COUNT answers them all.
A teammate asks: "How many orders did we ship yesterday?" — SELECT COUNT(*) FROM orders WHERE shipped_date = 'yesterday'.
Wave 17 SUM / AVG
Total a numeric column. Average it.
After "how many?" comes "how much?" — SUM totals a column, AVG gives you its average. Both are headline-number generators: total revenue, average order value, total time on site.
A teammate asks: "What was our total revenue this month?" — SUM(amount). "Average order value?" — AVG(amount).
Wave 18 MIN / MAX
Find the smallest or largest value in a column.
Extremes summarize a dataset fast — biggest customer by spend, slowest page by load time, latest activity timestamp, smallest non-zero balance. Part of the same aggregate family as COUNT/SUM/AVG.
A teammate asks: "When did this account last log in?" — MAX(login_time). "Smallest order we've shipped?" — MIN(amount).
Wave 19 GROUP BY
Run an aggregate once per category — one row per group.
GROUP BY is the unlock for almost every analyst dashboard — sales by region, sign-ups per day, errors per service. Without it, an aggregate collapses everything to one number. With it, you get one number per bucket.
A teammate asks: "How many orders per state?" — SELECT state, COUNT(*) FROM orders GROUP BY state.
Wave 20 HAVING
Filter groups after aggregating — WHERE for buckets.
WHERE filters rows before grouping. HAVING filters the groups themselves. The rule: if you're filtering on an aggregate (COUNT > 5, SUM > 1000), it goes in HAVING, not WHERE.
A teammate asks: "Which customers placed more than five orders?" — GROUP BY customer_id HAVING COUNT(*) > 5.
Wave 21 AS + ORDER BY aggregate
Give a computed column a readable name. Sort by it to build a ranked report.
AS renames any column in your output — especially useful for computed ones that would otherwise show up as COUNT(*). Once named, you can ORDER BY that name to turn a breakdown into a ranking. This is the leaderboard pattern every analyst uses.
A teammate asks: "Rank our regions by total sales, biggest first" — SELECT region, SUM(amount) AS total FROM orders GROUP BY region ORDER BY total DESC.
Phase Encounters — Boss Waves

Two-phase challenges that force you to reach into your toolkit twice in sequence. Each phase requires a different query shape.

Wave 22 WHERE → AND
Phase 1: identify a group. Phase 2: narrow it with a second condition.
Two-step targeting mirrors real analysis — first find the right slice, then filter further with AND to act on the exact subset. By this wave you know both; the challenge is combining them under pressure.
Phase 1: WHERE type = 'Warden' — lock all Wardens to break their shields. Phase 2: WHERE type = 'Warden' AND health > 60 — finish only the dangerous ones.
Wave 23 WHERE → OR
Phase 1: isolate one class. Phase 2: widen to catch both hostile classes.
Sometimes you need to run queries in stages — clear one subset first, then pivot to a broader filter. The two phases reinforce that WHERE (narrow) and OR (wide) are opposite tools for opposite jobs.
Phase 1: WHERE type = 'Scout' — clear the escort. Phase 2: WHERE type = 'Raider' OR type = 'Warden' — finish the exposed heavies.
How it works
What to expect in the game

SQL Quest is a turn-based combat game. Enemies appear on screen — your weapon is a SQL query. Write the right query and you hit. Write the wrong one and they hit back. Here's what you need to know before you start.

⚔️

The combat loop

Write a query that returns exactly the right rows to target the right enemies. A perfect first-try clear triggers a Fatality — everything explodes at once. Wrong answer? The enemies counterattack.

💡

Hints + autopilot

Stuck? The "Show hint" button reveals one hint at a time. After 3 missed attempts, Autopilot fills the answer for you and explains it — but you still press Execute yourself to fire the query.

🎯

Read the objective

The 🎯 line below the briefing tells you the code shape — it hints at what to type without giving it away. Read it carefully before you write anything.

🎼

Typing melody

Turn on the 🎼 melody button — your keystrokes play musical tones. Typing a valid SQL keyword makes the pitch rise as you spell it. The music stays as a steady backing; your typing IS the melody.

💼

On the job

Each wave has a 💼 chip that opens a real-world context panel: the analyst scenario where you'd reach for this concept and an example of what a teammate might ask you.

Phase waves

Waves 22–23 are two-phase encounters. Phase 1 breaks the shields — correct query, enemies explode, "SHIELDS DOWN" flashes. Then Phase 2 loads with new enemies and a new query target.