Single-table fluency: how to read, filter, sort, and deduplicate data before any joining or grouping.
Pull every row from a table.
SELECT is the first thing in every query. The * wildcard means "every column" — you get the whole table, nothing filtered out. It's how you start exploring any dataset.
A teammate asks: "Can you pull the customers table so we can look at it?" — you'd reach for SELECT * FROM customers.
Choose exactly which columns come back.
Real tables have dozens of columns. Naming only the ones you need keeps results readable and queries fast. It's also clearer to anyone reading your SQL later.
A teammate asks: "Just give me each customer's name and email" — you'd name those two columns instead of selecting everything.
Collapse duplicates — return each unique value once.
Duplicates sneak in through logging, joins, and ETL pipelines. DISTINCT is the quick sanity-check: how many unique values actually exist in this column?
A teammate asks: "How many distinct product categories do we carry?" — SELECT DISTINCT category FROM products.
Filter rows by a text value — keep only what matches.
WHERE zooms a whole table down to just the rows a question is about. One region, one status, one customer type — WHERE is how you narrow to the exact slice you need.
A teammate asks: "Which orders are still marked pending?" — WHERE status = 'pending'.
Filter by numeric comparisons — greater than, less than, equal to.
Numeric thresholds are everywhere: orders over $100, sessions longer than 5 minutes, accounts older than a year. Numbers don't get quotes — that's the main gotcha.
A teammate asks: "Show me every transaction over $1,000" — WHERE amount > 1000 (no quotes, it's a number).
Stack two conditions — a row must pass both.
Real questions stack conditions constantly. AND narrows results — each extra AND can only shrink the set, never grow it. Useful for combining category, threshold, and date filters.
A teammate asks: "California orders over $100 that shipped this week" — that's three conditions joined with AND.
Widen the filter — a row passes if either condition is true.
Where AND narrows, OR grows the set. Pull several categories or values in one pass without writing a separate query for each.
A teammate asks: "Orders from California or New York" — WHERE state = 'CA' OR state = 'NY'.
Sort the result — highest to lowest, A to Z, newest first.
Sorting turns a pile of rows into a ranking. Top sellers, slowest queries, newest signups — ORDER BY is how you make data tell a story. Add DESC for largest-first.
A teammate asks: "Rank our customers by total spend, biggest first" — ORDER BY total_spend DESC.
Cap the result at N rows — "top 5", "top 10".
"Top N" is one of the most-asked analyst questions. LIMIT also lets you safely peek at a huge table without pulling millions of rows when you just need to see the shape.
A teammate asks: "What are our five best-selling products?" — ORDER BY sales DESC LIMIT 5.
Connecting multiple tables, handling missing data, and more expressive filters.
Stitch two tables together on a shared column — keep only matching rows.
Real data lives across many tables. JOIN is how you combine orders with customers, users with events, products with inventory. The ON clause specifies which columns link them.
A teammate asks: "Which customers actually placed an order?" — JOIN customers to orders on their shared customer ID.
Keep ALL rows from the left table — fill NULL where there's no right-side match.
INNER JOIN silently drops unmatched rows. LEFT JOIN keeps them. That's essential when "no match" is itself meaningful — customers with no orders, products with no sales, users with no activity.
A teammate asks: "Which customers have never placed an order?" — LEFT JOIN customers to orders; the unmatched customers have NULL on the orders side.
Find rows where a field has no data at all.
NULL means "unknown" or "not recorded" — not zero, not an empty string. You can't test it with =; SQL requires IS NULL. Missing data is everywhere in real tables.
A teammate asks: "Which customers haven't provided a phone number?" — WHERE phone IS NULL.
Check if a value matches any item in a list — cleaner than chaining OR.
When a question has a list of acceptable values, IN keeps the WHERE clause tidy — whether the list is 2 things or 20. It's shorthand for multiple OR conditions.
A teammate asks: "Pull orders for these five customer IDs" — WHERE customer_id IN (101, 204, 307, 412, 518).
Filter within a range — includes both endpoints.
Range filters are constant: dates between two days, amounts within a band, ages in a bracket. BETWEEN is the readable shorthand for >= x AND <= y.
A teammate asks: "Show me transactions from last quarter" — WHERE order_date BETWEEN '2026-01-01' AND '2026-03-31'.
Match by pattern — "starts with", "contains", "ends with".
Exact matching breaks the moment you need partial matches. The % wildcard means "anything, any length." Searching messy text — names, emails, product codes — is constant in analysis.
A teammate asks: "All customers with a Gmail address" — WHERE email LIKE '%@gmail.com'.
Collapsing many rows into summary numbers — the core of reporting and dashboards.
"How many?" — count the rows that match.
COUNT is the first aggregate you'll reach for, every day. "How many sign-ups this week?" "How many failed payments?" "How many active accounts?" COUNT answers them all.
A teammate asks: "How many orders did we ship yesterday?" — SELECT COUNT(*) FROM orders WHERE shipped_date = 'yesterday'.
Total a numeric column. Average it.
After "how many?" comes "how much?" — SUM totals a column, AVG gives you its average. Both are headline-number generators: total revenue, average order value, total time on site.
A teammate asks: "What was our total revenue this month?" — SUM(amount). "Average order value?" — AVG(amount).
Find the smallest or largest value in a column.
Extremes summarize a dataset fast — biggest customer by spend, slowest page by load time, latest activity timestamp, smallest non-zero balance. Part of the same aggregate family as COUNT/SUM/AVG.
A teammate asks: "When did this account last log in?" — MAX(login_time). "Smallest order we've shipped?" — MIN(amount).
Run an aggregate once per category — one row per group.
GROUP BY is the unlock for almost every analyst dashboard — sales by region, sign-ups per day, errors per service. Without it, an aggregate collapses everything to one number. With it, you get one number per bucket.
A teammate asks: "How many orders per state?" — SELECT state, COUNT(*) FROM orders GROUP BY state.
Filter groups after aggregating — WHERE for buckets.
WHERE filters rows before grouping. HAVING filters the groups themselves. The rule: if you're filtering on an aggregate (COUNT > 5, SUM > 1000), it goes in HAVING, not WHERE.
A teammate asks: "Which customers placed more than five orders?" — GROUP BY customer_id HAVING COUNT(*) > 5.
Give a computed column a readable name. Sort by it to build a ranked report.
AS renames any column in your output — especially useful for computed ones that would otherwise show up as COUNT(*). Once named, you can ORDER BY that name to turn a breakdown into a ranking. This is the leaderboard pattern every analyst uses.
A teammate asks: "Rank our regions by total sales, biggest first" — SELECT region, SUM(amount) AS total FROM orders GROUP BY region ORDER BY total DESC.
Two-phase challenges that force you to reach into your toolkit twice in sequence. Each phase requires a different query shape.
Phase 1: identify a group. Phase 2: narrow it with a second condition.
Two-step targeting mirrors real analysis — first find the right slice, then filter further with AND to act on the exact subset. By this wave you know both; the challenge is combining them under pressure.
Phase 1: WHERE type = 'Warden' — lock all Wardens to break their shields. Phase 2: WHERE type = 'Warden' AND health > 60 — finish only the dangerous ones.
Phase 1: isolate one class. Phase 2: widen to catch both hostile classes.
Sometimes you need to run queries in stages — clear one subset first, then pivot to a broader filter. The two phases reinforce that WHERE (narrow) and OR (wide) are opposite tools for opposite jobs.
Phase 1: WHERE type = 'Scout' — clear the escort. Phase 2: WHERE type = 'Raider' OR type = 'Warden' — finish the exposed heavies.