Who this is for: anyone who writes SQL or reviews it — you don’t need to be a DBA. Read top to bottom, or jump to the big table below.
The simple idea (before jargon)
Imagine a sorted list of user IDs in an index. A good filter says: “give me the row where user_id equals this number.” The database finds the spot in the sorted structure and stops.
A bad filter (for that index) says: “for every row, turn user_id into something else, then compare.” That is more work — often every row in the table. In production, “every row” means disk I/O, CPU, and queueing that did not exist when the table was small.
SARGable (Search ARGument-able) means: the column in WHERE is written in a plain form the index was built for. Non-SARGable means something is done to the column first (function, wrong type, math on the column). Index suppression is when the planner picks a plan that does not use your index the way you hoped — very often because of those predicates.
Types of queries: what goes wrong in production
The table below is the heart of the post. Each pattern is a kind of SQL people write every day. The production issue column is what teams actually see when data is big and traffic is concurrent — not what you see on a laptop with 5,000 rows.
| Query type / pattern | What it often looks like | Typical production symptoms |
|---|---|---|
| Function on the column | YEAR(created_at) = 2024, ROUND(amount,2) = ?, LOWER(email) = ? | Reports and dashboards time out or run minutes; CPU high on DB; same query “was fine” last year when the table was small. |
| Wrong type vs column | String from API bound to INT column (or the reverse) | “Random” slowness on lookups by ID; works in tests; p99 explodes in prod; no obvious error in app logs. |
| Math on the column | WHERE price * 1.1 > 100 | Catalog / pricing queries degrade as catalog size grows; hard to spot without EXPLAIN. |
| Leading wildcard search | LIKE '%@gmail.com', LIKE '%error%' | Search boxes hammer the DB; timeouts; full scans on large text columns; need caching or a search engine. |
| OR mixing different access paths | status = 'x' OR updated_at > ? | Unexpected full scans; plans flip after stats change; intermittent slow queries. |
| NOT IN with subquery | WHERE id NOT IN (SELECT …) | Heavy nested loops or bad estimates; minutes-long queries; wrong results if NULLs appear in subquery. |
| NULL handled wrong | WHERE col = NULL instead of IS NULL | Silent wrong results; confusing reports; then “fixes” that still don’t use indexes well until logic is corrected. |
| Filter on expression without index support | WHERE col_a * col_b > 100 with no expression index | Analytics-style filters get slower linearly with table growth; need generated column or index on expression (engine-specific). |
One sentence to remember
If the database cannot turn your WHERE into “seek or narrow range in the index,” it will often do “read almost everything” — and everything gets expensive in production.
Why the index “disappears” (short version)
B-tree indexes keep values in sorted order so the engine can seek and scan a short range. If your condition is f(column) = value and f changes each cell (year, round, upper, cast), the sorted order of column is not the order of f(column), so the btree cannot narrow the search cheaply. The fix is usually: rewrite so column stays bare, and move math, casts, and calendar logic to the parameter side or to application code.
1. Functions and casts on the column
Any one-sided function on the column — date parts, rounding, case folding — tends to block a normal index seek unless you add a matching expression index (possible in PostgreSQL; still explicit design).
| Avoid | Prefer (direction) |
|---|---|
WHERE YEAR(created_at) = 2024 | Range on the timestamp: >= start of year and < start of next year. |
WHERE DATE(ts) = … | Half-open range on ts for that day (watch time zones). |
WHERE ROUND(amount, 2) = ? | Compute a numeric range on amount, or bucket in app and use >= / <. |
WHERE UPPER(email) = ? | Correct collation, or store normalized email + index that column. |
2. String vs number (very common in APIs)
HTTP paths and JSON often give you strings. If the column is INT / BIGINT, binding a string can force conversions and bad plans. Fix at the boundary: parse, validate, then pass a typed value to SQL.
const id = Number.parseInt(req.params.id, 10);
if (!Number.isFinite(id)) return res.status(400).end();
// Bind id as a number to match the SQL column type.
3. Arithmetic on the column
Write WHERE price > 100 / 1.1 instead of WHERE price * 1.1 > 100 — same math, but the column stays untouched.
4. LIKE: prefix vs “contains”
LIKE 'abc%' often behaves like a range (engine-dependent). LIKE '%abc' or '%a%b%' has no fixed prefix — a normal btree on the whole string usually cannot narrow the search. For real search UX, plan for full-text, trigram (Postgres), or an external search service.
5. OR and mixed predicates
One OR can stop the optimizer from using a single nice index. Sometimes UNION ALL of two simpler queries (each selective) is clearer and faster — measure both.
6. NULL and NOT IN
Use IS NULL / IS NOT NULL, not = NULL. For exclusions, NOT EXISTS is often clearer and plans better than huge NOT IN lists — but always EXPLAIN on your engine and data.
What you see in production (symptoms checklist)
- Latency: p95/p99 jumps while p50 stays OK — classic “sometimes we scan a giant table.”
- Database CPU: sustained high CPU on OLTP nodes during business hours.
- Disk / IOPS: read throughput spikes on large tables without matching traffic growth.
- App side: pool timeouts, “too many connections,” retries that make the storm worse.
- Staging blind spots: everything “fast” until row count crosses a threshold — non-SARGable queries scale badly.
Reading EXPLAIN (simple signals)
| You see | Ask |
|---|---|
| Full table scan on a huge table for a “by id” or “by date range” query | Is the column bare? Types matched? Any function on the column? |
key is null (MySQL) or no index condition (many plans) | Was an index expected? Predicate may not be sargable. |
Estimated rows wildly off | Update statistics; check for skew and bad estimates on OR/NOT IN. |
When rewrite is not enough
Suffix search, fuzzy text, arbitrary JSON paths, and very selective filters on expressions may need different index types, materialized columns, or another system (search, analytics DB). That is normal — the goal is to match the tool to the access pattern.
Before merge: a short checklist
- Indexed columns appear plain in
WHERE(no hidden function). - Parameters match column types at the API boundary.
EXPLAINon realistic row counts.- Slow-query logging in dev/staging with a tight threshold.
Something I got wrong once
On a busy read path I bound a string to an INT column and used ROUND(…) on a numeric column inside WHERE. Indexes existed; the predicates did not let the engine use them well. Fixing types and switching to a plain range on the raw column removed the pain without a migration — a good lesson that “we have an index” is not the same as “this query can seek.”
Closing
Non-SARGable patterns are the same across MySQL, PostgreSQL, and SQL Server; only the EXPLAIN output and exact rewrite differ. Write filters so the column stays in its natural indexed shape, push casts and math to literals or app code, and treat the table above as a map from query shape to what breaks in production.