Most data teams know they should be testing their dbt models. Fewer have a clear strategy for what to test, where to add tests, and how to avoid a false sense of security from tests that don’t catch real problems.
This is what I’ve landed on after a few iterations.
The Problem with “Just Add Tests” #
Adding not_null and unique tests to every column feels thorough. It isn’t. These tests catch structural problems but miss business logic errors — the kind where the data is technically valid but numerically wrong.
The goal of a testing strategy isn’t maximum coverage. It’s catching the failures that would actually hurt someone.
Layer-by-Layer Testing #
Staging models — test the raw data you don’t control:
not_nullon required fieldsaccepted_valueson status/type enumsuniqueon natural keys you’re treating as unique
These tests catch upstream schema changes early, before they propagate into marts.
Intermediate models — test join logic:
- Row count assertions (joining A to B should never multiply rows unless expected)
relationshipstests across foreign keys- Custom tests for business rules that aren’t obvious from column names alone
Mart models — test business correctness:
- Metric bounds (revenue shouldn’t be negative, conversion rate shouldn’t exceed 100%)
- Reconciliation against known totals when available
- Freshness assertions via
dbt source freshness
Custom Tests Worth Having #
The built-in generic tests (not_null, unique, accepted_values, relationships) cover a lot. A few custom macros I’ve found useful:
Row count comparison between runs — catch unexpected drops in data volume. Date range validation — events should fall within a plausible window. Metric reconciliation — compare a rolled-up mart total against a source system total.
What Not to Test #
- Don’t test transformations that dbt itself guarantees (e.g., a
coalesceworks how it says it does) - Don’t duplicate tests across layers — if staging validates an enum, intermediate doesn’t need to re-validate it
- Don’t add severity:
erroron tests you’d actually want to warn about. Usewarnfor informational tests,errorfor blocking ones
Making Tests Useful in Practice #
Tests are only useful if they run. Set up CI to run dbt test on every PR. Set up alerting on production run failures.
Also: when a test fails, the fix should be obvious. If you have to investigate what the test means before you can fix it, the test isn’t well-named or well-documented. Add a description.
The Honest Limitation #
dbt tests are great at catching structural and invariant violations. They’re weak at catching subtle logic errors — a miscalculated metric that’s off by 3% will pass every structural test. For that, you need either reconciliation against a source of truth or a human who knows what the numbers should look like.
Tests are a layer of defense, not a proof of correctness.