A Practical dbt Testing Strategy

Table of Contents

Most data teams know they should be testing their dbt models. Fewer have a clear strategy for what to test, where to add tests, and how to avoid a false sense of security from tests that don’t catch real problems.

This is what I’ve landed on after a few iterations.

The Problem with “Just Add Tests”
#

Adding not_null and unique tests to every column feels thorough. It isn’t. These tests catch structural problems but miss business logic errors — the kind where the data is technically valid but numerically wrong.

The goal of a testing strategy isn’t maximum coverage. It’s catching the failures that would actually hurt someone.

Layer-by-Layer Testing
#

Staging models — test the raw data you don’t control:

not_null on required fields
accepted_values on status/type enums
unique on natural keys you’re treating as unique

These tests catch upstream schema changes early, before they propagate into marts.

Intermediate models — test join logic:

Row count assertions (joining A to B should never multiply rows unless expected)
relationships tests across foreign keys
Custom tests for business rules that aren’t obvious from column names alone

Mart models — test business correctness:

Metric bounds (revenue shouldn’t be negative, conversion rate shouldn’t exceed 100%)
Reconciliation against known totals when available
Freshness assertions via dbt source freshness

Custom Tests Worth Having
#

The built-in generic tests (not_null, unique, accepted_values, relationships) cover a lot. A few custom macros I’ve found useful:

Row count comparison between runs — catch unexpected drops in data volume. Date range validation — events should fall within a plausible window. Metric reconciliation — compare a rolled-up mart total against a source system total.

What Not to Test
#

Don’t test transformations that dbt itself guarantees (e.g., a coalesce works how it says it does)
Don’t duplicate tests across layers — if staging validates an enum, intermediate doesn’t need to re-validate it
Don’t add severity: error on tests you’d actually want to warn about. Use warn for informational tests, error for blocking ones

Making Tests Useful in Practice
#

Tests are only useful if they run. Set up CI to run dbt test on every PR. Set up alerting on production run failures.

Also: when a test fails, the fix should be obvious. If you have to investigate what the test means before you can fix it, the test isn’t well-named or well-documented. Add a description.

The Honest Limitation
#

dbt tests are great at catching structural and invariant violations. They’re weak at catching subtle logic errors — a miscalculated metric that’s off by 3% will pass every structural test. For that, you need either reconciliation against a source of truth or a human who knows what the numbers should look like.

Tests are a layer of defense, not a proof of correctness.

The Problem with “Just Add Tests” #

Layer-by-Layer Testing #

Custom Tests Worth Having #

What Not to Test #

Making Tests Useful in Practice #

The Honest Limitation #