Skip to main content

A Practical dbt Testing Strategy

·463 words·3 mins
Author
Albin Cikaj
Building data pipelines and writing about what I learn along the way.

Most data teams know they should be testing their dbt models. Fewer have a clear strategy for what to test, where to add tests, and how to avoid a false sense of security from tests that don’t catch real problems.

This is what I’ve landed on after a few iterations.


The Problem with “Just Add Tests”
#

Adding not_null and unique tests to every column feels thorough. It isn’t. These tests catch structural problems but miss business logic errors — the kind where the data is technically valid but numerically wrong.

The goal of a testing strategy isn’t maximum coverage. It’s catching the failures that would actually hurt someone.


Layer-by-Layer Testing
#

Staging models — test the raw data you don’t control:

  • not_null on required fields
  • accepted_values on status/type enums
  • unique on natural keys you’re treating as unique

These tests catch upstream schema changes early, before they propagate into marts.

Intermediate models — test join logic:

  • Row count assertions (joining A to B should never multiply rows unless expected)
  • relationships tests across foreign keys
  • Custom tests for business rules that aren’t obvious from column names alone

Mart models — test business correctness:

  • Metric bounds (revenue shouldn’t be negative, conversion rate shouldn’t exceed 100%)
  • Reconciliation against known totals when available
  • Freshness assertions via dbt source freshness

Custom Tests Worth Having
#

The built-in generic tests (not_null, unique, accepted_values, relationships) cover a lot. A few custom macros I’ve found useful:

Row count comparison between runs — catch unexpected drops in data volume. Date range validation — events should fall within a plausible window. Metric reconciliation — compare a rolled-up mart total against a source system total.


What Not to Test
#

  • Don’t test transformations that dbt itself guarantees (e.g., a coalesce works how it says it does)
  • Don’t duplicate tests across layers — if staging validates an enum, intermediate doesn’t need to re-validate it
  • Don’t add severity: error on tests you’d actually want to warn about. Use warn for informational tests, error for blocking ones

Making Tests Useful in Practice
#

Tests are only useful if they run. Set up CI to run dbt test on every PR. Set up alerting on production run failures.

Also: when a test fails, the fix should be obvious. If you have to investigate what the test means before you can fix it, the test isn’t well-named or well-documented. Add a description.


The Honest Limitation
#

dbt tests are great at catching structural and invariant violations. They’re weak at catching subtle logic errors — a miscalculated metric that’s off by 3% will pass every structural test. For that, you need either reconciliation against a source of truth or a human who knows what the numbers should look like.

Tests are a layer of defense, not a proof of correctness.