Analytics Engineering with dbt: Trusted Metrics at Scale

Q: What is dbt and what problem does it solve?

dbt (data build tool) is a transformation framework that lets data teams write SQL as version-controlled, testable, documented models in Git. It solves the problem where revenue logic lives in a BI tool formula, segmentation logic lives in a saved query, and nobody can answer 'why did this number change?' Teams adopting dbt reduce metric discrepancies by 80-90% within the first quarter because every transformation is auditable, testable, and owned.

Q: What are the three standard dbt model layers?

Staging models clean raw source data with one model per source table and no business logic. Intermediate models apply business logic, joins, and calculations like revenue recognition and attribution. Mart models produce final, aggregated, BI-optimized datasets. This separation means changing a business rule updates one intermediate model, not a dozen BI reports with divergent implementations.

Q: What dbt tests should every model have?

Minimum four tests per model: primary key uniqueness, not-null on primary key, referential integrity between related models, and accepted_values on enum columns. Business-critical models add custom tests like 'monthly revenue cannot drop more than 50% without a known cause.' Mature projects average 3-5 tests per model, all running in CI to block deployment on failure.

Q: How does dbt lineage help with impact analysis?

dbt builds a directed acyclic graph of all model dependencies from ref() references automatically. When changing a staging model, the lineage graph immediately shows every downstream model, test, and dashboard exposure affected. Without lineage, answering 'which dashboards break if this source changes?' requires manually maintained docs that are almost always out of date.

Q: When does a team need an analytics engineer vs. a data engineer?

Data engineers build infrastructure: pipelines, warehouses, orchestration, data movement. Analytics engineers define what data means: metric definitions, entity relationships, segmentation logic, attribution models. Teams with more than 50 dbt models or 10+ data consumers benefit from dedicated analytics engineers. They own the semantic layer between raw data and business decisions, working primarily in SQL and dbt.

Nov 24, 2025 Metasphere Engineering 13 min read

The quarterly business review is underway. The CFO’s dashboard shows one revenue figure for Q3. The CRO’s dashboard shows a number several percent lower. Both pull from Salesforce. The CFO’s Looker tile recognizes revenue at contract signature. The CRO’s Tableau worksheet recognizes it at first payment. Both are technically correct by their own definitions. Neither analyst knows the other’s logic exists. The executive meeting that was supposed to discuss growth strategy turns into a 40-minute argument about which spreadsheet is right.

You have been in this meeting. Everyone has. It is the meeting that makes you question your career choices.

The problem is not the analysts. They are doing their best with the tools they have. The problem is that revenue recognition logic lives in a Looker calculated field, a Tableau worksheet formula, a Salesforce report filter, and three analysts’ saved queries on their laptops. There is no single source of truth. There is no version history. There is no way to answer “when did this number change and why?”

Analytics engineering is the discipline that prevents this. It applies practices software engineers have used for decades, version control, testing, code review, modular design, to the SQL transformations that produce business metrics. When your revenue definition lives in a versioned, tested dbt model called int_revenue_recognized rather than scattered across five tools, the debate about which number is right becomes structurally impossible. Not discouraged. Impossible.

The Modular Layer Architecture

The dbt modeling convention that practitioners converge on has three layers with distinct responsibilities. This is not theoretical. It is what actually works in production.

Staging models are the cleaning layer. One staging model per source table. Zero business logic. None. The purpose is to standardize naming to snake_case, cast data types correctly, handle null values consistently, and rename columns to meaningful names. When Stripe changes charge_amt to amount in their API response, you update stg_stripe_charges and every downstream transformation continues working. The staging layer is your insulation against the outside world. Without it, a single API change ripples through 30 downstream models and your Monday morning is ruined.

Intermediate models are the business logic layer. This is where revenue recognition rules live, where customer lifetime value is calculated, where marketing attribution logic is applied, where session construction happens. Keeping business logic in intermediate models rather than Looker calculated fields means the logic is version-controlled, testable, peer-reviewed, and consistent across every tool. When the CFO and CRO both query mart_monthly_revenue, they get the same number because it comes from the same tested int_revenue_recognized model. The argument from that QBR simply cannot happen.

Mart models are the consumption layer. Aggregated, denormalized, built for BI tools and analysts who should never need to understand the underlying data model. A typical enterprise dbt project has 15-30 staging models, 20-50 intermediate models, and 10-20 mart models. Pairing this layer structure with CI/CD pipelines means schema changes are caught by tests before they reach production. A PR that breaks the revenue model gets a red check on GitHub, not a panicked Slack thread on Monday morning.

Clean architecture is table stakes. But architecture without tests is just organized chaos.

Tests as Data Contracts

dbt tests transform data quality from “we found the bug in a dashboard last Tuesday” to “the deployment was blocked before anyone saw it.” When a test fails, the model deployment is blocked. Broken data never reaches analysts or dashboards because the test catches it first.

The testing pyramid for analytics mirrors software testing principles. Generic tests (uniqueness, not-null, referential integrity, accepted_values) run on every model and catch structural data quality issues.

A primary key uniqueness test that fails means you have duplicate rows that will double-count in every downstream aggregation. A not-null test on a foreign key that fails means your joins will silently drop rows. Production dashboards have underreported revenue by 12% because a join key had 12% nulls and nobody tested for it. That is not a rounding error. That is a board-level problem.

Business logic tests handle the semantic anomalies that generic tests miss. A singular test asserting sum(refunds) <= sum(charges) catches accounting errors before they reach financial reports. A test asserting that daily active users cannot drop more than 30% day-over-day catches data pipeline failures before the executive team asks why the business fell off a cliff. These tests encode business domain knowledge into the data layer and run automatically on every model update. Write them. Maintain them. They will save you.

When a test fails in CI, the result is clear: here is the model, here is the test, here is what it found. The analyst does not need to trace through multiple BI tool formulas to understand where a number went wrong. The test output tells them exactly which assertion failed, on which model, with what data.

Tests guarantee that individual models are correct. But correct models producing inconsistent metrics is a different problem entirely.

The Semantic Layer: Single Source of Metric Definitions

The same metric calculated differently across tools. Revenue in Looker uses one formula. Revenue in Tableau uses another. Revenue in an ad-hoc SQL query uses a third. All three query the same mart table, but they apply different filters, different date logic, or different aggregation methods. Sound familiar? The metric layer solves this by centralizing metric definitions in a single place that all tools consume.

A semantic layer (sometimes called a metrics layer or headless BI layer) is a declarative definition of your business metrics that sits between the data warehouse and the BI tools. Instead of each tool implementing its own version of “monthly recurring revenue,” the definition lives once in the semantic layer: the measure is sum(mrr_amount), the time dimension is date_month, the filters are status = 'active' and plan_type != 'trial'. Every tool queries the semantic layer rather than writing its own SQL against the raw tables.

The tooling ecosystem for semantic layers has matured rapidly. dbt introduced its metrics specification to define metrics alongside models in the same repository. MetricFlow (now integrated into dbt Cloud) compiles metric definitions into optimized SQL for the target warehouse. Cube provides an open-source semantic layer with a REST and GraphQL API that any BI tool can query. LookML, while proprietary to Looker, pioneered many of these concepts and remains widely used. The choice of tool matters less than the principle: every metric should have exactly one authoritative definition.

The impact is measurable and dramatic. Organizations that implement a semantic layer report 80-90% fewer metric discrepancy incidents within the first quarter. The reason is structural, not procedural. When there is only one definition of revenue, there can only be one revenue number. The debates that consumed executive meetings stop happening. Not because people became more agreeable. Because the disagreement became architecturally impossible.

The semantic layer also changes how metric changes are managed. When the CFO decides that revenue recognition should shift from contract signature to first payment, that change happens in one place. The metric definition is updated, reviewed in a PR, tested in CI, and deployed. Every dashboard, report, and ad-hoc query that references the revenue metric automatically reflects the new logic. Without a semantic layer, that same change requires updating Looker calculated fields, Tableau worksheet formulas, Jupyter notebooks, and whichever analyst scripts happen to calculate revenue. Some of those updates will get missed. They always do. The old number and the new number coexist in production for weeks until someone notices. A well-implemented semantic layer pairs naturally with the data quality testing approach described above. Tests validate that the underlying models are correct. The semantic layer guarantees that the business definitions applied on top of those models are consistent.

Documentation That Stays Current

If you have ever maintained a data dictionary in Confluence, you know how this ends: it is accurate on the day it is written and progressively wrong every day after. Nobody updates it. Nobody trusts it. Analytics engineering documentation is generated from the code itself, which means it stays current automatically. Column descriptions live in YAML files alongside the SQL models. Run dbt docs generate and you get a searchable documentation site with lineage graphs, test coverage, source information, and freshness metrics.

The critical property is that documentation cannot drift from the code because they live in the same files. When a column’s calculation changes, updating the SQL and its YAML description is the same pull request, reviewed together. Your code reviewer can see that the SQL changed but the description did not, and they flag it. Documentation that requires a separate update step after code changes is documentation that is always wrong. Not slightly wrong. Wrong.

Data engineering teams that adopt dbt consistently report that documentation improves not because engineers became more disciplined, but because the tooling made good documentation the path of least resistance. Surfacing this documentation through observability and monitoring tooling gives data consumers a reliable way to understand freshness, lineage, and test coverage before trusting a metric. That is the precondition for data actually being used to make decisions rather than argued about.

When a new analyst joins the team, they can browse the dbt docs site, see every metric definition with its SQL, trace the lineage back to the raw source, and understand what tests protect the data. All before writing their first query. That is onboarding in hours instead of weeks. And it is the kind of institutional knowledge that survives team turnover because it lives in the codebase, not in someone’s head.

Good docs and tested models get you most of the way. The rest depends on how you organize the people writing them.

Team Structure and Workflow

The tooling only works if the organizational structure supports it. Where analytics engineers sit, how they collaborate with data engineers and stakeholders, and how their work is reviewed all determine whether dbt becomes a genuine transformation or an expensive experiment that three people use and everyone else ignores.

Organizational placement is the first decision, and most companies get it wrong on the first try. Some centralize analytics engineers in a single team that serves the whole organization. Others embed them directly within domain teams (marketing, finance, product) alongside the analysts they support. The embedded model produces better business context. An analytics engineer who sits in the finance team understands revenue recognition nuances, accrual timing, and the difference between bookings and billings. A centralized analytics engineer serving finance as one of eight domains rarely develops that depth. The tradeoff is consistency. Centralized teams enforce uniform coding standards, testing practices, and naming conventions more easily. The most effective structure for organizations with more than 100 dbt models is a hybrid: analytics engineers are embedded in domain teams for day-to-day work, but they report into a central analytics engineering function that sets standards, maintains shared macros, and coordinates cross-domain model dependencies.

The PR review workflow for SQL models needs two distinct perspectives. The analytics engineer reviews the technical implementation: is the SQL performant, does it follow the layer architecture, are tests sufficient, is the model materialized correctly. A business stakeholder (often a senior analyst or domain lead) reviews the semantic correctness: does the revenue logic match the company’s revenue recognition policy, does the churn definition align with how the board measures retention. Skip business review and you get technically excellent models that calculate the wrong thing. Skip technical review and you get correct definitions implemented in SQL that scans terabytes when it could scan gigabytes. Both failures are expensive. Both reviews happen on the same pull request in Git. This workflow integrates naturally with CI/CD pipelines that run dbt tests automatically on every PR, so the reviewer already knows whether the tests pass before they begin their review.

The boundary between analytics engineers and data engineers is clear in practice even when it looks blurry on paper. Data engineers own the infrastructure: ingestion pipelines, warehouse configuration, orchestration schedules, compute optimization, and raw data availability. Analytics engineers own the transformation layer: what the data means, how entities relate, how metrics are calculated, and how consumers access the results. The handoff point is the source tables. Data engineers guarantee that source tables are fresh, complete, and structurally stable. Analytics engineers take those source tables and build the staging, intermediate, and mart models that give the data business meaning. When these boundaries are unclear, you get data engineers writing business logic they do not fully understand or analytics engineers debugging Airflow DAGs they should never need to touch. Define the interface. Let each role focus on what they do best.

Scaling the analytics engineering function follows a predictable curve. A single analytics engineer can manage 30-50 dbt models serving 5-10 regular data consumers. Beyond that, model maintenance, test upkeep, stakeholder requests, and PR reviews consume more time than one person has. At 100 models with 20 or more consumers, you need 3 analytics engineers. At 200 models with cross-domain dependencies and a semantic layer, you need 5 or more. The signals that you have outgrown your current team size are consistent: PR review turnaround exceeds 3 business days, test coverage starts declining because nobody has time to write tests for new models, and stakeholder requests sit in a backlog for weeks. These are not problems you solve by asking the existing team to work harder. They are capacity signals that the function needs investment. The organizations that treat analytics engineering as a staffing priority rather than a side responsibility for analysts see 2-3x faster time to insight and dramatically fewer metric trust issues across the business. The ones that treat it as “something the analysts can handle” end up back in that QBR arguing about which spreadsheet is right.

Frequently Asked Questions

What is dbt and what problem does it solve?

dbt (data build tool) is a transformation framework that lets data teams write SQL as version-controlled, testable, documented models in Git. It solves the problem where revenue logic lives in a BI tool formula, segmentation logic lives in a saved query, and nobody can answer ‘why did this number change?’ Teams adopting dbt reduce metric discrepancies by 80-90% within the first quarter because every transformation is auditable, testable, and owned.

What are the three standard dbt model layers?

Staging models clean raw source data with one model per source table and no business logic. Intermediate models apply business logic, joins, and calculations like revenue recognition and attribution. Mart models produce final, aggregated, BI-optimized datasets. This separation means changing a business rule updates one intermediate model, not a dozen BI reports with divergent implementations.

What dbt tests should every model have?

Minimum four tests per model: primary key uniqueness, not-null on primary key, referential integrity between related models, and accepted_values on enum columns. Business-critical models add custom tests like ‘monthly revenue cannot drop more than 50% without a known cause.’ Mature projects average 3-5 tests per model, all running in CI to block deployment on failure.

How does dbt lineage help with impact analysis?

dbt builds a directed acyclic graph of all model dependencies from ref() references automatically. When changing a staging model, the lineage graph immediately shows every downstream model, test, and dashboard exposure affected. Without lineage, answering ‘which dashboards break if this source changes?’ requires manually maintained docs that are almost always out of date.

When does a team need an analytics engineer vs. a data engineer?

Data engineers build infrastructure: pipelines, warehouses, orchestration, data movement. Analytics engineers define what data means: metric definitions, entity relationships, segmentation logic, attribution models. Teams with more than 50 dbt models or 10+ data consumers benefit from dedicated analytics engineers. They own the semantic layer between raw data and business decisions, working primarily in SQL and dbt.

The Modular Layer Architecture

Tests as Data Contracts

The Semantic Layer: Single Source of Metric Definitions

Documentation That Stays Current

Team Structure and Workflow

Make Your Business Metrics Trustworthy

Frequently Asked Questions

What is dbt and what problem does it solve?

What are the three standard dbt model layers?

What dbt tests should every model have?

How does dbt lineage help with impact analysis?

When does a team need an analytics engineer vs. a data engineer?