All posts Engineering

SQLMesh for dbt Users: The Mental Model Shifts That Matter

Scotty Pate 11 min read

Introduction

If you're reading this, you probably have production dbt experience and you're looking at SQLMesh. Maybe you've read the feature comparisons. Maybe someone on your team ran through the quickstart. This post isn't another feature matrix — it's about the mental model shifts that trip up experienced dbt users when they start working with SQLMesh in practice.

Some of these differences are immediately obvious. Others only surface when you try to apply dbt patterns to SQLMesh and they don't work the way you expect.

This is Part 3 of a three-part series. Part 1 covers infrastructure, CI/CD, and deployment. Part 2 covers state management, scheduling, and observability.


Environments Share Physical Tables

In dbt, an environment is typically a separate schema or database. Running your models in dev creates tables in a dev schema. Running in prod creates tables in a prod schema. Every environment is a full, independent copy of every table.

SQLMesh environments work differently. An environment is a set of pointers to physical tables. When you create alice_dev and you haven't changed any models, alice_dev points at the same physical tables as prod. No data is duplicated. When Alice modifies a model, SQLMesh creates a new physical table for that model only — everything else still shares.

This means:

The naming convention is the key adjustment. In dbt, analytics_dev.orders is a physical copy of the data. In SQLMesh, analytics__alice_dev.orders is a view — you can open your warehouse and query it directly, but what you're querying is a view that points to a physical table stored in sqlmesh__analytics with an auto-generated fingerprinted name. Those physical tables are shared across environments that reference the same snapshot. This is why creating a new environment is instant and storage doesn't scale linearly with dev environments. sqlmesh evaluate and sqlmesh fetchdf are available for ad-hoc model evaluation, but they're not the primary way to inspect your dev data.


There's No dbt run Equivalent

In dbt, dbt run builds everything. It's the core command. You run it locally, you run it in CI, you run it on a schedule.

SQLMesh splits this into two concepts that serve different purposes:

sqlmesh plan computes what has changed and, when confirmed, applies those changes. This is your deployment command — the thing you run when model definitions have changed and you need to update an environment. Plan handles DDL (creating/replacing tables), backfills (re-executing historical intervals for changed models), and environment state updates.

sqlmesh run executes due intervals based on cron schedules. This is your scheduling command — the thing that runs on a timer to process new data. It does not deploy changes. It only processes intervals that are due and haven't been completed yet.

The mental model shift: in dbt, deployment and scheduling are the same command. In SQLMesh, they're deliberately separated. plan answers "what should the world look like?" and makes it so. run answers "is there new data to process?" and processes it.

If you try to use sqlmesh run to deploy model changes, nothing will happen — run only processes intervals for models that are already deployed. If you try to use sqlmesh plan on a schedule to process new data, it will work but it's doing unnecessary computation (re-evaluating model fingerprints every time) and you lose the clean separation between "change the pipeline" and "run the pipeline."


State Is Persistent and Matters

dbt is stateless by default. Every dbt run looks at your model files, compiles SQL, and executes it. There's no persistent record of what ran before (unless you're using dbt Cloud's metadata or a separate tool like dbt artifacts). The manifest.json and run_results.json are generated fresh each run.

SQLMesh has a persistent state database. It tracks:

This state is what makes features like virtual environments, incremental idempotency, and automatic change detection work. It's also something you have to manage — it needs a real database (Postgres, not SQLite), it needs backups, and it needs periodic cleanup via sqlmesh janitor.

If you lose the state database, SQLMesh doesn't know what's been deployed or what intervals have been processed. See Part 2 for disaster recovery details.


Model Kinds vs. Materializations

dbt has materializations: table, view, incremental, ephemeral. SQLMesh has model kinds, which overlap but aren't the same:

dbt Materialization SQLMesh Kind Key Difference
table FULL SQLMesh FULL rebuilds the entire table on every run. Same behavior as dbt table.
view VIEW Same concept.
incremental INCREMENTAL_BY_TIME_RANGE SQLMesh requires an explicit time column and manages intervals automatically. No is_incremental() conditional logic.
incremental (merge) INCREMENTAL_BY_UNIQUE_KEY Similar to dbt's unique_key strategy, but the model definition is declarative.
ephemeral EMBEDDED SQLMesh inlines the SQL at reference time.
(no equivalent) SEED Like dbt seeds but defined as a model kind.
(no equivalent) EXTERNAL References tables managed outside SQLMesh.

The biggest behavioral difference is in incremental models.

Incremental Models: No More is_incremental()

In dbt, incremental models use the is_incremental() Jinja macro to switch between full-refresh and incremental logic:

-- dbt pattern
SELECT *
FROM source_table
{% if is_incremental() %}
WHERE event_date > (SELECT MAX(event_date) FROM {{ this }})
{% endif %}

This pattern has known problems. The model has two code paths — one for the initial full load and one for incremental loads. The two paths can produce different results if you're not careful, and the SELECT MAX() pattern is fragile when data arrives late or out of order.

SQLMesh eliminates this entirely. You write one query that handles a time range, and SQLMesh manages the intervals:

MODEL (
  name analytics.events_daily,
  kind INCREMENTAL_BY_TIME_RANGE (
    time_column event_date
  )
);

SELECT
  event_date,
  event_type,
  COUNT(*) as event_count
FROM source_db.raw_events
WHERE event_date BETWEEN @start_date AND @end_date
GROUP BY event_date, event_type

The @start_date and @end_date macros are injected by SQLMesh based on which intervals need processing. There's no conditional branching, no querying the target table to find the high watermark, and no two-code-path problem. The same query runs for the initial load and every subsequent interval.

This is a better model, but it requires you to stop thinking in terms of "full refresh vs. incremental mode" and start thinking in terms of "this query processes a time range, and SQLMesh decides which ranges to process."


Change Detection Is Automatic

In dbt, if you change a model, you have to know which downstream models need rebuilding and either run them explicitly or run the full DAG. The --select and + syntax helps, but the burden of knowing what's affected is on you.

SQLMesh fingerprints every model. When you run sqlmesh plan, it compares the current fingerprints against the state database and determines exactly which models have changed and which downstream models are affected. It classifies changes as:

This classification happens automatically. You don't need to specify what to rebuild — SQLMesh figures it out. The plan output tells you exactly what it's going to do, including which models will be backfilled and for which date ranges.

The adjustment for dbt users: you don't manually orchestrate partial DAG runs. You change the models you need to change, run sqlmesh plan, and review what SQLMesh proposes. If the scope of the change surprises you, that's useful information — it means you didn't realize how far the change would propagate.


Testing Works Differently

dbt tests are separate YAML-defined assertions that run after your models. You run dbt test (or dbt build) and it executes queries that check for nulls, uniqueness, accepted values, and custom logic.

SQLMesh has two mechanisms:

Unit Tests

SQLMesh unit tests are true unit tests — they validate model logic with mock data without touching the warehouse:

# tests/test_orders.yaml
test_revenue_calculation:
  model: analytics.orders_daily
  inputs:
    - model: source_db.raw_orders
      rows:
        - order_id: 1
          amount: 100.00
          discount: 10.00
  expected:
    rows:
      - order_id: 1
        revenue: 90.00

These run locally and fast. They test that your SQL logic produces the expected output given specific inputs. This is something dbt doesn't have — dbt tests always run against actual warehouse data.

Audits

Audits are SQLMesh's equivalent of dbt's data tests. They run against actual data after model execution and assert conditions like non-null, accepted ranges, and uniqueness:

MODEL (
  name analytics.orders_daily,
  audits (
    not_null(columns := (order_id, order_date, revenue)),
    accepted_range(column := revenue, min_v := 0),
    unique_values(columns := (order_id))
  )
);

Audits can also be defined in separate files for complex assertions, similar to dbt's custom test macros:

AUDIT (
  name assert_orders_have_customers
);

SELECT *
FROM @this_model
WHERE customer_id NOT IN (SELECT customer_id FROM analytics.customers);

The key difference: in dbt, tests are always a separate step. In SQLMesh, audits run as part of model execution — if an audit fails, the interval is marked as failed. There's no gap between "data was loaded" and "data was validated."


Jinja Is Not the Templating Language

dbt uses Jinja2 for SQL templating. SQLMesh uses its own macro system with a Python-like syntax.

dbt:

SELECT *
FROM {{ ref('source_model') }}
WHERE date_col >= '{{ var("start_date") }}'

SQLMesh:

SELECT *
FROM source_model
WHERE date_col >= @start_date

Notable differences:

For teams with heavy Jinja macro libraries, this is the most labor-intensive part of a migration. For teams that use Jinja mainly for ref() and simple variable substitution, the conversion is mechanical.


The Deployment Workflow Is Different

dbt deployments are typically: change code, merge, dbt run, done. The tool doesn't tell you what changed or what the impact will be. You know because you wrote the code, and you hope you didn't miss a downstream dependency.

SQLMesh deployments are: change code, sqlmesh plan, review the proposed changes (including which models are affected, what backfills are needed, and whether changes are breaking or non-breaking), then confirm. The plan gives you a preview before anything happens.

This is a genuinely better workflow for production deployments. The tradeoff is that it's a slower feedback loop for quick local iteration. In dbt, you just dbt run --select my_model and see results. In SQLMesh, you sqlmesh plan and the tool evaluates the full environment state before proceeding.

For local development, use the sqlmesh plan with --select-model to focus on specific models, or use sqlmesh evaluate model_name to quickly test a single model's output without going through the full plan process.


What Doesn't Change

Some things work the same way and don't require mental model shifts:


Migration Considerations

If you're evaluating a migration from dbt to SQLMesh:

Start with a non-critical project. Don't migrate your production dbt project as the first thing you do. Pick a new project or a lower-stakes existing one.

Budget time for macro conversion. If you have 20+ custom Jinja macros, this is real work. If you're mostly using ref(), source(), and built-in macros, it's fast.

Run both in parallel initially. SQLMesh can read from tables that dbt produces. You can migrate incrementally — move models from dbt to SQLMesh one at a time, using EXTERNAL kinds to reference dbt-managed tables that haven't been migrated yet.

Expect the plan workflow to feel slow at first. After years of dbt run, the plan-then-confirm workflow feels like an extra step. Give it two weeks. Once you've had a plan catch an unexpected breaking change before it reaches production, the extra step starts to feel less like friction and more like a safety net.


Have feedback or corrections? Reach out directly — we update these posts when we find things that are wrong or outdated.