Data Engineering, Industrialized

The craft-era tax on your data team

A craftsman builds one thing at a time with custom tools and hard-won intuition. A factory produces at scale with standard processes, automated quality checks, and repeatable output. Data engineering is still in the craftsman era — and your team is paying for it every day.

The tax shows up in three ways.

Artisanal by default. No standard process, no reusable architecture, no way to go faster the twentieth time than the first. You're paying senior engineers to solve the same infrastructure problems they solved last quarter — and the quarter before that. Every new project is a greenfield rebuild of the same foundation.

The backlog never shrinks. Your engineers aren't slow. They're buried in infrastructure work, environment issues, and deployment toil. The bottleneck isn't talent. It's the absence of a production system around the talent. The data work sits in a queue while the plumbing work consumes the sprint.

Production breaks require a hero. When a pipeline fails at 3am, someone has to diagnose it manually, fix it manually, and hope the fix holds. There's no runbook, no automated recovery, no standard playbook. You're running production workloads on craft-era tooling — and hoping the person who built it is still reachable.

This is not a SQLMesh problem or a dbt problem. Those tools are excellent. This is what happens when excellent tools are operated without a production system underneath them.

What industrialized looks like

The industrial model isn't about removing engineers from the loop. It's about removing the wrong work from their days.

When manufacturing industrialized, craftsmen didn't disappear — they moved up the stack. They designed better products instead of hand-filing parts. They solved harder problems instead of repeating solved ones. The factory handled the repeatable work. The people handled the work that required judgment.

The same transition is available to data engineering teams right now. The tools exist. SQLMesh and dbt-core are production-grade. The infrastructure to run them reliably is well-understood. The gap is a production system that wraps the tools and handles the operational layer.

Here's what that looks like in practice:

Pipelines deploy to production automatically when code merges. Not "someone runs the deploy script when they remember." The act of merging is the deploy. The system handles the rest.
Every change is validated in a staging environment before promotion. SQLMesh's plan workflow makes this possible in a way dbt never could. The system runs the plan, shows the impact, and blocks promotion if something looks wrong.
Failures are detected, diagnosed, and recovered without paging your team. Alerts fire on real failures, not noise. Logs are structured and searchable. Recovery paths are automated where they can be.
New projects go from Git repo to running pipeline in minutes, not sprints. The infrastructure is already there. The patterns are already established. The new project slots into a system that already knows how to run it.

The production system gap

The tools are ready. SQLMesh and dbt-core are production-grade transformation frameworks. The problem is that most teams are still running them without a production system underneath.

That means every team independently solves the same set of problems: containerization, CI/CD pipelines, state database management, scheduling, monitoring, alerting, deployment automation, environment isolation. These are solved problems — but they're solved individually, from scratch, at every company.

The industrial model means building that production system once, correctly, and letting every team benefit from it. Your engineers do the work that requires judgment — data modeling, business logic, analytical thinking. The production system handles the operational layer underneath.

The goal is for engineers to stop asking "is the pipeline healthy?" and start asking "what should the pipeline do next?" That's the transition from craft-era to industrial.