The Integration Tax: Why Your AI Project Is Actually a Plumbing Project

When leadership greenlights an AI project, the conversation almost always starts with the model. GPT or Claude? Open source or hosted? Fine-tuned or prompted? These are real questions. They’re also the wrong place to spend most of your planning.

The hard part of enterprise AI isn’t the model. It’s the plumbing.

The 80/20 you didn’t budget for

Pixeltable’s recent analysis put a number on something most data teams already know in their bones: ML engineers spend roughly 80% of their time on data wrangling rather than model work. FlowFuse reported a similar split for edge AI in a 2026 study — 80% pipeline, 20% intelligence. Royal Cyber’s research on enterprise data stacks landed on 40% pure maintenance overhead just to keep existing pipelines functional.

These numbers aren’t outliers. They’re the pattern. Every serious AI project, at scale, becomes a project about how data moves through your organization, where it lives, who owns it, and what shape it’s in when the model finally sees it.

The ratio tends to surprise leadership because the AI conversation is dominated by the 10–20% — the model selection, the prompt engineering, the hosted-versus-self-hosted decision. That’s the visible part. The invisible 80% is integration, and it’s where projects actually live or die.

What “integration tax” looks like in practice

The phrase is loose. Let’s break it into the three forms it actually takes.

1. Data integration. Your data lives in twelve places. Some of it is in a CRM that hasn’t been seriously modernized since 2014. Some of it is in a warehouse that was set up for analytics, not for serving live decisions. Some of it is in spreadsheets a department head guards like a state secret. Before any model can run, that data has to be located, normalized, deduplicated, refreshed on a schedule that matches the business question, and made available without breaking three other systems that depend on it. This is the part of the project that takes months and rarely shows up in the deck that got the budget approved.

2. System integration. Once the AI produces an output, that output has to go somewhere — a CRM record, a ticketing tool, an ERP screen, a decision queue, a dashboard. Each of those systems has its own auth model, its own API quirks, its own write-rate limits, its own audit trail expectations. Hooking an AI into a 2007-era ERP is genuinely different engineering than hooking it into a modern SaaS, and “we’ll figure that out later” is the most expensive sentence you can say at the start of a project.

3. Operational integration. The model is running, the data is flowing, the outputs are landing where they should. Now: who monitors it? When it breaks at 3am, who pages whom? When the underlying data shifts and accuracy drifts, who notices, and how? AI systems decay differently than traditional software — the code keeps working perfectly while the outputs slowly become wrong. Building the operational layer that catches that drift is plumbing too, and it’s plumbing that almost no pilot accounts for.

Together, these three forms account for the bulk of every successful enterprise AI project we’ve worked on. The model is the easy part.

Why the model is becoming commodity

Here’s the part that makes the integration question more urgent, not less.

The frontier models are converging. The gap between the best-performing model and the second-best, on the kind of business tasks most enterprises actually run, is narrowing every quarter. Pricing is dropping. Hosted inference is improving. Fine-tuning is getting cheaper. The model layer — the part most AI conversations focus on — is increasingly a commodity.

What’s not commoditizing is your integration architecture. That stays bespoke, because your business is bespoke. Your data shape, your system landscape, your workflows, your governance requirements — none of that gets cheaper or more standardized just because GPT got better. If anything, the better the models get, the more the differentiation moves into how well they’re plugged into the systems where work actually happens.

This is why the 5% of pilots that reach production aren’t winning on model selection. They’re winning on integration design. They’ve already mapped the data, the systems, and the operational handoffs before the first prompt gets written.

What this changes about how you scope projects

If 70%+ of an AI project is integration, two things have to change about how the project is run.

Start the integration conversation first. Before you pick a model, before you scope a pilot, before you write a vendor RFP — map the decision the AI is supposed to support, the systems it has to read from, and the systems it has to write to. The integration architecture isn’t a Phase 3 problem. It’s the project.

Budget for the tax explicitly. If the timeline assumes 80% of the work is the model and 20% is integration, you’re going to miss your delivery date and blow your budget. Reverse the ratio in your planning and you’ll be closer to reality. Teams that scope honestly ship; teams that scope optimistically end up in the 95% that never reach production.

The model isn’t where AI projects fail. The plumbing is. Treat your next AI initiative like the integration project it actually is, and your odds of shipping look very different.

If you’re scoping an AI initiative and trying to figure out what the integration footprint actually looks like in your environment, that’s the conversation we have all day at VitaLink Software. The model is the easy part — let’s talk about the rest.