Case Study: The Racecar Stuck in First Gear

March 06, 2026
Case Study: The Racecar Stuck in First Gear

How a profitable software company was burning 50% of its engineering capacity — and blaming the wrong thing.


What They Were Experiencing

A fulfillment software company called us in to assess their DevOps and Agile practices. They'd been in business for over 15 years, were profitable, and had a loyal customer base. By any external measure, they were succeeding.

But leadership was frustrated. New features took forever. A major capability their customers were asking for — one that represented a real competitive threat if they couldn't deliver it — looked like it would take years to build. They'd tried adopting Scrum a few years earlier, and the general feeling was that it had made things worse.

The quote that stuck with me from that first meeting: "Scrum has really slowed us down. We used to be able to get things done."

They had five full-time developers, a handful of part-timers, one tester, and a codebase that spanned multiple technology generations. They thought they needed help with their tools and their process. What they actually needed was someone to show them where all the time was going.


What They Thought Was Wrong

Leadership believed the core problems were:

  • Their Scrum adoption wasn't working
  • Their tools (version control, CI/CD, project management) needed to be replaced
  • The development team was too slow
  • They probably needed to hire more developers

They were partially right about all of these. But none of them were the real problem.


What I Actually Found

I spent two days on-site interviewing leadership, developers, testers, product managers, and customer-facing staff. Here's what emerged.

Finding #1: 50% of Developer Effort Was Being Wasted

The team was drowning in context switching. Customer service issues came in as fire drills. Almost every bug — including a literal spelling error in legacy code — was prioritized at the same level as a production outage. Developers were juggling five or more tasks simultaneously.

Gerald Weinberg's research on software teams shows that juggling five simultaneous tasks produces roughly 75% waste from context switching alone. Based on what I observed, this company was conservatively losing 50% of its engineering capacity to multitasking. Everything took twice as long as it should have. The engineering team was effectively twice as expensive as it appeared on paper.

This wasn't a performance problem. It was a structural problem. The developers weren't slow. They were being prevented from focusing.

Finding #2: 156 Workdays Per Year Lost to Branching and Merging

The lead developer estimated spending 24 hours per week on branch and merge activity. That's 156 workdays per year — roughly 60% of a full-time person's working year — consumed by an activity that produces zero customer value.

The root cause: they'd built their branching strategy around environments rather than releases. They had long-lived branches for "beta" and "prod," and every merge between them was essentially creating a new, untested artifact. Each merge reset testing confidence to near zero, but nobody was accounting for that.

Finding #3: QA Was a Volunteer Program

Their quality assurance process was, to put it plainly, hoping that customers on the beta branch would find the bugs. There was no way to know whether anyone was actually testing. Releases to production were governed by date, not by quality.

The lead developer told me: "Deploy from beta to prod is based on a time, not based on any actual metrics."

I wrote in my notes: THAT'S CRAZY!!! (Three exclamation marks. I don't usually do that.)

Finding #4: 10,000 Databases

Their architecture resulted in approximately 10,000 individual SQL Server databases across 270 customers. Each customer had their own set of feature flags and configurations. The team couldn't easily query across databases, regularly hit SQL Server limitations, and lived in constant fear of schema drift between customer instances.

This architecture had made sense when they had a handful of customers. At 270 customers and 10,000 databases, it was a prison.

Finding #5: The Team Couldn't Tell the Truth

The lead developer — the person who'd been there since just out of college, who held most of the system knowledge in her head, who figured out how to route new features through 15 years of accumulated code — seemed defeated. She wanted to mentor her team but didn't have time. She wanted to build exciting new capabilities but couldn't see how. She was stuck maintaining a system that had grown beyond what anyone could fully understand, with no breathing room.

During interviews, people constantly interrupted each other. It was a cycle: nobody felt heard, so everyone interrupted to make sure their point got through, which meant nobody felt heard.

Management had a barely-contained tension with the development lead. They wanted more output. She wanted more focus. Neither side had the vocabulary or the framework to bridge that gap.

Finding #6: A Startup That Never Made the Transition

This was the underlying pattern that connected everything else. The company had been in startup mode for a long time and hadn't made the transition to a mature engineering organization. The informal processes, the "everyone pitches in" culture, the everything-is-a-fire-drill mentality — those things worked when they were small. They were killing them now.

The decisions that got them here were correct at the time. Saying yes to every customer request, building quickly, keeping the team lean — all rational choices. The accumulation of those rational choices created an irrational situation.


The Diagnosis

I told them their team was a racecar stuck in first gear. The engine was screaming and the tachometer was redlining — but they were only going 20 miles per hour.

The core issue wasn't Scrum. Scrum hadn't slowed them down. Scrum had started to make visible how slow they already were — and they shot the messenger.

The core issue wasn't their tools. Replacing version control and project management systems before fixing the human process would only add confusion to an already overwhelmed team.

The core issue was focus. Fix the multitasking problem, and you'd effectively double engineering capacity without hiring a single person. Create a real Definition of Done owned by the team, and you'd stop accumulating invisible technical debt every sprint. Simplify the branching strategy, and you'd reclaim 156 workdays per year. Establish actual QA practices instead of hoping customers would volunteer, and you'd stop releasing prayers into production.


What I Recommended

The recommendations were deliberately sequenced. Fix the humans first, then fix the tools.

  1. Get real Scrum training for the entire company — not just the dev team. Everyone who interacted with the developers needed to understand what focus meant and why interruptions had a quantifiable cost.
  2. Create a team-owned Definition of Done — the existing one had been handed down by management. The team had no buy-in. A real DoD created by the people doing the work would set expectations and prevent the "is it done?" ambiguity that was causing so much friction.
  3. Stop doing branching based on environments — adopt a release snapshot model where the same build artifact moves through all environments. This alone would recover most of that 156-day annual loss.
  4. Redefine what a "bug" is — the current system treated everything as a top-priority fire drill. They needed triage. Not every customer request is an emergency, and treating them all like one was the primary driver of the context-switching waste.
  5. Add feature utilization metrics — they had no idea which features customers actually used. They were maintaining everything equally, including features nobody touched. You can't make rational decisions about where to invest without data.
  6. Delay the tool replacements — replacing the customer service system, version control, and project management tools was a good idea in the long run. But doing it before stabilizing the human process would mean layering new confusion on top of existing chaos.

The Lesson

They didn't call me because they had a DevOps problem. They called me because their engineering organization had outgrown its process and nobody could see why everything felt so hard.

Every individual decision they'd made was defensible. The accumulation of those decisions created an indefensible situation. That's a normal thing that happens to successful companies. It's not failure — it's growing pains.

The diagnosis didn't require exotic tools or frameworks. It required someone from outside the system who could see the patterns that people inside the system couldn't see because they were living in them every day.


If any of this sounds familiar — if your team is working harder than ever but delivering less, if everything takes longer than it should, if you've tried improving your tools and process but nothing seems to stick — it might not be a performance problem. It might be a structural one.

That's the kind of problem I help companies see clearly.

Categories: case-studies
Tags: case-studies