19 Technical debt

19.1 The notebook graveyard

Every data scientist has one: a folder of dead notebooks, half-finished experiments, a utils.py that everything imports and nobody dares touch, a model in models/ trained by a script that no longer runs. It accumulated without anyone deciding it should. Each piece was a reasonable shortcut at the time — a quick experiment, a value hard-coded to ship before a deadline, a function copied rather than shared because copying was faster.

This is technical debt: the accumulated cost of the shortcuts and deferred cleanups that every real project takes on. The term is not an insult. Debt, used deliberately, is a legitimate tool — you borrow time now and agree to pay it back later. The danger is the debt nobody acknowledged, that compounds quietly until the day a small change takes a week because the code around it is a tangle no one understands. Every project carries debt; the only question is whether it’s managed or merely accreting.

19.2 What technical debt is

The financial metaphor is exact enough to be useful. A shortcut borrows time — you ship faster now — and charges interest, in that every future change to that code is slower and riskier than it would have been. There are two kinds. Deliberate debt is a conscious choice: “I’ll hard-code this for the demo and clean it up after”, taken knowingly and (ideally) recorded. Inadvertent debt is the debt you didn’t know you were taking — a design that seemed fine and turned out wrong, an assumption that the data later violated. Deliberate debt is a tool; inadvertent debt, and deliberate debt you forgot about, is the liability.

What makes technical debt insidious is that its interest is invisible until you have to change the code. A shortcut can run perfectly for months:

import numpy as np

# A shortcut taken under deadline: skip the zero-active-days guard,
# because "the data never has zeros". It runs fine — for now.
def mean_spend_per_day(spend, active_days):
    return np.mean(spend / active_days)

clean = mean_spend_per_day(np.array([100.0, 200.0]), np.array([4, 8]))
print(f"on today's clean data:  {clean:.1f}")

# Months later, the upstream source starts emitting a zero. The shortcut
# doesn't crash — it silently returns nonsense, which is worse.
with_zero = mean_spend_per_day(np.array([100.0, 200.0]), np.array([4, 0]))
print(f"the day a zero arrives: {with_zero}")

on today's clean data:  25.0
the day a zero arrives: inf

The shortcut took no time to write and worked flawlessly until the input changed — and then it didn’t fail loudly, it returned inf, the kind of silent wrongness that propagates into a report before anyone notices. That delay between taking the shortcut and paying for it is exactly why debt is so easy to accumulate and so dangerous to ignore.

Data Science Bridge

Technical debt is the un-cleaned-up analysis you already know intimately. Every data scientist has the notebook that “works” but is a thicket of out-of-order cells, hard-coded paths, and variables named df2 — and you already pay its interest every time you reopen it and have to reconstruct how it works before you can change anything. Refactoring is simply tidying that thicket so the next change is cheap instead of frightening.

Where the analogy to financial debt breaks down: a loan has a known interest rate and a repayment schedule, so you can plan around it. Technical debt’s interest is unpredictable and lumpy — it costs nothing at all until the day you have to touch the code, and then it can cost an enormous amount at once. That irregularity is what makes it so easy to defer: there’s no monthly statement reminding you it’s there, so it stays invisible until a deadline collides with it.

19.3 The debt data science accumulates

Some forms of debt are particular to data science work. Untested transformations (Chapter 7) are debt — every one is a change you can’t make safely. Copy-pasted logic (Chapter 6) is debt that compounds, because a fix to one copy leaves the others wrong. Exploratory notebooks promoted to production without cleanup (Chapter 1) are debt with the principal still outstanding. And then there is the debt of dead experiments and abandoned features cluttering the repository, and the glue code holding a pipeline together that everyone is afraid to remove.

Machine learning adds its own categories, catalogued in a widely-cited paper by Sculley and colleagues (Sculley et al. 2015): configuration that sprawls until no one knows which settings are live, data dependencies that change silently upstream, and “pipeline jungles” of accreted transformation steps. Their central point is that in machine learning, debt hides not only in the code but in the data and the configuration — the parts data scientists are least likely to treat as engineering artefacts, and therefore least likely to keep clean.

It’s worth being concrete about what that looks like, because these are the debts a general book on software engineering will never warn you about. There is the dataset nobody can regenerate: a CSV that someone cleaned semi-manually eighteen months ago, sitting in shared storage, still feeding every model you train, with no script that reproduces it from source. There is the feature whose definition has drifted — “active user” meant one thing in the notebook that trained the model and something subtly different in the pipeline that serves it, and the model has been quietly scoring the wrong population ever since. This is usually discussed as training–serving skew, a correctness problem; it is more useful to see it as debt, because it was created the moment the feature logic was written twice instead of once, and it charges interest on every model you train against it. And there is the model in production that nobody can retrain, because the training code lived in a notebook on a laptop that left with its owner. That model works, right up until the data shifts and you discover the only way forward is to rebuild it from scratch.

Notebooks deserve a specific mention here, not as debt in themselves but as the vehicle through which debt escapes notice. A shortcut taken in a module is visible: it shows up in a diff, a reviewer sees it, the absent test is conspicuous. The same shortcut taken in a notebook is invisible to every mechanism that would normally surface it — diffs are unreadable (Chapter 2), there is nothing to import and therefore nothing to test, and no reviewer is reading cell 40. The debt is identical; only the chance of anyone noticing it has changed.

19.4 Managing debt

You don’t eliminate technical debt; you manage it. Three practices do most of the work. Track it — a debt log, or issues in your tracker, so a known shortcut is visible and deliberate rather than forgotten. The boy-scout rule — leave code a little better than you found it — pays debt down opportunistically, a test added or a constant named while you’re in the file for another reason. And scheduled paydown — deliberate refactoring time, not squeezed in around feature work — keeps the larger debts from growing unboundedly.

The judgement is about timing. Taking on debt is right for a prototype that might be discarded, a hypothesis you’re testing, or a genuine deadline — there’s no sense gold-plating code you may throw away tomorrow. Repaying it is right before the code becomes load-bearing: before others depend on it, before it’s scheduled in production, before the shortcut you took for a throwaway becomes a foundation.

Author’s Note

Data science intentionally incurs debt, and that instinct is correct. Exploration is supposed to be fast, messy, and disposable; wrapping a hypothesis you might abandon tomorrow in tests and abstractions is waste, not virtue. The problem is almost never that data scientists take on too much debt during exploration. It’s that the messy exploratory code silently becomes the production system without anyone deciding it should — so debt taken on for a prototype that was meant to live an afternoon is now load-bearing, unacknowledged, and overdue.

The skill, then, is not avoiding debt but tracking it — keeping a record of the shortcuts you’ve taken, so that when a piece of code graduates from scratch to kept, you can see what you owe and choose to repay it deliberately, rather than rediscovering it at 3am when the zero finally arrives. Debt you chose, wrote down, and can repay on your terms is a tool. Debt you forgot you took is the thing that turns a small change into a lost week.

19.5 Making the case for repayment

Scheduled paydown requires something the other two practices don’t: someone else’s agreement to spend time on it. This is where most repayment plans die, and usually because of how the case was made rather than whether it was justified. “This module needs refactoring” asks a manager to fund an aesthetic judgement they have no way to evaluate. It reads as a request to make the code nicer, and against a roadmap of things customers asked for, nicer loses every time.

The case that works is made in the two currencies a manager already budgets in: velocity and risk. Velocity means the cost of future work, and you can usually evidence it from records you already have. If the last three changes touching the feature pipeline each took a week, and two of them shipped a bug that came back as a fix, that is a measured interest rate — not an opinion about code quality. Set it against the estimate for the repayment and the argument becomes arithmetic: two days now, and the next change to this area is an afternoon rather than a week. Risk is the other half, and it is the stronger one for the data-specific debts. A model nobody can retrain isn’t untidy; it’s a system with no recovery path, and the honest framing is that when the data shifts — not if — the response time is measured in weeks of rebuilding rather than hours of retraining. Managers who are indifferent to elegance are rarely indifferent to that.

Two tactics make the case land. Attach the repayment to work already on the roadmap, so it is the enabling first step of something wanted rather than a competing request; a refactor justified by the feature it unblocks is far harder to defer than one justified on its own. And name the trigger rather than asking for time in the abstract — “this doesn’t need fixing now, it needs fixing before we onboard the second model” converts an open-ended plea into a scheduled decision with a deadline attached. The corollary is that dedicated “refactoring sprints” are a weak instrument: they are visible, separable, and therefore the first thing cut when a deadline moves. Repayment folded into the estimate of the feature it serves survives the pressure that a standalone cleanup week does not.

19.6 Summary

Technical debt is inevitable; managing it deliberately is the skill:

Debt is borrowed time with interest. A shortcut is fast now and costlier on every later change; deliberate, acknowledged debt is a tool, while forgotten debt is the liability.
Its interest is invisible until you pay. A shortcut runs fine until the input or the requirement changes, then charges all at once — often as silent wrongness rather than a clean failure.
Data science has its own debts. Untested transforms, copy-paste, promoted notebooks, and — uniquely — debt hiding in data and configuration, not just code.
Manage it: track, tidy, and schedule. Record known shortcuts, leave code better than you found it, and set aside real time for repayment — repaying before the code becomes load-bearing.

The final chapter of this part turns from the debt within a project to the people across it: cross-discipline collaboration.

19.7 Exercises

Audit one of your own projects for technical debt: list the shortcuts, untested code, copy-pasted logic, dead experiments, and hard-coded values you find. Mark each as deliberate (you knew you were taking it) or inadvertent (you’ve only just noticed). Which category was larger, and which item surprised you?
Apply the boy-scout rule: while you’re in a file for some other reason, pay down one debt item — add a test, extract a function, name a constant. How long did it take relative to the change you were originally making?
Start a debt log — a file or a set of issues — recording known shortcuts, the cost of leaving each one, and the event that should trigger its repayment. What makes a debt item worth writing down rather than simply fixing on the spot?
Conceptual: Take three debt items from your audit in Exercise 1 and put them in the order you would repay them. The financial metaphor says clear the highest-interest debt first — but you cannot read technical debt’s interest rate off a statement, so say what you actually used as a proxy for it. Then find one item on your list that the metaphor would have you repay but which you would be better off deleting outright, and explain what the metaphor missed.
Conceptual: Taking on debt is sometimes the right call. Describe a situation where a shortcut is the correct decision and one where it’s reckless, and name the property that distinguishes the two.

--- # Content: CC BY-NC-SA 4.0 | Code: MIT - see /LICENSE.md --- # Technical debt {#sec-technical-debt} ## The notebook graveyard {#sec-notebook-graveyard} Every data scientist has one: a folder of dead notebooks, half-finished experiments, a `utils.py` that everything imports and nobody dares touch, a model in `models/` trained by a script that no longer runs. It accumulated without anyone deciding it should. Each piece was a reasonable shortcut at the time — a quick experiment, a value hard-coded to ship before a deadline, a function copied rather than shared because copying was faster. This is technical debt: the accumulated cost of the shortcuts and deferred cleanups that every real project takes on. The term is not an insult. Debt, used deliberately, is a legitimate tool — you borrow time now and agree to pay it back later. The danger is the debt nobody acknowledged, that compounds quietly until the day a small change takes a week because the code around it is a tangle no one understands. Every project carries debt; the only question is whether it's managed or merely accreting. ## What technical debt is {#sec-what-is-debt} The financial metaphor is exact enough to be useful. A shortcut borrows time — you ship faster now — and charges interest, in that every future change to that code is slower and riskier than it would have been. There are two kinds. *Deliberate* debt is a conscious choice: "I'll hard-code this for the demo and clean it up after", taken knowingly and (ideally) recorded. *Inadvertent* debt is the debt you didn't know you were taking — a design that seemed fine and turned out wrong, an assumption that the data later violated. Deliberate debt is a tool; inadvertent debt, and deliberate debt you forgot about, is the liability. What makes technical debt insidious is that its interest is invisible until you have to change the code. A shortcut can run perfectly for months: ```{python} #| label: debt-comes-due #| echo: true import numpy as np # A shortcut taken under deadline: skip the zero-active-days guard, # because "the data never has zeros". It runs fine — for now. def mean_spend_per_day(spend, active_days): return np.mean(spend / active_days) clean = mean_spend_per_day(np.array([100.0, 200.0]), np.array([4, 8])) print(f"on today's clean data: {clean:.1f}") # Months later, the upstream source starts emitting a zero. The shortcut # doesn't crash — it silently returns nonsense, which is worse. with_zero = mean_spend_per_day(np.array([100.0, 200.0]), np.array([4, 0])) print(f"the day a zero arrives: {with_zero}") ``` The shortcut took no time to write and worked flawlessly until the input changed — and then it didn't fail loudly, it returned `inf`, the kind of silent wrongness that propagates into a report before anyone notices. That delay between taking the shortcut and paying for it is exactly why debt is so easy to accumulate and so dangerous to ignore. ::: {.callout-note} ## Data Science Bridge Technical debt is the un-cleaned-up analysis you already know intimately. Every data scientist has the notebook that "works" but is a thicket of out-of-order cells, hard-coded paths, and variables named `df2` — and you already pay its interest every time you reopen it and have to reconstruct how it works before you can change anything. Refactoring is simply tidying that thicket so the next change is cheap instead of frightening. Where the analogy to financial debt breaks down: a loan has a known interest rate and a repayment schedule, so you can plan around it. Technical debt's interest is unpredictable and lumpy — it costs nothing at all until the day you have to touch the code, and then it can cost an enormous amount at once. That irregularity is what makes it so easy to defer: there's no monthly statement reminding you it's there, so it stays invisible until a deadline collides with it. ::: ## The debt data science accumulates {#sec-ds-debt} Some forms of debt are particular to data science work. Untested transformations (@sec-testing) are debt — every one is a change you can't make safely. Copy-pasted logic (@sec-functions-modules) is debt that compounds, because a fix to one copy leaves the others wrong. Exploratory notebooks promoted to production without cleanup (@sec-notebook-to-system) are debt with the principal still outstanding. And then there is the debt of dead experiments and abandoned features cluttering the repository, and the glue code holding a pipeline together that everyone is afraid to remove. Machine learning adds its own categories, catalogued in a widely-cited paper by Sculley and colleagues [@sculley2015]: configuration that sprawls until no one knows which settings are live, data dependencies that change silently upstream, and "pipeline jungles" of accreted transformation steps. Their central point is that in machine learning, debt hides not only in the code but in the *data* and the *configuration* — the parts data scientists are least likely to treat as engineering artefacts, and therefore least likely to keep clean. It's worth being concrete about what that looks like, because these are the debts a general book on software engineering will never warn you about. There is the dataset nobody can regenerate: a CSV that someone cleaned semi-manually eighteen months ago, sitting in shared storage, still feeding every model you train, with no script that reproduces it from source. There is the feature whose definition has drifted — "active user" meant one thing in the notebook that trained the model and something subtly different in the pipeline that serves it, and the model has been quietly scoring the wrong population ever since. This is usually discussed as training–serving skew, a correctness problem; it is more useful to see it as debt, because it was created the moment the feature logic was written twice instead of once, and it charges interest on every model you train against it. And there is the model in production that nobody can retrain, because the training code lived in a notebook on a laptop that left with its owner. That model works, right up until the data shifts and you discover the only way forward is to rebuild it from scratch. Notebooks deserve a specific mention here, not as debt in themselves but as the vehicle through which debt escapes notice. A shortcut taken in a module is visible: it shows up in a diff, a reviewer sees it, the absent test is conspicuous. The same shortcut taken in a notebook is invisible to every mechanism that would normally surface it — diffs are unreadable (@sec-version-control), there is nothing to import and therefore nothing to test, and no reviewer is reading cell 40. The debt is identical; only the chance of anyone noticing it has changed. ## Managing debt {#sec-managing-debt} You don't eliminate technical debt; you manage it. Three practices do most of the work. *Track it* — a debt log, or issues in your tracker, so a known shortcut is visible and deliberate rather than forgotten. The *boy-scout rule* — leave code a little better than you found it — pays debt down opportunistically, a test added or a constant named while you're in the file for another reason. And *scheduled paydown* — deliberate refactoring time, not squeezed in around feature work — keeps the larger debts from growing unboundedly. The judgement is about timing. Taking on debt is right for a prototype that might be discarded, a hypothesis you're testing, or a genuine deadline — there's no sense gold-plating code you may throw away tomorrow. Repaying it is right *before* the code becomes load-bearing: before others depend on it, before it's scheduled in production, before the shortcut you took for a throwaway becomes a foundation. ::: {.callout-tip} ## Author's Note Data science *intentionally* incurs debt, and that instinct is correct. Exploration is supposed to be fast, messy, and disposable; wrapping a hypothesis you might abandon tomorrow in tests and abstractions is waste, not virtue. The problem is almost never that data scientists take on too much debt during exploration. It's that the messy exploratory code silently becomes the production system without anyone *deciding* it should — so debt taken on for a prototype that was meant to live an afternoon is now load-bearing, unacknowledged, and overdue. The skill, then, is not avoiding debt but *tracking* it — keeping a record of the shortcuts you've taken, so that when a piece of code graduates from scratch to kept, you can see what you owe and choose to repay it deliberately, rather than rediscovering it at 3am when the zero finally arrives. Debt you chose, wrote down, and can repay on your terms is a tool. Debt you forgot you took is the thing that turns a small change into a lost week. ::: ## Making the case for repayment {#sec-making-the-case} Scheduled paydown requires something the other two practices don't: someone else's agreement to spend time on it. This is where most repayment plans die, and usually because of how the case was made rather than whether it was justified. "This module needs refactoring" asks a manager to fund an aesthetic judgement they have no way to evaluate. It reads as a request to make the code nicer, and against a roadmap of things customers asked for, nicer loses every time. The case that works is made in the two currencies a manager already budgets in: velocity and risk. Velocity means the cost of future work, and you can usually evidence it from records you already have. If the last three changes touching the feature pipeline each took a week, and two of them shipped a bug that came back as a fix, that is a measured interest rate — not an opinion about code quality. Set it against the estimate for the repayment and the argument becomes arithmetic: two days now, and the next change to this area is an afternoon rather than a week. Risk is the other half, and it is the stronger one for the data-specific debts. A model nobody can retrain isn't untidy; it's a system with no recovery path, and the honest framing is that when the data shifts — not if — the response time is measured in weeks of rebuilding rather than hours of retraining. Managers who are indifferent to elegance are rarely indifferent to that. Two tactics make the case land. Attach the repayment to work already on the roadmap, so it is the enabling first step of something wanted rather than a competing request; a refactor justified by the feature it unblocks is far harder to defer than one justified on its own. And name the trigger rather than asking for time in the abstract — "this doesn't need fixing now, it needs fixing before we onboard the second model" converts an open-ended plea into a scheduled decision with a deadline attached. The corollary is that dedicated "refactoring sprints" are a weak instrument: they are visible, separable, and therefore the first thing cut when a deadline moves. Repayment folded into the estimate of the feature it serves survives the pressure that a standalone cleanup week does not. ## Summary {#sec-technical-debt-summary} Technical debt is inevitable; managing it deliberately is the skill: 1. **Debt is borrowed time with interest.** A shortcut is fast now and costlier on every later change; deliberate, acknowledged debt is a tool, while forgotten debt is the liability. 2. **Its interest is invisible until you pay.** A shortcut runs fine until the input or the requirement changes, then charges all at once — often as silent wrongness rather than a clean failure. 3. **Data science has its own debts.** Untested transforms, copy-paste, promoted notebooks, and — uniquely — debt hiding in data and configuration, not just code. 4. **Manage it: track, tidy, and schedule.** Record known shortcuts, leave code better than you found it, and set aside real time for repayment — repaying before the code becomes load-bearing. The final chapter of this part turns from the debt within a project to the people across it: *cross-discipline collaboration*. ## Exercises {#sec-technical-debt-exercises} 1. Audit one of your own projects for technical debt: list the shortcuts, untested code, copy-pasted logic, dead experiments, and hard-coded values you find. Mark each as deliberate (you knew you were taking it) or inadvertent (you've only just noticed). Which category was larger, and which item surprised you? 2. Apply the boy-scout rule: while you're in a file for some other reason, pay down one debt item — add a test, extract a function, name a constant. How long did it take relative to the change you were originally making? 3. Start a debt log — a file or a set of issues — recording known shortcuts, the cost of leaving each one, and the event that should trigger its repayment. What makes a debt item worth writing down rather than simply fixing on the spot? 4. **Conceptual:** Take three debt items from your audit in Exercise 1 and put them in the order you would repay them. The financial metaphor says clear the highest-interest debt first — but you cannot read technical debt's interest rate off a statement, so say what you actually used as a proxy for it. Then find one item on your list that the metaphor would have you repay but which you would be better off deleting outright, and explain what the metaphor missed. 5. **Conceptual:** Taking on debt is sometimes the right call. Describe a situation where a shortcut is the correct decision and one where it's reckless, and name the property that distinguishes the two.