The 3 AM Refactor Nobody Asked For
The 3 AM Refactor Nobody Asked For
It started at 9 PM on a Tuesday. A “quick cleanup” before bed.
The service handled webhook deliveries. It worked. Tests passed. Users were happy. But one function bugged me from a code review earlier that day. Not broken — just ugly. A nested conditional that could be a guard clause. Five minutes, tops.
I opened the file. Highlighted the function. Asked Claude to clean it up.
The suggestion was good. Guard clause, early return, readable. I accepted it. The function below used a similar pattern. Might as well fix that too.
Claude agreed. It also flagged inconsistent error handling. Some functions threw. Some returned error objects. “Want me to standardize error handling across this file?”
Yes. Obviously yes.
That was 9:20 PM.
The cascade
By 10 PM I’d touched three files. The error handling refactor changed calling code in file two. File two imported a utility from file three that used the old pattern. Each fix was correct. Each fix revealed the next fix.
By 11 PM the utility file was reorganized. Functions grouped by responsibility. Dead exports removed. Types tightened. Claude found two near-duplicate functions and suggested merging them. Clean merge. Tests still passed. I felt good.
By midnight I was in the database layer. The refactored utility used a query builder with its own inconsistencies. That query builder worked. It had worked for eight months. But the clean layer above made the layer below look messy.
“This query builder could use a more composable API,” Claude offered. “Want me to show you method chaining?”
Of course I did.
Sunk cost in code
Four files deep at midnight. Walking away feels like waste.
You’ve invested three hours. The first three files are cleaner. Stopping now leaves a half-refactored codebase — clean on top, messy underneath. That feels worse than where you started. So you keep going. Not because the work matters. Because stopping would “waste” what you’ve done.
Textbook sunk cost fallacy. The three hours are gone regardless. The code worked before. It works now. Only the formatting changed. But your brain reframes: those three hours are an investment. Investments need to pay off. The payoff is finishing. So you push into file five.
File five is the biggest. Its own problems, its own patterns, its own debt. Claude finds 12 things to improve. Each one is legitimate. Each takes ten minutes. Two hours of work. You think: I’m already here. What’s two more hours after three?
Everything. The difference between six hours of sleep and four. Between showing up sharp and dragging through the morning with coffee as a personality.
But the AI doesn’t know about your morning. It doesn’t model your sleep. It doesn’t factor diminishing returns on refactoring working code. It sees code. It sees improvements. It suggests them.
Dark flow
Behavioral psychologists call it “dark flow.” Regular flow has a clear goal and a natural stopping point. You build a feature, enter flow, ship, stop. Dark flow has no endpoint. You feel productive. Time disappears. But no goal is being achieved. Just activity that resembles work.
The difference is subtle from inside. Both feel the same: focused, competent, engaged. Each solution satisfies. The gap between question and answer is tight — ask Claude, get a response in seconds, evaluate, accept or adjust, move on. The loop is fast. The feedback is immediate. The variable ratio — sometimes perfect, sometimes needs tweaking, sometimes surprising — locks your attention.
Slot machine mechanics in a code editor.
Real flow serves a destination. You know “done” because you defined it before starting. Dark flow has no “done.” There’s always another file. Another inconsistency. Another pattern that could be “more idiomatic.” The AI will never say “this is good enough.” It will never say “go to bed.”
That’s your job. At 2 AM, after five hours of micro-improvements, you’ve lost the ability to do it. Flow captured your executive function. You’re not choosing to continue. Stopping requires a decision. Decisions require the prefrontal cortex. Your prefrontal cortex clocked out two hours ago.
The morning after
I stopped at 3 AM. Not by choice. The build failed on an unrelated dependency issue. The spell broke.
I pushed the branch and went to bed.
Next morning I opened the PR. The diff: 847 lines changed across 9 files. I read through it with coffee, trying to remember why each change mattered. Some were genuine improvements. Most were lateral moves — different, not better. A few added abstraction where none was needed.
Functionality: identical. Every test that passed before still passed. Every test that failed still failed. No bugs fixed. No features added. No performance gained.
Six hours making code prettier for an audience of zero. The service ran on a server. Nobody read it. I was the only developer.
I closed the PR. Reverted the branch.
Four hours of sleep, gone. A sharp morning, gone. For less than nothing — because now I also had the shame of knowing I’d done it to myself. Again.
How to tell the difference
Three questions taped to my monitor. When I feel the pull of “one more improvement,” I stop and answer them.
Can I state my goal in one sentence? “Clean up the code” is a direction, not a goal. “Extract retry logic into a shared utility for the new endpoint” is a goal. If I can’t name a bounded outcome, I’m drifting.
Will stopping now lose meaningful progress? Meaningful means someone besides me will notice. A feature works. A bug affecting users is fixed. A blocked deployment is unblocked. “The code is prettier” is not meaningful. It’s grooming.
Has anyone asked for this work? Not “would someone approve.” Has someone requested it? Is it on a board? In a ticket? If I’m the only person who knows this work exists, that’s a signal. Important work has stakeholders. Midnight refactors have an audience of one.
All three answers “no”? Close the laptop. Not in five minutes. Now. “Five more minutes” is the entry drug. I know where it leads.
The uncomfortable part
The code was fine. It was always fine. The AI didn’t trick me. I chose to start. I chose to continue. The AI just made each step frictionless enough that I never hit a natural stopping point.
That’s the mechanism. Not manipulation — removal of friction. Every barrier that would make you pause — documentation lookup, finding the right pattern, typing boilerplate — gone. The distance between “I could improve this” and “I have improved this” collapsed to seconds. Without those micro-pauses where you’d glance at the clock or feel your eyes burn, the session just continues.
Hygiene isn’t about stopping. It’s about noticing. Noticing three hours passed. Noticing the goal shifted from “fix one function” to “refactor everything.” Noticing the satisfaction is real but the output is hollow.
Our quiz measures two dimensions that map to this pattern. Session Escalation: the “one more thing” spiral. Dark Flow: timeless absorption that feels productive but produces nothing. 14 questions. 3 minutes. Anonymous. Unlike my 3 AM refactor, it might tell you something useful.
OnTilt is a research project studying behavioral addiction mechanisms in AI coding tools. The quiz is a self-check tool, not a diagnostic instrument. Read more on our About page.