For months I had a vague guilt about how much I leaned on AI to ship. Vague, because I had no number. So I did the obvious thing: I built a dashboard.
The problem with vibes
You cannot manage what you cannot see. "I use Claude a lot" is not a metric — it is a vibe. The first job was to turn the vibe into a daily number I could not argue with.
If a tool is shaping how you work, you owe yourself an honest count of what it costs.
I pulled exact token counts where the logs exist (Codex, Claude Code) and honest estimates where they do not (the chat UIs). The trick is to never let an estimate masquerade as exact.
Normalizing the data
Every source gets normalized into one row per local day. Raw exports stay out of the repo; only scrubbed totals get committed.
type BurnDay = {
date: string; // YYYY-MM-DD, local
codex: number; // exact
claudeCode: number; // exact
chatEst: number; // estimated
total: number;
driver: "shipping" | "review" | "research";
};The heatmap that almost lied
My first heatmap was a uniform green smear — every day looked the same. The fix was a log/quantile color scale, so quiet days read light and real spikes read dark.
What I actually learned
The number was higher than my guilt suggested — but concentrated. Two or three shipping days a week account for most of it. That changed how I batch work, not whether I use the tools.