The CPA Firm That Cut Busy Season From 80 Hours a Week to 50
A 9-person tax practice was losing two staff per year to burnout. We mapped the work, automated the document chase, and rebuilt the IRS notice workflow. Here's exactly what we built and what it cost.
Every January this firm did the same thing. Partners pulled 80-hour weeks. Senior staff cried in the bathroom by mid-February. Two people quit in March. The cycle reset in October when recruiting started.
I came in October. Six months later busy season landed and the longest week any partner pulled was 53 hours. Nobody quit. They closed two weeks earlier than the prior year.
This is what we actually did.
The audit
Before I built anything I sat with three staffers for a full day each. I watched their screens. I timed everything. I asked dumb questions about why a thing was done a particular way.
What I found, in order of how much time it ate:
1. Document chase. Roughly 14 hours per week per senior just emailing clients asking for the W-2 they didn't send. 2. IRS notice triage. Each notice took 40 to 90 minutes to read, classify, draft a response, and queue for partner review. 3. Manual data entry from PDFs into Drake Tax. 4. Status update emails to anxious clients. 5. Re-doing returns when the engagement letter scope was misunderstood.
Five categories. About 60% of the senior staff's time. Two of those five are the ones we automated first.
Document chase (the boring one that mattered most)
We built a simple n8n workflow tied to their CRM. When a client engagement is opened, a checklist of expected documents is generated based on prior-year filings. The CRM gets a "missing docs" field that's updated every time a document is uploaded to the secure portal.
Every 72 hours, if the field is still incomplete, n8n drafts a polite reminder using a Claude prompt that personalizes based on the client's prior-year behavior. Some clients respond to gentle. Some need direct. The prompt knows.
The reminder gets sent automatically for tier-3 clients. Tier-1 and tier-2 (the high-revenue ones) get a draft into the assigned partner's inbox for a human send.
That alone saved 11 hours per week per senior. Not 14 because some clients still need a phone call. But the gap closed enough to matter.
IRS notice workflow
This was the harder one. IRS notices come in 70 or so flavors. Some are routine ("we received your return"). Some are scary ("we are auditing your business"). The firm was treating all 70 the same way: open, read, draft, queue.
We built a classifier first. Each notice gets OCR'd, run through Claude with a strict prompt that returns notice type + severity + recommended action. The classifier is right about 94% of the time on the test set. The 6% it gets wrong it flags as low confidence rather than miscategorizes.
Routine notices auto-route to a junior with a templated response that's pre-filled with client data. They review, edit, send. 8 minutes instead of 40.
Severe notices route directly to the responsible partner with a generated brief. The partner doesn't read the notice cold. They read the brief, then the notice. Cuts the partner's review time by half.
The audit-trigger notices are the only ones that still take the full 90 minutes, and we left those alone on purpose. You don't automate audit work.
What it cost
n8n self-hosted on a $20/mo Hetzner VPS. Claude API spend ran about $180 in March (highest month), $40-90 the rest of the year. Drake Tax integration was a custom HTTP webhook because Drake doesn't have a real API.
My engagement was 14 weeks of work over 4 months. About a third of one partner's annual recruiting and severance cost.
What I learned
The thing that surprised me was how much of busy season was self-inflicted. The firm had operated this way for 12 years. They had never sat down and counted where the time went. They had a vague sense that "documents are a nightmare" but didn't know it cost them 14 hours per senior per week.
The audit was the entire engagement. The automations were the easy part. The audit was the hard part.
If you run a practice and you're reading this and your busy season is brutal, I'll tell you the same thing I told this partner group at our first meeting. You can't fix what you haven't measured. Spend a day. Watch your staff work. Time the categories. The fix becomes obvious.
What didn't work
The first IRS classifier I built used embeddings against a notice library. Total disaster. The notices have so much identical boilerplate that everything looked similar to everything else. We threw it out and went to a direct Claude prompt with the notice text and a list of categories. Worked first try.
We also tried to automate the status-update emails to clients. We pulled it back after week three because the auto-emails sounded like a robot wrote them and clients started calling to complain. We kept the draft generation but require human send for all status updates. The drafts save time. The robotic sends cost relationships.
What's still on the punch list
Manual data entry from PDFs to Drake is still mostly manual. The OCR works for clean W-2s but breaks on scanned documents from phone cameras. We're piloting a different OCR engine next quarter.
Engagement-letter scope clarification is also mostly unsolved. That's a sales-process problem, not an AI problem.
I'd rather tell you what didn't work than pretend the whole engagement was a clean win. The 80-to-50 number is real but it took a year to land and we're not done.
Want the full guide? Check out our deep-dive page for more context, FAQs, and resources.
read the full guide