// ai postmortemsby JoshApril 23, 20265 min read

The HITL Queue Nobody Checked

Human-in-the-loop is the safety net. The safety net only works if humans actually look at it. Here's the story of an unchecked queue, the customer impact, and the design that fixed it.

The HITL Queue Nobody Checked

Every AI agent we ship has a human-in-the-loop queue. Things the agent isn't sure about go there. Humans review and approve.

The design assumes humans actually check the queue. One client's didn't, for three weeks. Here's what happened.

The setup

A 22-person sales team had an AI assistant that drafted outbound emails. High-confidence drafts auto-sent. Anything below 0.85 confidence went to a HITL queue for a sales manager to review and approve.

The system worked great. About 6% of drafts hit the HITL queue, which felt manageable. The sales manager approved them in batches every morning.

What happened

The sales manager went on a 3-week vacation in August. He didn't formally hand off the HITL queue review because "nothing important happens in August."

Sales kept generating drafts. The HITL queue kept filling. The system kept holding low-confidence drafts.

In week two of his vacation: - 47 unreviewed drafts piled up - 12 of them were for time-sensitive opportunities - 6 of those went stale and the prospects ghosted - 1 of them was a contract negotiation that lost the deal entirely

By the time the manager came back and saw the queue, it was too late on the contract negotiation. Estimated lost revenue: $80k.

Root cause

The HITL design assumed continuous human attention. The system did not: - Have a backup approver - Notify anyone when items aged past a threshold - Surface aging items as urgent - Auto-escalate stale items

When the human stopped paying attention, the safety net silently stopped working.

What we did instead

We rebuilt the HITL queue with three changes.

One, multi-tier approvers. Every item routes to a primary approver. If unreviewed after 4 hours, it also routes to a backup. If unreviewed after 24 hours, it routes to the team's slack channel as urgent.

Two, automatic aging alerts. Every morning at 9 AM the system pings the primary approver in Slack with: "You have N items in your queue, M of them over 24 hours old, oldest is from {time}."

Three, vacation handoff workflow. When an approver enables their vacation responder, the system automatically reroutes their queue to a configured backup until they return.

What I tell prospects now

HITL is not a design pattern, it's an operational pattern. The technology is easy. The discipline is what fails.

If you build HITL, you must also build: - Aging alerts - Multi-tier approver fallbacks - Vacation/coverage handoffs - Queue depth limits with automatic escalation - A weekly review of "what's in the queue and why"

Without these, you have a queue that silently breaks the first time a human stops paying attention. And humans always stop paying attention eventually.

The deeper lesson

The fundamental rule of AI agents: anything that's "the human will check" must also have "what happens if the human doesn't" specified.

In automation parlance: the loop should be self-healing under reasonable failure modes. The human pausing or being absent is a reasonable failure mode. Design for it.

I now consider an HITL design incomplete if it doesn't address coverage. That's a contractual change in my engagements, not just a best practice.

The thing nobody mentions

The sales manager wasn't lazy. He was on vacation. He came back to a disaster that wasn't his fault, but he absorbed the blame anyway.

The system was the problem. The system should not have allowed his absence to cost the company $80k. The system was designed to fail under that condition.

When you build HITL queues without coverage logic, you're transferring operational risk from the AI to the humans, and the humans aren't aware they're carrying it. The first time the queue silently fails, the human gets blamed. The second time, the human resigns. The third time, the AI gets rolled back.

Build the coverage into the system from day one. Don't make humans carry risk they didn't sign up for.

postmortemhitlai agentsoperationsfailure
// go deeper

Want the full guide? Check out our deep-dive page for more context, FAQs, and resources.

read the full guide
// keep reading

Related posts

// ready to ship?

Let's build yours.

Reading is the easy part. We do the work. Tell us what's broken and we'll tell you straight up whether we can help.