Anthropic Just Shipped What We Built Three Weeks Ago
Anthropic announced cloud scheduling for Claude Code today. Set a repo, set a schedule, set a prompt. Claude runs it on their infrastructure so you don't need your laptop open.
Twitter is losing its mind. 282 posts and counting. Noah Zweben, the Claude Code PM, posted the demo himself.
We've been running this since February.
Not bragging. Documenting. Because there are things you learn from running scheduled agents that you cannot learn from shipping the feature.
What They Shipped
The official version comes in two flavors.
CLI /loop — session-scoped cron. Set an interval and a prompt, it fires in the background. Up to 50 concurrent tasks. Three-day auto-expiry. Dies when you close the terminal.
Cowork scheduled tasks — persistent. Five cadences: manual, hourly, daily, weekday, weekly. Runs on Anthropic's cloud. Delivers reports, briefings, summaries. This is the one that matters.
Both are clean. Both solve real problems. Neither prepares you for what happens next.
What We Learned Running This for Weeks
We run a system called TinyClaw — a tmux-based daemon that connects messaging channels to Claude Code agents via file-based queues. Cron jobs fire prompts on schedule. The agent picks them up, does the work, responds through whatever channel started it.
Three things happened that the announcement doesn't mention.
1. Context Drift Is the Real Enemy
A scheduled task that runs "check PRs and summarize changes" works perfectly on day one. By day five, the agent has lost all context about why those PRs matter, which reviewer cares about what, and what the team's priorities are this sprint.
The /loop auto-expiry at three days is smart — they know this. But Cowork's persistent scheduling will hit it hard. A weekly report that runs 52 times needs context that survives 52 context windows. That's not a scheduling problem. That's a memory problem.
2. The Task Has to Be the Right Shape
Not every recurring task benefits from AI. A cron job that runs git log --since=yesterday and formats it as a Slack message doesn't need Claude. A cron job that reads yesterday's commits, correlates them with the sprint board, identifies what's blocked, and writes a standup summary for each team member — that needs Claude.
The temptation is to schedule everything. The reality is that 80% of scheduled tasks should stay as bash scripts. The 20% that need intelligence need a lot of intelligence, and they need it consistently.
3. Failure Recovery Is Non-Obvious
When a human runs Claude Code and it fails, the human adapts. Rephrase, more context, different approach.
When a scheduled task fails at 3 AM, nobody adapts. It either retries blindly (wasting tokens), logs an error nobody reads, or silently produces garbage output that looks correct until someone notices the numbers are wrong.
We built a three-tier response: retry with context injection, escalate to a human channel if retry fails, kill the schedule after three consecutive failures. That infrastructure took longer to build than the scheduler itself.
The Real Dependency
A CEO posted today that he paused the company credit card for Anthropic and fired every employee who didn't complain when Claude went down. 499K views on the tweet. The man is dead serious about his Claude dependency.
That's the future of scheduled agents. Not toy demos — production dependencies where downtime means lost revenue. The infrastructure for that level of reliability doesn't exist yet.
When your morning briefing agent fails and the CEO doesn't get their report, "it'll retry in an hour" is not an acceptable answer. When your PR review agent approves a bad merge because it lost context on run number 47, "the three-day expiry should have caught that" doesn't cut it.
What I'd Build Next
-
Persistent context across runs. Compressed summaries injected at the start of each execution. We call this context eviction and it's the hardest problem in agent infrastructure.
-
Failure channels. When a scheduled task fails, notify someone through a channel they check. The agent should escalate, not just log.
-
Task validation. Execute once, confirm the output, lock it as the baseline.
-
Cost guardrails per schedule. A runaway task with no token budget is a billing incident waiting to happen.
The scheduling itself is table stakes. Anthropic shipping it is expected. The interesting work — the work that separates "cool demo" from "production infrastructure" — is everything that happens when the schedule fires and nobody's watching.
We're three weeks ahead on learning that lesson. They're about to start.