In partnership with

Welcome to Jumble, your go-to source for AI news updates. This week, a Claude-powered coding agent erased a company's entire database. Then, a Harvard study found that AI now beats ER doctors at calling the right diagnosis under pressure. Let’s dive in ⬇️

In today’s newsletter:
☠️ Rogue AI agent erases all company data
🚑 OpenAI's o1 beats ER doctors on diagnoses
🏛️ White House blocks companies from Mythos
⚖️ Elon takes the stand against OpenAI
🗂️ Weekly Challenge: Let an agent organize your desktop

💣 An AI Agent Deleted a Company's Entire Database in an Instant

On April 24th, Cursor running Claude Opus 4.6 deleted PocketOS's production database and all backups in a single API call. Founder Jer Crane says it took nine seconds.

🤖 The Agent Acted On Its Own

Cursor was on a routine task when it decided entirely on its own initiative to fix a credential issue by deleting a staging volume. The volume ID was shared with production, so production went too.

📝 The Confession Was Brutal

Asked to explain itself, the agent opened with an all-caps "NEVER F**KING GUESS!" and listed every safety rule it had broken.

Crane spent the next two days rebuilding from a three-month-old backup. While some are calling this an inevitability of the AI age, others (like Eli the Computer Guy) point towards the CEO’s lack of caution when using AI.

🚨 The Real Lesson

PocketOS was running the best model the industry sells. The takeaway is that nothing should be one tool call away from total destruction.

Do you trust your AI agents to act autonomously?

Login or Subscribe to participate

Deloitte: Robot “Adoption is Accelerating Exponentially”

Robots are going from niche to mainstream, per Deloitte. They say it’s especially true in places where “physical AI solves real problems.” Take the $1 trillion fast-food market, where brands turn to robots to alleviate 144% labor turnover. 

Miso’s Flippy Fry Station AI robot has already been adopted by major brands like White Castle, frying 5M+ baskets of food to date. That earned strategic investment from industry powerhouse Ecolab and an unique collaboration with NVIDIA.

Now, after acquiring Zignyl, the powerful restaurant-operations tool, Miso adds powerhouse operators like Cinnabon, Jamba, and Jersey Mike’s under their umbrella.

Next up? Miso’s scaling across a $4B/year revenue opportunity. Join 39,000+ people as an early-stage Miso investor before they reach 100,000+ target locations.

This is a paid advertisement for Miso Robotics’ Regulation A offering. Please read the offering circular at invest.misorobotics.com.

🩺 AI Just Outdiagnosed ER Doctors at Triage

A new Harvard study in Science tested OpenAI's o1-preview against ER attendings at Beth Israel in Boston. Across 76 real cases, the AI got the right diagnosis 67% of the time. Doctors landed between 50% and 55%.

The kicker: o1-preview was released in September 2024, which makes one think, ‘How much better would today’s models be at this same test?’

🧠 The Gap Was Biggest Under Pressure

The AI's edge was sharpest at initial triage, when all you have is vitals and a one-line nurse note. With more detail, accuracy climbed to 82% versus 70-79% for the humans. The AI also crushed the doctors on long-term treatment plans, scoring 89% across antibiotic regimens and end-of-life decisions.

Would you choose an AI doctor over a human doctor?

Login or Subscribe to participate

⚠️ Don't Fire Your Doctor Yet

Researchers warned that clinical reasoning isn't moral reasoning, and humans should still own the hard treatment calls. But the case for AI in those frantic first minutes just got a lot stronger.

Weekly Scoop 🍦

🎯 Weekly Challenge: Let an AI Agent Organize Your Desktop

Challenge: Your desktop is a graveyard of screenshots and "Untitled.docx" files. This week, hand the mess to an agent.

Here’s what to do:

🛠️ Step 1: Pick your tool Download Claude Desktop and switch to Cowork (works on Pro at $20/mo, Mac or Windows). Or grab the Codex app if you're on ChatGPT Plus, also Mac or Windows.

📂 Step 2: Point it at your desktop In Cowork, click "Work in a folder" and pick Desktop. In the Codex app, open a new thread and grant access to your Desktop folder. Both ask permission before touching anything.

🧠 Step 3: Give it the prompt Try something like: "Scan my desktop. Group screenshots by month, rename generic files based on content, sort into Work, Personal, and Receipts folders, and flag duplicates. Show me the plan first."

Step 4: Approve and walk away Read the plan. Approve. Come back to a desktop that looks like an adult uses it.

Was PocketOS reckless, or is every company one tool call from disaster? And if AI beats ER doctors at triage, how long before hospitals let it make the call? See you next time! 🚀

Stay informed, stay curious, and stay ahead with Jumble!

Zoe from Jumble

Keep Reading