Jumble
Posts
Claude 3.7 Sonnet Might Have Just Retaken the AI Throne

Claude 3.7 Sonnet Might Have Just Retaken the AI Throne

Zoe Kopidis
February 26, 2025

Welcome to this week’s Jumble! Anthropic just launched its most ambitious model yet, Claude 3.7 Sonnet, and it’s shaking up the AI landscape with hybrid reasoning and extended output capabilities. Meanwhile, xAI’s Grok 3 claims to be the smartest AI yet—but is xAI cherry-picking data? Let’s dive in. ⬇️

In today’s newsletter:
🧠 Claude 3.7 Sonnet pros and cons
🧐 Is xAI inflating Grok 3’s intelligence?
🕺 Watch this humanoid robot dance while dodging a soccer ball
🎨 Challenge: Create your own AI avatar

💫 Claude 3.7 Sonnet Redefines AI Intelligence

Anthropic’s newest model, Claude 3.7 Sonnet, introduces a hybrid reasoning system that lets users switch between standard mode (instant responses) and extended thinking mode (detailed, step-by-step analysis). This approach mirrors human cognitive flexibility—combining rapid intuition with deeper analytical processing.

Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking.
One model, two ways to think.
We’re also releasing an agentic coding tool: Claude Code.
— Anthropic (@AnthropicAI)
6:30 PM • Feb 24, 2025

💡 Strengths and Weaknesses of Claude 3.7 Sonnet

Strengths:
👫 Hybrid Reasoning – Combines instant responses with deep, step-by-step analysis, making it highly adaptable.
🔎 Visible Chain-of-Thought – Unlike other models, Claude 3.7 shows its full reasoning process, helping users understand how it arrives at conclusions.
📏 Extended Output Capacity – Supports up to 128,000 tokens, handling extensive documents and detailed responses with ease.

Weaknesses:

💰 Higher Costs for Deep Thinking – Extended mode consumes more tokens, making long-form tasks expensive.
🖥️ Manual Mode Switching – Users need to toggle between standard and extended modes, which can disrupt workflow.
📉 Performance Gaps in Math – While excelling in coding and reasoning, it lags behind in competitive mathematical problem-solving.
🌐 Can’t Surf the Web - Still doesn’t have the capability to go online for searches, so it’s limited to training data from late 2024.

🛠️ 5 Ways Non-Coders Can Use Claude 3.7 Sonnet

Photo by Anthropic

Anthropic also released Claude Code, which can help you create just about anything you can imagine. But, if you’re not into coding, 3.7 Sonnet can still help you do a lot of things!

📅 Automate Daily Admin Tasks – Schedule meetings, fill out forms, and organize workflows more efficiently.
✈️ Plan Events and Travel – Research destinations, compare flight/hotel prices, and even book reservations.
📚 Study Smarter – Use Claude 3.7 for in-depth learning on complex topics, with clear explanations and step-by-step breakdowns.
📝 Enhance Writing Projects – Whether brainstorming, editing, or generating full drafts, Claude refines ideas with precision.
🔍 Conduct Deep Research – Claude 3.7 analyzes vast amounts of information, summarizing key insights and helping make informed decisions.

You can (and should) also have FUN with it 😜

can't stop, won't stop building cool shit with Claude 3.7.
It's simply amazing.
I've had this idea for years of converting any song to a "rollercoaster" with a continuous track representing the song's energy. It's much harden than it sounds to get a single, continuous, measure.
— joao (@jay_wooow)
5:38 AM • Feb 25, 2025

What’s Next For Anthropic?
We have no idea what’s next, but Anthropic’s team does. Here’s what they envision Claude doing over the next few years:

Source: Anthropic

🧐 Is xAI Lying About Grok 3’s Intelligence?

xAI’s Grok 3 is under fire for allegedly misleading benchmark comparisons. Critics claim xAI omitted the "consensus@64" metric—where OpenAI’s o3-mini-high actually outperforms Grok 3—making xAI’s model look stronger than it really is.

🤷 What’s the issue?

In single-pass tests, Grok 3 trails OpenAI’s o3-mini-high and even struggles against OpenAI’s older o1 model in some categories.
Critics argue xAI’s selective reporting undermines the credibility of AI benchmarking and fuels skepticism about its claims.

🗣️ What does xAI say?

xAI defends its methodology, claiming its approach aligns with industry standards.
While Grok 3 performs well in complex, multi-step reasoning tasks, its inconsistencies across benchmarks raise questions about its supposed superiority.

This Week’s Scoop 🍦

🛰️ US tech firms supply AI to Israeli military

🔬 Google deploys AI "co-scientist" for research

⚽ Watch this humanoid robot boogie while dodging all sorts of hazards

😅 Why is AI unable to be funny?

📂 DeepSeek to open-source code repositories

🤖 This credit card-sized swimming robot might be a game-changer

❓ Quiz of the Week

Which of the following surprising facts about AI is actually true?

Choose an answer 👇

(Scroll down for the answer at the bottom of the newsletter!)

🎨 AI Challenge of the Week

Create or Enhance Your Self-Portrait With AI 🎭

This week’s challenge is all about visual AI creativity! Using an AI image-generation tool like Deep Dream Generator or Artbreeder, create an AI-generated self-portrait—but with a twist! Make it surreal, futuristic, or even turn yourself into a cyberpunk character.

🖌️ Steps to try:

Upload a picture of yourself (or describe yourself in words).
Apply a wild artistic filter—go for neon cyberpunk, medieval fantasy, or AI-enhanced realism.
Share your AI-created self-portrait and tag us! The best ones might get featured in next week’s Jumble. 💌

Here’s Deep Dream Generator’s enhancement of Katt Williams, what do you think?

❓ AI Quiz Answer:

D) An AI model once created a new mathematical theorem that humans hadn’t discovered.

In 2021, AI developed by DeepMind collaborated with mathematicians to uncover entirely new insights in pure mathematics, proving AI can contribute to fields traditionally dominated by human intuition.

Click below ⬇️

That’s it for this week! Do you trust AI benchmark results, or is xAI’s Grok 3 controversy proof that the system needs reform? And what do you think of Claude 3.7 Sonnet’s hybrid reasoning? Let us know your thoughts!

Zoe from Jumble