AI Dev Stack

Sign in Subscribe

anthropic

News zu Anthropic und Claude

Claude 3.5 Sonnet Hits 49% on SWE-Bench: How Anthropic Built It

Claude 3.5 Sonnet Hits 49% on SWE-Bench: How Anthropic Built It

Claude 3.5 Sonnet hit 49% on SWE-bench Verified with a minimal agent scaffold: two tools, one prompt, maximum model control. Here's the exact architecture Anthropic used and why tool design matters as much as model capability.

Claude's "Think" Tool: When to Stop and Reason Mid-Task

Claude's "Think" Tool: When to Stop and Reason Mid-Task

Anthropic's "think" tool gives Claude structured reasoning checkpoints during execution. It delivered 54% improvement on policy-heavy tasks and 1.6% on SWE-Bench. Here's when to use it—and when to skip it.

Claude Code: Anthropic's Full-Stack AI Coding Assistant

Claude Code: Anthropic's Full-Stack AI Coding Assistant

Anthropic's Claude Code runs in terminal, IDE, browser, and mobile with session portability across all surfaces. Same context, different environments.

How Anthropic Built Claude's Multi-Agent Research System

How Anthropic Built Claude's Multi-Agent Research System

Anthropic's multi-agent research system uses Claude Opus 4 as orchestrator with Sonnet 4 subagents, achieving 90% better performance than single agents. Here's how they built it, the eight prompting principles that emerged, and why production was harder than the prototype.

Claude Desktop Extensions: One-Click MCP Server Installation

Claude Desktop Extensions: One-Click MCP Server Installation

Anthropic just eliminated the biggest barrier to MCP server adoption. Desktop Extensions let you package local servers as one-click installs—no terminal, no config files, no dependency hell. The spec is open source, so any AI app can support it.

Writing Effective Tools for AI Agents—Using AI Agents

Writing Effective Tools for AI Agents—Using AI Agents

Anthropic's internal playbook for building MCP tools that agents actually use well. Their secret? Let Claude Code optimize your tools against real evaluations—it beats human-written implementations.

Inside Anthropic's Three-Week Infrastructure Nightmare

Inside Anthropic's Three-Week Infrastructure Nightmare

Three infrastructure bugs hit Claude simultaneously in August 2024, affecting up to 16% of requests. Anthropic's detailed postmortem reveals why detection took weeks and what they're changing.