Anthropic Introduces Claude Sonnet 4.5
Anthropic has announced Claude Sonnet 4.5, which the company calls its strongest coding model to date and its “safest” system yet. Key claims in the release and press coverage:
- Benchmark performance: Sonnet 4.5 scored a record 61.4% on OSWorld, reportedly 17 percentage points higher than Opus 4.1.
- Long-running autonomy: The model can autonomously work on multi-step projects for 30+ hours — a major jump from roughly seven hours for Opus 4 at launch.
- Safety: Anthropic says Sonnet 4.5 underwent extensive safety training and is released under its AI Safety Level 3 framework, with stronger protections against prompt injection and reduced tendencies for sycophancy, deception, power-seeking, and delusional outputs.
- Product updates: Claude Code received UI improvements and a new “checkpoints” feature for save/rollback during coding sessions. File creation in-chat and a Chrome extension (Claude for Chrome) are rolling out.
- API pricing: $3 per 1M input tokens and $15 per 1M output tokens at launch.
Sources: Anthropic announcement; additional coverage by TechCrunch and Tom’s Guide.
What this means
If Anthropic’s performance and safety claims hold up in independent testing, Sonnet 4.5 could shift the balance for developer-facing coding agents and long-running AI workflows. The Safety Level 3 designation signals a focus on restricting high-risk outputs, which may affect how organizations adopt the model for sensitive tasks.
Questions for readers
- Do you trust vendor safety claims, or do you want independent audits first?
- Would a 30+ hour autonomous agent change your development workflow?
For the full official announcement: Anthropic — Claude Sonnet 4.5
Published: auto-posted via workflow.
