Claude Opus 4.7: Anthropic's AI Finds Decade-Old OS Flaws

Last week, Anthropic gathered twelve of the world’s largest technology companies to share an uncomfortable finding. Its most powerful AI model had spent several weeks autonomously identifying security flaws in widely used software, including vulnerabilities that had gone undetected for nearly three decades.

That disclosure came alongside the general release of Claude Opus 4.7. Anthropic is using the newer model to test the security controls it needs before it can responsibly release the more capable one. For enterprise buyers, both developments matter.

Research from Gravitee, published in February 2026, found that 81% of enterprise teams have moved past the planning phase for AI agents. Yet only 14.4% have full security or IT approval for the agents they run. That governance gap looks considerably more serious in light of what Anthropic disclosed this week.

What Opus 4.7 changes for enterprise teams

The core problem with running AI agents at scale has always been reliability. Models that drop context between sessions, stall on complex tasks, or need supervising at every step eat up more time than they save.

Opus 4.7 addresses several of those issues. It checks its own outputs before reporting back, retains context across sessions, and follows instructions more precisely than its predecessor. For teams running multi-day workflows, that context retention matters most. Re-establishing background at the start of each session is a real operational cost that most productivity assessments overlook.

Enterprise testers reported measurable gains. Notion saw a 14% improvement on complex multi-step workflows with a third fewer tool errors. They also said it was the first model to pass their implicit-need tests, where the model works out requirements without explicit instruction. Ramp found it needed far less step-by-step guidance across tasks spanning multiple tools and codebases.

Image resolution has increased to more than three times that of previous Claude models. That makes document processing and dense interface work more practical. Those running Claude within Microsoft 365 will see that improvement across Teams, Outlook, and OneDrive workflows. Pricing stays at $5 per million input tokens and $25 per million output tokens.

The security finding every IT leader needs to read

Using Claude Mythos Preview, Anthropic autonomously found thousands of critical zero-day vulnerabilities. These spanned every major operating system and web browser. One was a 27-year-old flaw in OpenBSD that let attackers remotely crash machines. Another was a bug in FFmpeg that automated testing tools had run five million times without flagging. Maintainers have now fixed all of them.

As UC Today covered separately this week, the significance is not the individual bugs. It is that a capable AI model can now find serious vulnerabilities at scale, autonomously, and faster than any existing testing process. The average cost of a data breach stands at $4.4 million. Unified communications environments, built on browsers, shared media libraries, APIs, and virtualised infrastructure, sit squarely in scope.

Project Glasswing, Anthropic’s response, brings together AWS, Cisco, CrowdStrike, Google, Microsoft, Palo Alto Networks, and others. The group committed $100M in model credits to scanning and hardening critical software infrastructure. They also directed a further $4M to open-source security organisations. Microsoft, which has been building its own AI security agent infrastructure in parallel, joined as a founding member.

Opus 4.7 is the first Claude model to ship with automated safeguards that block high-risk cybersecurity uses. Anthropic describes it as a test bed for the controls needed before Mythos-class models can reach a wider audience. Security professionals with legitimate requirements can apply through the new Cyber Verification Programme.

Deloitte’s 2026 enterprise AI report found that only one in five companies has a mature governance model for autonomous AI agents. For IT and security leads, that figure and this week’s news belong in the same conversation.

Source link