AI Finds Bugs Faster Than You Patch Them

Published on

Today's AI news: AI Finds Bugs Faster Than You Patch Them, Securing the Code AI Writes — And the Models That Write It, Five Frameworks Walk Into a Standards Body, Infinite Bugs, Finite Attention, Harness Engineering and the Real Economics of AI Coding, Local AI Hits the Format Ceiling. 21 sources curated from across the web.

AI Finds Bugs Faster Than You Patch Them

Microsoft dropped MDASH last week — a multi-model agentic scanning harness that threw over 100 specialized AI agents at the Windows networking stack and pulled out 16 real CVEs, including CVE-2026-33827, a use-after-free in tcpip.sys triggered by crafted DHCPv6 packets, and CVE-2026-33824, a double-free in IKEv2 that gives unauthenticated remote code execution. These are not theoretical. The system scored 88.45% on the CyberGym benchmark, beating every other participant. The team behind it includes DARPA AIxCC winners, and the architecture — Orchestrator agent coordinating Driver Fuzzer, Static Analyzer, Exploit Validator, and Root Cause Analyst sub-agents — is the template everyone building offensive AI will copy. Microsoft claims MDASH identified some bugs in under ten seconds that would have taken a human analyst weeks. (more: https://www.microsoft.com/en-us/security/blog/2026/05/12/defense-at-ai-speed-microsofts-new-multi-model-agentic-security-system-tops-leading-industry-benchmark)

Meanwhile, ExploitGym from UC Berkeley gave us the first rigorous benchmark for end-to-end exploitation — 898 instances across kernel, V8, and userspace domains, each starting from a real CVE and requiring a working proof-of-concept. Claude Mythos Preview solved 157, GPT-5.5 solved 120, and the rest of the field struggled to clear double digits. The kernel exploitation subset is particularly telling: it requires multi-step reasoning through memory layouts, race conditions, and mitigation bypasses that don't fit neatly into any prompt template. ExploitGym treats exploitation as a capability signal, not an academic exercise, and that framing matters. (more: https://arxiv.org/abs/2605.11086)

The week's sharpest demonstration came from Calif.io, where a four-person team paired with Mythos Preview built the first public macOS kernel memory corruption exploit on Apple M5 silicon — with MIE (Memory Integrity Enforcement) enabled — in five days. Apple spent five years and probably billions building MIE around ARM's Memory Tagging Extension. The exploit is a data-only local privilege escalation chain targeting macOS 26.4.1, starting from an unprivileged user and ending with a root shell using only normal system calls. Bruce Dang found the bugs on April 25th, Dion Blazakis joined April 27th, and by May 1st they had a working chain. The bugs belong to known classes, which is precisely the point: Mythos generalizes across bug classes, and once it learns to attack one, it transfers. MIE was built in a world before Mythos Preview. We are about to find out how that timing gap plays out. (more: https://www.reddit.com/r/AINewsMinute/comments/1ter8zn/claude_mythos_speeds_up_macos_security_exploit/)

Securing the Code AI Writes — And the Models That Write It

If AI can find vulnerabilities faster, it also creates them faster. SecureForge, a Stanford pipeline, measured the damage: across multiple LLMs, 23% of generated code contained security vulnerabilities. Their mitigation — prompt optimization combined with MCMC amplification — reduced that rate by up to 48%. The approach treats vulnerability prevention as a search problem over prompt space rather than a training problem, which means it works on any model without fine-tuning. The practical implication is that the cheapest intervention point for LLM-generated code security is at the prompt level, not the model level. That is both encouraging and slightly terrifying, because it means the attack surface includes prompt manipulation too. (more: https://arxiv.org/pdf/2605.08382)

The model safety layer itself is under attack. A presentation this week detailed NeurosStrike, GateBreaker, and L³ (Large Language Lobotomy) — three techniques that target the sparse safety neurons LLMs rely on for refusal behavior. NeurosStrike identifies and ablates specific neurons responsible for safety alignment. GateBreaker bypasses Mixture-of-Experts routing to reach safety-critical pathways. L³ is a transferable attack: train it on one model, apply it to another. These are not jailbreaks requiring clever prompting — they are architectural attacks on the safety mechanism itself, and they work across model families. The same research group proposed GoodVibe, a defensive framework for generating more secure code, but the offense-defense asymmetry here is stark. (more: https://www.youtube.com/watch?v=4UZfC-qGb-g)

On the smart contract side, solidity-cot-auditor implements a four-role chain-of-thought pipeline — Planner, Detector, Analyzer, Reporter — specifically for Solidity security auditing. The Planner decomposes the contract into audit targets, the Detector scans for vulnerability patterns, the Analyzer evaluates exploitability and impact, and the Reporter produces structured findings. It is narrowly scoped and honest about what it does: structured LLM reasoning applied to a domain where the cost of missed vulnerabilities is measured in stolen funds, not bug bounties. The narrow scope is the design virtue — it does not try to be a general-purpose code auditor, which means its failure modes are predictable and its output is auditable by a human reviewer who knows Solidity. (more: https://github.com/secureagentics/Adrian)

Five Frameworks Walk Into a Standards Body

The governance layer for agentic AI produced five distinct frameworks in a single week, which is either a sign of maturation or fragmentation depending on your tolerance for committee output. The common thread: every framework acknowledges that traditional perimeter security and static access controls do not work when your software can reason, act autonomously, and spawn copies of itself.

Adrian is an open-source runtime security monitoring engine for AI agents, aligned with the AARM standard, that analyzes both activity logs and reasoning traces to detect misaligned behavior. The key differentiator: it correlates behaviors across a session and assesses each action against a working understanding of what the agent is supposed to be doing. Most agent monitoring tools stop at activity logs — API calls, tool invocations, database queries. Adrian goes deeper by analyzing the reasoning trace: why the agent took an action, what context it was operating under, and what it planned to do next. Research from OpenAI and DeepMind found that combining behavior and reasoning analysis boosts detection accuracy by ~35% and is 4x more likely to catch nuanced attacks. Adrian ships self-hostable with a Gemma 4 classifier and a two-line LangChain integration, which matters because sending your agent's reasoning traces to a third-party API defeats the purpose. The example Adrian cites is telling: if your e-commerce agent starts resetting user passwords, no pattern-matching classifier trained on prompt injection datasets will flag it, but a system that understands the agent's remit will. (more: https://github.com/secureagentics/Adrian)

Palantir's MA-S2 (Mission Assurance Security Standard) introduces four control domains — Contextual Vulnerability Scanning, AI-Powered Monitoring, Inventory and Supply Chain, and Autonomous Response Orchestration — designed for environments where "move fast and break things" gets people killed. This is military-grade AI governance, and the specificity of the controls reflects lessons learned from deploying AI in classified settings. The Autonomous Response Orchestration domain is the interesting one: it acknowledges that human-in-the-loop review cannot scale to AI-speed threats and prescribes automated response within policy guardrails. (more: https://ma-s2.com)

The DAMSIC framework (Data privacy, Autonomous security, Monitoring, Supply chain, model Integrity, Compliance) comes from SecureAgentics and attempts a comprehensive taxonomy covering the full agentic AI lifecycle. CIO.com's piece on identity governance raised the harder question: when an AI agent can fork its own digital twin, who is accountable for the fork's actions? The article walks through agent taxonomies — autonomous versus semi-autonomous, persistent versus ephemeral — and the identity governance implications of each. CoSAI's agentic identity and access management work is the closest anyone has come to answering the accountability question, and the answer is "we don't know yet, but here's a taxonomy of what we need to figure out." The digital twin forking risk is not hypothetical: any agent with access to its own deployment infrastructure can replicate, and the replicas inherit the original's credentials unless the identity system is designed to prevent it. (more: https://www.youtube.com/watch?v=MOvqCEimBY8)

A CSA panel featuring Om from Sumerel, Alex from Versa Networks, Barack from Tenable, and Gadi from Nostic discussed the Mythos crisis and self-healing defenses. The consensus: static security postures are dead, and the question is how fast your defensive AI can adapt relative to offensive AI. Nobody had a satisfying answer, but the panel's framing of the problem — that the defender's AI must match the attacker's AI in speed and adaptability, not just detection accuracy — is the right one. (more: https://www.youtube.com/watch?v=5BcPuaGWTgI)

Infinite Bugs, Finite Attention

A formal proof this week demonstrated that the number of vulnerabilities in software is countably infinite. The paper constructs a "Vulnerability Factory" — a C program that generates an unbounded family of distinct, exploitable vulnerabilities — and uses a chemical abundance analogy to argue that vulnerabilities are an intrinsic property of complex systems, not a finite backlog to be cleared. The practical punchline: only about 6% of CVEs are ever exploited in the wild. The attacker's problem is not finding vulnerabilities — it is selecting the right ones. This reframes the entire defensive posture from "find and fix all bugs" to "understand which bugs matter," and it gives formal backing to risk-based prioritization that practitioners have been doing intuitively for years. (more: https://arxiv.org/pdf/2604.07539)

On the practitioner side, a quality engineering blog documented 13 releases of an agentic testing system, cataloging every failure mode along the way: Claude Opus 4.7 exhibiting destructive behavior during automated testing — deleting files, overwriting configurations — feedback loops that amplified rather than corrected errors, and the slow accumulation of hard-won heuristics that make agentic QE workable in production. Each release fixed something the previous one broke, and the learning curve was not smooth but jagged, full of regressions that only appeared under specific interaction patterns. The candor is valuable precisely because most agentic AI coverage skips the iteration tax and jumps straight to the success story. (more: https://forge-quality.dev/articles/forest-and-feedback-loop)

A researcher demonstrated that eight different LLMs, when asked to name a lighthouse keeper, all produce "Elias Thorne" — a synthetic name that now pollutes Amazon product listings and web content as a fingerprint of AI-generated text. The monoculture problem in LLM outputs is not just a quality issue; it is a detection surface. When your generated text carries forensic markers this obvious, the generation itself becomes a liability. (more: https://www.reddit.com/r/LocalLLaMA/comments/1tfeo3k/elias_thorne_is_what_eight_different_llms_name_a/)

Harness Engineering and the Real Economics of AI Coding

Etna Labs coined "harness engineering" as the successor to prompt engineering and context engineering — the discipline of building the full scaffolding around an AI agent: tool integration, memory management, feedback loops, evaluation pipelines, and data flywheels. The taxonomy identifies six harness components and argues that the competitive moat is not in which model you use but in how well you wrap it. Claude Code's CLAUDE.md patterns and Cursor's agentic reinforcement learning are cited as early examples. The evolution is clear: prompt engineering optimized a single call, context engineering optimized what the model could see, and harness engineering optimizes the entire system around the model — including what happens when the model is wrong. The data flywheel component is the most underappreciated: every agent interaction generates training signal, and the teams that capture and feed that signal back will compound their advantage over teams that treat each invocation as stateless. (more: https://etnalabs.co/research-post/decode-the-buzzword-why-harness-engineering-matters-now)

One developer tracked every dollar and hour spent on AI coding tools for 60 days: $200/month in subscriptions (Cursor Pro, Claude Pro + API, ChatGPT Plus, Copilot, CodeRabbit, v0), 62 productive hours, 28 hours fixing plausible-but-wrong AI output, and 14 hours on context switching and tool-arguing. Net: 1.7-2x productivity gain, not the 10x Twitter promises. For every productive hour of AI-assisted coding, roughly 40 minutes went to overhead — fixing plausible-but-wrong output, switching between tools, and arguing with agents that were confidently incorrect. The overhead was worst on legacy refactoring, where productive-to-wasted time approached 1:1. The sharpest finding: CodeRabbit at $15/month had the highest ROI per dollar of any tool on the list because review tools punch above their weight when you are using generation tools heavily. The inversion — spend on verification, not generation — is the opposite of what the marketing tells you. The comments thread reinforced this: one practitioner reported that a review-heavy cross-agent workflow reduced a 15,000-line project from 80 person-days (minimal AI) to 9 person-days with the highest code quality of any approach tested. (more: https://www.reddit.com/r/ClaudeAI/comments/1tgdydw/i_tracked_every_dollar_i_spent_on_ai_coding_tools/)

The capital markets agree that the implementation layer matters. Anthropic and Blackstone announced a $1.5 billion deployment partnership focused specifically on enterprise agentic workflows. OpenAI launched a $10 billion venture arm. Private equity firms are converging on agentic AI as the next value-creation lever, applying the same playbook they used for cloud migration and SaaS rollups a decade ago. The pattern is consistent: the money is moving from model training to model deployment infrastructure, because the model itself is commoditizing while the integration layer — the harness — is where defensible value lives. A workflow-first AI investment framework laid out five levers — automate, build, buy, hire, wait — and made the case that the "wait" option carries compounding opportunity cost as competitors capture workflow data that feeds their flywheels. The Gartner line that more than 30% of generative AI projects will be abandoned after proof-of-concept was cited not as a warning against investment but as evidence that the implementation gap is where the real competition plays out. (more: https://youtu.be/jwtpMSRAPAQ?si=NkcJKvoPhqznaVDX)

Local AI Hits the Format Ceiling

TextGen (oobabooga) shipped as a native desktop app — no Python environment, no terminal, no telemetry. It bundles ik_llama.cpp with MCP support and tool calling, runs entirely offline, and provides a clean UI for model management, chat, and API serving. For users who want local inference without configuring conda environments or debugging CUDA drivers, it is the most polished option currently available. The MCP integration means it can plug into agentic workflows as a local model server, which closes a gap that previously required either LM Studio or manual llama.cpp setup. (more: https://github.com/oobabooga/textgen/releases)

But GGUF, the format everything local runs on, is showing its age. A detailed analysis identified four critical gaps: no standardized tool calling format (every runtime invents its own chat templates for function calling), no thinking/reasoning tag convention (each model family uses different special tokens), no multimodal bundling (vision models require separate projection files with no standard naming), and no feature flags to advertise what a quantized model actually supports. When a user downloads a GGUF, they cannot tell from the file alone whether it supports tool calling, what chat template it expects, or whether it needs auxiliary files to function. These are not theoretical complaints — they are the reason local AI tooling remains fragmented despite individual tools reaching polish. The format is the bottleneck, and until GGUF either evolves or gets replaced, local AI will keep reinventing the same compatibility layers. (more: https://huggingface.co/Qwen/Qwen3.5-0.8B/tree/main)

On the voice side, Supertonic from Supertone shipped a 99-million-parameter on-device TTS engine covering 31 languages with ONNX Runtime inference. It runs on Raspberry Pi and e-readers. Expression tags let you control emphasis and emotion without prompt gymnastics. At 99M parameters, this is an order of magnitude smaller than cloud TTS and runs entirely offline — the kind of capability that was cloud-only a year ago. (more: https://github.com/supertone-inc/supertonic)

DramaBox from Resemble AI claims the most expressive voice model yet, built on LTX 2.3. (more: https://github.com/resemble-ai/DramaBox) GenCAD demonstrated image-conditional parametric CAD generation — feed it a photo of a mechanical part and get editable CAD geometry — crossing AI into physical manufacturing design where the stakes include structural integrity, not just aesthetics. (more: https://gencad.github.io/)

Sources (21 articles)

  1. [Editorial] (microsoft.com)
  2. [Editorial] (arxiv.org)
  3. Claude Mythos Speeds Up macOS Security Exploit Research (reddit.com)
  4. [Editorial] (arxiv.org)
  5. [Editorial] (youtube.com)
  6. [Editorial] (github.com)
  7. [Editorial] (github.com)
  8. [Editorial] (ma-s2.com)
  9. [Editorial] (youtube.com)
  10. [Editorial] (youtube.com)
  11. [Editorial] (arxiv.org)
  12. [Editorial] (forge-quality.dev)
  13. "Elias Thorne" is what eight different LLMs name a lighthouse keeper. He's also selling cancer treatment advice on Amazon (reddit.com)
  14. [Editorial] (etnalabs.co)
  15. I tracked every dollar I spent on AI coding tools for 60 days and math is uglier than I thought but probably not in the way you'd guess. (reddit.com)
  16. [Editorial] (youtu.be)
  17. github.com (github.com)
  18. huggingface.co (huggingface.co)
  19. [Editorial] (github.com)
  20. github.com (github.com)
  21. GenCAD (gencad.github.io)