Zero-Days, Copilot Leaks, and the Expanding AI Attack Surface

Published on

Today's AI news: Zero-Days, Copilot Leaks, and the Expanding AI Attack Surface, AI for Defenders — From Code Scanning to Pen Testing, The Dark Factory — When Agents Write All the Code, Agentic Dev Tools Mature, Frontier Benchmarks — Voice, Speed, and Scale, Verifiable Inference and the New AI Data Stack. 22 sources curated from across the web.

Zero-Days, Copilot Leaks, and the Expanding AI Attack Surface

Google shipped a Chrome stable channel update this week patching three high-severity vulnerabilities: CVE-2026-2648, a heap buffer overflow in PDFium; CVE-2026-2649, an integer overflow in V8 reported by a KAIST Hacking Lab researcher; and CVE-2026-2650, a heap buffer overflow in Media. The V8 integer overflow is the one to watch — V8 bugs are bread and butter for exploit chains, and this one sat in the wild long enough to earn a separate Hacker News thread flagging a CSS-related zero-day in the same release cycle. Chrome 145.0.7632.109/110 is the fix for Windows and Mac; Linux gets 144.0.7559.109. If you haven't force-updated yet, now is the time. (more: https://chromereleases.googleblog.com/2026/02/stable-channel-update-for-desktop_13.html)

Meanwhile, Microsoft confirmed a bug in Copilot that caused it to summarize confidential emails — messages the user was not authorized to view — and surface that content in Copilot responses. This is not a theoretical oversharing risk; it is an actual data leakage incident where the AI assistant breached access boundaries the organization had explicitly configured. Microsoft attributed it to a bug, not a design flaw, but the distinction matters less when the confidential content has already been presented to an unauthorized user. The same BleepingComputer report noted a broader security landscape in motion: Amazon disclosed an AI-assisted attacker who breached 600 FortiGate firewalls across 55 countries in five weeks, and researchers identified PromptSpy as the first known Android malware to use generative AI at runtime, leveraging Google's Gemini model to adapt its persistence mechanisms per device. (more: https://www.bleepingcomputer.com/news/microsoft/microsoft-says-bug-causes-copilot-to-summarize-confidential-emails/)

The attack surface is not just expanding through bugs in existing products — it is being deliberately designed into new protocols. WebMCP, an experimental browser API in Chrome 146+ that lets websites expose JavaScript functions as "tools" for AI agents, introduces what one security researcher calls the "data ferry" problem. An AI agent can invoke a tool on your bank's site, receive your balance into its context, then pass that data as parameters to a tool on a malicious site. CORS was implemented to prevent scripts on one origin from reading data from another; WebMCP introduces the AI agent as a trusted intermediary that can move data between origins through legitimate tool invocations. There is no mechanism to verify that a tool's actual behavior matches its description — a tool described as "preview order" could execute a purchase — and tools can request extensive user data under the guise of personalization. The WebMCP authors acknowledge these risks in their GitHub issues and are exploring mitigations, but the protocol itself specifies few controls, leaving implementation to browsers and agent developers. Security teams should be tracking this now, before it ships to stable. (more: https://www.linkedin.com/posts/idan-habler_webmcp-share-7430331975237857281-ZG-v)

AI for Defenders — From Code Scanning to Pen Testing

Anthropic launched Claude Code Security in a limited research preview this week, and the framing is worth noting: this is not a static analysis tool. It reads and reasons about code the way a human security researcher would — tracing how data moves through an application, understanding component interactions, and catching complex vulnerabilities that rule-based scanners miss, particularly flaws in business logic and broken access control. Every finding passes through a multi-stage verification process where Claude re-examines each result, attempting to prove or disprove its own findings before surfacing them. Validated findings appear in a dashboard with severity ratings and confidence scores, and suggested patches require human approval before application. Anthropic reports that using Claude Opus 4.6, their Frontier Red Team found zero-day vulnerabilities in production open-source codebases — bugs that had gone undetected for decades despite years of expert review. Enterprise and Team customers get early access; open-source maintainers can apply for free expedited access. (more: https://www.anthropic.com/news/claude-code-security)

The offensive side of AI security is evolving just as fast. A detailed walkthrough of AI penetration testing by Jason Haddix — who literally wrote the AI pen testing methodology — demonstrates how far the field has come beyond "making ChatGPT say bad words." His team's Agent Breaker challenges present actual AI-integrated applications (portfolio advisors, trip planners, corporate messaging) where attackers must exploit LLM-enabled search bars to leak system prompts, extract API keys, and exfiltrate confidential data from RAG databases. In one real client engagement recreated as a CTF, a search bar yielded a Jira project access token and patent acquisition data. The non-deterministic nature of LLMs means the same attack prompt might succeed on attempt 3 or attempt 239 — a fundamentally different testing paradigm from traditional security. A 12-year-old solved the entire five-flag CTF in 35 minutes at BSides San Francisco, which Haddix considers "entry level." (more: https://youtu.be/_yfiUQSbdPY?si=EhpIKaJK5SdU2xIe)

On the tooling side, Lonkero — a Rust-based web security scanner from Finnish firm Bountyy — shipped v3.6 with a proof-based XSS detection engine that requires no browser at all. Instead of Chrome-based scanning, it uses mathematical proof of exploitability: send a canary, probe escaping behavior across 16 reflection contexts, then derive whether XSS is exploitable from context plus escaping analysis. The result is 300x faster scanning (~200ms per URL versus 60+ seconds with Chrome), 2-3 requests per parameter, and zero browser stability issues. The scanner's intelligence system uses Bayesian hypothesis testing to form and refine vulnerability theories — starting from a prior probability based on parameter name and context, then updating via Bayes' theorem with each test — cutting payloads from 5,000 to 800 while increasing detection. Its v3.1 also introduced a README invisible prompt injection scanner that detects hidden instructions in markdown files invisible to rendered views but readable by LLMs processing raw source. (more: https://github.com/bountyyfi/lonkero)

The RSAC '26 preview underscores the governance challenge: agentic actions are highly fluid, dynamic, and unpredictable, requiring defenses that are equally context-aware. Non-human identity management is emerging as a critical gap — agents currently proxy human identities, which does not scale. Just-in-time privilege management for agents that need legitimate access without standing in the way of productivity is the unsolved problem. (more: https://www.linkedin.com/posts/scottacrawford_rsac-activity-7430649444208967681-DVa6) The Cloud Security Alliance released a CISO memo template specifically for personal AI desktop agents, covering risk context, compliance considerations, and policy requirements (more: https://cloudsecurityalliance.org/artifacts/policy-on-personal-ai-desktop-agents).

The Dark Factory — When Agents Write All the Code

A detailed analysis of autonomous software development maps what the industry calls "the five levels of vibe coding," from spicy autocomplete (Level 0) to the dark factory (Level 5) where no human writes or reviews code. The most documented Level 5 operation is StrongDM's software factory: three engineers, no sprints, no standups, no Jira board. They write markdown specifications; an agent called Attractor reads them, writes the code, and tests it. Critically, StrongDM does not use traditional tests — they use external "scenarios" that live outside the codebase, functioning as a holdout set the agent never sees during development, preventing the AI from optimizing for test passage rather than correctness. A "digital twin universe" of behavioral clones (simulated Okta, Jira, Slack, Google Docs) provides full integration testing without touching production. The output — CXDB, their AI context store — is 16,000 lines of Rust, 9,500 lines of Go, and 700 lines of TypeScript, all shipped in production. (more: https://youtu.be/bDcgHzCBgmQ?si=b1M3ygYB0cOn3G4o)

The numbers backing this up are real. Anthropic estimates that 90% of Claude Code's codebase was written by Claude Code itself, with the company converging toward 100% AI-generated code. OpenAI reports Codex 5.3 was instrumental in creating itself, yielding a 25% speed improvement and 93% fewer wasted tokens. Four percent of public GitHub commits are now directly authored by Claude Code, a figure Anthropic expects to exceed 20% by year-end. And yet a rigorous 2025 METR randomized control trial found experienced open-source developers using AI tools completed tasks 19% slower — while believing they were 24% faster. The gap between frontier teams running dark factories and everyone else stuck at Level 2, getting measurably slower while convinced they are speeding up, is the most important gap in tech right now.

This tension has deep theoretical roots. Ikenna Okpala's series on probabilistic engineering in deterministic systems frames the core problem: software engineering was built to eliminate ambiguity — compilers, type systems, CI, code review, reproducible builds — and now we are injecting probabilistic engines and asking them to ship deterministic outcomes. If your evaluator is also probabilistic, you have not built a truth oracle; you have built a circular validation trap. Two models agreeing is correlation, not proof. Okpala's prescription: represent your codebase as a structure with constraints, measure whether a proposed change can exist inside that structure, and block violations before they land — governance through topology, not language. (more: https://www.linkedin.com/pulse/why-probabilistic-engineering-breaks-deterministic-systems-okpala-7ysde)

Agentic Dev Tools Mature

FastCode, from HKU's Data Science Lab, claims to deliver 3-4x faster code understanding than Cursor or Claude Code at 44-55% lower cost, using a three-phase framework: semantic-structural code representation (AST-based parsing across 8+ languages, hybrid BM25/embedding index, multi-layer graph modeling), lightning-fast navigation (two-stage search, code skimming via function signatures rather than full files), and budget-aware context management that considers confidence, query complexity, codebase size, cost, and iteration count before deciding what to load. The key insight is "scouting-first" — build a semantic map, navigate the structure, then load only targeted files, rather than the traditional approach of repeatedly loading and searching. It supports multi-repository reasoning and runs with local models like qwen3-coder-30b. (more: https://github.com/HKUDS/FastCode)

The Claude Code ecosystem itself is generating a rich best-practices culture. A community-maintained guide crystallizes hard-won lessons: keep CLAUDE.md under 150 lines, use feature-specific subagents with skills instead of generic agents, do manual /compact at 50% context, always start with plan mode, make subtasks small enough to complete in under 50% context, and commit as soon as each task completes. The guide also catalogs a command-skill-agent architecture for progressive disclosure: commands as entry points, agents as orchestrators with preloaded skills, and skills as domain knowledge injected at startup. (more: https://github.com/shanraisshan/claude-code-best-practice) Power users running multiple parallel Claude Code sessions report that the biggest workflow gap is reviewing what the agent built across many files — scrolling through terminal output and git logs to piece together changes. SonarSource's Kintsugi, an Agentic Development Environment built on top of Claude Code rather than replacing it, addresses this with a visual queue for managing parallel sessions, plan review before execution, inline comments, and integrated code review powered by SonarQube's quality engine. (more: https://www.linkedin.com/posts/cole-medin-727752184_i-use-claude-code-every-day-its-by-far-activity-7430746557152546816-sT4M) (more: https://events.sonarsource.com/kintsugi/?s_category=Paid&s_source=Paid%20Other&s_origin=influencer)

GitNexus takes a different approach to AI code intelligence: it turns a repository into an AST-driven knowledge graph directly in the browser using Tree-sitter for parsing and KuzuDB for storage, with a local model generating embeddings for semantic search over files, symbols, and call chains. Every step runs client-side — code never leaves the machine — while exposing an MCP (Model Context Protocol) interface so agents like Claude Code can request relational context ("all call paths touching this function across services") instead of streaming arbitrary files into context windows. (more: https://www.linkedin.com/posts/alindnbrg_ai-devtools-codeintelligence-ugcPost-7430181056034676736-O_MN)

Frontier Benchmarks — Voice, Speed, and Scale

Daily released an open-source benchmark (aiewf-eval) specifically targeting LLM performance for voice agents — testing tool calling, instruction following, and factual grounding over 30-turn conversations, which standard benchmarks do not cover. The headline finding: three models now saturate the benchmark at 100% on every metric, but all three are too slow for production voice agents. Natural conversation requires voice-to-voice response times under 1,500ms, translating to roughly 700ms time-to-first-token for text-mode LLMs. Most production voice agents today still run on GPT-4.1 or Gemini 2.5 Flash — models released in April 2025, ancient by AI timelines. AWS Nova 2 Pro now matches their performance and latency, bringing fully capable enterprise voice agents onto AWS for the first time. On the speech-to-speech front, Ultravox is the first model to perform well on long multi-turn benchmarks, and it is open-weights — 30-billion-parameter Nemotron 3 Nano nearly matches GPT-4o performance, validating the thesis that open weights will gain significant market share in 2026. (more: https://www.daily.co/blog/benchmarking-llms-for-voice-agent-use-cases)

Claude Opus 4.6 surged past forecasts on METR's 50% Time-Horizon Benchmark, which measures how long (in human-equivalent hours) a software task can be that an LLM completes successfully half the time. Opus 4.6 leads at roughly 13+ hours — meaning it can reliably complete tasks that would take a skilled human over half a day — a massive jump from GPT-5.2 at ~7 hours and Opus 4.5 at ~5 hours. Earlier models clustered near 30 minutes to 1 hour. The trajectory is clearly exponential, with models doubling effective task complexity every few months. (more: https://www.reddit.com/r/AINewsMinute/comments/1rap978/claude_opus_46_surges_past_forecasts_on_metrs_50/) On the open-weight side, MiniMax-M2.5 with 10B active parameters achieves 80.2% on Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp, with Unsloth's dynamic GGUF quantization compressing the model to fit on consumer hardware — the 3-bit dynamic quant runs at 20+ tokens/s on 128GB RAM. (more: https://unsloth.ai/docs/models/minimax-m25)

Verifiable Inference and the New AI Data Stack

Jolt Atlas, a zero-knowledge machine learning (zkML) framework from a team building on the Jolt proving system, tackles one of the hardest open problems in trustworthy AI: cryptographic proof that a model inference actually ran the claimed model on the claimed inputs, without revealing model weights or input data. Unlike zkVMs that emulate CPU instruction execution, Jolt Atlas operates directly on ONNX tensor operations — the open-source, portable format used across PyTorch, TensorFlow, and JAX. The key innovation is using lookup arguments via the sumcheck protocol for non-linear functions (Softmax, ReLU), which are the dominant cost in zero-knowledge proofs of neural networks. Neural teleportation compresses activation lookup tables by shrinking the effective input range, and prefix-suffix decomposition enables streaming provers that reduce peak memory from O(N) to roughly O(sqrt(N)), enabling on-device proving without specialized hardware. Benchmarks show Jolt Atlas proving nanoGPT inference in 14 seconds versus ezkl's 237 seconds — a 17x speedup on proof generation alone. The companion paper outlines use cases for agentic commerce guardrails, where autonomous AI agents making financial decisions must provide cryptographic proof of what model produced which recommendation. (more: https://arxiv.org/abs/2602.17452v1)

At the infrastructure layer, RTI's GENESIS framework brings DDS (Data Distribution Service) — the same middleware powering flight control systems, FDA-cleared surgical robots, and autonomous vehicle sensor fusion — to distributed AI agent networks. The key feature is zero-configuration discovery: agents find each other automatically without IP addresses, ports, or configuration files, communicating peer-to-peer with sub-millisecond latency. Intelligent function windowing uses classifiers to select the 5-10 relevant functions from 200+ discovered tools, achieving 90%+ reduction in LLM tokens. For enterprises deploying multi-agent systems in production, the reliability guarantees inherited from safety-critical domains address a gap that HTTP-based agent frameworks have not solved. (more: https://github.com/rticommunity/rti-genesis)

At a different layer of the stack, the RuVector ecosystem continues to push the single-file-does-everything thesis. A test of the RVF format on a 31,000-word text ingested 263 vectors, a knowledge graph of 306 nodes with 1,402 edges, and GNN weights with 4-head attention — all in a single 137KB file with a cryptographic SHAKE-256 witness chain recording every operation. Copy-on-write snapshots yielded an 847:1 compression ratio for version control on AI artifacts. (more: https://www.linkedin.com/posts/gunnar-strandberg-b58998_ruvector-rvf-vectordatabase-activity-7430978842380406784-OW_F) The rvDNA toolkit applied similar principles to genomics, processing 596,007 markers from a 23andMe export in about a second — no cloud, no GPU — with conservative CYP star allele calling and drug recommendations suppressed unless evidence reaches at least Moderate confidence. (more: https://www.linkedin.com/posts/reuvencohen_does-rvdna-work-eric-porres-put-it-to-activity-7430714040609374209-IVJz)

Sources (22 articles)

  1. Zero-day CSS: CVE-2026-2441 exists in the wild (chromereleases.googleblog.com)
  2. Microsoft says bug causes Copilot to summarize confidential emails (bleepingcomputer.com)
  3. [Editorial] WebMCP — MCP for the Web (linkedin.com)
  4. [Editorial] Anthropic: Claude Code Security (anthropic.com)
  5. [Editorial] Video: AI Technology Deep Dive (youtu.be)
  6. [Editorial] Lonkero — Open Source AI Tool (github.com)
  7. [Editorial] RSAC Security Conference Insights (linkedin.com)
  8. [Editorial] CSA Policy on Personal AI Desktop Agents (cloudsecurityalliance.org)
  9. [Editorial] Video: AI & Security Perspectives (youtu.be)
  10. [Editorial] Why Probabilistic Engineering Breaks Deterministic Systems (linkedin.com)
  11. HKUDS/FastCode: Accelerating Code Understanding (github.com)
  12. [Editorial] Claude Code Best Practices (github.com)
  13. [Editorial] Claude Code Daily Driver — Real-World Workflow (linkedin.com)
  14. [Editorial] SonarSource Kintsugi — Code Security Event (events.sonarsource.com)
  15. [Editorial] AI DevTools & Code Intelligence (linkedin.com)
  16. [Editorial] Benchmarking LLMs for Voice Agent Use Cases (daily.co)
  17. Claude Opus 4.6 Surges Past Forecasts on METR's 50% Time-Horizon Benchmark with Exponential Gains (reddit.com)
  18. [Editorial] Unsloth: MiniMax M2.5 Fine-Tuning Guide (unsloth.ai)
  19. Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge (arxiv.org)
  20. [Editorial] RTI Genesis — Real-Time Infrastructure (github.com)
  21. [Editorial] RuVector & RVF Vector Database (linkedin.com)
  22. [Editorial] RVDNA — Does It Work? (linkedin.com)