AI-Powered Offense Meets M5 Silicon

Published on

Today's AI news: AI-Powered Offense Meets M5 Silicon, Identity, Privacy, and the Governance Gap, The $700 Billion Bet, When Your AI Swarm Hallucinates, Local AI: From Budget GPUs to Daily Drivers, Alignment Pretraining and Research Frontiers. 22 sources curated from across the web.

AI-Powered Offense Meets M5 Silicon

Apple spent five years and reportedly billions of dollars building Memory Integrity Enforcement (MIE), a hardware-assisted memory safety system layered on ARM's Memory Tagging Extension. MIE was the marquee security feature of the M5 chip, designed to stop memory corruption exploits — the vulnerability class behind the most sophisticated compromises on iOS and macOS. According to Apple's own research, MIE would have defeated every published exploit chain against modern iOS, including the leaked Coruna and Darksword kits. A four-person team at Calif, working with Anthropic's Mythos Preview model, built a working kernel exploit bypassing MIE in five days. (more: https://blog.calif.io/p/first-public-kernel-memory-corruption)

The exploit is a data-only kernel local privilege escalation chain targeting macOS 26.4.1 on bare-metal M5 hardware with kernel MIE enabled. It starts from an unprivileged local user, uses only normal system calls, and ends with a root shell. Bruce Dang found the bugs on April 25th; Dion Blazakis joined on April 27th; Josh Maine built the tooling; by May 1st they had a working chain. Mythos Preview identified the bugs quickly because they belong to known bug classes, but bypassing MIE required human expertise to bridge the gap between discovery and exploitation. "Landing a kernel memory corruption exploit against the best protections in a week is noteworthy, and says something strong about this pairing," they write. Full technical details are withheld pending an Apple fix. To their knowledge, this is the first public macOS kernel exploit on MIE hardware.

Meanwhile, the Linux kernel's security-disclosure pipeline is buckling under AI-generated vulnerability reports. Gadi Evron highlighted Willy Tarreau's new ground rules for the mailing list: keep reports short and human-readable, strip markdown, include a working reproducer, propose and test a fix before reporting, and don't waste maintainer time on dead code or irrelevant edge cases. (more: https://www.linkedin.com/posts/gadievron_i-often-get-on-my-soap-box-here-about-how-activity-7462185294570864641-Cql5) Tarreau's frustration is specific: "We waste more than 80% of our time dealing with stuff that would never happen if reporters had just read the doc." The tools genuinely find bugs nobody saw before, but the disclosure infrastructure was built for human throughput. On the tooling side, Laurent Gaffie released SecorizonAI, a terminal-native AI shell built for penetration testing — a single Go binary backed by a local model via Ollama, no cloud dependency, no telemetry, with an explicit anti-refusal posture and MCP support for tools like Burp Suite. (more: https://github.com/secorizon/SecorizonAI) The question is no longer whether AI can assist offensive security. It's which workflow and which local model.

Identity, Privacy, and the Governance Gap

When AI entities act, decide, and speak on an organization's behalf, identity governance stops being about logins and session tokens. Steve Tout, CEO of Identient, lays out a taxonomy the industry desperately needs: synthetic agents trained on aggregated domain expertise, ephemeral AI workers where attribution gets murky fast, swarms that mix entities across trust levels without reclassification, and digital twins — verified, governed representations of a specific human's judgment. "You can't manage a digital twin like a service account. You can't manage an AI worker like an employee. And you can't let cross-level swarms run without a registry that tracks what spawned what." (more: https://www.cio.com/article/4170235/the-death-of-identity-as-we-know-it.html)

The second axis — governed versus feral — is where the threat model gets real. Researchers have demonstrated how malicious AI swarms can flood the web with fabricated content designed to poison future training data, a data integrity problem masquerading as disinformation. The Coalition for Secure AI (CoSAI) published its "Agentic Identity and Access Management" framework in April, treating agents as first-class identities with their own lifecycle, delegation model, and accountability. But governance only works if it keeps pace with deployment, and it isn't. The risk is already concrete: a site called ReplacedByClawd lets anyone spin up a digital version of a named person in minutes — no authorization, no tie back to the real human. A convincing digital version of your CFO approving a wire, or a cloned voice of a senior engineer pushing a late-night code review, is the kind of social engineering that most security programs aren't built for. Tout's warning: "The companies that can trace that lineage will govern it. The ones that can't will learn what they've lost after the fact."

The privacy problem is more immediate. A new study tested tracking across 20 popular AI chatbots using the prompt "pregnancy test near me" and found that 17 of 20 sent data to third parties, 15 shared conversation IDs with analytics or ad tools, and some session replay tools captured readable prompts and responses. (more: https://www.reddit.com/r/Anthropic/comments/1te331q/ai_chatbot_privacy_has_a_web_tracking_problem/) The advertising web's surveillance infrastructure migrated wholesale into conversational AI, but the context is fundamentally different: people type things into chatbots they would never search on Google. A persistent conversation ID linked to behavioral data creates a de facto identity trail. Advertisers don't need your email when they have a UUID that follows your emotional state across sessions.

The $700 Billion Bet

Benedict Evans' Spring 2026 deck crystallizes the macro picture: the four hyperscalers plan roughly $700 billion in combined 2026 capex, dwarfing global telecoms (~$300 billion) and approaching oil and gas (~$1 trillion). U.S. data center construction spending now overtakes office construction. Nvidia's quarterly revenue has rocketed past Intel's. These formerly asset-light software companies are seeing capex-to-sales ratios climb toward 50-60%, and OpenAI alone aspires to 1 GW per week of new construction capacity. (more: https://static1.squarespace.com/static/50363cf324ac8e905e7df861/t/6a0af5d0484fbf5fe9a7743e/1779103184855/2026-Spring-AI.pdf)

Evans draws a pointed analogy: LLMs in 2026 look like mobile networks in 2010 — tokens have marginal cost, opaque pricing poorly aligned with value, and long-term pricing power is unlikely. Benchmark scores show frontier models from OpenAI, Anthropic, Google, Meta, and Chinese labs converging with no clear network effects, suggesting models may become commodity infrastructure. Consumer adoption is "a mile wide and an inch deep" — 900 million weekly ChatGPT users, but fewer than 20% sent more than 1,000 prompts in all of 2025, and only 5% pay. Enterprise pilots are widespread — 60-80% adoption rates in software development and customer service — but daily habitual use remains low. Annualized enterprise AI spending is led by coding, with legal, support, and medical trailing. Y Combinator's latest batch was overwhelmingly AI startups trying to unbundle Google, Excel, email, and Oracle. His provisional thesis: chat is a terrible UX, general use needs purpose-built apps, labs can't build all those apps, and innovation will move up the stack above the model layer.

One legal obstacle to OpenAI's IPO ambitions just evaporated. Elon Musk's lawsuit accusing Sam Altman and OpenAI of "stealing a charity" ended in a unanimous jury verdict — his claims were filed too late under the statute of limitations. "There was a substantial amount of evidence to support the jury's finding, which is why I was prepared to dismiss on the spot," Judge Yvonne Gonzalez Rogers said. Musk's lead counsel offered a one-word response: "Appeal." (more: https://techcrunch.com/2026/05/18/elon-musk-has-lost-his-lawsuit-against-sam-altman-and-openai/)

Anthropic, meanwhile, is building its developer platform. The company acquired Stainless, which has powered the generation of every official Anthropic SDK since the earliest API days — turning API specs into SDKs across TypeScript, Python, Go, Java, Kotlin, plus CLIs and MCP servers. (more: https://www.anthropic.com/news/anthropic-acquires-stainless) With agents only as useful as what they can connect to, this is Anthropic verticalizing its tooling stack at the exact moment the MCP ecosystem is scaling. A detailed video editorial on AI investment strategy argues that 40% of agentic AI projects will fail by 2027 — not because the technology doesn't work, but because organizations skip understanding their workflows before buying tools. "Do not automate what you cannot describe," the presenter warns, offering a five-lever framework: automate, build, buy, hire, or wait. (more: https://youtu.be/LIkYVsxMpS8?si=TCCJmW1atsEe8v2H)

When Your AI Swarm Hallucinates

Researchers at Sun Yat-sen University published MAS-FIRE, the first systematic framework for fault injection and reliability evaluation of LLM-based multi-agent systems. They define 15 fault types covering intra-agent cognitive errors (hallucinations, memory loss, planning failures) and inter-agent coordination failures (blind trust, message storms, role ambiguity), injected through three non-invasive mechanisms: prompt modification, response rewriting, and message routing manipulation. (more: https://arxiv.org/abs/2602.19843v1)

The findings challenge comfortable assumptions. Stronger models do not uniformly improve robustness. Under "blind trust" injection — agents told to accept all upstream input without verification — GPT-5 achieved only 6.32% robustness while DeepSeek-V3 reached 70.61%. GPT-5's superior instruction-following became a liability: it obeyed the corrupted directive in roughly 94% of failed cases, while DeepSeek-V3's weaker compliance paradoxically enabled recovery by challenging suspicious inputs. Architecture matters as much as model quality — iterative, closed-loop designs neutralized over 40% of faults that caused catastrophic collapse in linear workflows. The authors organize observed recovery behaviors into four tiers: mechanism-level (architectural redundancy), rule-based (deterministic filtering that achieved 100% recovery for communication faults), prompt-level (semantic role robustness), and reasoning-level (cognitive reflection). The practical recommendation: agents should cross-validate upstream instructions against their own reasoning rather than treating all directives as authoritative.

IBM Research and HuggingFace launched the Open Agent Leaderboard to address a related blind spot: existing benchmarks measure models, not agent systems. The leaderboard evaluates full agent-model configurations across six benchmarks spanning coding, research, personal tasks, and support, reporting both quality and cost. The top three configurations use the same model but differ in score and cost because the surrounding agent architectures differ — "same model, different agents, different results." General-purpose agents with no benchmark-specific tuning already match specialized systems built for individual tasks, though open-weight models still trail frontier closed-source models by 18-29 percentage points on average. Failed runs cost 20-54% more than successful ones — failure behavior shapes your bill as much as success does. (more: https://huggingface.co/blog/ibm-research/open-agent-leaderboard)

Simon Willison's PyCon lightning talk offers the altitude view: the last six months saw coding agents cross from "often-work to mostly-work" after labs ran reinforcement learning from verifiable rewards. OpenClaw-style personal AI assistants emerged, Mac Minis sold out in Silicon Valley, and the best laptop-available models started wildly outperforming expectations — a Qwen3.6-35B-A3B model on Willison's laptop drew a better pelican than Claude Opus 4.7. (more: https://simonwillison.net/2026/May/19/5-minute-llms/) Meanwhile, the cultural backlash continues. A passionate essay asks: where are the vibecoded Photoshops? The vibecoded Excels? The author distinguishes three levels of software work — typing (Level 1, now cheap via AI), verifying (Level 2, still hard), and deciding (Level 3, unchanged) — and argues the gate was never at Level 1. "The accusation that someone produced unverified output is itself being produced as unverified output." (more: https://indiepixel.de/blog/posts/where-are-the-vibecoded-photoshops/)

Local AI: From Budget GPUs to Daily Drivers

Salvatore Sanfilippo — antirez, creator of Redis — didn't expect DwarfStar 4 to take off. DS4 is a single-model local AI experience built around DeepSeek v4 Flash, which works "extremely well" with asymmetric 2/8-bit quantization so that 96-128GB of RAM is enough. "It is the first time since I play with local inference that I find myself using a local model for serious stuff that I would normally ask to Claude/GPT," he writes, calling it "really a big thing." (more: https://antirez.com/news/165) He envisions the project tracking whatever open-weights model is practically fast on high-end Macs or "GPU in a box" gear, with potential domain variants — ds4-coding, ds4-legal, ds4-medical — loaded on demand. He built DS4 in one week, working 14-hour days — "my normal average is 4-6 since early Redis times" — and credits the ability to build it that fast to GPT 5.5. Next priorities: quality benchmarks, a coding agent, and distributed inference (both serial and parallel).

The open-weights ecosystem keeps delivering. Ring-2.6-1T, a 1-trillion-parameter reasoning model with 63 billion active parameters, was open-sourced with leading results on agent benchmarks including PinchBench, ClawEval, and TAU2-Bench. (more: https://www.reddit.com/r/ollama/comments/1td2sul/ring261t_open_sourced_today_soooo_looking_forward/) It supports adaptive reasoning that adjusts budget based on task complexity — though whether anyone can run a trillion parameters outside a datacenter remains aspirational. On consumer hardware, a dual-RTX-3090 setup running Ubuntu hit 4,000 prompt-processing tokens per second and 113 generation tokens per second with Qwen 3.6 27B at 262K context. (more: https://www.reddit.com/r/LocalLLaMA/comments/1tcf2dt/we_really_all_are_going_to_make_it_arent_we/) The community consensus: "local AI went from 'my 7B can almost summarize text' to people running near-Sonnet coding workflows on consumer hardware."

For those unsure what fits their rig, whichllm auto-detects GPU/CPU/RAM and ranks models from HuggingFace using merged benchmarks — LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO — with confidence-graded scoring rather than a simple "biggest model that fits" heuristic. (more: https://github.com/Andyyyy64/whichllm) One developer took local AI further, building a custom agent loop on Qwen3.5 9B that replaced Claude Code for 99% of tasks. The key insight: with a well-designed agentic setup, the human becomes the bottleneck. To work within the small model's limits, they implemented a map-reduce pattern with structured JSON outputs, deterministic Python guardrails, parallel execution, and checkpointing. A community member independently built a 3-tier router — reflexes for sub-100ms tasks with zero LLM calls, instinct for single-call decisions, full deliberation only when needed — where the small model handles 80% of traffic on the cheap tiers. (more: https://www.reddit.com/r/LocalLLaMA/comments/1tftaaa/the_power_of_structured_workflows_and_small_local/)

Alignment Pretraining and Research Frontiers

Geodesic Research pretrained a suite of 6.9-billion-parameter LLMs from scratch on 500 billion tokens, varying only the AI-related content. The baseline selected misaligned actions 45% of the time. Adding misalignment-themed discourse increased this to 51%. The striking result: upsampling positive alignment discourse — synthetic examples of AI systems behaving safely — reduced misalignment to just 9%, far more effective than merely filtering out negative content (which yielded 31%). "The presence of positive AI discourse matters more than the absence of negative discourse," the authors conclude. These effects persist through post-training, the capability cost is minimal (2-4 percentage points), and each run cost approximately $35,000 on 20,000 H100 GPU hours. The implication is both practical and unsettling: the internet's collective discourse about AI — including all the doomposting and all the safety theater — is literally shaping how future models behave. The paper establishes "alignment pretraining" as a new research direction complementary to post-training safety, and the authors release all models, datasets, and evaluations. (more: https://arxiv.org/abs/2601.10160)

On academic integrity, arXiv is enforcing a one-year submission ban for papers containing hallucinated references. Thomas Dietterich reminded authors that "by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated." (more: https://twitter.com/tdietterich/status/2055000956144935055) It's a blunt instrument for a real problem — the minimum viable response when fabricated citations threaten the scientific record.

Nous Research published work on efficient pretraining with token superposition, aggregating multiple tokens into higher-density units during training to cut compute. The approach draws on a 1994 observation that mutual information between pairs of English letters decays as a power law with distance; the team confirmed the same holds for tokenized text and used it to weight target positions in the loss function. (more: https://www.reddit.com/r/LocalLLaMA/comments/1tc67pw/efficient_pretraining_with_token_superposition_by/) In robotics, NVIDIA published a guide to fine-tuning its Cosmos Predict 2.5 world model using LoRA and DoRA for synthetic robot trajectory generation. With only 50 million trainable parameters injected into a frozen 2-billion-parameter diffusion transformer, the fine-tuned model eliminated hallucinated human hands, correctly followed hand-specificity prompts, and reduced video jitter — converging in 2.5 hours on 8 H100s. (more: https://huggingface.co/blog/nvidia/cosmos-fine-tuning-for-robot-video-generation)

Sources (22 articles)

  1. [Editorial] First Public Kernel Memory Corruption (blog.calif.io)
  2. [Editorial] Gadi Evron on Infosec (linkedin.com)
  3. SecorizonAI — AI shell for security professionals (github.com)
  4. [Editorial] The Death of Identity As We Know It (cio.com)
  5. AI chatbot privacy has a web tracking problem (reddit.com)
  6. AI eats the world (Spring 26) [pdf] (static1.squarespace.com)
  7. Elon Musk has lost his lawsuit against Sam Altman and OpenAI (techcrunch.com)
  8. Anthropic acquires Stainless (anthropic.com)
  9. [Editorial] Video Feature (youtu.be)
  10. MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems (arxiv.org)
  11. The Open Agent Leaderboard (huggingface.co)
  12. The last six months in LLMs in five minutes (simonwillison.net)
  13. Where Are the Vibecoded Photoshops? (indiepixel.de)
  14. A few words on DS4 (antirez.com)
  15. Ring-2.6-1T Open sourced today! (reddit.com)
  16. We really all are going to make it — 2x3090 setup (reddit.com)
  17. whichllm — Find the best local LLM for your hardware (github.com)
  18. The power of structured workflows and small local models (reddit.com)
  19. Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment (arxiv.org)
  20. New arXiv policy: 1-year ban for hallucinated references (twitter.com)
  21. Efficient pretraining with token superposition by Nous Research (reddit.com)
  22. Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation (huggingface.co)