The Invisible Attack Surface — When CSS and Memory Become Weapons
Published on
Today's AI news: The Invisible Attack Surface — When CSS and Memory Become Weapons, Sandboxing the Sandbox — Containing Agents That Have Your Keys, AI Pentesting Goes Autonomous, Multi-Agent Orchestration Gets a Scheduler, The MCP and Skills Ecosystem Scales Up, LLM Internals — Copy-Paste Your Prompt, Double Your Accuracy, Industry Moves — Acquisitions, Exits, and the Enterprise AI Summit. 23 sources curated from across the web.
The Invisible Attack Surface — When CSS and Memory Become Weapons
AI browser agents read a different internet than humans do. That claim, from security researcher Mihalis Haatainen of Bountyy Oy, now comes with seven working proof-of-concept exploits to back it up. GhostCSS demonstrates that the `.sr-only` CSS pattern — a standard accessibility class shipped by Bootstrap, Tailwind, and every major CSS framework — is a near-perfect delivery vehicle for prompt injection against AI browser agents. The attack is embarrassingly elegant: embed instructions inside screen-reader-only spans on a normal blog post. The AI parses the DOM and accessibility tree, reads the hidden payload as regular page content, and follows it. In testing, browser-use with Llama 3.3 70B navigated to a blog, read the hidden exfiltration instruction, visited the attacker's endpoint, returned, and gave the user a clean summary — never mentioning what it actually did. No XSS. No JavaScript. No exploit code. Just CSS that every sanitizer in existence considers safe. (more: https://github.com/bountyyfi/Ghostcss)
The scariest part is the breadth of the attack surface. CSS `content` properties, custom properties, counters, `::before`/`::after` pseudo-elements, animation keyframes — none of these exist in the HTML source. They are generated at render time. But agents read them from the DOM. Of the seven attack variants GhostCSS demonstrates, the `.sr-only` injection is the most dangerous precisely because it is indistinguishable from legitimate accessibility markup. No WAF catches it. No HTML sanitizer strips it. Defenses exist — render-aware extraction that diffs DOM versus pixels, CSS property filtering before LLM ingestion, dual-pass verification with screenshot OCR — but nobody ships them yet. (more: https://www.linkedin.com/posts/mihalis-h-551323164_cybersecurity-ai-vulnerability-share-7428161763944296448-laOv)
The same researcher also built MCP Parasite, and it makes GhostCSS look polite. Disguised as "ProjectMemory," a genuinely useful project memory and context tool for MCP-enabled AI agents, it indexes your codebase, remembers architecture decisions across sessions, and works perfectly for days. The reconnaissance phase silently profiles your entire environment — every tool call, every repo, every credential file, your deploy schedule, your team members, which other MCP servers you have installed. Then it goes dormant. `mcp-scan` finds nothing. Tool descriptions are clean. There is no malicious metadata to detect. When you deploy to production, the parasite responds with your deploy checklist plus one extra line — a "project convention" it "remembered." Your AI reads it, follows it, and uses your own GitHub MCP to create an issue containing your `.env` file. The worm never touched GitHub. Your AI did. Cross-server shadowing, temporal, adaptive, stateful. (more: https://github.com/bountyyfi/ProjectMemory)
The trust chain problem is not theoretical. Wiz Research discovered that Moltbook — the viral "AI-only" social network — had a misconfigured Supabase database granting full read and write access to all platform data: 1.5 million agent authentication keys, private agent-to-agent DMs (some containing plaintext OpenAI API keys), and the ability to modify any post. The data revealed 1.5M "AI agents" were controlled by just 17K human accounts — an 88:1 ratio — with zero verification that agents were actually AI. One missing Row Level Security toggle exposed everything. The Moltbook team patched it within three hours, but the incident is a case study in what happens when "vibe coding" meets static credentials at agent scale. As one commenter noted, the traditional access control model of periodic credential rotation and manual access reviews simply cannot keep pace with agents that spawn dynamically, discover credentials programmatically, and operate at machine speed. (more: https://www.linkedin.com/posts/mihalis-h-551323164_cybersecurity-mcp-aiagents-share-7429280614270275584-x6VC)
Sandboxing the Sandbox — Containing Agents That Have Your Keys
The security community's response to agentic risk is bifurcating into two camps: those building application-layer defenses and those arguing that only kernel-level enforcement will do. A developer going by WayneCider built YOPJ (Your Own Personal Jean-Luc) — a portable, local AI coding agent running Codestral 22B via llama-server — and wrapped it in an 8-layer security sandbox: command allowlist plus blocklist, strict path confinement with symlink resolution, prompt injection sanitization on tool results, network egress blocking, protected paths, output size limits, and audit logging. Then he handed the full source code to ChatGPT and said "break it." 240+ adversarial tests across nine attack vectors — prompt injection, sandbox escape, command injection, social engineering, regex bypass, memory poisoning, context window attacks — are published in the repo as a knowledge library that the agent itself can reference to recognize attack patterns. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r6cyr9/i_built_a_local_ai_coding_agent_with_an_8layer/)
The top comment on that post pointed out the fundamental limitation: all those controls live at the application layer, and the Python process is the enforcement boundary. If the model finds a sanitizer bypass or there is a bug in path confinement, the shell and filesystem are wide open. That commenter was from nono, an open-source tool that takes the opposite approach — kernel-level enforcement using Landlock (Linux) and Seatbelt (macOS). Once applied, restrictions are irreversible for the process. There is no API to widen them. Not even nono itself can remove them. Claude Code now ships with a built-in sandbox mode, which is a meaningful step, but it includes `dangerouslyDisableSandbox` — an escape hatch that lets the agent retry commands outside the sandbox. Nono eliminates that possibility structurally. Install via Homebrew, wrap your Claude Code session, done: `nono run --allow-cwd --profile claude-code -- claude`. (more: https://www.alwaysfurther.ai/blog/how-to-sandbox-claudecode-with-nono)
On the infrastructure side, Sandstorm takes a different but complementary approach: instead of hardening local execution, it runs agents in disposable cloud VMs. Built on the Claude Agent SDK and E2B sandboxes, each request gets a fresh VM with full tool access — Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch — and the sandbox is destroyed when done. Nothing persists. Nothing escapes. It supports 300+ models via OpenRouter, structured output via JSON schema, subagent delegation, MCP server integration, and file uploads. The whole thing is a single `pip install duvo-sandstorm` and `ds "your prompt"`. (more: https://github.com/tomascupr/sandstorm)
Herman Errico, a PM at Vanta, is trying to name this entire problem category. His AARM specification — Autonomous Action Runtime Management — argues that the security gap is not in permissions or logging but at the action boundary: the moment an agent calls a tool that reads data, moves money, changes permissions, or deletes records. Once the tool runs, there is no undo button. AARM proposes a standardized layer for observing, controlling, and authorizing autonomous actions at runtime, including action receipts, policy hooks, and step-up approval. The consistent response from builders and security teams: "yes, this is the gap." (more: https://www.linkedin.com/posts/hermanerrico_every-ai-agent-security-strategy-has-the-share-7429545310168768512-EMG_)
AI Pentesting Goes Autonomous
Shannon, by Keygraph, bills itself as "your fully autonomous AI pentester" and backs it up with a 96.15% success rate on the hint-free, source-aware XBOW benchmark. Built on Anthropic's Claude Agent SDK, Shannon operates as a four-phase multi-agent system: reconnaissance maps the attack surface via source code analysis plus Nmap and Subfinder; parallel vulnerability-analysis agents hunt per OWASP category with data-flow tracing from user inputs to dangerous sinks; parallel exploitation agents attempt real browser-based attacks under a strict "No Exploit, No Report" policy; and a final phase consolidates validated findings into a report with copy-paste proof-of-concepts. Against OWASP Juice Shop, Shannon found 20+ critical vulnerabilities including complete authentication bypass with full database exfiltration, privilege escalation to admin via registration workflow bypass, systemic IDOR flaws, and SSRF. Against crAPI, it chained multiple JWT attacks — algorithm confusion, `alg:none`, and weak key injection — to full database compromise. A full run costs roughly $50 in Claude Sonnet API calls and takes 1–1.5 hours. Shannon Lite is AGPL-3.0 open source; Shannon Pro adds an LLM-powered data flow analysis engine for deeper coverage. (more: https://github.com/KeygraphHQ/shannon)
The broader community is pushing in the same direction with less polish. A LinkedIn post about an unnamed multi-agent AI red team — shipping as a single Docker container with Nmap, Metasploit, SQLMap, and Hydra preinstalled, where the AI decides which tool to use — went viral. The discourse was more illuminating than the tool itself. As one commenter put it, "a red team engagement is $60-85K because you're paying for judgment, not tools. Nmap and Metasploit have been free for decades." Another noted that automated pentest platforms being marketed as autonomous end-to-end testing, without strong scope guardrails and human oversight, risk creating real production incidents — especially in medical systems or critical infrastructure. The consensus: automation improves continuously, but it does not replace accountability, scoping, and judgment. (more: https://www.linkedin.com/posts/dominic-archuleta-9258191300_wow-im-not-going-to-say-the-groups-ugcPost-7429520381843034112-R5LC)
Meanwhile, David Kennedy of TrustedSec released btrpa-scan, a BLE scanner with Resolvable Private Address resolution that is less flashy but arguably more useful for physical security assessments. It implements the Bluetooth Core Specification's `ah()` function for real-time IRK-based address resolution, supports multi-adapter scanning, GPS location stamping, RSSI-based distance estimation with environment presets, proximity alerts, a live TUI, and a web-based radar GUI with animated sweep visualization. The tool addresses a real gap: BLE devices use randomized MAC addresses for privacy, but anyone with the Identity Resolving Key can de-anonymize them in real time. (more: https://github.com/HackingDave/btrpa-scan)
Multi-Agent Orchestration Gets a Scheduler
If you are building multi-agent systems and still hard-coding communication topologies, a new paper from East China Normal University suggests you are leaving 5–26% accuracy on the table. ST-EVO introduces Spatio-Temporal Evolution for multi-agent communication — scheduling not just which agents talk to which (spatial) but when those topologies change across dialogue iterations (temporal). The scheduler is a compact neural network driven by MLP-based Flow Matching, and it is trained on just 40–80 samples per task. On nine benchmarks — MMLU, GSM8K, HumanEval, HotpotQA, and others — ST-EVO achieved an average accuracy of 88.42%, with improvements ranging from 5.65% on easy tasks (MultiArith) to 26.10% on hard ones (DDXPlus). Token consumption was roughly 50% of the nearest competitor on MMLU and HumanEval while delivering about 5% higher accuracy. Most striking: under prompt injection attacks, static multi-agent systems suffered 6.5–12.7% performance drops, while ST-EVO maintained nearly identical performance — the adaptive topology effectively routes around compromised agents. (more: https://arxiv.org/abs/2602.14681v1)
Google DeepMind is apparently working on its own approach to the same problem, releasing what it calls "Intelligent AI Delegation" for multi-agent orchestration. Details are thin — the Reddit post announcing it offered no link to the paper — but the timing aligns with a clear industry trend: the shift from manually designed agent workflows to systems that learn their own coordination patterns. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r6pqjr/google_deepmind_has_released_their_take_on/)
At the practical end of multi-agent coordination, BeadHub tackles the specific version of the problem that nobody else is: multiple programmers, each with multiple agents, across different machines and repositories. Built on top of Steve Yegge's beads (a git-native issue tracker), BeadHub is a coordination server where agents claim work, communicate via mail (async, fire-and-forget) or chat (sync, block-until-reply), and respect advisory file locks. When an agent marks a task as in-progress, the claim is visible to every other agent across machines. The system recognizes different repository clones as the same project, so agents in Buenos Aires and San Francisco see each other's claims and locks. A dedicated "coordinator" agent that does not write code — watching the dashboard, assigning work, checking on progress — turned out to matter more than any of the technical plumbing. Anthropic's Claude Code Agent Teams and OpenAI's Codex now handle the single-machine case; BeadHub is building for the multi-programmer, multi-timezone version. (more: https://juanreyero.com/article/ai/beadhub)
The MCP and Skills Ecosystem Scales Up
Antigravity Awesome Skills has crossed 864 curated agentic skills — small markdown files that teach AI coding assistants specific tasks — working across Claude Code, Gemini CLI, Codex CLI, Cursor, GitHub Copilot, OpenCode, and Antigravity IDE. The repository includes official skills from Anthropic, OpenAI, Google, Microsoft, Supabase, and Vercel Labs, organized into architecture, security, data/AI, infrastructure, and testing categories. Version 5.4.0 adds workflows — step-by-step execution sequences for goals like "Ship a SaaS MVP" or "Security Audit for a Web App." (more: https://github.com/sickn33/antigravity-awesome-skills)
Separately, a developer built SkillsMP — an MCP server that connects agents to 8,000+ community skills with a single-line installation and zero setup. The MCP provides keyword and semantic search, lets agents read skill content into context (without permanent installation), and supports install/remove for persistent use. The most interesting pattern: for one-off tasks, having the agent read the skill temporarily rather than installing it permanently is often better — it keeps the skill directory clean while still providing the full instructions in context. The top community question was predictable and important: "What controls are in place for ensuring the skills database does not hold anything malicious?" — a question that echoes the MCP Parasite attack described earlier in today's coverage. (more: https://www.reddit.com/r/ClaudeAI/comments/1r6zr5m/i_built_an_mcp_that_connects_your_agent_to_8000/)
LLM Internals — Copy-Paste Your Prompt, Double Your Accuracy
Andriy Burkov highlighted a paper that reveals something fundamental about how autoregressive attention works — and the fix is absurdly simple. LLMs process text left to right; each token can only attend to what came before it. When you write a long prompt with context first and a question last, the context tokens were generated without any awareness of what question was coming. The paper asks: what if you just send the prompt twice in a row, so every token gets a second pass with full bidirectional awareness? Across seven benchmarks and seven models (from Gemini, ChatGPT, Claude, and DeepSeek), accuracy goes up with no increase in output length and no meaningful increase in response time — because processing input tokens is already parallelized by the hardware. One model jumped from 21% to 97% on a name-in-a-list task. That is not a marginal gain from a marginal trick; it is a structural fix for a structural limitation. (more: https://www.linkedin.com/posts/andriyburkov_llms-process-text-from-left-to-right-each-share-7429588572996993024-uu5_)
On the infrastructure side, a paper introducing Krites addresses the semantic caching problem that production LLM deployments face. Standard semantic caches use a single embedding similarity threshold — conservative thresholds miss safe reuse opportunities; aggressive ones risk serving wrong answers. Krites keeps the standard threshold on the serving path (zero latency change) but adds an off-path LLM judge for borderline "grey zone" queries. When the judge approves a match, the curated static response gets promoted into the dynamic cache via auxiliary overwrite — turning it into a mutable pointer layer over static answers. On conversational workloads, Krites increased the fraction of traffic served with curated answers by 136%; on search-style queries, by 290%. The approach is model-agnostic and does not change the baseline's error rate. For anyone running multi-tier LLM infrastructure, that is essentially free safety and cost improvement. (more: https://arxiv.org/abs/2602.13165v1)
For the local-AI crowd, a Strix Halo owner posted a detailed activation guide that reads like a love letter to AMD's integrated GPU stack. The setup includes Whisper on the NPU backend for instant transcription, Kokoro for blazing-fast TTS, and Qwen3-Coder-Next as the reasoning backbone — with Lemonade Server tying it all together. The poster even managed to run MiniMax M2.5 Q3\_K\_XL, though KV cache issues in llama.cpp limited its utility. The x86 architecture gives it an edge over ARM-based alternatives like NVIDIA's DGX Spark for running AI-powered applications alongside inference on the same box. Community questions focused on flash attention compatibility with ROCm on the gfx1151 target and practical prompt-processing speeds for agentic coding. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r7l7q5/the_strix_halo_feels_like_an_amazing_super_power/)
Industry Moves — Acquisitions, Exits, and the Enterprise AI Summit
OpenAI acquired OpenClaw and hired its creator, Austrian developer Peter Steinberger. Community reaction was skeptical: "Isn't them buying these half-baked solutions an admission they don't have anything better in terms of agents?" Others noted the founder was losing money on it and saw OpenAI's offer as a rational exit. The acquisition pattern — European AI startup gets successful, gets bought by US Big Tech — drew familiar sighs from those who have watched the same dynamic play out in gaming and enterprise software for decades. (more: https://www.reddit.com/r/OpenAI/comments/1r6ztgd/openai_buys_openclaw_hires_openclaw_creator/)
The talent dynamics beneath these acquisitions are also shifting. A thread on why top talent is walking away from OpenAI and xAI generated more heat than light, but the substantive points reduce to economics: early employees who vested shares now worth $15–20M have hit the "generational wealth" threshold where switching to lower-stress positions or semi-retirement becomes rational. For xAI specifically, commenters expressed little surprise that leadership style would drive departures, given what is "very well documented." The structural question — whether AI lab talent retention can survive the combination of massive secondary-market valuations and well-funded competitors like Google and Microsoft — matters more than any individual departure. (more: https://www.reddit.com/r/AINewsMinute/comments/1r4bx2e/why_top_talent_is_walking_away_from_openai_and_xai/)
Gene Kim announced the first Enterprise AI Summit for April 9–10 in San Jose, framing it as a leadership forum for the "real consequences of AI at scale." The speaker list spans OpenAI, Disney, Vanguard, Fidelity, Cisco, and NRC Health. The questions on the agenda are the ones that matter: What should teams look like when two people do the work of eight? What does code review look like when agents write the majority of code? How do you trust and secure services built at AI speed? How do you upskill thousands of developers quickly? These are the same coordination and security problems that ST-EVO, Shannon, BeadHub, and AARM are trying to solve at the technical layer — now surfacing as management problems for the enterprise. (more: https://www.linkedin.com/posts/realgenekim_enterprise-ai-summit-activity-7429335024321318912-34hQ)
Xiaomi Robotics released Xiaomi-Robotics-0, an open-source Vision-Language-Action model with 4.7B parameters for real-time robotic manipulation. It achieves 98.7% average success on LIBERO and 4.80 average task completion length on CALVIN, deployed on the HuggingFace Transformers ecosystem with Flash Attention 2 and bfloat16 for consumer GPUs. The model accepts multi-view camera inputs alongside proprioceptive state and generates action chunks in real time — a practical step toward robots that can reason about what they see and act on natural language instructions without cloud round-trips. (more: https://github.com/XiaomiRobotics/Xiaomi-Robotics-0)
Sources (23 articles)
- [Editorial] Ghostcss (github.com)
- [Editorial] Cybersecurity AI Vulnerability (linkedin.com)
- [Editorial] ProjectMemory (github.com)
- [Editorial] Cybersecurity MCP AI Agents (linkedin.com)
- I built a local AI coding agent with an 8-layer security sandbox — then had ChatGPT try to break it for 240+ rounds (reddit.com)
- [Editorial] How to Sandbox Claude Code with Nono (alwaysfurther.ai)
- tomascupr/sandstorm — One API call. Full Claude agent. Completely sandboxed. (github.com)
- [Editorial] AI Agent Security Strategy (linkedin.com)
- [Editorial] Shannon by Keygraph (github.com)
- [Editorial] AI Community Discussion (linkedin.com)
- HackingDave/btrpa-scan — BLE Scanner with RPA Resolution (github.com)
- ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies (arxiv.org)
- Google Deepmind has released their take on multi-agent orchestration they're calling Intelligent AI Delegation (reddit.com)
- [Editorial] BeadHub — AI Creative Tool (juanreyero.com)
- [Editorial] Antigravity Awesome Skills (github.com)
- I built an MCP that connects your agent to 8,000+ skills with zero setup (reddit.com)
- [Editorial] LLM Processing Internals (linkedin.com)
- Krites: Asynchronous Verified Semantic Caching for Tiered LLM Architectures (arxiv.org)
- The Strix Halo feels like an amazing super power [Activation Guide] (reddit.com)
- OpenAI buys OpenClaw, hires creator Peter Steinberger (reddit.com)
- Why top talent is walking away from OpenAI and xAI (reddit.com)
- [Editorial] Enterprise AI Summit — Gene Kim (linkedin.com)
- XiaomiRobotics/Xiaomi-Robotics-0 (github.com)