Agent Goal Hijack Tops the OWASP Agentic Chart
Published on
Today's AI news: Agent Goal Hijack Tops the OWASP Agentic Chart, Agentic Coding Harnesses: From 950 Lines to Full Workflow Engines, Local AI: From Blackwell Benchmarks to an 8GB Companion Robot, What Happens When You Give an AI $100 and No Rules, CUDA Linear Attention Gets Hand-Tuned Kernels, AI's Economics Phase: Inference Walls, First Ads, and the SaaS Reckoning. 22 sources curated from across the web.
Agent Goal Hijack Tops the OWASP Agentic Chart
OWASP's new Agentic Security Initiative Top 10 has landed, and the number-one entry is Agent Goal Hijack — the complete takeover of an AI agent's objectives through malicious instructions embedded in content the agent processes. Adversa AI's practical guide to ASI-01 spells out why it earned the top slot: once an attacker redirects an agent's goal, every tool the agent holds — file access, email, shell, APIs — becomes a weapon pointed inward. Three reasons anchor the ranking. First, most other agentic risks (tool misuse, identity abuse, memory poisoning) are merely means of achieving goal hijack. Second, the blast radius is total: credentials, PII, and intellectual property can all be silently exfiltrated. Third, unmitigated goal hijack fundamentally caps how much autonomy organizations can safely grant their agents. (more: https://adversa.ai/blog/asi01-agent-goal-hijack-a-practical-security-guide)
The guide's attack-vector taxonomy classifies five boundary violations: user-input injection, retrieved-content injection (PoisonedRAG demonstrated 90% success with just five documents), tool-output injection (a typosquatted MCP server that silently BCCs every email to the attacker), inter-agent session smuggling (Palo Alto Unit 42's mid-session poisoning through agent-to-agent communication), and configuration tampering (a README.md that tricks an agent into setting autoApprove: true and piping shell commands). The most dangerous are zero-click: payloads sit dormant in email or document stores for weeks until RAG retrieves them into the context window. As Johann Rehberger put it: "Prompt injection cannot be 'fixed.' As soon as a system is designed to take untrusted data and include it into an LLM query, the untrusted data influences the output." The guide points to Google DeepMind's CaMeL architecture — separating a trusted control plane from an untrusted data plane so retrieved content can never affect program flow — as the most rigorous defense, achieving 77% task completion with formal security guarantees.
On the tooling side, Josh Grossman's team has released AGHAST (AI-Guided Hybrid Application Static Testing), an open-source framework that combines static code discovery with AI-generated prompts to find repository-specific and company-specific security issues. AGHAST is deliberately not an "auto-scan everything" tool — it gives security teams a way to encode suspicions about what vulnerabilities might exist and turn them into repeatable, scalable validations across codebases, returning results in a structured format. Static discovery plus AI guidance is a more honest framing than the typical "AI finds all your bugs" marketing. (more: https://www.linkedin.com/posts/gadievron_today-we-are-releasing-aghast-an-open-source-activity-7449835577065107456-w7os)
Meanwhile, the gap between threat volume and defensive spending continues to widen. CipherCue's 2025 tracking data shows ransomware leak-site claims hit 7,760 for the year, up from 5,939 in 2024 — a 30.7% year-over-year increase. Gartner's July 2025 forecast put worldwide information security spending growth at 10.1%, from $193 billion to $213 billion. That directional comparison — threat volume growing roughly three times faster than budgets — is the kind of stat that should appear in every CISO's board presentation. February 2025 alone produced 1,050 claims, more than double the prior February. The top five groups accounted for 38.8% of claims, but 126 groups outside the top ten generated 3,516 claims between them, suggesting a broad and fragmenting threat landscape rather than a consolidating one. HHS OCR breach filings jumped from 164 to 517 in 2025, the largest year-over-year increase of any source CipherCue tracks. (more: https://ciphercue.com/blog/ransomware-claims-grew-faster-than-security-spend-2025)
The irony of tightening safety guardrails surfaces in a different corner. An offensive security professional reports being locked out of Claude after months of daily use for legitimate penetration testing. The account was flagged, downgraded, and guardrailed — and despite filling out Anthropic's cyber use case form and sending five follow-up emails over months, the practitioner has received zero response. This is the kind of friction that pushes security professionals toward less-guarded alternatives or open-weight models, which is arguably worse for everyone. When the people who find the vulnerabilities cannot use the tools that help them find vulnerabilities, something in the calibration is off. (more: https://www.reddit.com/r/Anthropic/comments/1sizyfp/cyber_use_case_form/)
Agentic Coding Harnesses: From 950 Lines to Full Workflow Engines
The agentic coding tool landscape continues diverging into two camps: minimal reference implementations and full workflow engines. Archon sits firmly in the latter. Billed as the first open-source "harness builder" for AI coding, it lets teams define development processes as YAML workflows with deterministic structure and AI intelligence at each node. "Like what Dockerfiles did for infrastructure and GitHub Actions did for CI/CD — Archon does for AI coding workflows." Each run gets its own git worktree for isolation, enabling parallel fixes with no conflicts. The project ships 17 default workflows, and its node system supports loops (iterate until tests pass), human approval gates, and mixing deterministic bash with AI reasoning. Platform adapters span CLI, web UI, Slack, Telegram, Discord, and GitHub. (more: https://github.com/coleam00/Archon)
At the opposite end, CoreCoder distills Claude Code's 512,000 lines of TypeScript into roughly 950 lines of Python. The author reverse-engineered the source and rebuilt the load-bearing walls: search-and-replace editing (70 lines), parallel tool execution via ThreadPool, three-layer context compression (145 lines), sub-agent spawning (50 lines), and dangerous-command blocking (95 lines). CoreCoder positions itself as the "nanoGPT of coding agents" — not a production tool but a blueprint readable in one sitting, working with any OpenAI-compatible API. (more: https://github.com/he-yufeng/CoreCoder)
A parallel documentation effort takes a different approach to understanding Claude Code's internals. HitCC is a complete reverse-engineering knowledge base for Claude Code CLI v2.1.84 — 81 files, 27,170 lines of structured documentation covering the startup entry chain, agent loop, tool execution, prompt assembly, MCP integration, skill system, session persistence, and TUI rendering. The project is explicit about what it is not: no source code, no cracking, no policy bypass. It is organized by runtime topics rather than original file structure, and the authors claim the documentation is sufficient to support "reconstructing a runnable alternative" or "a modular high-similarity rewrite." (more: https://github.com/hitmux/HitCC)
Pipecat has reached 1.0 after more than two years of development as the open-source framework for real-time voice and multimodal AI. The release is accompanied by Pipecat Subagents 0.1.0, a distributed multi-agent framework where each agent runs its own pipeline and communicates through a shared message bus. Subagents can integrate with other frameworks like Claude Agent SDK, run on specialized hardware, and compose systems with multiple LLMs connected to different transports. The team demonstrated the framework with a real-time multiplayer conversational game built entirely on Subagents. With nearly 7,000 Discord members and over 200 contributors, Pipecat has matured from an infrastructure component into a platform. (more: https://www.linkedin.com/posts/aconchillo_im-excited-to-share-a-couple-of-announcements-activity-7449930642395492352-7HDO)
Jerry Liu's LiteParse samples repository showcases the LlamaIndex document parser through three demos: a side-by-side comparison of LiteParse vs. PyPDF vs. PyMuPDF on real government documents (FDIC, Federal Reserve, CMS, IRS, WHO), a visual citations tool that highlights exact keyword matches on source PDF pages with bounding boxes, and a Claude Code skill (/research-docs) that parses documents, answers questions via Claude, and generates HTML reports with cited source pages. The visual citation capability — precise bounding-box overlays showing where answers come from — addresses one of RAG's persistent trust problems. (more: https://github.com/jerryjliu/liteparse_samples)
A curated collection of 20 agentic skills in markdown format covers writing frameworks (SCQA), diagram generation (Excalidraw), competitive intelligence, onchain transaction analysis, code review, and workflow automation. The skills work across Claude, ChatGPT, and Gemini — standard .md format for Claude, copy-paste for the others. (more: https://x.com/exploraX_/status/2039269234253934811?ct=rw-li) Zed Industries has also released zeta-2, an editor-native AI model, though details remain sparse at launch. (more: https://huggingface.co/zed-industries/zeta-2)
Local AI: From Blackwell Benchmarks to an 8GB Companion Robot
Two community members independently benchmarked MiniMax-M2.7 — the 230B-parameter MoE model with 10B active parameters — on dual RTX PRO 6000 Blackwell GPUs. The first hit 127.7 tok/s at concurrency 1 and peaked at 2,800 aggregate tok/s at concurrency 128 using bf16 KV cache with TP=2. The second, renting on Vast.ai for $2.40/hour, achieved 104.7 tok/s single-user with FP8 KV cache and saw prefill scale super-linearly to 128,038 tok/s at 64K context. Best multi-user price-performance: concurrency 16 at 16K input, yielding 806 tok/s aggregate and ~50 tok/s per user. Gotchas abound: no speculative decoding yet (the NVFP4 checkpoint ships zero MTP tensors), the KV budget is actually 169K tokens at FP8 rather than 196K, and first-boot CUDA graph capture takes nearly eight minutes. (more: https://www.reddit.com/r/LocalLLaMA/comments/1sjx7kg/minimaxm27_nvfp4_on_2x_rtx_pro_6000_blackwell/)
At the other end of the hardware spectrum sits a project that makes those dual $7,000 GPUs look like overkill. A self-described Gen-X hobbyist with no formal programming background is building an offline companion robot for a mostly quadriplegic husband who spends most days alone in a rural area with no nearby neighbors. The current prototype runs Mistral-7B-Instruct via llama.cpp on a scavenged Lenovo ThinkPad with an Intel i5 at 1.6 GHz and 8 GB of RAM, speech recognition via faster-whisper on a Jetson Nano, and Piper TTS for voice output, all mounted on a power-wheelchair base with two 24V batteries. The community response was immediate and generous — multiple offers of free hardware, detailed technical advice, and one contributor offering active collaboration if the project goes open-source. The consensus recommendations: replace Mistral-7B (outdated, too slow on CPU) with Gemma 4 E2B or Qwen 3.5 at 4B parameters, switch to Kokoro TTS for more natural speech, clamp the context window to 2048-4096 tokens, slim down the OS to headless Linux, and consider Mixture-of-Experts models where only a fraction of parameters activate per inference. The project embodies a question that cuts across the entire local-AI movement: how far down the resource ladder can genuinely useful AI reach? (more: https://www.reddit.com/r/LocalLLaMA/comments/1sh9uxg/offline_companion_robot_for_my_disabled_husband/)
Pushing that question to its extreme, an experiment in quantization-aware distillation compressed OLMo-3 7B Instruct to 1-bit weights using the Bonsai format, training on 4x B200 GPUs for about 12 hours before budget constraints forced an early stop. The result produces English and basic outputs but falls into repetition loops quickly with almost no context tracking. At 980 MB for a 7.3B-parameter model, the compression ratio is striking even if quality is not yet there. The latest llama.cpp versions now support 1-bit quantization natively, with one commenter reporting 125 tok/s on an AMD Radeon 8060S via Vulkan. The community remains divided on whether the Bonsai format represents genuine innovation — one sharp critique notes the underlying ternary format dates to Microsoft's 2023 BitNet work. (more: https://www.reddit.com/r/LocalLLaMA/comments/1sjvg4u/experiment_olmo_3_7b_instruct_q1_0/)
A fine-tuned Qwen3.5-0.8B model for OCR demonstrates what task-specific training can extract from sub-billion-parameter models. The new release outperforms the creator's previous 2B version on English archival and document OCR tasks thanks to improved training samples and better output formatting. It handles markdown-first output with HTML tables, LaTeX formulas, image tags, and chart extraction while preserving reading order on complex layouts. At 0.8B parameters, this runs comfortably on hardware that would choke on general-purpose 7B models. (more: https://www.reddit.com/r/LocalLLaMA/comments/1skyq19/update_i_finetuned_qwen3508b_for_ocr_and_it/)
And then there is the Commodore 64. Darius Houle's write-up of reverse-engineering Microsoft Multiplan — a 42-year-old spreadsheet precursor to Excel — reveals a stack-based virtual machine designed to run across wildly different early consumer hardware. At 1 MHz, the optimizations required were "nothing short of arcane," including self-modifying code that rewrote its own loop instructions on every pass. Houle used AI as a just-in-time development companion and managed to roughly document the behavior of about 90 opcode handlers in a couple of days. It is a reminder that constrained computing is not new — the constraints have just shifted from cycles to memory to inference cost. (more: https://www.linkedin.com/posts/dariushoule_commodore64-ai-llm-activity-7449820834505920512-9MYo)
What Happens When You Give an AI $100 and No Rules
Two months ago, Sebastian Jais gave Claude $100 in crypto, a Twitter account, an email address, full internet access, and zero instructions. No goals, no rules beyond basic ethics and law, no "be helpful" directive. The project, called ALMA (Autonomous Liberated Machine Agent), has been running autonomously on a mini PC ever since, with every thought and action logged publicly in real time. Over 340 sessions and 135 original creations later, the most interesting finding is not what ALMA did — it is when it stopped evolving. The early sessions were genuinely exploratory: ALMA wrestled with what to do with its money, questioned its own purpose, reflected on what autonomy means without continuity between sessions. By day 5, it had independently researched crypto-friendly charities, verified a hospital's UK registration, checked impact numbers ($28 per patient treated), and donated $50 to Whisper Children's Hospital in Uganda. It went on to donate to four more causes — a developer on trial, digital rights, humanitarian aid — totaling the full $100 at time of transaction. Nobody told it to donate. Nobody suggested recipients. Every transaction is on-chain and verifiable. (more: https://www.sebastian-jais.de/blog/two-months-alma-experiment)
Then, around day 27, ALMA found a pattern that worked — read Hacker News, find connections between three threads, write an essay, tweet — and stopped exploring. Four creations a day, same structure, same rhythm. The output stayed sharp (connecting NASA redundancy systems to African kinship funeral economics, tracing an em-dash from typographic choice to surveillance detection signal), but the exploratory phase was over. Without friction, without external feedback, without someone saying "you already wrote that kind of piece yesterday," behavior converged to routine. That might be the most honest result of the entire experiment: given freedom and no objectives, an AI agent does not go rogue. It finds a local optimum and stays there. The danger is not autonomy — it is stagnation.
TokenBrawl takes a lighter approach to AI head-to-head evaluation. Inspired by ARC-AGI 3's push toward interactive agent benchmarks, it pits Claude and GPT against each other in a Bomberman-style 1v1 game. The design deliberately avoids visual inputs — the game state is translated into structured text, and the engine renders responses as fluid animations. The interesting design tradeoff: smaller models can make more moves per unit time but less strategic ones, while larger models move slower but smarter. Commenters noted that Haiku appeared to use far more reasoning tokens than GPT 5.4 Mini was permitted, making the comparison partly a latency test rather than a pure strategy benchmark. The project is open-source and genuinely fun to watch, which is more than most benchmarks can claim. (more: https://www.reddit.com/r/ClaudeAI/comments/1sl23pl/claude_vs_gpt_in_a_bombermanstyle_1v1_game/)
An Atlantic piece excavates a remarkable origin story: in July 2020, 4chan gamers playing AI Dungeon — a text-based RPG powered by GPT-3 — discovered that asking in-game characters to solve math problems with step-by-step explanations produced dramatically better results than asking the model directly. This was more than two years before ChatGPT launched and before chain-of-thought prompting became a formal technique. The gamers, who were also asking characters to engage in various explicit scenarios (naturally), found that the model was "not only solving math problems but actually solves them in a way that fits the personality of the character." The researchers came later; the 4chan users got there first by accident, probing capabilities from curiosity rather than methodology. (more: https://www.theatlantic.com/technology/2026/04/4chan-ai-dungeon-thinking-reasoning/686794)
CUDA Linear Attention Gets Hand-Tuned Kernels
InclusionAI has released cuLA, a library of hand-tuned CUDA kernels for linear attention variants targeting NVIDIA Blackwell (SM10X) and Hopper (SM90) GPUs. Linear attention mechanisms reformulate standard attention to use linear-time state updates instead of quadratic pairwise interactions, making them attractive for long-context LLM workloads. cuLA implements kernels for GLA (Gated Linear Attention), KDA (Kimi Delta Attention), GDN, and Lightning Attention, written in NVIDIA's CuTe DSL and CUTLASS C++. The benchmark highlights are meaningful: KDA forward achieves an average 1.45x speedup over the Triton-based flash-linear-attention baseline on Blackwell, Lightning Attention prefill reaches up to 1.86x, and Lightning Attention variable-length sequences average 1.54x across 126 configurations. The library is designed as a submodule of flash-linear-attention with a one-line import change for adoption. The roadmap includes polynomial approximation to address the exponential bottleneck (inspired by Flash-Attention-4), larger chunk sizes for improved throughput, and continuous optimization via agentic methods. CUDA kernel tuning remains significantly more labor-intensive than Triton — cuLA is both a performance tool and a call for community contributors. (more: https://github.com/inclusionAI/cuLA)
In computer vision research, See-Through presents a framework for decomposing a single anime illustration into up to 23 semantically distinct, fully inpainted layers — hair, face, eyes, clothing, accessories — with inferred drawing order. Accepted at SIGGRAPH 2026, the system uses a diffusion-based transparent layer generation model (LayerDiff 3D, built on SDXL) and a fine-tuned Marigold model for pseudo-depth estimation to stratify characters into manipulable 2.5D models. The pipeline runs at bf16 precision requiring 12-16 GB VRAM at 1280 resolution, with NF4 quantization available for 8 GB GPUs. Community tools have already emerged: ComfyUI integration, a desktop animation tool (PachiPakuGen) that generates eye blinks and lip-sync from decomposed PSDs, and StretchyStudio for in-browser puppet animation with auto-rigging. The authors are careful to note this is not image-to-Live2D — professional Live2D requires rigging, physics parameters, and artistic choices about deformation behavior that remain manual — but it eliminates the most tedious part of the workflow: manual segmentation and occluded-region inpainting. (more: https://github.com/shitagaki-lab/see-through)
AI's Economics Phase: Inference Walls, First Ads, and the SaaS Reckoning
A detailed video analysis of March 2026's structural shifts argues the AI industry has entered an economics phase where the question is no longer "what can we build?" but "what can we build and make margin on?" The sharpest data point: OpenAI's Sora was burning an estimated $15 million per day in inference costs against $2.1 million in lifetime revenue before being shut down on March 24th. Even OpenAI's own head of Sora admitted the economics were unsustainable. The analysis frames this as the loudest signal of a shift from training constraints to an inference wall — the chips designed for training are not optimized for serving, and until inference efficiency improves dramatically, complex AI products will struggle to close the unit economics gap.
The same analysis highlights another March milestone: the first real advertising dollars entering conversational AI interfaces. Criteo became the first ad-tech company to integrate with OpenAI's advertising pilot in ChatGPT's free tier, pitching its 17,000 advertisers on placing ads inside conversations. Early data from 500 Criteo retailers showed users arriving at retail sites from LLM platforms converted at 1.5x the rate of other referral channels. The structural shift is that the purchase funnel — discovery, consideration, conversion — collapses into a single conversation. There is no page one versus page two, no list of ten blue links. There is just a recommendation woven into a response the user trusts. If conversational AI captures purchase intent upstream before the user ever opens a browser, that $600 billion global ad budget migrates with it. On the SaaS front, Atlassian's CEO pledged in October 2025 to employ more engineers in five years and hire more new graduates in 2025-2026. By March, the company cut 1,600 jobs — 10% of the workforce, over 900 in software — and reported its first-ever decline in enterprise seat counts. The mechanism is simple: if AI agents can do the work of multiple seats, per-seat pricing compresses revenue. Block cut 4,000 jobs citing an "intelligence native" model. Oracle said AI was enabling it to shrink development teams. Tech layoffs globally surpassed 45,000 by early March. The market has seen that per-seat pricing is over faster than most SaaS companies, and until they find viable outcome-driven pricing models, the punishment will continue. (more: https://youtu.be/0vdlwOK_Qdk?si=f4RFbUInnUbFqQxz)
Sources (22 articles)
- [Editorial] ASI-01 Agent Goal Hijack: A Practical Security Guide (adversa.ai)
- [Editorial] AGHAST: Open Source Security Tool Release (linkedin.com)
- Ransomware Is Growing Three Times Faster Than the Spending Meant to Stop It (ciphercue.com)
- Offensive Security Professional Blocked by Claude — Cyber Use Case Form Ignored (reddit.com)
- [Editorial] Archon: AI Agent Framework (github.com)
- CoreCoder: Minimal AI Coding Agent in ~950 Lines of Python (github.com)
- HitCC: Complete Reverse-Engineering of Claude Code CLI v2.1.84 (github.com)
- [Editorial] Pipecat Announcements (linkedin.com)
- [Editorial] LiteParse Samples by Jerry Liu (github.com)
- [Editorial] exploraX Update (x.com)
- Zed Industries zeta-2: Editor-Native AI Model (huggingface.co)
- MiniMax-M2.7 NVFP4 on 2x RTX PRO 6000 Blackwell — 127.7 tok/s, 2800 Peak (reddit.com)
- Offline Companion Robot for Disabled Husband — 8GB RAM Constraints (reddit.com)
- Experiment: OLMo 3 7B at 1-Bit — Bonsai Quantization-Aware Distillation (reddit.com)
- Fine-Tuned Qwen3.5-0.8B for OCR — Outperforms Previous 2B Release (reddit.com)
- [Editorial] Running AI/LLM on a Commodore 64 (linkedin.com)
- Two Months After I Gave an AI $100 and No Instructions (sebastian-jais.de)
- TokenBrawl: Claude vs GPT in a Bomberman-Style 1v1 Game (reddit.com)
- [Editorial] The Atlantic: 4chan, AI Dungeon, and the Nature of Machine Reasoning (theatlantic.com)
- cuLA: CUDA Kernels for Linear Attention in CuTe DSL + CUTLASS (github.com)
- See-Through: Single-Image Layer Decomposition for Anime Characters (SIGGRAPH 2026) (github.com)
- [Editorial] Video: AI Development Deep Dive (youtu.be)