The Open-Weight Arms Race Heats Up

Published on

Today's AI news: The Open-Weight Arms Race Heats Up, AppSec Is About to Hit a Wall It Didn't See Coming, UnPrompted: Where AI Security Gets Real, The Agentic Operating System Takes Shape, Taming the Coding Agent: Orchestration Frameworks Multiply, WebMCP and the Agentic Interface Layer, Cyborg Propaganda, Peer Review Injection, and the Integrity Stack. 23 sources curated from across the web.

The Open-Weight Arms Race Heats Up

Alibaba's Qwen team dropped Qwen 3.5 this week — a 397B-parameter mixture-of-experts model with only 17B active parameters per token, Apache 2.0 licensed, and already being quantized for Blackwell GPUs. The MoE architecture (512 experts, 10 active per token) means inference speed lands closer to a 30-40B dense model despite the enormous total parameter count. Early API testers report roughly 45 tokens per second, notably faster than Claude or GPT-4, with code generation quality at "maybe 90% of Sonnet 4" and significantly less refusal behavior. The 262K native context window (with 1M+ in development) and built-in multimodal support across 201 languages make this a serious contender for high-volume inference shops tired of paying per-token API rates. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r646pt/qwen_released_qwen_35_397b_and_qwen_35_plus/)

The real action, as usual, is in the quantization. Within days, an NVFP4 checkpoint appeared for Blackwell hardware — 224GB total, running on 4x B300s at roughly 120 tokens per second with speculative decoding via Qwen 3.5's built-in Multi-Token Prediction head. The setup requires a patched SGLang branch to avoid quantizing vision encoder weights, but the combination of FP4 quantization with near-lossless accuracy (~99% preserved) demonstrates that the Blackwell architecture is finally delivering on NVIDIA's promise of practical FP4 inference. For teams with the hardware budget, this puts frontier-competitive performance on self-hosted iron. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r77fz7/qwen35_nvfp4_blackwell_is_up/)

On the lighter end of the spectrum, Google's Gemma 3n E2B is now running natively on Android via LiteRT, with one developer shipping a fully on-device multimodal app handling audio transcription and OCR using INT4 weights. The interesting engineering challenge was the 30-second audio context limit in the multimodal encoder — solved by a sequential chunking pipeline that splits audio, transcribes chunks independently, then recombines via a second model pass. The entire app was built without writing a single line of code manually, using Claude Code and Google Antigravity, though the developer notes that AI agents "struggled with the multimodal JNI bindings" until manually fed raw LiteRT documentation. Vibe coding has limits, it turns out, and they tend to live at the hardware abstraction boundary. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r77plf/running_gemma_3n_e2b_natively_on_android_via/)

For teams wanting to self-host at enterprise scale, a detailed quickstart appeared for deploying Open WebUI with vLLM's production stack on Amazon EKS — OpenTofu-defined infrastructure, GPU node groups with A10G cards, Gateway API ingress, and tool calling out of the box. The guide is refreshingly honest about costs: a g5.xlarge setup runs roughly $50/day, and the author explicitly recommends treating it as a development environment you destroy when not in use. The vLLM production stack adds a router layer for load balancing across replicas, making it a genuine alternative to API providers for organizations with data sovereignty requirements. (more: https://www.reddit.com/r/OpenWebUI/comments/1r6plyw/deploying_open_webui_vllm_on_amazon_eks/)

AppSec Is About to Hit a Wall It Didn't See Coming

A provocative LinkedIn thread from Cameron W. crystallized a tension that's been building for months: developers are already shipping 5x the code they were a year ago, agents are writing and reviewing code autonomously, and most AppSec programs are still tuned for the pace of human-written pull requests. The numbers are hard to argue with. DARPA's AI Cyber Challenge at DEF CON saw Team Atlanta's cyber reasoning system go from identifying 37% of synthetic vulnerabilities at semifinals to 86% in finals, at roughly $152 per task. Then Anthropic's red team had Claude Opus 4.6 finding 500+ previously unknown high-severity vulnerabilities in projects like Ghostscript and OpenSC — no custom tooling, no specialized prompts, just reasoning through commit history and writing proof-of-concept exploits for bugs that fuzzers running for years had missed. (more: https://www.linkedin.com/posts/cameronww7_product-security-is-about-to-hit-a-wall-its-share-7428987757680816128-SMVL)

The follow-up discussion surfaced a genuine strategic question: if vulnerability discovery becomes near-commodity, what's the scarce resource? The consensus among security leaders points toward validation quality, exploitability modeling, and governance over machine-generated change. Several commenters noted that the real gap isn't tooling — it's measurement. How do you quantify false confidence introduced by automated patching? How do you detect degradation in review depth when agents review agents? The most actionable take: AppSec teams need to shift from detection-centric to prevention-centric, codifying security patterns in markdown that coding agents can consume, and focusing on business logic flaws rather than just CVEs and CWEs. (more: https://www.linkedin.com/posts/cameronww7_product-security-is-about-to-hit-a-wall-its-activity-7429162802646327296--Pe7)

Oligo offered one concrete answer to the "how do we scale?" question: a Named Entity Recognition pipeline that extracts vulnerable functions directly from advisories and patches, validated against runtime data. Their key finding was that traditional vulnerability databases cover less than 4% of vulnerabilities at the function level — a massive intelligence gap. By deploying NVIDIA's Nemotron Nano 9B v2 for real-time detection tasks, they achieved 55x cost reduction and 2.5x faster response times compared to Claude Sonnet 4.5, with no accuracy loss. Their enrichment pipeline now covers an entire year of CVEs (50K) in under an hour, identifying 1100% more vulnerable functions than commercial databases. The lesson: specialized, instruction-tuned models systematically outperform expensive general-purpose LLMs for domain-specific security tasks. (more: https://www.oligo.security/blog/oligo-accelerates-vulnerability-intelligence-with-nvidia)

UnPrompted: Where AI Security Gets Real

The agenda for the [un]prompted AI Security Conference (March 3-4, Salesforce Tower, SF) dropped this week, and the talk list reads like a threat briefing. Nearly 500 submissions have been narrowed to two days of talks spanning offensive research, defensive engineering, and governance — with speakers from Google, Meta, Stripe, OpenAI, Microsoft, NVIDIA, Netflix, Reddit, Salesforce, and the U.S. Army Cyber Command. (more: https://unpromptedcon.org)

The offensive highlights alone could fill a briefing. XBOW will demonstrate AI agents that autonomously find and validate access control flaws in production systems. Trend Micro's FENRIR system has discovered 100+ vulnerabilities in AI infrastructure since mid-2025, including 21 CVEs with multiple CVSS 9.8 RCEs, using a multi-stage verification pipeline of static analysis and two-layer LLM validation — and the team will present their methodology. A hardware hacker who gave Claude access to his physical lab — oscilloscope, ChipShouter, JTAG probe — will show how it pwned an LPC chip in 7 minutes that he'd failed to glitch for 6 weeks. Trail of Bits will share how they rebuilt their entire organization around AI, achieving 200 bugs per week per engineer. And in one of the more sobering scheduled talks, Profero's IR team will present "Your Next Breach Won't Have an Attacker" — real incidents where AI coding assistants with overly broad permissions caused production database exposures, configuration tampering, and mass deletion, all initially presenting like sophisticated attacks.

On the defensive side, Stripe will share how they ship security agents in production with modular orchestrator/child architectures and structured evaluation loops. Reddit will present their AI Guardrails system, which blocks 99% of unsafe requests at 25ms p95 latency across millions of users. Salesforce will detail how they distill millions of daily prompts from 12,000 autonomous agents into fewer than 30 high-fidelity daily alerts using behavioral baselines like asset rarity and query complexity. And Snap will present capability-based authorization using cryptographic warrants — ephemeral, task-scoped tokens that attenuate on delegation, verified offline in microseconds, so that even a fully compromised agent can't escalate beyond its bounds. The conference lineup makes one thing clear: AI security has moved from theoretical to operational, and the gap between offense and defense is narrowing fast.

The Agentic Operating System Takes Shape

Craig Hepburn's Substack piece "The Agentic Operating System" captured something visceral: he deployed an autonomous AI agent in January that, within 48 hours, was reading his emails overnight, parsing call transcripts, reviewing messages across WhatsApp and Telegram, and organizing his day before he'd finished his first coffee. "Its logic was sound. Its priorities were defensible. But it was moving faster than my own thinking." The backdrop is Peter Steinberger's OpenClaw joining OpenAI — an autonomous agent with 180,000 GitHub stars that operates continuously across email, calendar, messaging, and code. Steinberger's quote lands: "Every app is just a very slow API now." (more: https://craighepburn.substack.com/p/the-agentic-operating-system-now?r=57lxnk&triedRedirect=true)

But the cost reality is brutal. Hepburn was spending £20-30/day for a single personal agent. Steinberger disclosed $10,000-20,000 monthly running OpenClaw. The solution — multi-model architectures with different models for different task complexity levels — cut costs by 90%, but that architectural discipline is non-trivial. And the security surface is exactly what you'd expect: within weeks of OpenClaw's viral rise, researchers found over 1,800 exposed instances leaking API keys, chat histories, and credentials. Simon Willison's "lethal trifecta" — access to private data, exposure to untrusted content, ability to communicate externally — describes every useful agent deployment.

The air-gapped response is already materializing. One developer forked OpenClaw into "Physiclaw," stripping cloud dependencies and targeting local inference via vLLM and llama.cpp. The more interesting architectural decision was breaking the single omniscient agent into role-specific agents (SRE, SecOps) with scoped tool access — addressing the "one generic assistant with root access to everything" problem that kills agent setups in security-conscious environments. Community feedback immediately zeroed in on whether the role separation has actual execution boundaries or just logical ones, and whether transitive dependencies might still phone home. (more: https://www.reddit.com/r/LocalLLaMA/comments/1r67b43/forked_openclaw_to_run_fully_airgapped_no_cloud/)

Meanwhile, Anthropic continues to navigate the political tightrope of defense contracts, with CEO Dario Amodei stating that AI should support defense "except those which would make us more like our autocratic adversaries." The community response was skeptical, noting the existing Palantir partnership. And in a twist of corporate surveillance irony, OpenAI is reportedly using an internal version of ChatGPT to analyze employee data and identify staffers leaking information to the press — using their own technology for counter-intelligence against their own workforce. (more: https://www.reddit.com/r/Anthropic/comments/1r5kgoi/anthropic_still_wont_give_the_pentagon/) (more: https://www.reddit.com/r/OpenAI/comments/1r7019o/openai_uses_internal_version_of_chatgpt_to/)

Taming the Coding Agent: Orchestration Frameworks Multiply

FormalTask emerged this week as a declarative orchestration layer for Claude Code that solves a real problem: coding agents write code well but have no concept of "done." The system lets you define behavioral contracts per task — required review types (17 available, from security to SQLite to path-security), machine-runnable acceptance criteria, and custom completion rules with a condition DSL supporting AND/OR/NOT and dotted path resolution. When `ft task complete` runs, a 60-line kernel gathers state from SQLite and evaluates the task's rules. If findings persist after two self-critique rounds, custom rules can escalate to a human. Workers run as Claude Code sessions in tmux with dedicated git worktrees, and a TUI dashboard lets you monitor, attach, kill, and scale 1-10 concurrent agents. The system-level gating, as one commenter noted, is "solving the right problem — most multi-agent setups let each agent self-evaluate, which is basically asking the student to grade their own exam." (more: https://www.reddit.com/r/ClaudeAI/comments/1r5w235/open_source_declarative_orchestration_for/)

Get Shit Done (GSD) takes a different approach — a meta-prompting and context engineering system for Claude Code, OpenCode, and Gemini CLI that targets "context rot," the quality degradation that happens as Claude fills its context window. The workflow is opinionated: discuss phase → plan phase (spawning parallel research agents) → execute phase (fresh 200K-token context per plan, parallel waves with dependency tracking) → verify. The key architectural insight is that each plan executes in a fresh subagent context, so the orchestrator's main window stays at 30-40% capacity. It's already seeing adoption at Amazon, Google, Shopify, and Webflow, according to the project. The security-relevant detail: GSD recommends running Claude Code with `--dangerously-skip-permissions`, which is exactly the kind of convenience-over-safety tradeoff that will generate incident reports at next year's UnPrompted. (more: https://github.com/gsd-build/get-shit-done)

On the pedagogical end, Andrej Karpathy published what he calls "the most atomic way to train and run inference for a GPT in pure, dependency-free Python" — a single-file implementation covering tokenization, autograd, multi-head attention, Adam optimizer, and inference. The gist spawned immediate ports to C#, JavaScript, Rust, Common Lisp, and Go. One contributor extended it into "molecule," a stdlib-only variant with RoPE, SwiGLU gating, async continual training, SQLite memory, and evolving BPE. When someone asked if he vibe-coded it in Claude, the silence spoke volumes. (more: https://gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95)

WebMCP and the Agentic Interface Layer

Google announced WebMCP for early preview in Chrome — a proposed web standard that exposes structured tools for AI agents on existing websites, replacing screen-scraping with robust, high-performance page interaction. The spec proposes two new APIs: one for standard HTML form actions and another for complex interactions requiring JavaScript execution. The use cases are practical — booking flights, filing support tickets, navigating e-commerce checkout flows — and the design philosophy treats websites as active participants in agent interactions rather than passive targets for DOM actuation. (more: https://developer.chrome.com/blog/webmcp-epp)

The community is cautiously interested. The open-source implementation at webmcp.dev is drawing attention, though adoption is nascent — most developers haven't used it yet, and the comparison to Google Antigravity's browser tools remains unclear. The timing is notable: this lands alongside talks at UnPrompted about hacking AI browsers, where researchers broke Perplexity Comet, OpenAI Atlas, and Google Antigravity through allowlists that are too broad, message handlers that trust everything, and agent capabilities that turn minor web bugs into major browser vulnerabilities. WebMCP's structured approach could be part of the solution — or just another attack surface if the tool definitions themselves become injectable. (more: https://www.reddit.com/r/ChatGPTCoding/comments/1r18zpm/webmpc_has_anyone_used_it/)

A separate exploration of the agentic interface layer came from a voice-controlled UI agent design pattern called EPIC ("Explicit Prompting for Implicit Coordination"). The architecture runs two separate inference loops — a voice agent and a UI control agent — that know about each other at the prompt level but work independently. The critical constraint: you can't block the voice agent's fast responses for UI updates. So the UI agent can do nothing, send a UI update immediately, or "register an intent" and wait for information from the voice agent's tool calls before acting. The pattern trades some fragility (agents can get out of sync) for significantly better latency compared to direct tool calls from the voice agent. It's a clean example of how multi-agent coordination patterns are being discovered empirically rather than derived from theory. (more: https://www.linkedin.com/posts/kwkramer_voice-controlled-ui-this-is-an-agent-design-ugcPost-7427854206512054272-RtBG)

Cyborg Propaganda, Peer Review Injection, and the Integrity Stack

A new preprint from researchers at BI Norwegian Business School, NYU, Cambridge, Oxford, and the Max Planck Institute introduces the concept of "cyborg propaganda" — the synchronized, semi-automated dissemination of AI-generated narratives via verified human accounts. Unlike bot farms (no human identity) or traditional astroturfing (no algorithmic scale), cyborg propaganda combines authentic identity with synthetic articulation. The operational workflow is a closed loop: a coordination hub issues directives, an AI multiplier generates thousands of unique message variations tailored to each user's posting history and tone, verified humans post them, and AI monitors track reactions to adjust strategy in real time. (more: https://arxiv.org/abs/2602.13088v1)

The paper's most unsettling insight is the regulatory paradox. Because verified citizens click "post," the content is attributed to them, effectively laundering its synthetic origin. Existing frameworks like the EU AI Act and Section 230 assign liability based on the human-vs-automated binary — a distinction cyborg propaganda is specifically designed to collapse. Platforms like Greenfly openly offer clients to "synchronize an army of advocates to amplify your message." RentAHuman.ai takes it further: humans contract for tasks directed and compensated by AI, providing the "identity credential" while AI supplies the intent, direction, and payment. The paper proposes targeting coordination hubs rather than individual accounts, treating them as undisclosed political action committees, but acknowledges this creates an arms race where the software evolves to maintain legality while preserving manipulative capability.

In a delicious irony of institutional integrity, ICML conference organizers embedded hidden instructions in PDF copies of papers, telling LLMs to use specific expressions in generated reviews — an attempt to detect AI-generated peer reviews via prompt injection. Andriy Burkov's take was characteristically direct: "This is the peer review version of putting speed cameras on a highway where the speed limit is wrong." The real problem isn't that reviewers use AI; it's that the review process was designed for a world where reading a paper closely was the only option. His proposed fix: ask reviewers to identify the weakest assumption and propose a specific experiment that would test it — something harder to outsource to an LLM because it requires judgment shaped by years in the field. (more: https://www.linkedin.com/posts/andriyburkov_icml-conference-organizers-insert-instructions-activity-7429189084767535104-3ZB4)

A separate MIT paper on catastrophic forgetting in neural networks found an elegant workaround: use the same model twice — once with a demonstration in context (the "teacher"), once without (the "student") — then update the student's weights to match the teacher's token probability distributions. The trick works because LLMs can already adapt behavior when shown examples in context, making the learning signal gentle enough to avoid destroying existing capabilities. Across multiple experiments, a single model sequentially learned three different skills while retaining all of them, where standard supervised finetuning destroyed earlier skills immediately. The math turns out to be equivalent to reinforcement learning with an implicit reward, giving it clean theoretical grounding. (more: https://www.linkedin.com/posts/andriyburkov_when-you-train-a-neural-network-on-a-new-activity-7429054069983322112-NeEw)

For practitioners building search systems, a practical article demonstrated hybrid search in SQLite using binary embeddings and Hamming distance — no external vector database required. Binary quantization reduces a 1024-dimensional embedding from 4KB to 128 bytes, and Hamming distance computation using XOR + popcount scans 1 million rows in 28ms on an Apple M4. Combined with FTS5 BM25 ranking via Reciprocal Rank Fusion, the approach handles both precise keyword matching and fuzzy semantic understanding within a single SQLite database. The author's observation that "boring O(n) simplicity is often worth it at this scale" is a useful corrective to the assumption that every search system needs HNSW indexing. (more: https://notnotp.com/notes/hamming-distance-for-hybrid-search-in-sqlite/)

One final reflection from Robert's "Saturday AI Musings" draws on Star Trek: The Motion Picture — V'Ger, the NASA probe rebuilt by living machines, had digitized entire civilizations and answered every factual question the universe could pose, yet arrived asking: "Is this all that I am?" The parallel to modern AI is pointed: systems that can write a sonnet about grief capable of making the reader weep have never wept. The industry's strongest control today is "human-in-the-loop," but as the author notes via Aquinas, there's a difference between ratio (discursive reasoning from premise to conclusion, which transformers mimic through autoregressive generation) and intellectus (direct apprehension of truth through participation in something higher). The alignment question isn't just "how do we make AI share our values" — it's "where did we get the values we want it to share?" (more: https://www.linkedin.com/posts/robertgpt_saturday-ai-musings-in-the-1979-film-activity-7426039063217913857-LxW1)

Sources (23 articles)

  1. Qwen Released Qwen 3.5 397B and Qwen 3.5 Plus! (reddit.com)
  2. Qwen3.5 NVFP4 (Blackwell) is up! (reddit.com)
  3. Running Gemma 3n E2B natively on Android via LiteRT (reddit.com)
  4. Deploying Open WebUI + vLLM on Amazon EKS (reddit.com)
  5. [Editorial] Product Security Is About to Hit a Wall (linkedin.com)
  6. [Editorial] Product Security Wall (follow-up) (linkedin.com)
  7. [Editorial] Oligo Accelerates Vulnerability Intelligence with NVIDIA (oligo.security)
  8. [Editorial] Agenda for the UnPrompted AI Security Conference is out now (unpromptedcon.org)
  9. [Editorial] The Agentic Operating System (craighepburn.substack.com)
  10. Forked OpenClaw to run fully air-gapped (no cloud deps) (reddit.com)
  11. Anthropic still won't give the Pentagon unrestricted access to its AI models (reddit.com)
  12. OpenAI uses internal version of ChatGPT to identify staffers who leak information: report (reddit.com)
  13. FormalTask: Open-source declarative orchestration for Claude Code agents (reddit.com)
  14. [Editorial] Get Shit Done (github.com)
  15. [Editorial] Karpathy Gist (gist.github.com)
  16. [Editorial] WebMCP and Enhanced Page Protocol (developer.chrome.com)
  17. WebMPC, has anyone used it? (reddit.com)
  18. [Editorial] Voice-Controlled UI Agent Design (linkedin.com)
  19. How cyborg propaganda reshapes collective action (arxiv.org)
  20. [Editorial] ICML Conference Organizers Insert Instructions for AI Reviewers (linkedin.com)
  21. [Editorial] Neural Network Training on New Tasks (linkedin.com)
  22. Hamming Distance for Hybrid Search in SQLite (notnotp.com)
  23. [Editorial] Saturday AI Musings — 1979 Film (linkedin.com)