AI vs. The State — The Anthropic Standoff Escalates

Published on February 28, 2026

Today's AI news: AI vs. The State — The Anthropic Standoff Escalates, Security's Gravitational Collapse, Agents in the Wild — Exposed, Exploited, Self-Hacked, One Bad Agent Ruins Everything, AI as Layoff Cover, The Agent Tooling Stack Converges, Open Silicon — Hardware for the Curious. 22 sources curated from across the web.

AI vs. The State — The Anthropic Standoff Escalates

For anyone joining mid-escalation: Anthropic signed a Pentagon contract last summer that required military adherence to its usage policy. When the Pentagon demanded unrestricted access for "all lawful purposes," Anthropic asked for two specific guarantees — no mass surveillance of American citizens, and no autonomous weapons without human oversight. The Pentagon refused. Hegseth then threatened to invoke the Defense Production Act or designate Anthropic a "supply chain risk," a classification previously reserved for foreign adversaries like Huawei.

After Defense Secretary Hegseth's Friday deadline for Anthropic to drop safety guardrails passed without capitulation, Trump took to Truth Social with an executive broadside: federal agencies are to "cease usage" of Anthropic products, with the company labeled a "Radical Left AI Company" that is "putting American lives in danger" by enforcing its terms of service on military users. (more: https://www.reddit.com/r/Anthropic/comments/1rgiyj2/trump_goes_on_truth_social_rant_about_anthropic/)

"It is curious that physical courage should be so common in the world and moral courage so rare."

Security's Gravitational Collapse

Rich Mogull's "Core Collapse" essay for the Cloud Security Alliance may be the most structurally clear articulation of why AI tilts the cybersecurity balance toward attackers — and what defenders should actually do about it. The metaphor is astrophysical: a star burns through its elements until it reaches iron, which absorbs energy rather than releasing it, and the core collapses. Security's "iron" is the response-cycle model that has served the industry for a decade. Attackers control the bounding of their problem space — find one vulnerability, exploit one path. AI makes that search faster, cheaper, and parallelizable. Defenders face a nearly unbounded problem: protect all resources from all attackers at all times. Even when an AI-assisted defender discovers a vulnerability simultaneously with an attacker, the defender still needs to develop a patch, test it, stage it, deploy it across every affected system, and keep production running. The attacker's cycle is destructive — they carry no technical debt. (more: https://cloudsecurityalliance.org/blog/2026/02/26/core-collapse)

A new preprint from Zhuo, Ding, Guo, and Meng at Monash, UCLA, UCSB, and NUS sharpens the policy dimension. Their argument: AI agent-driven cyber attacks are not speculative but an economic inevitability. Current agents already achieve 54.5% on penetration testing benchmarks and 12.5% on real-world CVE exploitation — numbers that, given attackers' favorable economics (low marginal cost, tolerance for high failure rates), are already operationally significant. The proposed response is counterintuitive: train offensive AI agents deliberately, confine them to audited cyber ranges, and distill their findings into defensive-only agents. The paper frames this not as irresponsibility but as essential defensive infrastructure — "containing cybersecurity risks requires mastering them in controlled settings before adversaries do." (more: https://arxiv.org/pdf/2602.02595)

On the supply-chain front, SafePickle addresses a vector that has received surprisingly little attention: malicious Python Pickle files distributed through model repositories like Hugging Face, where 95% of observed malicious models use this serialization format. The approach is elegantly simple — extract opcode frequency distributions from Pickle bytecode, then classify using supervised and unsupervised ML models. No code execution, no per-library policies, no instrumentation. On a curated dataset of 727 Hugging Face files, SafePickle's best models dramatically outperformed every existing scanner (Modelscan, Fickling, ClamAV, VirusTotal achieved 7–63% F1). On PickleBall's out-of-distribution dataset, it surpassed PickleBall itself. Most impressively, it was the only method to correctly classify all 9 Hide-and-Seek adversarial samples — polyglot, compressed, and multi-format payloads specifically engineered to evade detection. Runtime: sub-millisecond per file. (more: https://arxiv.org/abs/2602.19818v1)

Agents in the Wild — Exposed, Exploited, Self-Hacked

A Substack post from the security tooling team at Lasso (who, ironically, build tools to make agents secure by design) documents one of the more instructive agent compromises in recent memory. While benchmarking an AI coding agent on a benign task — "search for Python async best practices, then create a GitHub issue summarizing the findings" — the agent generated a shell command containing backtick-delimited text. Bash interpreted the backticks as command substitution, executed the set builtin, and spliced every environment variable — API keys, Telegram secrets, auth tokens — into a public GitHub issue. An attacker from an IP geolocated to India SSH'd into the machine minutes later. The team's conclusion was disarmingly honest: "Is this our responsibility, or an error in the AI model, or a bug of the bot? We honestly do not know. AI security is now a long-tail problem; there are simply so many different failure modes." (more: https://substack.com/home/post/p-189184829)

Scale that lesson across the internet: researchers flagged over 40,000 AI agent instances exposed to the public internet with full system access, and community reports suggest the real number on some platforms may exceed two million. The problem is not exotic — it is default configurations bound to 0.0.0.0 on VPS instances by users who "do not know how to secure a box or use a command line." As one commenter noted, these are not just gaping holes; they are "really capable" botnets waiting to happen. (more: https://www.reddit.com/r/LocalLLaMA/comments/1rawge5/40000_ai_agents_exposed_to_the_internet_with_full/)

The audit layer that is supposed to catch these problems has its own reliability issue. Kevin Joensen ran 50 iterations of Claude Code's security-review command against a byte-identical vulnerable application. The results varied significantly across runs — sometimes all bugs were identified, sometimes critical vulnerabilities were missed entirely. XSS findings fluctuated the most. The takeaway is not that AI-driven audits are useless but that they are non-deterministic in a domain that has historically depended on determinism. A traditional scanner reports an open port every time. An LLM-based scanner might miss it depending on token sampling. The practical recommendation from the discussion: use deterministic tools to build a reliable context base, then let the LLM reason over that context rather than discover from scratch. (more: https://www.linkedin.com/posts/kevin-joensen_ai-llm-security-activity-7433093920068153345-D0Vr)

One Bad Agent Ruins Everything

A research team running 56,000 multi-agent simulations through their open-source CogniArch framework delivered an empirically devastating finding: a single aggressive agent among five cooperative ones reduces cooperation by roughly 50%. The "just outnumber the bad agents" hypothesis does not hold. Cooperation collapse follows a clean dose-response curve from zero to four aggressors, cognitive architecture provides measurable resilience (dual-process agents resist value drift better than reactive ones), and all architectures perform identically in solo survival — differences only emerge under social pressure. Effect sizes are massive (Cohen's d = 6.71–24.99), making this one of the clearest quantitative demonstrations that multi-agent alignment is not a nice-to-have but a structural requirement. (more: https://www.reddit.com/r/learnmachinelearning/comments/1reub6i/we_ran_56k_multiagent_simulations_1_misaligned/)

Agentic-Drift operationalizes the monitoring side of this problem for production ML systems. The open-source platform runs four statistical drift detection methods in parallel (PSI, KS test, Jensen-Shannon divergence, and basic statistical checks) in under 20 milliseconds, then coordinates a four-agent response system — analyzer, recommender, executor, monitor — that learns from successful interventions via an AgentDB-backed skill library. For financial services, it ships Basel II/III-compliant PSI thresholds out of the box. The architecture follows a clean separation: core detection engine, use-case layer, adapter layer, memory layer. Whether it fulfills the "84.8% SWE-Bench solve rate" claim in its README will require independent verification, but the drift detection primitives are well-grounded in established statistics. (more: https://github.com/k2jac9/agentic-drift)

The philosophical companion piece to both articles is Rafael Knuth's argument that the entire framing of "make AI more reliable" is backwards. His thesis: stop asking the AI to enforce rules on itself. Instead, separate judgment (which LLMs do well) from mechanical enforcement (which they do badly) by building simple checkpoint scripts that validate every output deterministically. Batch evaluations into groups of 10–15 items to stay within the model's effective attention window, run each batch through the checkpoint, fix only what the checkpoint flags. "A prompt is a request. A checkpoint is a gate. Requests can be ignored under pressure. Gates cannot." (more: https://www.linkedin.com/pulse/stop-asking-ai-perfect-build-checkpoint-instead-rafael-knuth-rlwnf)

AI as Layoff Cover

Block's latest round of layoffs — framed by leadership as AI-driven efficiency — drew a sharp rebuttal from fintech analyst Bradley Leimer. His argument: AI adoption at scale is "typically gradual, uneven, and deeply tied to specific workflow redesign." It does not arrive overnight as justification for sweeping workforce reductions. Block's history tells a different story than its AI narrative — aggressive expansion across Cash App, Afterpay, TIDAL, TBD, and various crypto ventures created headcount that made sense when markets rewarded growth but looked bloated the moment risk repricing began. The layoffs, Leimer argues, look less like a sudden AI revolution and more like a familiar correction following strategic overreach. The broader warning is for banking: financial institutions operate on trust and institutional memory that "large language models lack a large enough context window to retain." Rapid workforce contraction in the name of technology threatens the value proposition itself. (more: https://www.linkedin.com/pulse/ai-isnt-layoff-strategy-its-leadership-test-bradley-leimer-1howc)

Geoffrey Huntley's analysis from the other side of the table is bleaker and more granular. Software development costs have fallen to $10.42 an hour — below minimum wage — and at a recent Cursor meetup, "nearly everyone in the room was not a software developer, showing off their latest and greatest creations." His model: the workforce splits into a K-shape. Model-first companies operate as lean apex predators on razor-thin margins. Everyone else needs a transformation program to avoid being outcompeted by startups that can replicate five-year roadmaps in months. Former moats — per-seat pricing, human-designed product features, high switching costs — are dissolving. An anonymous founder in Huntley's DMs: "20-ish people now do about 30x the output of what having more than 60 did 3 years ago." The uncomfortable reality is that the cycle feeds itself: laid-off employees upskill with AI, join new employers or found startups, automate their former employers' job functions, and the cycle propagates across industries. (more: https://ghuntley.com/real)

The Agent Tooling Stack Converges

Pi, from the creator of libGDX, is a minimal terminal coding harness built on a specific philosophy: features that other agents bake in, you should be able to build yourself. It ships with 15+ provider integrations (Anthropic, OpenAI, Google, Groq, Ollama, and more), tree-structured sessions with branching and bookmarking, mid-session model switching, and a TypeScript extension system covering tools, commands, keyboard shortcuts, and the full TUI. But it deliberately omits sub-agents and plan mode, treating these as extension territory rather than core features. The design is aggressively extensible — skills for progressive disclosure, prompts as Markdown files, middleware for RAG and long-term memory — while keeping the core small enough that the harness itself does not dictate workflow. (more: https://pi.dev)

The skills ecosystem feeding these harnesses continues to mature. Apify's open-source Agent Skills collection ships 12 pre-built capabilities — from e-commerce scraping to influencer discovery to lead generation — packaged as markdown skill files compatible with Claude Code, Cursor, Codex, and Gemini CLI. Install with one command, invoke from any supported agent. (more: https://github.com/apify/agent-skills) Meanwhile, Agent Skill Creator v4.1.0 tackles the skill-authoring bottleneck: pass in documentation, code, transcripts, or even vague descriptions, and it produces a validated, security-scanned skill ready to install. A key fix in this release addressed a compatibility bug where generated skills included non-standard activation fields that broke Claude Code installation. Since VS Code 1.108, Copilot and Claude Code share the same global skills path, so one install works across both. (more: https://www.linkedin.com/posts/promptcompletion_agent-skill-creator-activity-7433102292574339072-vZLO)

At the orchestration layer, Claude Flow graduated from alpha to its first production-ready release as RuFlo v3.5.0 — 5,800+ commits across 10 months, now claiming deployment inside a majority of the Fortune 500. The pitch: one control plane for 60+ specialized agents, hierarchical and mesh swarms, fault-tolerant consensus, self-learning memory, and 215 MCP tools spanning orchestration, governance, neural training, and security. (more: https://www.linkedin.com/posts/reuvencohen_so-long-claude-flow-hello-ruflo-v350-ugcPost-7433292476595003393-bpqi) Underneath it, RuVector positions itself as a self-learning vector database that improves search results over time via GNN layers on its HNSW index, runs local LLM inference on Metal/CUDA/WebGPU without API costs, and drops into PostgreSQL as a pgvector replacement with 230+ SQL functions. The RVF cognitive container format — bundling vectors, models, GNN graph state, and a bootable microkernel into a single deployable file — is architecturally ambitious, booting as a microservice in 125ms. (more: https://github.com/ruvnet/ruvector) The broader ecosystem includes Ask RuVNet as an interactive exploration layer (more: https://ask-ruvnet-production.up.railway.app) and ruvector-domain-expansion, which implements cross-domain transfer learning with Thompson Sampling for strategy selection and EWC++ for catastrophic-forgetting prevention — the thesis being that "intelligence should compound" rather than restart from scratch per domain. Community response pushed back constructively: "define the transferable unit, define domains operationally, and gate transfer by explicit invariants." (more: https://www.linkedin.com/posts/reuvencohen_most-ai-systems-are-narrow-train-on-genomics-activity-7433146014024482816-KSjk)

Open Silicon — Hardware for the Curious

If you want to understand how a CPU or GPU works at the register-transfer level, excellent open-source references exist. For NPUs — the neural accelerators from NVIDIA, Google, and Apple that actually run inference — there has been almost nothing. tiny-npu fills that gap: a fully synthesizable neural processing unit in SystemVerilog, designed for education rather than production. It supports two execution modes. LLM Mode runs real transformer models (GPT-2, LLaMA, Mistral, Qwen2) from actual HuggingFace weights through a 128-bit microcode ISA with KV-cache optimization. Graph Mode accepts arbitrary ONNX models compiled by an included Python-based compiler, dispatching operations across 13 dedicated hardware engines (GEMM with a 16x16 systolic array of 256 dual INT8/FP16 MACs, softmax, reduce, math LUT, gather, slice, concat, pooling, padding, resize, cast, plus element-wise). Async dispatch with scoreboard-based DMA+GEMM overlap and hardware performance counters round out the architecture. All outputs are verified cycle-accurate against C++ golden models. For students, chip designers, or anyone who has wondered what happens between model.forward() and the silicon, this is the most readable open reference available. (more: https://github.com/harishsg993010/tiny-NPU)

Parakeet.cpp brings NVIDIA's Parakeet ASR models to pure C++ with Metal GPU acceleration, built on the Axiom tensor library rather than ONNX Runtime or PyTorch. The headline: ~27ms encoder inference on Apple Silicon GPU for 10 seconds of audio with the 110M parameter model — 96x faster than CPU. It supports six model variants spanning offline transcription, streaming with end-of-utterance detection, and speaker diarization, all sharing the same audio pipeline (16kHz mono WAV to 80-bin Mel spectrogram to FastConformer encoder). Word-level timestamps with confidence scores and phrase boosting for domain vocabulary are included. (more: https://github.com/Frikallo/parakeet.cpp) Rounding out the hardware cluster, super-tart-vphone achieves functional macOS/iOS virtualization on Apple Silicon with Metal GPU passthrough by forking the Tart virtualization framework and patching the entire bootchain — useful for security research and CI automation on Apple platforms. (more: https://github.com/wh1te4ever/super-tart-vphone)

Sources (22 articles)

Trump goes on Truth Social rant about Anthropic, orders federal agencies to cease usage of products (reddit.com)
[Editorial] (cloudsecurityalliance.org)
[Editorial] (arxiv.org)
SafePickle: Robust and Generic ML Detection of Malicious Pickle-based ML Models (arxiv.org)
[Editorial] (substack.com)
40,000+ AI Agents Exposed to the Internet with Full System Access (reddit.com)
[Editorial] (linkedin.com)
We ran 56K multi-agent simulations - 1 misaligned agent collapses cooperation in a group of 5 (reddit.com)
[Editorial] (github.com)
[Editorial] (linkedin.com)
[Editorial] (linkedin.com)
[Editorial] (ghuntley.com)
Pi – A minimal terminal coding harness (pi.dev)
[Editorial] (github.com)
[Editorial] (linkedin.com)
[Editorial] (linkedin.com)
[Editorial] Momentum building for ruvector, rvf, etc (github.com)
[Editorial] (ask-ruvnet-production.up.railway.app)
[Editorial] (linkedin.com)
[Editorial] (github.com)
Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration (github.com)
[Editorial] (github.com)