Security and Trust Under Fire
Published on
Today's AI news: Security and Trust Under Fire, The Agentic Coding Arms Race, Big Models, Small Iron, Agent Memory and Developer Infrastructure, Research Frontiers, AI Meets the Physical World. 22 sources curated from across the web.
Security and Trust Under Fire
Vercel disclosed a security incident this week demonstratng how third-party AI tool integrations have become a soft underbelly of cloud infrastructure. The breach originated not from a Vercel system but from Context.ai, an AI platform whose compromised Google Workspace OAuth application gave attackers a foothold in a Vercel employee's account. From there, the attacker escalated into Vercel environments and enumerated environment variables that were not marked as "sensitive" β and therefore not encrypted at rest. Vercel CEO Guillermo Rauch confirmed the chain: Context.ai breach β employee Google Workspace β Vercel environments β unencrypted env vars β further access. A threat actor claiming to be ShinyHunters (though actual ShinyHunters members deny involvement) is now offering 580 employee records, API keys, NPM tokens, and internal deployment access on hacking forums, with an alleged $2 million ransom demand. Vercel says Next.js and its open-source projects remain safe, and has rolled out dashboard improvements for managing sensitive variables β the kind of feature you implement after the barn door is open. (more: https://www.bleepingcomputer.com/news/security/vercel-confirms-breach-as-hackers-claim-to-be-selling-stolen-data/)
The European Commission's week was arguably worse. Ursula von der Leyen presented the EU's new age-verification app on Wednesday, declaring it "technically ready" and "fully open source." Security consultant Paul Moore broke it in under two minutes. The app stored sensitive data unprotected on-device, and French white-hat Baptiste Robert confirmed that biometric authentication was bypassable β meaning your nephew could borrow your phone and prove he's over 18. The Commission's defense evolved from "it's technically ready" to "it's a demo version" to "the code will be constantly updated" within 48 hours, while both researchers confirmed they were testing the latest published code. Over 400 security experts had already sent the Commission a letter in March calling for a moratorium on age-verification deployment until the science settled. Polish MEP Piotr MΓΌller called it "step-by-step creation of a Chinese-style internet in Europe." The deeper issue isn't the bugs β it's the political pressure to ship identity-linked verification infrastructure before anyone has figured out how to do it safely. (more: https://www.politico.eu/article/eu-brussels-launched-age-checking-app-hackers-say-took-them-2-minutes-break-it/)
Trust manipulation extends well beyond government apps. A peer-reviewed Carnegie Mellon study (ICSE 2026) analyzed 20 terabytes of GitHub metadata and found approximately 6 million suspected fake stars distributed across 18,617 repositories by roughly 301,000 accounts. Stars sell for $0.06 on at least a dozen websites β no dark web required. The investigation maps the full pipeline from star farms to venture capital: Redpoint Ventures' Jordan Segall published data showing the median GitHub star count at seed financing is 2,850, with firms running automated scrapers to identify fast-growing repos. For $85 to $285 in purchased stars, a startup can manufacture that seed-stage median. The fork-to-star ratio emerges as the strongest detection heuristic β organic projects like Flask show 235 forks per 1,000 stars, while FreeDomain (157,000 stars) manages just 11. One AI project topped Runa Capital's ROSS Index with 74,300 stars and 54x growth; the study flagged nearly half its stars as artificial. The FTC's 2024 rule banning fake social influence metrics carries penalties up to $51,744 per violation, and the SEC has already charged founders for inflating traction during fundraising. Nobody has been charged specifically for fake GitHub stars yet. Given the math β a $50 problem with a $50 million consequence β it may only be a matter of time. (more: https://awesomeagents.ai/news/github-fake-stars-investigation/)
The Agentic Coding Arms Race
OpenAI released what amounts to a "Codex does everything now" update. Background computer use lets multiple agents operate your Mac in parallel β seeing, clicking, and typing with their own cursors β without interfering with your work. An in-app browser enables direct annotation of web pages for frontend iteration. Memory lets Codex remember preferences and corrections across sessions. Automations can reuse conversation threads, schedule future work, and wake up autonomously days later to continue tasks. Over 90 new plugins ship, spanning Atlassian Rovo, CircleCI, GitLab, and Microsoft Suite. The framing is explicit: Codex is no longer a code assistant but an autonomous software development partner that operates across the full lifecycle. (more: https://openai.com/index/codex-for-almost-everything/)
The logical endpoint of this trajectory is the dark factory β a codebase where AI handles planning, implementation, review, and production deployment with zero human in the loop. Cole Medin is running the first public experiment, built on Archon (his open-source harness builder) driving Claude Code sessions routed to MiniMax M2.7 for cost reasons. The architecture has three governance documents (mission, factory rules, CLAUDE.md) loaded into every workflow, plus a triage β implement β validate loop running on a cron schedule. The critical design decision, borrowed from StrongDM's production dark factory, is the holdout pattern: the validation agent receives the user journey and code diffs but zero context from the development process, preventing the sycophancy that lets AI agents paper over their own bugs. Issues come in via GitHub, get triaged against the mission scope, implemented in parallel worktrees, validated with browser automation, and merged without human approval. (more: https://www.youtube.com/watch?v=6woc6ii-zoE)
The question of what responsible agent-assisted open source looks like gets a thoughtful answer from Hugging Face. Their transformers-to-MLX Skill is a recipe for porting language models from the transformers library to Apple's MLX framework β not a fire-and-forget automation but a contributor tool that produces PRs designed to look like careful human submissions. The Skill runs per-layer comparisons against the transformers baseline, detects undeclared dtypes from safetensors metadata, and generates a separate non-agentic test harness for reproducibility. Critically, it will not open a PR until the contributor has accepted the results, and it always discloses agent assistance. The post is equally notable for its diagnosis of the problem: PR volume to transformers has gone up tenfold since code agents started working, but the number of maintainers has not, and agents "break implicit contracts between the library and its users" by following "best practices" that violate undocumented design decisions. (more: https://huggingface.co/blog/transformers-to-mlx)
On the tooling front, Anthropic confirmed that OpenClaw-style Claude CLI usage is sanctioned again, with OpenClaw supporting API keys, Claude CLI reuse, prompt caching, and a 1M context window beta for Opus/Sonnet models (more: https://docs.openclaw.ai/providers/anthropic). Meanwhile, the Pi coding agent continues gaining traction as a minimalist Claude Code alternative β fighting the feature bloat that has made Claude Code's system prompt unpredictable and its behavior harder to control. Pi's philosophy: build the smallest extensible core, then let the agent modify itself to add any capability you need. (more: https://youtu.be/XSmI7OYd7iM)
Big Models, Small Iron
The local inference community is working through the Kimi K2.6 deployment question, and the numbers are sobering. This 1.1 trillion-parameter MoE model needs roughly 768GB of VRAM to hit 25-30 tokens per second at full precision with full context. One user got it running on a 512GB M3 Ultra at a "glorious" 6 tokens per second using an IQ2_KL quantization β functional but hardly production speed. The realistic hardware estimate for target throughput ranges from $28,000 (AMD Genoa + 2-4 RTX Pro 6000s) to $80,000 (8x RTX Pro 6000 96GB each). The model is natively QAT at INT4, which helps the footprint, but full 265K context at F16 KV cache still demands around 48GB on top of the ~548GB model weight. (more: https://www.reddit.com/r/LocalLLaMA/comments/1srd9pc/anyone_deployed_kimi_k26_on_their_local_hardware/)
At the opposite end of the cost spectrum, a detailed build guide turns a Xiaomi 12 Pro into a 24/7 headless AI server. The recipe: flash LineageOS, kill the Android UI entirely to reclaim RAM, disable the phantom process killer via ADB, and run Ollama in Termux with a wake lock. A custom script suite ("Zeus") handles Wi-Fi recovery after UI shutdown, battery cycling between 40-80% with Telegram alerts, and automated fan control via a SONOFF smart plug triggered at 45C. Benchmarks show Gemma 4 E2B at Q8 generating 2.16 tokens per second and Qwen2.5 7B at Q4 managing 1.54 tok/s. An update after compiling llama.cpp with ARM NEON intrinsics and Q4_0 quantization reportedly improved speed dramatically. (more: https://www.reddit.com/r/LocalLLaMA/comments/1smedrp/247_headless_ai_server_on_xiaomi_12_pro_guide/)
Gemma 4's smallest variant, the e4B-IT, is drawing attention for a different reason: it follows instructions so eagerly that it essentially has no refusal behavior out of the box. Users report the vanilla model handles "unhinged and dark" content without pushback, with community-produced uncensored quantizations pushing the envelope further. (more: https://www.reddit.com/r/LocalLLaMA/comments/1srafun/gemm4e4bit_good_at_instructions_following_no/)
Agent Memory and Developer Infrastructure
A benchmark of four agent memory systems delivers the kind of results that should make anyone relying on these tools pause. Mem0 scores 49% recall β worse than a coin flip. Zep burns 340x more tokens for a 15-point improvement. The community response was blunt: "most agent memory systems are out of vibe coding hell β a plague like Todo-Apps, LLM Routers and Flappy-Bird Games." The deeper critique is architectural: memory is token-expensive and fundamentally misaligned with how LLMs work. These aren't real people who benefit from remembering β they're stateless inference engines where context window management matters more than persistent storage. The most practical suggestion in the thread: "chronologically stacked, meticulously prompted, unpackable and repackable compaction." Several commenters admitted to manually pruning JSON files after each session because nothing automated worked without destroying their token budget. (more: https://www.reddit.com/r/LocalLLaMA/comments/1sm6lt6/benchmarked_4_agent_memory_systems_mem0_scores_49/)
A single-file OpenWebUI filter function tackles these problems pragmatically: semantic routing classifies queries into five categories with different system prompts, multi-provider failover switches from Groq to Fireworks mid-request, citation verification checks that factual responses cite real URLs from search results rather than hallucinated ones, and per-chat memory stores conversation turns with BM25+cosine hybrid retrieval plus compression of oldest turns (more: https://www.reddit.com/r/OpenWebUI/comments/1snvwwx/opensourced_my_openwebui_router_semantic_routing/). Separately, the Inline Visualizer v2 plugin enables live streaming visualization rendering inside OpenWebUI chats β SVGs assembling token-by-token, Chart.js bars populating column by column, D3 force graphs settling in real time. A custom safe-cut HTML parser plus incremental DOM reconciler ensures existing elements never re-mount and animations don't retrigger. (more: https://www.reddit.com/r/OpenWebUI/comments/1sqzwzh/this_should_not_be_possible_in_open_webui_live/)
Cloudflare launched a scanner that evaluates how ready websites are for AI agent interaction, checking across five categories: discoverability (robots.txt, sitemaps), AI bot rules, agent infrastructure (server cards, agent skills, API catalogs, OAuth discovery), and Model Context Protocol (MCP) support. The message is clear: the web is being restructured around agent access, and sites that don't publish machine-readable capability descriptions will be invisible to the next generation of automated consumers. (more: https://isitagentready.com)
Research Frontiers
As LLMs push toward million-token context windows, the KV cache becomes the bottleneck β VRAM scales linearly with sequence length, and existing Top-K eviction strategies delete tokens that may become critical later. A new approach reframes the problem entirely: instead of deciding which tokens to remove, mathematically summarize them. The SRC (Selection-Reconstruction-Compression) pipeline uses Shannon entropy of attention weights to identify "diffuse" tokens whose information is less specific, moves them to a recycle bin, solves an ordinary least squares problem to reconstruct their functional contribution as a compact weight matrix, then compresses via SVD to retain only the dominant singular values. Under equal memory budgets, this entropy-guided approach (HAE) achieves near-zero reconstruction error at 30% keep ratio while Top-K degrades rapidly. It actually uses less memory than Top-K across all ratios because redundant information is mathematically removed rather than explicitly stored. The primary trade-off is computation time for the OLS and SVD steps β the authors plan custom Triton kernels to amortize that cost. (more: https://jchandra.com/posts/hae-ols/)
A USC paper introduces Density-Guided Response Optimization (DGRO), which aligns language models to community norms without any explicit preference labels. The key observation: responses that communities consistently accept cluster in high-density regions of embedding space, forming a measurable "acceptance manifold." DGRO uses local kernel density estimation conditioned on nearby conversation contexts to rank candidate responses, then feeds those density-derived rankings into standard DPO training. Tested on eating disorder support communities and Russian-language conflict documentation forums β domains where explicit preference annotation would be ethically fraught or culturally misaligned β DGRO-aligned models consistently outperformed supervised fine-tuning baselines in head-to-head evaluation. The paper is careful about limitations: DGRO reproduces whatever norms exist, including harmful ones, and is vulnerable to coordinated manipulation of acceptance signals. It is positioned as a descriptive tool, not a normative authority. (more: https://arxiv.org/abs/2603.03242v1)
NVIDIA's Nemotron OCR v2 demonstrates the power of synthetic data at scale. A modified SynthDoG pipeline generates 12 million training images across six languages with pixel-perfect ground truth at word, line, and paragraph levels β including reading-order relation graphs that most real-world OCR datasets lack. The result: NED scores drop from 0.56-0.92 (unusable) to 0.035-0.069 on non-English languages. A single unified model handles all languages simultaneously at 34.7 pages per second on one A100, outperforming even language-specialized PaddleOCR variants while running 28x faster (more: https://huggingface.co/blog/nvidia/nemotron-ocr-v2). In video generation, a Max Planck team introduces Physical Simulator In-the-Loop Video Generation (PSIVG), which integrates an MPM-based physics simulator directly into the diffusion process. The pipeline reconstructs 4D scenes from template videos, initializes them in a physics engine, and uses simulated trajectories to guide generation toward physically coherent motion. A test-time texture consistency optimization (TTCO) technique adapts text embeddings based on pixel correspondences from the simulator. In user studies, PSIVG was preferred in 82.3% of comparisons against baseline methods. (more: https://arxiv.org/abs/2603.06408v1)
AI Meets the Physical World
AutoProber is the kind of project that makes you wonder what garage security research will look like in five years. Built from a $300 CNC machine, a USB microscope, a Siglent oscilloscope, and 3D-printed parts, it gives an AI agent everything it needs to autonomously probe hardware targets. The workflow: the agent identifies the target on the CNC bed, takes microscope frames while tracking XYZ coordinates, stitches them into an annotated map identifying pads, pins, and chips, proposes probe targets for human approval, then physically probes the approved points. Safety is taken seriously β an independent optical endstop on oscilloscope Channel 4 provides continuous monitoring during any motion, with any anomaly triggering an immediate feed hold. The entire stack is Python-controlled with a Flask dashboard, and the agent handles everything from homing and calibration to target identification. (more: https://github.com/gainsec/autoprober)
Meanwhile, a Chinese-language GitHub repository called "awesome-human-distillation" catalogs over 204 projects that turn real people β colleagues, mentors, ex-partners, parents, deceased family members β into callable Claude Code Agent Skills. Categories span workplace (distill your boss), academia (distill your thesis advisor), intimate relationships (distill your ex, your crush, your first love), and public figures (from Confucius to Trump to Jack Ma). The repository's disclaimer is characteristically self-aware: "We have no intention of becoming traitors to humanity β down with AI-centrism!" Anti-distillation skills also exist, designed to make your distilled skill file look complete while keeping the real knowledge to yourself. (more: https://github.com/mliu98/awesome-human-distillation)
On the evaluation side, Charles Martin's analysis of LLM-as-a-Verifier techniques highlights a shift in how we extract judgment from models. Rather than taking the single score token a model outputs, the system examines logprobs across all possible score tokens, computes expected scores from the full distribution, repeats verification multiple times, and decomposes evaluation across multiple criteria. Results include 77.8% on SWE-Bench Verified and 78.9% pairwise verification accuracy on Terminal-Bench. The implication: some older best practices for LLM judges need revision as frontier models improve at nuanced evaluation β but the gains appear tied to the specific verifier setup, not a blanket "switch from 1-5 to 1-100" recommendation. (more: https://www.linkedin.com/posts/charlesmartin14_%F0%9D%97%9F%F0%9D%97%9F%F0%9D%97%A0-%F0%9D%97%AE%F0%9D%98%80-%F0%9D%97%AE-%F0%9D%97%A9%F0%9D%97%B2%F0%9D%97%BF%F0%9D%97%B6%F0%9D%97%B3%F0%9D%97%B6%F0%9D%97%B2%F0%9D%97%BF-%F0%9D%97%AA%F0%9D%97%B5%F0%9D%98%86-%F0%9D%97%99%F0%9D%97%B6%F0%9D%97%BB%F0%9D%97%B2-%F0%9D%97%9A%F0%9D%97%BF%F0%9D%97%AE%F0%9D%97%B6%F0%9D%97%BB%F0%9D%97%B2%F0%9D%97%B1-activity-7450366817715515393-S_Qu)
Sources (22 articles)
- Vercel April 2026 security incident (bleepingcomputer.com)
- Brussels launched an age checking app. Hackers took 2 minutes to break it (politico.eu)
- GitHub's Fake Star Economy (awesomeagents.ai)
- Codex for almost everything (openai.com)
- I'm Building an AI Dark Factory That Ships Its Own Code (Public Experiment) (youtube.com)
- The PR you would have opened yourself (huggingface.co)
- Anthropic says OpenClaw-style Claude CLI usage is allowed again (docs.openclaw.ai)
- [Editorial] (youtu.be)
- Anyone deployed Kimi K2.6 on their local hardware? (reddit.com)
- 24/7 Headless AI Server on Xiaomi 12 Pro (Guide & Benchmarks) Gemma4 VS Qwen2.5 (reddit.com)
- Gemm4:e4B-IT good at instructions following no refusals. (reddit.com)
- Benchmarked 4 agent memory systems: Mem0 scores 49% recall (worse than a coin flip), Zep uses 340x more tokens for 15 points improvement. Here's what's actually going on. (reddit.com)
- Open-sourced my OpenWebUI router β semantic routing, citation verification, and per-chat memory for any LLM. (reddit.com)
- THIS SHOULD NOT BE POSSIBLE IN OPEN WEBUI: LIVE VISUALIZATION RENDERING - Inline Visualizer v2 is HERE! (reddit.com)
- Is Your Site Agent-Ready? (By Cloudflare) (isitagentready.com)
- High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction (jchandra.com)
- Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals (arxiv.org)
- Building a Fast Multilingual OCR Model with Synthetic Data (huggingface.co)
- Physical Simulator In-the-Loop Video Generation (arxiv.org)
- Guy builds AI driven hardware hacker arm from duct tape, old cam and CNC machine (github.com)
- mliu98/awesome-human-distillation (github.com)
- [Editorial] (linkedin.com)