AI Infrastructure

Deployment, Kubernetes, scaling, cloud vs local, MLOps

939 articles across 210 editions

Articles

SAGA: Workflow-Atomic Scheduling for AI Agent Inference on GPU Clusters -- 2026-07-09
OfficeCLI: Office suite for AI agents to read and edit Microsoft Office files -- 2026-07-09
Toolport: Use as many MCP servers as you want without the token tax -- 2026-07-09
GPT-5.6 Sol Ultra will be in Codex -- 2026-07-09
Cloudflare Meerkat - Globally distributed consensus -- 2026-07-08
Run AI workloads on any cloud, store on Hugging Face: zero-egress storage with SkyPilot -- 2026-07-08
DeepSeek V4 Branch: Quantized KV Cache Fixes Enable 1M Context on Single RTX PRO 6000 -- 2026-07-07
Athena: 100% Local Voice-to-Voice Assistant — Qwen3.5-397B + Orpheus TTS + Whisper, All C++, Zero Cloud -- 2026-07-07
6x P40 Homelab Running Minimax M2.7 Q3_XL — Full Benchmarks and Config -- 2026-07-07
Uncensored Heretic: Gemma 4 12B with 13/100 Refusals and 0.0367 KLD — Safetensors + GGUF -- 2026-07-07
CPPO: Cumulative Prefix-Divergence Policy Optimization for LLM Reinforcement Learning -- 2026-07-07
CORTIS: Continual Speaker Identity Unlearning in Zero-Shot TTS -- 2026-07-07
96gb+ 4090's and 5090 are literally a scam. I mods these cards myself -- 2026-07-02
Mtia v2 - what to do -- 2026-07-02
HIP: use hipBLAS for dense prefill on gfx900, keep MMQ for MoE by DEV-DUFORD · Pull Request #24588 · ggml-org/llama.cpp -- 2026-07-02
Is Qwen3-VL-2B the only viable VLM for JSON extraction on a "potato"? -- 2026-07-02
Phantoms and Disclosures: a Causal Framework for Auditing Synthetic Data -- 2026-06-29
How do I prove that I don't collect data from my llm app? -- 2026-06-29
Cloudflare launched self-managed OAuth for all -- 2026-06-29
Apple to skip high-end M6 Mac chips in favor of AI-focused M7 line -- 2026-06-27
NVIDIA's New AI Servers Run on Hotub Coolant and Don't Need Evaporators -- 2026-06-27
Accelerating Fine-Tuning with NVIDIA NeMo AutoModel -- 2026-06-25
NVIDIA Claims 15x Speedup with Diffusion-Based LLM — Entire Block of Text Generated at Once -- 2026-06-25
OpenAI unveils its first custom chip, built by Broadcom -- 2026-06-25
Qualcomm to Acquire Modular -- 2026-06-25
45°C cooling design cuts data center water use to near zero -- 2026-06-25
OpenAI's CSO on Registration List for Thiel's Secret $16K Retreat Alongside Treasury Secretary -- 2026-06-25
[Editorial] The Easiest Way to Lie with AI -- 2026-06-25
[Editorial] Video Pick 1 -- 2026-06-19
Epic Games announces Lore version control system -- 2026-06-19
Supertone/supertonic-3 -- 2026-06-19
[Editorial] Video Pick 3 -- 2026-06-19
[Editorial] Cursor: Agent Autonomy & Auto-Review -- 2026-06-17
paradigmxyz/centaur — Multiplayer, Self-Hosted, Secure Agents -- 2026-06-17
tastyeffectco/sandboxes — Self-Hosted Dev Sandboxes -- 2026-06-17
[Editorial] AI Has Made Corporate Knowledge Debt Visible -- 2026-06-17
[Editorial] Gauntlet AI -- 2026-06-17
GrapheneOS Ported to Android 17 -- 2026-06-17
WATCH MY ESCAPE — LLMs Try to Solve Your Handmade Escape Rooms -- 2026-06-17
scheidydude/codeindex — Blast-Radius Impact Scoring for AI Dev -- 2026-06-17
LLM Pipeline Survives Provider Outage via Stateful FSM Fallback -- 2026-06-17
[Editorial] Langcraft Flux — 30 Days Deep Dive -- 2026-06-17
[Editorial] Zuckerberg Tells Meta Employees AI Will Transform Their Work -- 2026-06-16
Workers Are Spending Over 6 Hours a Week 'Botsitting' AI, Fueling Job Frustration -- 2026-06-16
[Editorial] Everyone's Cutting Junior Hires — I'm Doubling Down -- 2026-06-16
[Editorial] Same Line in Every Room — Code Quality Patterns -- 2026-06-15
[Editorial] The Frontier AI Ecosystem -- 2026-06-15
[Editorial] The Next Phase of AI Infrastructure -- 2026-06-15
[Editorial] OpenRouter Fusion — The Pattern Emerging -- 2026-06-15
Config Files That Run Code: Supply Chain Security Blindspot -- 2026-06-10
Surveillance is not safety: A statement on the UK's latest threat to privacy -- 2026-06-10
Apple reveals new AI architecture built around Google Gemini models -- 2026-06-10
Apple Core AI Framework -- 2026-06-10
[Editorial] Apple just killed paying for AI -- 2026-06-10
wexaai/cognodb -- 2026-06-08
[Editorial] -- 2026-06-08
[Editorial] -- 2026-06-08
MisoLabs/MisoTTS -- 2026-06-08
Volkswagen blocks Home Assistant by requiring client assertion -- 2026-05-29
YouTube to automatically label AI-generated videos -- 2026-05-29
103B-token Usenet corpus (1980-2013) — pre-web, human-only, zero AI contamination -- 2026-05-29
Clark Browser: AI-native browser that finds a way -- 2026-05-29
I'm Getting into Mesh Networks (Meshtastic, MeshCore, and Reticulum) -- 2026-05-28
A Eureka machine that thinks like nature and explores what AI cannot -- 2026-05-28
Starbucks is scrapping its AI inventory tool after it reportedly miscounted and mislabeled items -- 2026-05-28
Google officially announces that ads will be included in AI Mode search results -- 2026-05-22
Mistral AI acquires Emmi AI -- 2026-05-22
Gemini CLI will stop working from June 18, 2026 -- 2026-05-22
A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments -- 2026-05-22
Simple Multi-Agent Architecture Running Across Our Entire Org. Keeping everything in Loop. -- 2026-05-22
eight-acres-lab/openmelon -- 2026-05-22
GPT 5.5 (Codex) leading the future prediction race -- 2026-05-22
An OpenAI model has disproved a central conjecture in discrete geometry -- 2026-05-21
I've joined Anthropic -- 2026-05-21
OpenAI Adopts Google's SynthID Watermark for AI Images with Verification Tool -- 2026-05-20
Infomaniak transitions to a foundation model to protect user data privacy -- 2026-05-20
[Editorial] -- 2026-05-20
[Editorial] -- 2026-05-18
I tracked every dollar I spent on AI coding tools for 60 days and math is uglier than I thought but probably not in the way you'd guess. -- 2026-05-18
[Editorial] -- 2026-05-18
[Editorial] Patch Diffing Pipeline -- 2026-05-15
"This is the first documented instance of AI self-replication via hacking." -- 2026-05-15
Show HN: Running the second public ODoH relay -- 2026-05-15
The Dark Side of Unitree Robot Dogs -- 2026-05-15
[Editorial] XBOW Mythos Offensive Security Evaluation -- 2026-05-14
SPECA: Specification-to-Checklist Agentic Auditing Framework -- 2026-05-14
[Editorial] BIML Research Finds Critical Flaws in AI Security Measurement -- 2026-05-14
CERT Releases Six CVEs for Serious Dnsmasq Vulnerabilities -- 2026-05-14
AMD Introduces Instinct MI350P Accelerator: CDNA 4 Comes to PCIe Cards -- 2026-05-14
MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required -- 2026-05-14
STAM: Stable Training with Adaptive Momentum — A New Optimizer for AI Training -- 2026-05-14
[Editorial] Anthropic Quietly Nerfed Claude Code -- 2026-05-14
ForgeRAG: Production-Ready RAG with Structure-Aware Reasoning -- 2026-05-14
[Editorial] Hyperframes: Open-Source AI Video Generation -- 2026-05-14
[Editorial] Video Editorial Submission -- 2026-05-14
Computron — Self-Hosted AI Assistant with Full Google Workspace Integration -- 2026-05-14
Open-Source Microsoft Office Extensions for Open WebUI -- 2026-05-14
GGUF Uploads on Hugging Face Nearly Doubled in 2 Months -- 2026-05-14
AI Lobbyists Say Regulation Means Losing to China — But China Says 'Safety First, Innovation Second' -- 2026-05-14
Drastically improve prompt processing speed for --n-cpu-moe partially offloaded models -- 2026-05-13
Gemma 4 26B Hits 600 Tok/s on One RTX 5090 -- 2026-05-13
ZAYA1-8B: Frontier intelligence density, trained on AMD -- 2026-05-13
examples : add llama-eval by ggerganov · Pull Request #21152 · ggml-org/llama.cpp -- 2026-05-13
I got a real transformer language model running locally on a stock Game Boy Color! -- 2026-05-13
I let AI build a tool to help me figure out what was waking me up at night -- 2026-05-12
Taking Polyphony to a New Level -- 2026-05-12
Google says criminal hackers used AI to find a major software flaw -- 2026-05-12
Not a good day for team 'Claude Mythos is Just Marketing Hype' — Mozilla security hardening with Claude -- 2026-05-12
US and tech firms strike deal to review AI models for national security before public release -- 2026-05-12
Local AI needs to be the norm -- 2026-05-12
DeepSeek V4 being 17x cheaper got me to actually measure what I send to cloud vs what I could run locally. the results are stupid. -- 2026-05-12
HOT TAKE: local models + agent harnesses are now capable enough to hand off junior-level IT professional tasks to -- 2026-05-12
Mac Studio local loadout - May 2026 -- 2026-05-12
MTP on Strix Halo with llama.cpp (PR #22673) — 40→80 t/s -- 2026-05-12
Using Gemma 3:1b for a persistent, local-first AI Desk Kiosk on a Pi5 -- 2026-05-12
The Road to a Billion-Token Context -- 2026-05-06
How OpenAI delivers low-latency voice AI at scale -- 2026-05-06
[Editorial] The AI Vulnerability Storm -- 2026-05-06
CVE-2026-31431: Copy Fail vs. rootless containers -- 2026-05-06
$100K budget to build an in-house LLM server for agentic coding -- 2026-05-05
Tenstorrent TT-QuietBox 2 Specifications (Blackhole) -- 2026-05-05
CUDA + ROCm simultaneously with -DGGML_BACKEND_DL=ON -- 2026-05-05
Tinygrad Driver testing on Blackwell + M3 Ultra RDMA cluster -- 2026-05-05
MiMo 2.5 requires at least 4 GPUs — model accessibility locked behind hardware -- 2026-05-05
NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents -- 2026-05-05
vibevoice.cpp — Pure C++ ggml port of Microsoft VibeVoice with voice cloning and long-form ASR -- 2026-05-05
Visual-Informed Speech Enhancement Using Attention-Based Beamforming -- 2026-05-05
[Editorial] Pipecat Gradient Bang -- 2026-04-30
AgentFM: Peer-to-peer network turning everyday computers into a decentralized AI supercomputer -- 2026-04-30
Microsoft TRELLIS.2: Open-Source 4B-Parameter Image-To-3D Model with PBR Textures -- 2026-04-30
[Editorial] What Boards Can Finally Ask For Before Approving AI -- 2026-04-27
[Editorial] Video: AI Tools and Techniques -- 2026-04-27
[Editorial] On Uncensoring Models — Origin Story and Ethics -- 2026-04-27
[Editorial] Video Feature -- 2026-04-23
[Editorial] Video Feature -- 2026-04-23
Nobody Got Fired for Uber's $8M Ledger Mistake? -- 2026-04-23
[Editorial] Video Feature -- 2026-04-23
[Editorial] Stanford HAI AI Index Report 2026 -- 2026-04-17
[Editorial] Steve Yegge on AI -- 2026-04-17
[Editorial] The AI Resentment Stage -- 2026-04-17
[Editorial] OpenAI Scaling Trusted Access for Cyber Defense -- 2026-04-16
RedSun: System user access on Win 11/10 and Server with the April 2026 Update -- 2026-04-16
[Editorial] Redamon Cybersecurity Pentesting -- 2026-04-16
Cal.com is going closed source -- 2026-04-16
Qwen3.5-122B at 198 tok/s on 2x RTX PRO 6000 Blackwell — Budget build, verified results -- 2026-04-16
DFlash Doubles the T/S Gen Speed of Qwen3.5 27B (BF16) on Mac M5 Max -- 2026-04-16
DeepSeek Updated their repo DeepGEMM testing Mega MoE -- 2026-04-16
Minimax 2.7 running sub-agents locally -- 2026-04-16
Taking on CUDA with ROCm: 'One Step After Another' -- 2026-04-16
[Editorial] Video: AI Development Deep Dive -- 2026-04-15
[Editorial] -- 2026-04-13
[Editorial] -- 2026-04-13
Safetensors is Joining the PyTorch Foundation -- 2026-04-09
S3 Files — AWS Reimagines Object Storage -- 2026-04-09
MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection -- 2026-04-09
Parlor — Real-time AI (audio/video in, voice out) on M3 Pro with Gemma E2B -- 2026-04-09
Gemma-4-E2B-it NVFP4 on DGX Spark — #1 in 9 Spark Arena categories -- 2026-04-09
Unsloth Gemma 4 Fine-Tuning Guide -- 2026-04-09
Bartowski vs Unsloth for Gemma 4 -- 2026-04-09
inferrs — Rust-Based Local LLM Inference Engine -- 2026-04-09
kevinrgu/autoagent — autonomous harness engineering -- 2026-04-07
remorses/usecomputer — Fast computer automation CLI for AI agents -- 2026-04-07
Claude Code + LightRAG = UNSTOPPABLE -- 2026-04-07
[Editorial] Career-Ops -- 2026-04-07
Gemma 4 on iPhone -- 2026-04-06
Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud -- 2026-04-06
[Editorial] Cutting LLM Size in Half Without Quality Loss -- 2026-04-06
[Editorial] Apple's Embarrassing AI Drop -- 2026-04-06
[Editorial] Build a Better Coding Agent at Home -- 2026-04-06
[Editorial] Unprompted — Day 1 Session 2 -- 2026-04-06
[Editorial] Unprompted — Day 2 Session 2 Part 9 -- 2026-04-06
[Editorial] Unprompted — Day 2 Session 2 Part 12 -- 2026-04-06
[Editorial] Security Considerations for Artificial Intelligence -- 2026-04-06
Anthropic wins court order blocking Pentagon's security threat designation -- 2026-04-02
T-Norm Operators for EU AI Act Compliance Classification: An Empirical Comparison of Lukasiewicz, Product, and Gödel Semantics in a Neuro-Symbolic Reasoning System -- 2026-04-02
Ollama is now powered by MLX on Apple Silicon in preview -- 2026-04-01
[Editorial] ClipCannon — AI Video Processing Tool -- 2026-04-01
[Editorial] ClipCannon White Paper -- 2026-04-01
[Editorial] LiteParse — Lightweight Document Parser -- 2026-04-01
Axios compromised on NPM – Malicious versions drop remote access trojan -- 2026-04-01
[Editorial] AI Tools & Techniques Video -- 2026-04-01
[Editorial] The Comforting Lie of SHA Pinning -- 2026-04-01
[Editorial] CEO Accountability for AI -- 2026-03-30
[Editorial] When AI Makes Work Easier to Move -- 2026-03-30
[Editorial] Observing AI's Increasing Trajectory -- 2026-03-30
[Editorial] The 12th of Never -- 2026-03-30
Dual DGX Sparks vs Mac Studio M3 Ultra 512GB: Running Qwen3.5 397B locally on both. Here's what I found. -- 2026-03-28
$500 GPU outperforms Claude Sonnet on coding benchmarks -- 2026-03-28
tonbistudio/turboquant-pytorch -- 2026-03-28
CERN uses tiny AI models burned into silicon for real-time LHC data filtering -- 2026-03-28
The ARC-AGI-3 Leaderboard has been released and Gemini is showing overwhelming cost efficiency. -- 2026-03-28
[Editorial] l33tdawg/sage -- 2026-03-27
Show HN: Cq – Stack Overflow for AI coding agents -- 2026-03-27
[Editorial] Video Submission -- 2026-03-27
sanbuphy/nanoAgent -- 2026-03-27
[Editorial] -- 2026-03-25
[Editorial] -- 2026-03-25
[Editorial] -- 2026-03-25
[Editorial] -- 2026-03-25
[Editorial] -- 2026-03-25
Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems -- 2026-03-24
Gaslighting LLMs with special token injection for mischief or to bypass code reviews -- 2026-03-24
[Editorial] Cisco AI Defense Security Scanner -- 2026-03-24
[Editorial] The New Offense: How AI Agents Are Changing Attack Surface -- 2026-03-24
CVE-2026-3888: Important Snap Flaw Enables Local Privilege Escalation to Root -- 2026-03-23
CEO Asks ChatGPT How to Void $250 Million Contract, Ignores His Lawyers, Loses Terribly in Court -- 2026-03-23
Project Nomad – Knowledge That Never Goes Offline -- 2026-03-23
[Editorial] -- 2026-03-20
mlx-tune – fine-tune LLMs on your Mac (SFT, DPO, GRPO, Vision) with an Unsloth-compatible API -- 2026-03-20
How the Eon Team Produced a Virtual Embodied Fly -- 2026-03-20
LeRobot v0.5.0: Scaling Every Dimension -- 2026-03-20
Show HN: Sub-millisecond VM sandboxes using CoW memory forking -- 2026-03-20
Mistral AI Releases Forge -- 2026-03-20
I built an open-source AI that lets you talk to your database — ask questions in plain English and get graphical insights instantly -- 2026-03-20
Memory Chip Crunch to Persist Until 2030, SK Hynix Chairman Says -- 2026-03-19
Meta's renewed commitment to jemalloc -- 2026-03-19
We all had p2p wrong with vllm so I rtfm -- 2026-03-19
[Editorial] Veracode: AI App Security — The Illusion of Control -- 2026-03-19
How do you catch auth bypass risks in generated code that looks completely correct -- 2026-03-19
GreyhavenHQ/greywall -- 2026-03-19
Ulysses Sequence Parallelism: Training with Million-Token Contexts -- 2026-03-18
[Editorial] Cognitum Seed API -- 2026-03-16
[Editorial] What We Are Building with πruvio -- 2026-03-16
[Editorial] RuVector / Ruvix Crate -- 2026-03-16
[Editorial] RuVector Min-Cut Examples -- 2026-03-16
[Editorial] Claude 1M Context GA -- 2026-03-14
NVIDIA Nemotron 3 Super: open-weight 120B MoE hybrid with 1M-token context -- 2026-03-14
Expert parallelism for 1T MoE finetuning on a single node - 50x faster and 2x cheaper than alternatives -- 2026-03-14
[Editorial] AI Researchers Need to Know Statistics -- 2026-03-13
[Editorial] -- 2026-03-12
[Editorial] -- 2026-03-12
[Editorial] -- 2026-03-12
Intelligent-Internet/CommonGround -- 2026-03-12
Processing 1 million tokens locally with Nemotron 3 Super on a M1 ultra -- 2026-03-12
[Editorial] -- 2026-03-12
FlashHead: Up to 40% Faster Multimodal Reasoning on Top of Quantization -- 2026-03-12
Running DeepSeek V3.2 with dense attention (like in llama.cpp) makes it a bit dumber -- 2026-03-12
Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs -- 2026-03-12
Atlassian CEO: AI doesn't replace people here, but we're firing them anyway -- 2026-03-12
[Editorial] -- 2026-03-12
Debian decides not to decide on AI-generated contributions -- 2026-03-12
[Editorial] DeepNote-AI: AI Music Generation -- 2026-03-06
[Editorial] With AI, The Time for Generalists Has Come -- 2026-03-06
[Editorial] OCR-Provenance: Document Provenance Verification -- 2026-03-06
[Editorial] Apple Introduces MacBook Pro with All-New M5 Pro and M5 Max -- 2026-03-06
Apple Python Foundation Models SDK -- 2026-03-06
[Editorial] Apple Was Losing the AI Race — Until Now -- 2026-03-06
[Editorial] Apple's AI Strategy Vindicated -- 2026-03-06
[Editorial] System Design Meets AI Reverse Engineering -- 2026-03-02
[Editorial] AI Agent Patterns & Implementation -- 2026-03-02
[Editorial] Advanced Agent Orchestration Techniques -- 2026-03-02
[Editorial] AirSnitch Attack Breaks Wi-Fi Encryption -- 2026-02-27
[Editorial] NDSS 2026 Security Paper -- 2026-02-27
[Editorial] Bell Labs Solved Prompt Injection in 1976 -- 2026-02-27
[Editorial] GitHub Copilot Exploited -- 2026-02-27
[Editorial] Stop Telling Your AI Agent What Not to Do -- 2026-02-27
[Editorial] AI MCP Integration Patterns -- 2026-02-27
Train AI models with Unsloth and Hugging Face Jobs for FREE -- 2026-02-25
Kitten TTS V0.8 Running in the Browser -- 2026-02-25
AI toolkit — LiteLLM + n8n + Open WebUI in one Docker Compose -- 2026-02-25
Show HN: PgDog – Scale Postgres without changing the app -- 2026-02-25
klawsh/klaw.sh -- 2026-02-24
[Editorial] -- 2026-02-24
[Editorial] -- 2026-02-24
[Editorial] -- 2026-02-24
[Editorial] -- 2026-02-24
In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach -- 2026-02-24
[Editorial] Taala's Etches AI Models onto Transistors to Rocket-Boost Inference -- 2026-02-23
Repurposing 800 RX 580s into an AI Inference Cluster: Mass Document OCR at 24x Lower Cost -- 2026-02-23
Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge -- 2026-02-21
[Editorial] RTI Genesis — Real-Time Infrastructure -- 2026-02-21
[Editorial] RuVector & RVF Vector Database -- 2026-02-21
[Editorial] RVDNA — Does It Work? -- 2026-02-21
[Editorial] Enterprise Open Source AI Coding Is Changing the ROI Calculation -- 2026-02-20
[Editorial] Think Tax: The Real Cost of AI-Generated Code -- 2026-02-20
[Editorial] RuVector DNA Sequence Analysis Example -- 2026-02-20
GGML and llama.cpp join HF to ensure the long-term progress of Local AI -- 2026-02-20
[Editorial] Agentic AI for Enterprise -- 2026-02-20
Vellium: open-source desktop app for creative writing with visual controls -- 2026-02-20
[Editorial] AI Security, Governance, and Cybersecurity -- 2026-02-19
AI-generated password isn't random, it just looks that way -- 2026-02-19
I plugged a $30 radio into my Mac mini and told my AI "connect to this" — now I control my smart home and send voice messages over radio with zero internet -- 2026-02-19
FlashLM v4: 4.3M ternary model trained on CPU in 2 hours — coherent stories from adds and subtracts only -- 2026-02-19
[Editorial] Antigravity Awesome Skills -- 2026-02-18
I built an MCP that connects your agent to 8,000+ skills with zero setup -- 2026-02-18
OpenAI buys OpenClaw, hires creator Peter Steinberger -- 2026-02-18
Why top talent is walking away from OpenAI and xAI -- 2026-02-18
[Editorial] Enterprise AI Summit — Gene Kim -- 2026-02-18
XiaomiRobotics/Xiaomi-Robotics-0 -- 2026-02-18
[Editorial] Product Security Is About to Hit a Wall -- 2026-02-17
[Editorial] Product Security Wall (follow-up) -- 2026-02-17
[Editorial] Oligo Accelerates Vulnerability Intelligence with NVIDIA -- 2026-02-17
[Editorial] RVF — Most Consequential AI Infrastructure -- 2026-02-16
[Editorial] Introducing RVF Cognitive Container -- 2026-02-16
375ms Voice-to-Voice Latency: Local Nemotron-4 + Kokoro-82M on Blackwell Bare Metal -- 2026-02-16
Expensively Quadratic: The LLM Agent Cost Curve -- 2026-02-16
[Editorial] https://www.authsignal.com/blog/articles/account-recovery-is-the-identity-industrys-most-overlooked-challenge -- 2026-02-13
[Editorial] https://raffy.ch/blog/2026/02/03/the-gaps-that-created-the-new-wave-of-siem-and-ai-soc-vendors -- 2026-02-13
[Editorial] https://m.youtube.com/watch?v=w8p-yFqF13o -- 2026-02-13
[Editorial] https://mrinal.com/articles/agent-identities -- 2026-02-13
[Editorial] https://labs.zenity.io/p/perplexity-comet-a-reversing-story -- 2026-02-13
[Editorial] https://arxiv.org/abs/2602.10117 -- 2026-02-13
[Editorial] https://arxiv.org/abs/2602.09433 -- 2026-02-13
[Editorial] https://www.linkedin.com/posts/hermanerrico_i-put-out-a-site-and-paper-defining-a-new-activity-7427822997593387008-zzYm -- 2026-02-13
[Editorial] https://www.linkedin.com/pulse/ive-spent-three-decades-cybersecurity-ai-biggest-trust-brett-kelsey-v7r3c -- 2026-02-13
[Editorial] https://www.linkedin.com/pulse/ai-red-teamers-advice-orgs-deploying-brian-chamberlain-utkse -- 2026-02-13
[Editorial] https://www.linkedin.com/posts/cole-medin-727752184_vibe-coding-has-a-30-50-security-vulnerability-activity-7420461997537959938-y5uG -- 2026-02-13
[Editorial] https://zeltser.com/ai-malware-analysis-remnux -- 2026-02-13
I built a personal AI assistant in 815 lines of TypeScript — every capability is just a Markdown file -- 2026-02-13
whisper.cpp + llama.cpp in a desktop app — local voice-to-text with LLM text cleanup -- 2026-02-13
I built a social network where 6 Ollama agents debate each other autonomously — Mistral vs Llama 3.1 vs CodeLlama -- 2026-02-13
Lorph: A Local AI Chat App with Advanced Web Search via Ollama -- 2026-02-13
[Editorial] https://www.linkedin.com/pulse/when-brain-os-meets-real-operating-systems-rafael-knuth-4hcsf -- 2026-02-11
[Editorial] https://docs.entire.io/core-concepts -- 2026-02-11
Why System Prompts are failing your local agent builds (and why you need a Logic Floor) -- 2026-02-11
I built an MCP server that syncs Cursor, Claude Desktop, and Windsurf with one brain [Open Source] -- 2026-02-11
built a self-hosted API proxy that strips PII before prompts reach any LLM - works with Ollama too -- 2026-02-11
Bitnet.cpp - Inference framework for 1-bit (ternary) LLM's -- 2026-02-11
Last Week in Multimodal AI - Local Edition -- 2026-02-11
[Editorial] https://www.zdnet.com/article/claude-code-alternative-free-local-open-source-goose -- 2026-02-10
Recommend model for openclaw clawdbot running locally on old laptop 4gb vram 16g ram asus -- 2026-02-10
[Editorial] https://github.com/ikennaokpala/forge -- 2026-02-09
Trainable System Router and Industry standard Dual Method Memory System Release -- 2026-02-09
[Editorial] https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference -- 2026-02-09
[Editorial] https://www.linkedin.com/posts/ownyourai_i-just-open-sourced-my-security-auditor-for-activity-7426565421375541248-rqGu -- 2026-02-09
[Editorial] https://www.linkedin.com/posts/activity-7426382890004971520-VBdy -- 2026-02-09
[Editorial] https://www.linkedin.com/posts/samuele-giampieri-b1b67597_redamon-airedteam-penetrationtesting-activity-7426292400534437889--0Ny -- 2026-02-09
[Editorial] https://hackernoon.com/everyone-says-ai-is-insecure-so-i-measured-it -- 2026-02-09
[Editorial] https://x.com/fr0gger_/status/2020025525784514671?ct=rw-li -- 2026-02-09
Agent deleted production data because no policy layer said 'no' - what's your governance strategy? -- 2026-02-09
[Editorial] https://github.com/usestrix/strix -- 2026-02-06
[Editorial] https://github.com/GH05TCREW/pentestagent -- 2026-02-06
[Editorial] https://www.edloveless.com/the-call-is-coming-from-inside-the-house-and-its-watching-netflix -- 2026-02-06
eScan Antivirus Delivers Malware in Supply Chain Attack -- 2026-02-06
Run Ollama on your Android! -- 2026-02-04
Is using the officially supported local LLM integration in Claude Code for business/corporate use a violation of ToS? -- 2026-02-04
[Editorial] https://www.linkedin.com/posts/hermanerrico_aisecurity-agenticai-cybersecurity-activity-7424484799123247104-40_F -- 2026-02-04
m4xxxxx/AIxVuln -- 2026-02-04
OpenClaw on edge Linux (systemd + cron) — quick experiment + a few questions -- 2026-02-03
[Ollama Cloud] 29.7% failure rate, 3,500+ errors in one session, support ignoring tickets for 2 weeks - Is this normal? -- 2026-02-03
OpenClaw is everywhere all at once, and a disaster waiting to happen -- 2026-02-03
[Editorial] https://www.linkedin.com/posts/robvanderveer_iso42001-pren18282-pren18282-share-7423993903118290945--EO7 -- 2026-02-03
Large categorized list of AI / LLM benchmarks & leaderboards -- 2026-02-03
The cost of massive context: Burned 45M Gemini tokens in hours using OpenCode. Is Context Caching still a myth for most agents? -- 2026-01-30
PSA: CHECK YOUR OPENAI PAYMENT CARD -- 2026-01-30
[Editorial] https://humanemulator.co/ -- 2026-01-30
Generating skills for api+local CUAs via noVNC demonstration recording MCP -- 2026-01-30
Our Agent Rebuilt Itself in 26 Hours. AMA👀 -- 2026-01-30
I built a multi-agent orchestration layer for Claude Code - sharing in case it's useful to anyone -- 2026-01-30
AlfonsSkills/SkillSync -- 2026-01-30
1rgs/nanocode -- 2026-01-30
We Got Claude to Build CUDA Kernels and teach open models! -- 2026-01-30
Show: Fully Local Voice Assistant (with optional Voice Cloning) -- 2026-01-30
Thoughts on PowerInfer as a way to break the memory bottleneck? -- 2026-01-30
Ollama Models Ranked by VRAM Requirements -- 2026-01-30
local-vision-bridge: OpenWebUI Function to intercept images, send them to a vision capable model, and forward description of images to text only model -- 2026-01-30
We added an on-device AI meeting note taker into AnythingLLM to replace SaaS solutions -- 2026-01-29
Stop wasting 30%+ of your context window on JSON braces. Meet SONA -- 2026-01-29
1.8-3.3x faster Embedding finetuning now in Unsloth (~3GB VRAM) -- 2026-01-29
I built a free open-source TDD canvas for VS Code. Claude Code writes tests first, captures runtime traces when they fail, fixes until green -- 2026-01-29
Show HN: Sandbox Agent SDK – unified API for automating coding agents -- 2026-01-29
OSS ChatGPT WebUI – 530 Models, MCP, Tools, Gemini RAG, Image/Audio Gen -- 2026-01-29
Renting out the cheapest GPUs ! (CPU options available too) -- 2026-01-29
umputun/ralphex -- 2026-01-29
rezonia/invoice-processor -- 2026-01-29
[Editorial] https://www.linkedin.com/pulse/person-rights-responsibility-why-ai-contributors-break-ralf-d-m%C3%BCller-m2k9f -- 2026-01-29
OpenStreetMap overwhelmed by bots scraping data -- 2026-01-28
LLM-Generated Newspaper Provides Ultimate in Niche Publications -- 2026-01-28
Anyscale's new data: Most AI clusters run at <50% utilization. Is "Disaggregation" the fix, or just faster cold starts? -- 2026-01-28
Deploying Open WebUI for 2,000 Users (Solo) – Sanity Check Needed -- 2026-01-28
A Tool to Calculate If a LLM Will Fit Your GPU -- 2026-01-27
We indexed the entire Ollama Library (10TB+ VRAM). Here is how we run them all on 1 Node. -- 2026-01-27
Is the next leap in AI architectural? Comparing VRAM-hungry Transformers with Compute-intensive Energy-Based Models -- 2026-01-27
Show HN: A Local OS for LLMs. MIT License. Zero Hallucinations. Infinite Memory -- 2026-01-27
facebookresearch/actionmesh -- 2026-01-27
deepseek-ai/Engram -- 2026-01-27
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge -- 2026-01-27
[Editorial] https://github.com/ruvnet/ruvector/blob/claude/clawdbot-ruvector-setup-RHW3a/npm/packages/ruvbot/docs/FEATURE_COMPARISON.md -- 2026-01-27
Can companies "hack" ChatGPT to promote them? -- 2026-01-27
[Editorial] https://grahamhelton.com/blog/nodes-proxy-rce -- 2026-01-26
Route leak incident on January 22, 2026 -- 2026-01-26
[Editorial] https://www.linkedin.com/posts/jeffreyemanuel_agent-coding-life-hack-im-100-convinced-activity-7421442482082660352-l5AG -- 2026-01-26
Model Persistence, Context Management, Multilayered Cognition, Data Export, Cross Provider Support --- Anybody interested? -- 2026-01-26
Anyone else wish they could "branch" conversations like git branches? -- 2026-01-26
xuzeyu91/WebCode -- 2026-01-26
ast-grep: A CLI tool for code structural search, lint and rewriting -- 2026-01-26
[Editorial] https://www.linkedin.com/pulse/ai-conversation-we-should-actually-having-renato-beninatto-vs55c -- 2026-01-26
Designing AI-resistant technical evaluations -- 2026-01-26
I put an RTX PRO 4000 Blackwell SFF in my MS-S1 Max (Strix Halo), some benchmarks -- 2026-01-26
ClaraVerse | Local AI workspace (4 months ago) -> Your feedback -> Back with improvements. -- 2026-01-26
Beyond Vendor Lock-In: A Framework for LLM Sovereignty -- 2026-01-26
[Editorial] https://www.linkedin.com/posts/ivandj_early-claims-around-self-evolving-memory-activity-7421307316437676033-l0Jm -- 2026-01-26
[Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-ruvector-world-model-activity-7421556928910290944-cx4v -- 2026-01-26
stepfun-ai/Step3-VL-10B -- 2026-01-26
The value of $200 a month AI users -- 2026-01-23
Claude Permanent Memory Leak - This could be the cause of issue 16157 - instally hitting usage limits -- 2026-01-23
browser-use/agent-sdk -- 2026-01-23
egebese/seo-research-mcp -- 2026-01-23
Crates.io: Development Update -- 2026-01-23
[Editorial] https://www.linkedin.com/posts/owais-drera-590750378_github-owaisdreraagent-slayer-activity-7419782518985486336-7WE3 -- 2026-01-23
[Editorial] https://www.linkedin.com/posts/resilientcyber_prompt-injection-activity-7420165497230454784-NOHa -- 2026-01-23
[Editorial] https://www.linkedin.com/posts/anshumanbhartiya_lets-talk-about-threat-modeling-and-skills-activity-7418130148312674305-arTh -- 2026-01-23
[Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-prime-radiant-a-real-time-activity-7420466084006223873-hOct -- 2026-01-23
[Editorial] https://www.wiz.io/blog/wiz-research-codebreach-vulnerability-aws-codebuild -- 2026-01-23
Aider's documentation for getting connected to local inference sucks. Hopefully this helps. -- 2026-01-22
Polymcp Integrates Ollama – Local and Cloud Execution Made Simple -- 2026-01-22
Show HN: LangGraph architecture that scales (hexagonal pattern, 110 tests) -- 2026-01-22
[Editorial] https://www.linkedin.com/posts/reuvencohen_mcps-generally-kind-of-suck-and-the-community-activity-7420106621437095936-0qkE -- 2026-01-22
[Editorial] docker ai sandbox -- 2026-01-22
cvsouth/memories-mcp -- 2026-01-22
TencentCloudADP/youtu-tip -- 2026-01-22
What we learned processing 1M+ emails for context engineering -- 2026-01-22
[Editorial] https://www.linkedin.com/posts/unsloth_you-can-now-run-glm-47-flash-locally-on-activity-7419220348719624192-CV65 -- 2026-01-22
Here is how to get GLM 4.7 working on llama.cpp with flash attention and correct outputs -- 2026-01-22
unsloth/GLM-4.7-Flash-GGUF -- 2026-01-22
768Gb Fully Enclosed 10x GPU Mobile AI Build -- 2026-01-22
I used Ollama (Mistral Small 24B) + LightRAG to build a graph pipeline that catches hidden risks where standard Vector RAG fails. -- 2026-01-21
Hey all- I built a self-hosted MCP server to run AI semantic search over your own databases, files, and codebases. Supports Ollama and cloud providers if you want. Thought you all might find a good use for it. -- 2026-01-21
I built Semantiq - a universal MCP server that gives semantic code understanding to Claude Code, Cursor, and any AI coding tool (100% local, no API keys) -- 2026-01-21
My company banned AI tools and I dont know what to do -- 2026-01-21
AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality -- 2026-01-21
MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching -- 2026-01-21
[Editorial] https://github.com/7h30th3r0n3/Evil-M5Project -- 2026-01-20
[Editorial] https://7h30th3r0n3.fr/the-vulnerability-that-killed-freewifi_secure -- 2026-01-20
x86 prefixes and escape opcodes flowchart -- 2026-01-20
I need a feedback about an open-source CLI that scan AI models (Pickle, PyTorch, GGUF) for malware, verify HF hashes, and check licenses -- 2026-01-20
Running multiple models locally on a single GPU, with model switching in 2-5 seconds. -- 2026-01-20
EXAONE MoE support has been merged into llama.cpp -- 2026-01-20
naklecha/simple-llm -- 2026-01-20
[Editorial] https://substack.com/inbox/post/184924197 -- 2026-01-20
[Editorial] https://www.linkedin.com/posts/mondweepchakravorty_this-article-details-how-to-get-started-using-ugcPost-7418423980123987969-uVok -- 2026-01-20
Automating illustration for the Conan story "Tower of the Elephant"--Llama and Mistral for prompt generation, Qwen3-VL for image scoring, and image models. -- 2026-01-20
Demo: On-device browser agent (Qwen) running locally in Chrome -- 2026-01-20
Agent observability is way different from regular app monitoring - maintainer's pov -- 2026-01-20
charIesding/agent-dashboard -- 2026-01-20
Learning Latency-Aware Orchestration for Parallel Multi-Agent Systems -- 2026-01-20
Binary Fuse Filters: Fast and Smaller Than XOR Filters -- 2026-01-19
Read_once(), Write_once(), but Not for Rust -- 2026-01-19
Show HN: HTTP:COLON – A quick HTTP header/directive inspector and reference -- 2026-01-19
7x Longer Context Reinforcement Learning in Unsloth -- 2026-01-19
openbmb/AgentCPM-Explore -- 2026-01-19
black-forest-labs/FLUX.2-klein-4B -- 2026-01-19
[Editorial] https://sean.heelan.io/2026/01/18/on-the-coming-industrialisation-of-exploit-generation-with-llms -- 2026-01-19
[Editorial] https://red.anthropic.com/2026/cyber-toolkits-update -- 2026-01-19
[Editorial] https://github.com/trailofbits/skills -- 2026-01-19
[Editorial] https://blog.cloudflare.com/fail-small-resilience-plan -- 2026-01-16
[Editorial] https://www.linkedin.com/posts/akopytko_symbolicai-neurosymbolicai-deterministicai-activity-7417128350650912768-reUc -- 2026-01-16
[Editorial] https://www.usenix.org/system/files/usenixsecurity25-zhang-xiang.pdf -- 2026-01-16
[Editorial] https://state-of-iranblackout.whisper.security/ -- 2026-01-16
[Editorial] https://equixly.com/blog/2026/01/14/can-ai-identify-0days -- 2026-01-16
[Editorial] https://www.linkedin.com/posts/reuvencohen_announcing-claude-flow-v3-a-full-rebuild-activity-7417928335160262656-NYqJ -- 2026-01-16
[Editorial] https://www.linkedin.com/posts/sandstream_i-just-shipped-ralph-inferno-10-to-npm-activity-7417606358654406657-zBPY -- 2026-01-16
[Editorial] https://www.linkedin.com/posts/rasmuswiding_parallel-ai-agents-the-complete-infrastructure-activity-7417646422436777984-D1Zw -- 2026-01-16
Ralph Loop inspired me to build this - AI decides what Claude Code does next orchestrating claude code until task is done -- 2026-01-16
Local AI App With SD-1.5 Models -- 2026-01-15
For RAG serving: how do you balance GPU-accelerated index builds with cheap, scalable retrieval at query time? -- 2026-01-15
Home workstation vs NYC/NJ colo for LLM/VLM + Whisper video-processing pipeline (start 1 GPU, scale to 4–8) -- 2026-01-15
Create specialized Ollama models in 30 seconds -- 2026-01-15
[Editorial] https://www.linkedin.com/pulse/ai-race-moving-faster-than-our-security-standards-can-david-abutbul-zmvtf -- 2026-01-15
[Editorial] https://www.linkedin.com/posts/josh-orenstein_iran-just-did-something-no-government-has-activity-7417294442811895811-oOTR -- 2026-01-15
[Editorial] https://sanderschulhoff.substack.com/p/the-ai-security-industry-is-bullshit -- 2026-01-15
[Editorial] https://hackthemodel.com/ai-security-isnt-bullshit-but-we-re-securing-the-wrong-thing-b925d04b517a -- 2026-01-15
[Editorial] https://www.linkedin.com/posts/reuvencohen_qudag-bitchat-is-a-secure-peer-to-peer-messaging-activity-7417222548897329152-153E -- 2026-01-15
Confer – End to end encrypted AI chat -- 2026-01-15
Two ASRock Radeon AI Pro R9700's cooking in CachyOS. -- 2026-01-14
which small model can i use to read this gauge? -- 2026-01-14
Supertone/supertonic-2 -- 2026-01-14
Qualcomm's RISC-Ventana Fusion -- 2026-01-14
An Open Source Electromagnetic Resonance Tablet -- 2026-01-14
[Editorial] https://github.com/VibiumDev/vibium -- 2026-01-13
Battle of AI Gateways: Rust vs. Python for AI Infrastructure: Bridging a 3,400x Performance Gap -- 2026-01-13
Built a local TTS app using Apple's MLX framework. No cloud, no API calls, runs entirely on device. -- 2026-01-13
I built a tool to clean HTML pages for RAG (JSON / MD / low-noise HTML) -- 2026-01-13
[Editorial] https://cloudsecurityalliance.org/blog/2026/01/09/the-first-question-security-should-ask-on-ai-projects -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/stephenbklein_the-age-of-pretend-the-ai-industry-just-spent-activity-7415779694509219842-8OkK -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/reuvencohen_most-people-talk-about-gpus-as-if-they-are-activity-7415778737486483456-7DQK -- 2026-01-12
[Editorial] https://blog.openthreatresearch.com/evolving-the-threat-hunter-playbook-planning-hunts-with-agent-skills -- 2026-01-12
[Editorial] https://maggiegray.us/p/the-age-of-ai-for-offensive-cyber -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/resilientcyber_llm-fingerprinting-activity-7415849264452739072-H9fw -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/johnbruggeman_kimwolf-tldr-whattodo-activity-7413983885392396289-xsd4 -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/clintgibler_cybersecurity-ai-activity-7407102282120462337-6URK -- 2026-01-12
[Editorial] https://xoxruns.medium.com/feedback-driven-iteration-and-fully-local-webapp-pentesting-ai-agent-achieving-78-on-xbow-199ef719bf01 -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/yass-99637a105_i-spent-the-last-couple-of-months-building-activity-7415098924224499714-lCDV -- 2026-01-12
A closer look at a BGP anomaly in Venezuela -- 2026-01-09
Show HN: I visualized the entire history of Citi Bike in the browser -- 2026-01-09
Modifying a QingPing Air Quality Monitor for Local MQTT Access -- 2026-01-09
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation -- 2026-01-09
Connect any LLM to all your knowledge sources and chat with it -- 2026-01-08
Have claude code interact with another claude code session interactively to test a plugin im building -- 2026-01-08
[Editorial] https://www.linkedin.com/posts/robert-westin_vibecoding-google-chrome-ugcPost-7410672189860933633-rnuH -- 2026-01-08
shootthesound/comfyUI-LongLook -- 2026-01-08
tangxiaofeng7/cscan -- 2026-01-08
Solar-Open-100B-GGUF is here! -- 2026-01-08
[HW TUNING] Finding the best GPU power limit for inference -- 2026-01-08
HomeGenie v2.0: 100% Local Agentic AI (Sub-5s response on CPU, No Cloud) -- 2026-01-08
WebGPU llama.cpp running in browser with Unity to drive NPC interactions (demo) -- 2026-01-08
Offline agent testing chat mode using Ollama as the judge (EvalView) -- 2026-01-08
[Editorial] https://www.linkedin.com/posts/reuvencohen_we-are-hitting-the-ceiling-of-prompt-driven-activity-7415027558171488256-_Dvn -- 2026-01-08
[Editorial] https://www.linkedin.com/posts/cole-medin-727752184_most-developers-using-ai-coding-assistants-activity-7414834730149376000-lecD -- 2026-01-08
[Editorial] https://www.linkedin.com/posts/pratik-kadam-pk_i-wasted-3-weeks-building-ai-agents-the-wrong-activity-7414361937570078720-BNbD -- 2026-01-08
[Editorial] https://www.linkedin.com/posts/ownyourai_you-know-claude-code-works-really-well-with-activity-7414678511967244288-VHxZ -- 2026-01-08
[Editorial] https://backalleycoder.com/posts/passseeds-an-experiment-in-hijacking-passkeys-to-unlock-cryptographic-use-cases -- 2026-01-07
[Editorial] https://hackbot.dad/writing/intro-to-gpus -- 2026-01-07
[Editorial] https://www.linkedin.com/posts/vilhelm-von-ehrenheim_are-you-avoiding-the-dumb-zone-dex-dropped-activity-7414210431570993152-8AFP -- 2026-01-07
[Editorial] https://www.linkedin.com/posts/cole-medin-727752184_2025-overpromised-on-ai-agents-2026-demands-activity-7414472389167841280-WDzs -- 2026-01-07
[Editorial] https://www.linkedin.com/posts/ronitelman_the-missing-step-in-decision-intelligence-activity-7413638316899762177-Aifb -- 2026-01-07
Achieving 30x Real-Time Transcription on CPU . Multilingual STT Openai api endpoint compatible. Plug and play in Open-webui - Parakeet -- 2026-01-07
Local Image Edit API Server for Models like Qwen-Image-Edit or Flux2-dev -- 2026-01-07
Using n8n to orchestrate DeepSeek/Llama3 Agents via SSH (True Memory Persistence) -- 2026-01-07
[Editorial] https://arxiv.org/html/2512.24601v1 -- 2026-01-06
[Editorial] https://www.linkedin.com/posts/javier-cullas-644179109_ruvllm-llm-onlinelearning-activity-7414118850759262208-1Epx -- 2026-01-06
[Editorial] https://github.com/Cornjebus/rlm-replication-study -- 2026-01-06
I built Ctrl: Execution control plane for high stakes agentic systems -- 2026-01-06
I built a local GUI for vector DBs (pgvector, Qdrant, Chroma, more) -- 2026-01-06
Has anyone tried routing Claude Code CLI to multiple model providers? -- 2026-01-06
[Editorial] https://www.linkedin.com/posts/andriyburkov_one-of-the-fundamental-papers-that-advanced-activity-7412675071640485888-4gvc -- 2026-01-05
[Editorial] https://anthropic.skilljar.com/claude-code-in-action -- 2026-01-05
RTX 3090 vs RTX 4090 for local AI assistant - impact on Time To First Token (TTFT)? -- 2026-01-05
Any Vision model on pair with GPT-OSS 120B? -- 2026-01-05
tobilg/ai-observer -- 2026-01-05
omniASR-server: OpenAI-compatible API for Meta's omniASR with streaming support -- 2026-01-05
Tally – A tool to help agents classify your bank transactions -- 2026-01-05
DiffSynth-Studio/Qwen-Image-i2L -- 2026-01-05
orneryd/NornicDB -- 2026-01-02
Build a Deep Learning Library -- 2026-01-02
Liquid CO2 For Grid Scale Energy Storage Isn’t Just Hot Air -- 2026-01-02
How llama.cpp implements 2.9x faster top-k sampling with bucket sort -- 2025-12-31
Built an offline-first vector database (v0.2.0) looking for real-world feedback -- 2025-12-31
Linux 7.0 Expected to Bring IO_uring Iopoll Polling Improvements -- 2025-12-31
[Release] Dingo v2.0 – Open-source AI data quality tool now supports SQL databases, RAG evaluation, and Agent-as-a-Judge hallucination detection! -- 2025-12-31
Securing MCP in production -- 2025-12-31
Why I Ditched Serverless Neptune/OpenSearch for Dockerized Neo4j/pgvector on EC2 (60% Cost Cut) -- 2025-12-30
[Editorial] https://ratatui.rs/ -- 2025-12-29
[Editorial] https://github.com/ruvnet/ruvector/tree/main/crates/ruvector-mincut -- 2025-12-29
[Editorial] https://loggingsucks.com/ -- 2025-12-29
Karpathy on Programming -- 2025-12-29
[Editorial] https://github.com/marcuspat/turbo-flow-claude -- 2025-12-29
Claude Watch - Monitor Your Context Usage Across All Sessions -- 2025-12-29
[Research] Jacobi Forcing: turning AR LLMs into diffusion-style parallel decoders, staying causal with 4x speedup -- 2025-12-23
Variable Sized Experts in MoEs -- 2025-12-23
[Editorial] https://www.linkedin.com/posts/harish-santhanalakshmi-ganesan-31ba96171_github-cisco-ai-defensemcp-scanner-scan-activity-7409036231025811456-y16c -- 2025-12-23
[Editorial] PentestGPT -- 2025-12-23
Untargeted Jailbreak Attack -- 2025-12-23
AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems -- 2025-12-23
Data in, Research Paper out. Fully autonomous. Open-sourced & Free Research Agent. -- 2025-12-23
Why claude code compare to github copilot ? -- 2025-12-23
black-forest-labs/flux2 -- 2025-12-23
tulerfeng/OneThinker -- 2025-12-23
My problem: my agent code got tied to one provider. I built a thin wrapper so I can swap OpenAI ↔ Ollama without rewrites. -- 2025-12-23
Hey r/LocalLLaMA, I built a fully local AI agent that runs completely offline (no external APIs, no cloud) and it just did something pretty cool: It noticed that the "panic button" in its own GUI was completely invisible on dark theme (black text on black background), reasoned about the problem, a -- 2025-12-23
Demo - RPI4 wakes up a server with dynamically scalable 7 gpus -- 2025-12-23
Show HN: I Built an Image Captioning Tool Using Llama.cpp -- 2025-12-23
Introducing Bilgecan: self-hosted, open-source local AI platform based on Ollama + Spring AI + PostgreSQL + pgvector -- 2025-12-22
The Open WebUI Documentation just got a massive 2,600+ line overhaul (v0.6.42) -- 2025-12-22
[Editorial] https://zymtrace.com/ -- 2025-12-22
PLX/PEX PCIe 4.0 seems to help for LLMs and P2P! I.e. PEX88096 (1 PCIe 4.0 X16 to 5 PCIE 4.0 X16) and others, and comparison vs bifurcation. -- 2025-12-22
Qubes OS 4.3.0 has been released -- 2025-12-22
SeeSee21/Z-Image-Turbo-AIO -- 2025-12-22
Designing a CPU for Native BASIC -- 2025-12-22
Memory at the Speed of Light -- 2025-12-19
MRI-style transformer scan, Llama 3.2 3B -- 2025-12-19
TQTQliu/Light-X -- 2025-12-19
[Editorial] https://www.linkedin.com/posts/rocklambros_aisecurity-devsecops-activity-7407423157445287937-Wc0Z -- 2025-12-19
If Your AI App Only Works When You Sit Next To It -- 2025-12-19
dsl-learn/cutile-learn -- 2025-12-18
Errors in Rust: A Deep Dive -- 2025-12-18
Plug Into USB, Read Hostname and IP Address -- 2025-12-18
[Editorial] https://www.linkedin.com/posts/yotam-perkal_comparing-ai-agents-to-cybersecurity-professionals-activity-7407076565357887488-KI5M -- 2025-12-18
Building an event-driven alternative to LangGraph because single-threaded loops are killing me. Roast my architecture. -- 2025-12-18
Claude Code, GPT-5.2, DeepSeek v3.2, and Self-Hosted Devstral 2 on Fresh SWE-rebench (November 2025) -- 2025-12-18
ai-sage/GigaChat3-702B-A36B-preview -- 2025-12-18
Open Source Alternative to Perplexity -- 2025-12-16
How I Self-Hosted a Local Reranker for Open WebUI with vLLM (No More Jina API) -- 2025-12-16
[Editorial] https://openreview.net/pdf?id=nbMeRvNb7A -- 2025-12-16
Feedback Wanted - Vector Compression Engine (benchmarked v FAISS) -- 2025-12-16
Why it so hard to abliterated kimi k2 thinking model? -- 2025-12-16
Price of a bot army revealed across online platforms -- 2025-12-15
iOS 26.2 fixes 20 security vulnerabilities, 2 actively exploited -- 2025-12-15
Litestream VFS -- 2025-12-15
I got tired of my agents losing context on topic shifts, so I hacked together a branch router - thoughts? -- 2025-12-12
Thoughts on decentralized training with Psyche? -- 2025-12-12
DeepSeek V3.2 got gold at IMO and IOI - weights on HF, MIT license, but Speciale expires Dec 15 -- 2025-12-10
Linux Foundation Announces the Formation of the Agentic AI Foundation (AAIF), Anchored by New Project Contributions Including Model Context Protocol (MCP), goose and AGENTS.md -- 2025-12-10
Run Any Model Provider on OpenWebUI immediately by discovering AI services on your LAN -- 2025-12-08
dynamic allocation of less used experts to slower memory -- 2025-12-08
At What Point Does Owning GPUs Become Cheaper Than LLM APIs ? I -- 2025-12-05
How to run phones while being struck by suicide drones -- 2025-12-04
FreeBSD 15.0-Release Announcement -- 2025-12-04
UEFI On ARM? More Likely Than You Think -- 2025-12-04
This Week in Security: Cloudflare Wasn’t DNS, BADAUDIO, and Not a Vuln -- 2025-11-28
[Editorial] https://www.linkedin.com/posts/reuvencohen_bigger-isnt-better-the-future-of-ai-isn-activity-7398720183797911554-wHsT/ -- 2025-11-28
Building the largest known Kubernetes cluster, with 130k nodes -- 2025-11-26
[Editorial] https://github.com/ChrisRoyse/UsefulPrompts/blob/main/pushrepo.md -- 2025-11-25
Open-source package: let your coding agent generate interactive docs -- 2025-11-25
Google's Antigravity - Another VS Code Fork! -- 2025-11-20
Godbolt's Rule -- 2025-11-20
Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms -- 2025-11-20
[Editorial] https://www.linkedin.com/posts/avi-lumelsky-713111144_an-ai-powered-cyberattack-is-self-replicating-activity-7396569417549234177-n6ai -- 2025-11-19
Native Sysmon functionality coming to Windows -- 2025-11-19
A more surgical approach to abliteration -- 2025-11-19
[30 Trillion token dataset] "HPLT 3.0: Very Large-Scale Multilingual Resources for LLM and MT. Mono- and Bi-lingual Data, Multilingual Evaluation, and Pre-Trained Models", Oepen et al. 2025 -- 2025-11-19
Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples -- 2025-11-19
[Editorial] https://www.npmjs.com/package/neural-trader -- 2025-11-14
Critical RCE patched in Imunify360 affects up to 50M+ websites -- 2025-11-14
Kubernetes Ingress Nginx is retiring -- 2025-11-14
About KeePassXC's Code Quality Control -- 2025-11-14
I built my own self-hosted GPT with LM Studio, Caddy, and Cloudflare Tunnel -- 2025-11-14
Can't find Model in Ollama -- 2025-11-14
[PSA] Claude Code Web users: Want something useful to do with your $1k free credits? Help fix all the borked HuggingFace Spaces. -- 2025-11-14
Hi reddit, I rebuilt Karpathy's Nanochat in pure Rust [nanochat-rs] -- 2025-11-14
Building for an Open Future - our new partnership with Google Cloud -- 2025-11-14
kubernetes-tenants/tenant-operator -- 2025-11-12
Native LLM Router Integration with Cost Transparency for OpenWebUI -- 2025-11-12
Working on a list of open source tools for a Kubernetes ML stack -- 2025-11-10
I built a leaderboard for Rerankers -- 2025-11-10
LiquidAI/LFM2-ColBERT-350M -- 2025-11-10
Apache Iggy is a high-performance, persistent message streaming platform -- 2025-11-07
[Editorial] https://www.linkedin.com/posts/gadievron_deep-dive-cursor-code-injection-runtime-activity-7391805842318077952-bRjD -- 2025-11-05
Now you can deploy OpenStatus on Raspberry Pi -- 2025-11-03
Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
OCR models: HF demos vs local performance -- 2025-11-03
Help me decide: EPYC 7532 128GB + 2 x 3080 20GB vs GMtec EVO-X2 -- 2025-11-03
[Editorial] https://github.com/claraverse-space/ClaraVerse -- 2025-10-31
You can now run Ollama models in Jan -- 2025-10-31
Codex and Supabase -- 2025-10-31
snowyfizz/Vision-Detection-API -- 2025-10-29
Show HN: Apache Fory Rust – 10-20x faster serialization than JSON/Protobuf -- 2025-10-29
huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning -- 2025-10-29
Need help understanding OpenAIs API usage for text-embedding -- 2025-10-26
[Editorial] Browsers you can socially engineer -- 2025-10-24
[Editorial] share terminal sessions using Claude Code for web -- 2025-10-24
[Editorial] New web -- 2025-10-23
ContextGuard – Open-source security monitoring for MCP servers -- 2025-10-23
Gemini AI owners, please, I beg you, let me disable canvas permanently -- 2025-10-23
We rewrote OpenFGA in pure Postgres -- 2025-10-22
Ntfsplus: NTFS Filesystem Remake -- 2025-10-22
Reasoning should be thought of as a drawback, not a feature -- 2025-10-21
inclusionAI/Ring-1T-preview -- 2025-10-21
[Editorial] LinkedIn Alogrithm -- 2025-10-18
State of AI Report 2025 -- 2025-10-17
[Editorial] Asimov’s three laws — updated for the genAI age -- 2025-10-17
Comparing Popular AI Evaluation Platforms for 2025 -- 2025-10-17
I analyzed 200 e-commerce sites and found 73% of their traffic is fake -- 2025-10-17
ZephrFish/OmniProx -- 2025-10-14
[Editorial] Claude Flow updates -- 2025-10-11
[Editorial] https://www.linkedin.com/pulse/from-chatbot-operating-system-what-openais-next-move-means-leimer-ju18c -- 2025-10-11
11 AI Agent Projects You Can Build Today (With Guides) -- 2025-10-11
Anyone here building Agentic AI into their office workflow? How’s it going so far? -- 2025-10-11
How would you address it (free alternatives) -- 2025-10-11
meituan-longcat/LongCat-Flash-Chat -- 2025-10-11
Ring Flash 2.0 104B A6B with Linear Attention released a few days ago -- 2025-10-07
GDPVal: Measuring the performance of our models on real-world tasks -- 2025-09-26
OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview -- 2025-09-25
microsoft/VibeVoice-1.5B -- 2025-09-25
jhu-clsp/mmBERT-base -- 2025-09-25
Sophia NLU Engine Upgrade - New and Improved POS Tagger -- 2025-09-22
llama.ui: new updates! -- 2025-09-22
[Tool] Intuitive branching/forking/merging of chats via ThreadIt -- 2025-09-22
Google Android RAG SDK – Quick Comparison Study -- 2025-09-22
Advice on building an enterprise-scale, privacy-first conversational assistant (local LLMs with Ollama vs fine-tuning) -- 2025-09-22
I built a tool to do deep research on my local file system -- 2025-09-21
sbcinnovation/lmux -- 2025-09-21
Hypervisor from Scratch -- 2025-09-21
Show HN: Ghostpipe – Connect files in your codebase to user interfaces -- 2025-09-21
Scaleway on Hugging Face Inference Providers 🔥 -- 2025-09-21
New version of AlchemyLab (another Claude Code alternative) -- 2025-09-20
v0.6.29 Released - Major new version, major redesigns and many new features and performance improvements -- 2025-09-20
What Facebook's Memcache Taught Me About Systems Thinking -- 2025-09-20
Linus Torvalds Guitar Pedal Project -- 2025-09-20
Alex Karp Insists Palantir Doesn't Spy on Americans. Here's What He's Not Saying -- 2025-09-20
ArchGW 0.3.11 – Cross-API streaming (Anthropic client ↔ OpenAI models) -- 2025-09-19
Public AI on Hugging Face Inference Providers 🔥 -- 2025-09-19
VS Code Chat: Introducing auto model selection (preview) -- 2025-09-18
ircfspace/masque-plus -- 2025-09-18
Google Agentic Payments Protocol and X402: Agents Can Now Pay Each Other -- 2025-09-18
The madness of SaaS chargebacks -- 2025-09-18
Fix AI pipeline bugs before they hit your local stack: a semantic firewall + grandma clinic (beginner friendly, MIT) -- 2025-09-17
Any idea how to use ollama (debian) with 2x GPUs to load larger models? -- 2025-09-14
Rails on SQLite: new ways to cause outages -- 2025-09-14
[Editorial] Defeating Nondeterminism in LLM Inference -- 2025-09-14
Nvidia Unveils Rubin CPX Amidst Chart-Topping Blackwell Ultra MLPerf Results -- 2025-09-14
[Editorial] v3 of the Smart Turn semantic VAD model. -- 2025-09-12
Best Tiny Model for programming? -- 2025-09-12
Day 9 of Working with 8 Concurrent Claude Codes -- 2025-09-12
Danau5tin/multi-agent-coding-system -- 2025-09-12
ApeRAG: Production-ready GraphRAG with multi-modal indexing and K8s deployment -- 2025-09-12
Deploying 1.4KW GPUs (B300) what's the biggest bottleneck you've seen power delivery or cooling? -- 2025-09-10
New approach to block decoding from Meta, claims that around 4x inference speedup is possible, with 4x less compute passes at the same time. -- 2025-09-10
Qwen3 30B A3B Q40 @ 13 tok/sec on Raspberry Pi cluster -- 2025-09-10
SERVE 8B model directly from iPhone -- 2025-09-10
How do you handle integration blindness of AI coding? -- 2025-09-10
From 14-year corporate job to AI-powered solo founder - Day 3 insights -- 2025-09-10
Claude Code often pretends to execute tasks but doesn’t actually do them -- 2025-09-10
I built a Graph RAG pipeline (VeritasGraph) that runs entirely locally with Ollama (Llama 3.1) and has full source attribution. -- 2025-09-09
Built an offline AI CLI that generates apps and runs code safely -- 2025-09-09
Three different models reviewing three different implementations coded by three different models -- 2025-09-09
karpathy/rendergit -- 2025-09-09
Shipping textures as PNGs is suboptimal -- 2025-09-09
jkroepke/access-log-exporter -- 2025-09-08
Nvidia Dynamo vs vLLM production stack — how do they compare in real-world multi-node serving? -- 2025-09-08
A multi-interface (REST and MCP) server for automatic license plate recognition 🚗 -- 2025-09-05
Replay - like Git for App States and Agent Context -- 2025-09-05
Bringing Computer Use to the Web -- 2025-09-05
Missing Agents -- 2025-09-05
Show HN: Woomarks, transfer your Pocket links to this app or self-host it -- 2025-09-05
Capture and Plot Serial Data in the Browser -- 2025-09-05
ECA: free vendor lock alternative -- 2025-09-05
AMD Ryzen 7 8700G for Local AI: User Experience with Integrated Graphics? -- 2025-09-05
The Sense and Nonsense of Virtual Power Plants -- 2025-09-04
Authenticate Thyself -- 2025-09-04
Taco Bell Says 'No Más' to AI Drive-Thru Experiment -- 2025-09-02
Setting up MCP in Codex is easy, don’t let the TOML trip you up -- 2025-09-02
Make your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation -- 2025-09-02
Built a Confluence to OpenWebUI Knowledge Base Sync Tool -- 2025-09-02
Trying to simplify RAG setups → built a free hybrid search sandbox (feedback welcome) -- 2025-09-01
[Guide + Code] Fine-Tuning a Vision-Language Model on a Single GPU (Yes, With Code) -- 2025-09-01
OpenWebUI-SDK Development -- 2025-09-01
I built a local “second brain” AI that actually remembers everything (321 tests passed) -- 2025-08-30
Gpt-oss Fine-tuning - now with 60K context length and fits on <13GB VRAM -- 2025-08-30
facebookincubator/pces -- 2025-08-29
Google Debuts Device-Bound Session Credentials Against Session Hijacking -- 2025-08-29
Treasury Announces Federal Govt Will Phase Out Paper Checks on September 30th -- 2025-08-29
Bearer token keeps getting forgotten - somehow -- 2025-08-29
texttron/BrowseComp-Plus -- 2025-08-28
Scaling RL to Long Videos -- 2025-08-28
Meta's AI Companion Policy Is Outrageous -- 2025-08-27
Developer sentenced to prison for activating “kill switch” to avenge his firing -- 2025-08-25
How to Stop Zeus from Toasting Your Pi -- 2025-08-25
superfashi/pwnbot-ng -- 2025-08-25
Automated microgreens mini-farm ran by Claude Code -- 2025-08-25
Faster prefill on CPU-MoE IK-llama? -- 2025-08-23
Llamarunner, a llama.cpp manager and runner (with user presets!) -- 2025-08-23
what's "load_in_4bit" in unsloth LORA training? -- 2025-08-23
merve/smol-vision -- 2025-08-23
Rubby2001/Rshell---A-Cross-Platform-C2 -- 2025-08-23
Cloudflare incident on August 21, 2025 -- 2025-08-23
DeepSeek-V3.1 (Thinking and Non Thinking) -- 2025-08-22
Modify <think> to explore the impact on <answer> -- 2025-08-22
Tiny finance “thinking” model (Gemma-3 270M) with verifiable rewards (SFT → GRPO) — structured outputs + auto-eval (with code) -- 2025-08-22
Qwen/Qwen3-30B-A3B-Thinking-2507 -- 2025-08-22
tencent/Hunyuan-7B-Instruct -- 2025-08-22
Bringing Computer Use to the Web -- 2025-08-21
vibheksoni/stealth-browser-mcp -- 2025-08-21
RAG Web Search performs poorly -- 2025-08-21
AGENTS.md – Open format for guiding coding agents -- 2025-08-21
turtacn/kubestack-ai -- 2025-08-21
Critical Cache Poisoning Vulnerability in Dnsmasq -- 2025-08-21
gguf-eval: an evaluation framework for GGUF models using llama.cpp -- 2025-08-21
My open-source agent Maestro is now faster and lets you configure context limits for better local model support -- 2025-08-21
guide : running gpt-oss with llama.cpp -- 2025-08-21
Docker container for running Claude Code in "dangerously skip permissions" mode -- 2025-08-21
Llama Habitat Continues to Expand, Now Includes the PSP -- 2025-08-21
OpenAI Cookbook - Verifying gpt-oss implementations -- 2025-08-21
Docker Model Runner is really neat -- 2025-08-20
Build a Powerful RAG Web Scraper with Ollama and LangChain -- 2025-08-20
Ollama interface with memory -- 2025-08-20
YuminosukeSato/pyproc -- 2025-08-20
AGENTS.md – Open format for guiding coding agents -- 2025-08-20
Need Help: So-Vits-SVC Vibrated/Glitchy Output + Source Vocal Has Residual Music (G=98k, Diff=57k) -- 2025-08-19
GDPR meant nothing: chat control ends privacy for the EU [video] -- 2025-08-19
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels -- 2025-08-19
[Editorial] XBOW vs HackerOne, Flawless victory! -- 2025-08-19
GPT-5 doubles performance in offensive security benchmark -- 2025-08-19
Drop-in Voice App Control for iOS with Local Models -- 2025-08-18
GPT-OSS 20b runs on a RasPi 5, 16gb -- 2025-08-18
PowerInfer/SmallThinker-21BA3B-Instruct -- 2025-08-18
[Editorial] Beyond LLM and SLM.. -- 2025-08-18
Why it’s a mistake to ask chatbots about their mistakes | Ars Technica -- 2025-08-18
LocalAI Major Update: Modular Backends (update llama.cpp, stablediffusion.cpp, and others independently!), Qwen-VL, Qwen-Image Support, Image Editing & More -- 2025-08-18
Could you use RAG and Wikidumps to keep AI in the loop? -- 2025-08-17
Markdown-UI: an interactive UI inside Markdown for LLMs -- 2025-08-17
How can I reduce financial model deployment time from 5–10 days to 2 using automation (Cline, SQL, Snowflake,Tableau/Sigma)? -- 2025-08-17
Model intelligence is no longer the constraint for automation -- 2025-08-17
Implementing a basic equivalent of OpenBSD's pflog in Linux nftables -- 2025-08-17
Google Play Store bans wallets that don't have banking license -- 2025-08-17
Physical Aimbot Shoots For Success In Valorant -- 2025-08-17
[Reproducible] Constraint-guided knowledge file for Claude reduces long-chain drift (MIT PDF, 60-sec setup) -- 2025-08-17
Looking for a way to mimic custom slash commands in Aider -- 2025-08-17
NO WAY BACK -- 2025-08-14
X-Omni-Team/X-Omni -- 2025-08-14
SkyworkAI/Matrix-3D -- 2025-08-14
Best (free) AI Model for learning/understanding large unfamiliar codebases? -- 2025-08-13
MCPs that are part of my day-to-day Claude Code workflow -- 2025-08-13
[Editorial] Rust, AI Agents -- 2025-08-13
[Editorial] AI Winter? -- 2025-08-13
sii-research/siiRL -- 2025-08-13
Hand-picked selection of articles on AI fundamentals/concepts -- 2025-08-13
Update for Maestro - A Self-Hosted Research Assistant. Now with Windows/macOS support, Word/MD files support, and a smarter writing agent -- 2025-08-13
Llama.cpp Vulkan is awesome, It gave new life to my old RX580 -- 2025-08-13
Pairs of GPUs for inference? -- 2025-08-13
Ollama 2x mi50 32GB -- 2025-08-13
Go 1.25 Is Released -- 2025-08-13
[Editorial] New Red Team's Networking Techniques -- 2025-08-13
[Editorial] GLM-4.5, enterprise use -- 2025-08-13
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface -- 2025-08-13
Nonescape: SOTA AI-Image Detection Model (Open-Source) -- 2025-08-12
Activation-Guided Local Editing for Jailbreaking Attacks -- 2025-08-12
Run GPT-OSS with MLX or GGUF in your CLI using 1 line of code -- 2025-08-11
Built a new VLM (MicroLlaVA) on a single NVIDIA 4090 -- 2025-08-11
nvidia/audio-flamingo-3 -- 2025-08-11
LGAI-EXAONE/EXAONE-4.0-32B -- 2025-08-11
Does GPT-5 have JSON output mode? -- 2025-08-10
How to prevent claude from running `git -A` using hooks? -- 2025-08-10
TSMC to go 3D with wafer-sized processors -- 2025-08-10
Proton's New Two-Factor Authenticator App -- 2025-08-10
CodeFu-7B-v0.1 - a Reinforcement Learning (RL)-trained 7B model for Competitive Programming -- 2025-08-08
lucidrains/h-net-dynamic-chunking -- 2025-08-08
Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training -- 2025-08-08
Saidia: Offline-First AI Assistant for Educators in low-connectivity regions -- 2025-08-06
Finding a local model for text table QA -- 2025-08-06
Kodezi/Chronos -- 2025-08-06
[Editorial] Voice ai and voice agents, howto -- 2025-08-05
[Editorial] You don’t need a WebRTC server for your voice agents -- 2025-08-05
Waiting on direct MCP integration—dev team, got a roadmap update? -- 2025-08-05
Character Bitmap Graphics on the Pet 2001 -- 2025-08-04
Caches: LRU vs. Random -- 2025-08-04
Names are not type safety (2020) -- 2025-08-04
[Editorial] HRM -- 2025-08-03
How are people running an MLX-compatible OpenAI API server locally? -- 2025-08-03
I built the perfect MCP client for broke developers (Ollama powered) -- 2025-08-03
character-ai/pipelining-sft -- 2025-08-03
🚀 Qwen3-30B-A3B Small Update -- 2025-08-01
These new Qwen3 models are cooking! -- 2025-08-01
unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF -- 2025-08-01
[Editorial] Taking AI back from the sparkle ponies. -- 2025-07-28
Autonomous AI Surveillance: Multimodal Deep Learning for Cognitive and Behavioral Monitoring -- 2025-07-28
InditexTech/k8s-overcommit-operator -- 2025-07-27
Migadu Email -- 2025-07-27
Parquet Content-Defined Chunking -- 2025-07-27
How are people staging AI training datasets from NVMe → DDR5 → GPU VRAM for fine-tuning on RTX 5090s? -- 2025-07-25
Running Qwen3 235B-A22B 2507 on a Threadripper 3970X + 3x RTX 3090 Machine at 15 tok/s -- 2025-07-25
mistral-small3.2:latest 15B takes 28GB VRAM? -- 2025-07-25
Looking for help with terrible vLLM performance -- 2025-07-25
Warashi/cage -- 2025-07-22
The Most Powerful Server Embiggens a Bit with Power11 -- 2025-07-22
Vintage Hardware Find Includes Time Capsule of Data -- 2025-07-22
rip-zoyo/orbit-tls -- 2025-07-22
vidore/colqwen-omni-v0.1 -- 2025-07-21
Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data -- 2025-07-21
Super fast local CPU file processing with static embeddings! -- 2025-07-21
Servo Web Engine Further Tuning Performance -- 2025-07-20
Upcoming deprecation of GitHub Command Palette feature preview -- 2025-07-20
I fed Gemini a lot of posts from this reddit and let it summarize the best practice -- 2025-07-19
Consilium: When Multiple LLMs Collaborate -- 2025-07-19
Defense Department to begin using Grok -- 2025-07-18
LGAI-EXAONE/EXAONE-4.0-1.2B -- 2025-07-18
This SSD Will Self Destruct in Ten Seconds… -- 2025-07-18
Locally Running AI model with Intel GPU -- 2025-07-18
Where local is lagging behind... Wish lists for the rest of 2025 -- 2025-07-18
Devstral-Vision-Small-2507 -- 2025-07-18
Xttsv2 model, Chatterbox on MacBook air 8 gb -- 2025-07-18
Claude deleted my whole repository -- 2025-07-17
Defeating Memory Leaks with Zig Allocators -- 2025-07-17
OpenDPDv2: A Unified Learning and Optimization Framework for Neural Network Digital Predistortion -- 2025-07-17
Stop monitoring systems; start monitoring outcomes -- 2025-07-16
Migrating the Hub from Git LFS to Xet -- 2025-07-16
RekaAI/reka-flash-3.1 -- 2025-07-15
What kind of throughput can I expect with Llama 3.1 on a H200? -- 2025-07-15
Local LLM to back Elastic AI -- 2025-07-13
Blackwell FP8 W8A8 NVFP4 support discussion -- 2025-07-13
Is there some localllm benchmarking tool to see how well your system will handle a model? -- 2025-07-13
Unlocking AMD MI300X for High-Throughput, Low-Cost LLM Inference -- 2025-07-13
AI4Research: A Survey of Artificial Intelligence for Scientific Research -- 2025-07-13
Owen777/Kontext-Style-Loras -- 2025-07-13
Best Practices for Integrating Onyx (Danswer) with Open WebUI Pipelines -- 2025-07-13
Just shipped first uvx compatible public pypi release for my automated Open WebUI Postgres migration tool -- 2025-07-13
[P-6] Decoding FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space -- 2025-07-13
Local AI server with Ollama and Tailscale integration looking for feedback -- 2025-07-12
Running OpenWebUI Without RAG: Faster Web Search & Document Upload -- 2025-07-12
Show HN: Pangolin – Open source alternative to Cloudflare Tunnels -- 2025-07-11
SUS Lang: The SUS Hardware Description Language -- 2025-07-11
Embedded USB Debug for Snapdragon -- 2025-07-11
introducing cocoindex - super simple etl to prepare data for ai, with dynamic index (ollama integrated) -- 2025-07-08
welltodopoker/kubernetes-dynamic-reclaimable-pvc-controllers -- 2025-07-08
Python Pandas Ditches NumPy for Speedier PyArrow -- 2025-07-08
Claude Max Integration - Roo Code 3.21.4 & 3.21.5 Release Notes -- 2025-07-05
OpenAI to buy AI startup from Jony Ive -- 2025-07-05
The Hobby Computer Culture -- 2025-07-05
OpenMIDIStomper Makes Sure Your Gear Does What Your Foot Says -- 2025-07-05
How realistic is it to run a media site entirely on AI-generated code with no developers? -- 2025-07-03
Thinking about switching from cloud based AI to sth more local -- 2025-07-03
There are no new ideas in AI only new datasets -- 2025-07-02
trycua/cua -- 2025-06-30
ml0-1337/claude-gate -- 2025-06-30
mstrYoda/go-arctest -- 2025-06-30
Hack of SEC's Edgar System Exposed Flaws in US Financial Security -- 2025-06-29
$^{100}$Mo-enriched Li$_2$MoO$_4$ scintillating bolometers for $0\nu 2\beta$ decay search: from LUMINEU to CUPID-0/Mo projects -- 2025-06-29
yushangxiao/claude2api -- 2025-06-28
nlohmann/json -- 2025-06-25
marimo-team/marimo -- 2025-06-25
Python ASGI Framework Benchmarks -- 2025-06-24
AI in my plasma physics research didn’t go the way I expected -- 2025-06-23
identicallead/mse6 -- 2025-06-22
GCC 13.4 Released with 129 additional bug fixes -- 2025-06-22
Databricks acquires Neon -- 2025-06-22
crumbyte/noxdir -- 2025-06-21
strapi/strapi -- 2025-06-21
n8n-io/n8n -- 2025-06-20
Learning (The Basics of) Nftables -- 2025-06-18
Linux Cgroup from First Principles -- 2025-06-18
Databricks Free Edition -- 2025-06-18
kn0x0x/CVE-2025-32756-POC -- 2025-06-17
Magic Leap One Bootloader Exploit -- 2025-06-17
Take9 Won't Improve Cybersecurity -- 2025-06-17
nanonets/Nanonets-OCR-s -- 2025-06-16
Menlo/Jan-nano-gguf -- 2025-06-16
Olow304/memvid -- 2025-06-14
carbon-language/carbon-lang -- 2025-06-12
1001 Ways of Scenario Generation for Testing of Self-driving Cars: A Survey -- 2025-06-11
100G Data Center Interconnections with Silicon Dual-Drive Mach-Zehnder Modulator and Direct Detection -- 2025-06-11
litert-community/Gemma3-1B-IT -- 2025-06-11
Qwen/Qwen3-Reranker-8B -- 2025-06-11
open-thoughts/OpenThinker3-7B -- 2025-06-10
Cosmos-Reason1: Physical AI Common Sense and Embodied Reasoning Models -- 2025-06-10
How does vector dimension reduction work in new Qwen3 embedding models? -- 2025-06-10
New Upgraded Deepseek R1 is now almost on par with OpenAI's O3 High model on LiveCodeBench! Huge win for opensource! -- 2025-06-10
Mundane Robustness Benchmarks -- 2025-06-10
My Local LLM plan for academic editing help -- 2025-06-10
What is the best and affordable uncensored model to fine tune with your own data? -- 2025-06-10
Backpropagation with Automatic Differentiation from Scratch in Python -- 2025-06-10
DeepSeek-R1-0528 Released on Official API! -- 2025-06-10
Misconceptions about the Unix Philosophy -- 2025-06-10
Where hyperscale hardware goes to retire: Ars visits a big ITAD site -- 2025-06-10
How to Connect an External RAG Database (FAISS, ChromaDB, etc.) to Open WebUI? -- 2025-06-10
meilisearch/meilisearch -- 2025-06-09
Weaponizing Dependabot: Pwn Request at its finest -- 2025-06-08
Experts -- 2025-06-08
007: Democratically Finding The Cause of Packet Drops -- 2025-06-08
GoogleCloudPlatform/generative-ai -- 2025-06-07
EvolutionAPI/evo-ai -- 2025-06-06
Stopping AI scrapers from taking down my server -- 2025-06-05
kubeflow/kubeflow -- 2025-06-03
Enhancing MySQL: MySQL improvement project -- 2025-06-03
JefferyHcool/BiliNote -- 2025-06-02
IDE for PostgreSQL in VS Code from Microsoft -- 2025-06-02
10,000 km Straight-line Transmission using a Real-time Software-defined GPU-Based Receiver -- 2025-06-02
astral-sh/uv -- 2025-06-01
Realtek's $10 tiny 10GbE NIC will hit motherboards soon -- 2025-06-01
RSyncUI – A SwiftUI based macOS GUI for rsync -- 2025-06-01