AI Hardware

AI news coverage

702 articles across 197 editions

Articles

Pine64 launch $50 smart speaker for Home Assistant tinkerers -- 2026-07-03
Tiny Jetson Nano Orin Super Benchmarking of 1B and sub 1B LLMs — llama.cpp vs Ollama -- 2026-07-03
nvidia/Cosmos3-Nano -- 2026-07-03
96gb+ 4090's and 5090 are literally a scam. I mods these cards myself -- 2026-07-02
Mtia v2 - what to do -- 2026-07-02
HIP: use hipBLAS for dense prefill on gfx900, keep MMQ for MoE by DEV-DUFORD · Pull Request #24588 · ggml-org/llama.cpp -- 2026-07-02
Is Qwen3-VL-2B the only viable VLM for JSON extraction on a "potato"? -- 2026-07-02
Apple Neural Engine: Architecture, Programming, and Performance -- 2026-06-30
OpenRouter pricing implies heavier quantization than advertised -- 2026-06-30
llama.cpp updates: granite-speech, LFM2.5 embeddings, Vulkan backend improvements -- 2026-06-30
deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface -- 2026-06-29
High-quality GLM-5.2 Quant on 4x DGX Spark - Guide, Results, and Comps -- 2026-06-29
Ornith 35B is great so far -- 2026-06-29
Book Review: Domain-Specific Small Language Models by Guglielmo Iozzia -- 2026-06-29
Apple to skip high-end M6 Mac chips in favor of AI-focused M7 line -- 2026-06-27
NVIDIA's New AI Servers Run on Hotub Coolant and Don't Need Evaporators -- 2026-06-27
Apple raises prices of MacBooks, iPads -- 2026-06-26
IBM debuts sub-1 nanometer chip technology -- 2026-06-26
OpenAI and Broadcom unveil LLM-optimized inference chip -- 2026-06-26
OpenAI unveils its first custom chip, built by Broadcom -- 2026-06-25
Qualcomm to Acquire Modular -- 2026-06-25
45°C cooling design cuts data center water use to near zero -- 2026-06-25
OpenAI's CSO on Registration List for Thiel's Secret $16K Retreat Alongside Treasury Secretary -- 2026-06-25
[Editorial] The Easiest Way to Lie with AI -- 2026-06-25
Long-Theorized GPS Weakness Exploited on Large Scale -- 2026-06-25
Breaking Into a Prison Tablet -- 2026-06-25
[Editorial] Video Feature -- 2026-06-25
Leaky Player Piano Gets MIDI Upgrade in YouTube Restomod -- 2026-06-25
[Editorial] Apple Core AI Developer Framework -- 2026-06-25
[Editorial] Apple Container — Open-Source Container Runtime -- 2026-06-25
Accelerating Fine-Tuning with NVIDIA NeMo AutoModel -- 2026-06-25
NVIDIA Claims 15x Speedup with Diffusion-Based LLM — Entire Block of Text Generated at Once -- 2026-06-25
[Editorial] Video Pick 1 -- 2026-06-19
Epic Games announces Lore version control system -- 2026-06-19
Supertone/supertonic-3 -- 2026-06-19
[Editorial] Video Pick 3 -- 2026-06-19
I Could've Rickrolled the FIFA World Cup. All I Needed Was My ID -- 2026-06-19
MIT researchers built their own OS to study how chips really work -- 2026-06-19
Finally - 4xRTX 5060TI -- 2026-06-18
You can run Deepseek 4 flash on mac (M3 Max, 96gb) -- 2026-06-18
Built a local AI assistant because I always knew this day would come, yesterday just made it feel very real -- 2026-06-18
Local streaming copilot: live transcription + real-time suggestions, on a Jetson -- 2026-06-18
[Editorial] AMD CEO Lisa Su Challenges NVIDIA's GPU Dominance -- 2026-06-16
PonyExl3: EXL3 Quantization Ported to Apple Silicon — 2700 tok/s Prefill, 68.5 tok/s Decode on M5 Max -- 2026-06-16
My Homelab AI Dev Platform -- 2026-06-16
[Editorial] Chainguard Scans Source Code for Malware and Greyware -- 2026-06-12
Are insecure code completions in PyCharm a vulnerability? -- 2026-06-12
ShieldNet-360/prompt-gate -- 2026-06-12
[Editorial] Staris Tech -- 2026-06-12
huawei-csl/KVarN -- 2026-06-12
chiennv2000/orthrus -- 2026-06-12
Qwen/Qwen3.5-122B-A10B -- 2026-06-12
zai-org/GLM-OCR -- 2026-06-12
Apple reveals new AI architecture built around Google Gemini models -- 2026-06-10
Apple Core AI Framework -- 2026-06-10
[Editorial] Apple just killed paying for AI -- 2026-06-10
MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second -- 2026-06-09
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States -- 2026-06-09
Gooey: A GPU-accelerated UI framework for Zig -- 2026-06-05
Branchless Quicksort faster than std:sort and pdqsort with C and C++ API -- 2026-06-05
HP re-releases classic computer science calculator: The HP-16C -- 2026-06-05
Hacking your PC using your speaker without ever touching it -- 2026-06-04
[Editorial] -- 2026-06-04
I built a vulnerable app and spent $1,500 seeing if LLMs could hack it -- 2026-06-04
Use your Nvidia GPU's VRAM as swap space on Linux -- 2026-06-03
[Editorial] chipotlai-max -- 2026-06-03
[Editorial] Video Submission -- 2026-06-03
It is an amazing time for programmers -- 2026-06-03
Claude Code – Everything You Can Configure That the Docs Don't Tell You -- 2026-06-02
[Editorial] -- 2026-06-02
Turning local agents into self-optimizing agents -- 2026-06-02
1-Bit Bonsai Image 4B Image Generation for Local Devices -- 2026-06-01
Old Mac Pro still proving its worth -- 2026-06-01
Heterogeneous GPU Weighting & Layer Splitting -- 2026-06-01
I finally put my NPU (Intel Arrow Lake) to use doing ASR for my smart home -- 2026-06-01
OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face -- 2026-06-01
hipEngine: Fast native Qwen 3.6 inference for RDNA3 — ROCm-native open-source engine -- 2026-05-29
Custom C++ engine for MiniCPM-V 4.6 on Orange Pi AIPro — 2× speedup via custom AscendC kernels -- 2026-05-29
Krasis update: Qwen3.6-35B-A3B (Q4) at reading speed, 1x 8GB 3070 Mobile laptop (32GB RAM) -- 2026-05-28
Benchmarked Needle 26M vs Qwen3-0.6B on CPU function calling, 50 queries across 5 difficulty tiers. The 23x smaller model wins on accuracy and is 4.4x faster. -- 2026-05-28
Small comparison on full compute performance (Anima) of 5090 vs 6000 PRO MaxQ vs 6000 PRO WS/SE -- 2026-05-28
CXMT started selling ram to corsair -- 2026-05-28
TTS Benchmark Comparison (all known TTS up until May 2026) -- 2026-05-28
I'm Getting into Mesh Networks (Meshtastic, MeshCore, and Reticulum) -- 2026-05-28
A Eureka machine that thinks like nature and explores what AI cannot -- 2026-05-28
Starbucks is scrapping its AI inventory tool after it reportedly miscounted and mislabeled items -- 2026-05-28
$400 Qwen 3.6-27B Setup — Dual RTX 3060 — 30-50 t/s -- 2026-05-27
Qwen3.6 27B made me a believer — single-shot game development with local model -- 2026-05-27
Qwen3.5 35B-A3B Uncensored Heretic — Native MTP Preserved, multiple formats -- 2026-05-27
[Editorial] MSI NVIDIA DGX Station -- 2026-05-27
Have we passed the peak of inflated expectations? -- 2026-05-27
AMD Ryzen AI Halo Developer Platform and Ryzen AI Max PRO 400 Series -- 2026-05-21
May 2026 Updated Strix Halo Mini PC Comparison Chart -- 2026-05-21
Two VRAM-modded RTX 2080 Ti (22GB each) running Qwen3.6-27B at 38 t/s -- 2026-05-21
Anyone holding out for M5 Ultra? -- 2026-05-21
Cloud GPU prices creeping up everywhere — RunPod, Vast, hidden fees -- 2026-05-21
Tesla Wall Connector bootloader bypasses the firmware downgrade ratchet -- 2026-05-19
Regex Chess: A 2-ply minimax chess engine in 84,688 regular expressions -- 2026-05-19
Dual GPU llama.cpp speedup -- 2026-05-19
Glia – Local-first shared memory layer (SQLite-vec + FTS5 + Offline Knowledge Graph) -- 2026-05-19
A VERY lightweight open web-search tool for smaller local LLMs -- 2026-05-19
Anyone else following Q.ANT's photonic GPU advancements? Tech shifting point -- 2026-05-19
Tired of memory leakage between projects? I built a Folder-Scoped Memory Isolation filter for Open WebUI! -- 2026-05-19
Using Intel Arc Pro series, any thoughts ? -- 2026-05-19
AMD Introduces Instinct MI350P Accelerator: CDNA 4 Comes to PCIe Cards -- 2026-05-14
MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required -- 2026-05-14
STAM: Stable Training with Adaptive Momentum — A New Optimizer for AI Training -- 2026-05-14
AMD to release slottable GPU -- 2026-05-11
Taiwanese company Skymizer announces HTX301 - PCIE inference card with 384GB of Memory at ~240 Watts -- 2026-05-11
Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store -- 2026-05-11
Anthropic Just Secured a Reserve — 220,000 GPUs via SpaceX Colossus -- 2026-05-08
[Editorial] The Four Layers of AI Measurement — A CFO's Framework -- 2026-05-08
[Editorial] Cognitum — AI Knowledge Platform -- 2026-05-08
[Editorial] Behind the Scenes: Hardening Firefox -- 2026-05-08
[Editorial] Moak AI — AI Security Tool -- 2026-05-08
[Editorial] Thousands of Vibe-Coded Apps Expose Corporate and Personal Data -- 2026-05-08
Bleeding Llama: Critical Unauthenticated Memory Leak in Ollama -- 2026-05-08
Maybe You Shouldn't Install New Software for a Bit -- 2026-05-08
$100K budget to build an in-house LLM server for agentic coding -- 2026-05-05
Tenstorrent TT-QuietBox 2 Specifications (Blackhole) -- 2026-05-05
CUDA + ROCm simultaneously with -DGGML_BACKEND_DL=ON -- 2026-05-05
Tinygrad Driver testing on Blackwell + M3 Ultra RDMA cluster -- 2026-05-05
MiMo 2.5 requires at least 4 GPUs — model accessibility locked behind hardware -- 2026-05-05
Writing an LLM Compiler from Scratch: PyTorch to CUDA -- 2026-05-04
Karpathy's MicroGPT Running at 50,000 Tokens/Second on an FPGA -- 2026-05-04
Hipfire: Full AMD Architecture Validation Across RDNA 1–4, Strix Halo, and BC250 -- 2026-05-04
Qwen 3.6-35B KV Cache Benchmark: f16 vs q8_0 vs turbo3 vs turbo4 from 0 to 1M Context -- 2026-05-04
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 -- 2026-04-30
First direct side by side MoE vs Dense comparison -- 2026-04-30
GPT-6 Confirmed -- 2026-04-30
Google to invest up to $40 billion in Anthropic as search giant spreads its AI bets -- 2026-04-30
convert : add support for Nemotron Nano 3 Omni by danbev - llama.cpp PR #22481 -- 2026-04-30
TorchTPU: Running PyTorch Natively on TPUs at Google Scale -- 2026-04-28
Fully Featured Audio DSP Firmware for the Raspberry Pi Pico -- 2026-04-28
Running Hunyuan Image-to-3D Texture 3x Faster with MLX at half the VRAM on Apple Silicon -- 2026-04-28
[Editorial] -- 2026-04-28
[Editorial] -- 2026-04-28
sparkyniner/DRISH-X-Satellite-powered-freight-intelligence- -- 2026-04-28
Jensen Huang: US chip export controls might be creating the problem they're trying to solve -- 2026-04-24
To Beat China, Embrace Open-Source AI (WSJ) -- 2026-04-24
Meta to start capturing employee mouse movements, keystrokes for AI training -- 2026-04-24
[Editorial] Token Is All You Need: Finding 0days with LLMs -- 2026-04-20
[Editorial] Decepticon — AI Adversarial Testing Framework -- 2026-04-20
[Editorial] RedAI — Red Team AI Tools -- 2026-04-20
[Editorial] Redamon Agentic System Technical Whitepaper -- 2026-04-20
[Editorial] Talon by CarbeneAI — AI Agent Tool -- 2026-04-20
[Editorial] The Mother of All AI Supply Chains: Critical Systemic Vulnerability at the Core of MCP -- 2026-04-20
€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs -- 2026-04-20
[Editorial] SecWest: AMD VVI — Hardware-Level Vulnerability Research -- 2026-04-20
[Editorial] Phenoelit/Halvar — Legendary Security Research -- 2026-04-20
AI-Enabled Covert Channel Detection in RF Receiver Architectures -- 2026-04-17
SPICE simulation to oscilloscope verification with Claude Code MCP -- 2026-04-17
pyre-code: Self-hosted ML coding practice platform (68 problems) -- 2026-04-17
[Editorial] Video Submission -- 2026-04-17
Two Months After I Gave an AI $100 and No Instructions -- 2026-04-15
TokenBrawl: Claude vs GPT in a Bomberman-Style 1v1 Game -- 2026-04-15
[Editorial] The Atlantic: 4chan, AI Dungeon, and the Nature of Machine Reasoning -- 2026-04-15
cuLA: CUDA Kernels for Linear Attention in CuTe DSL + CUTLASS -- 2026-04-15
See-Through: Single-Image Layer Decomposition for Anime Characters (SIGGRAPH 2026) -- 2026-04-15
MiniMax-M2.7 NVFP4 on 2x RTX PRO 6000 Blackwell — 127.7 tok/s, 2800 Peak -- 2026-04-15
Offline Companion Robot for Disabled Husband — 8GB RAM Constraints -- 2026-04-15
Experiment: OLMo 3 7B at 1-Bit — Bonsai Quantization-Aware Distillation -- 2026-04-15
Fine-Tuned Qwen3.5-0.8B for OCR — Outperforms Previous 2B Release -- 2026-04-15
[Editorial] Running AI/LLM on a Commodore 64 -- 2026-04-15
Parlor — Real-time AI (audio/video in, voice out) on M3 Pro with Gemma E2B -- 2026-04-09
Gemma-4-E2B-it NVFP4 on DGX Spark — #1 in 9 Spark Arena categories -- 2026-04-09
Unsloth Gemma 4 Fine-Tuning Guide -- 2026-04-09
Bartowski vs Unsloth for Gemma 4 -- 2026-04-09
inferrs — Rust-Based Local LLM Inference Engine -- 2026-04-09
System Card: Claude Mythos Preview [pdf] -- 2026-04-08
[Editorial] -- 2026-04-08
[Editorial] -- 2026-04-08
1-bit llms on device?! -- 2026-04-07
Running SmolLM2-360M on a Samsung Galaxy Watch 4 (380MB RAM) – 74% RAM reduction in llama.cpp -- 2026-04-07
Built my 10x NVidia V100 AI Server - 320gb vram - vLLM Testing Linux Headless -- 2026-04-07
Gemma 4 on iPhone -- 2026-04-06
Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud -- 2026-04-06
[Editorial] Cutting LLM Size in Half Without Quality Loss -- 2026-04-06
[Editorial] Apple's Embarrassing AI Drop -- 2026-04-06
[Editorial] Build a Better Coding Agent at Home -- 2026-04-06
[Editorial] TDX Ray — CPU Trusted Execution Security Research -- 2026-04-06
Rowhammer Attacks via CUDA Kernels Can Root NVIDIA GPU Machines -- 2026-04-06
M5 Max vs M3 Max Inference Benchmarks (Qwen3.5, oMLX, 128GB, 40 GPU cores) -- 2026-04-02
Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs -- 2026-04-02
Lemonade by AMD: a fast and open source local LLM server using GPU and NPU -- 2026-04-02
I spent 48 hours saturating Qwen 3.5 with 2,000,000 tokens to kill 'Quantization-Slop'. Here is the Sovereign Series (0.8B to 27B). -- 2026-04-02
I'm building a medieval RPG where every significant NPC runs on a local uncensored LLM — no cloud, no filters, no hand-holding. Here's the concept. -- 2026-04-02
mac-code: Claude Code Clone Running Free on Apple Silicon (35B at 30 tok/s) -- 2026-03-31
autoresearch-mlx: Karpathy's Autonomous Research Loops on Apple Silicon -- 2026-03-31
TurboQuant for Weights: Near-Optimal 4-bit LLM Quantization with Lossless 8-bit Residual -- 2026-03-31
Copaw-9B: Alibaba's Official Qwen3.5 9B Agentic Finetune -- 2026-03-31
[Editorial] Unsloth: #1 Trending Model for 3 Weeks -- 2026-03-31
[Editorial] MSPM0G3507 Correlation Power Analysis -- 2026-03-24
[Editorial] Latest Red Amon — Recon Tooling Update -- 2026-03-24
Flash-KMeans: Fast and Memory-Efficient Exact K-Means -- 2026-03-20
Autoresearch for SAT Solvers -- 2026-03-20
Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation -- 2026-03-20
Hunter Alpha was a stealth model revealed on March 18th as an early testing version of MiMo-V2-Pro. -- 2026-03-19
mistralai/Mistral-Small-4-119B-2603 -- 2026-03-19
LiquidAI/LFM2-24B-A2B -- 2026-03-19
The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics -- 2026-03-18
phai-lab/InstantSplatPP -- 2026-03-18
[Editorial] The First Multi-Behavior Brain Upload -- 2026-03-13
MiroThinker-1.7: Open Deep Research Agent (30B and 235B) -- 2026-03-13
[Editorial] Replacing Alexa with a Local Voice Assistant -- 2026-03-11
RunAnywhere (YC W26) — Fastest AI Inference on Apple Silicon -- 2026-03-11
TripoSR Image-to-3D Running Fully On-Device on iPhone via ONNX Runtime -- 2026-03-11
TADA: Fast, Reliable Open-Source Speech Generation via Text-Acoustic Synchronization -- 2026-03-11
[Editorial] -- 2026-03-09
[Editorial] -- 2026-03-09
[Editorial] Apple Introduces MacBook Pro with All-New M5 Pro and M5 Max -- 2026-03-06
Apple Python Foundation Models SDK -- 2026-03-06
[Editorial] Apple Was Losing the AI Race — Until Now -- 2026-03-06
[Editorial] Apple's AI Strategy Vindicated -- 2026-03-06
Inside the M4 Apple Neural Engine, Part 1: Reverse Engineering -- 2026-03-03
Hydroph0bia – fixed SecureBoot bypass for UEFI firmware from Insyde H2O (2025) -- 2026-03-03
[Editorial] -- 2026-02-28
Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration -- 2026-02-28
[Editorial] -- 2026-02-28
Free ASIC Llama 3.1 8B inference at 16,000 tok/s - no, not a joke -- 2026-02-25
[Editorial] Cognitum -- 2026-02-25
Hetzner Prices increase 30-40% -- 2026-02-25
[Editorial] Taala's Etches AI Models onto Transistors to Rocket-Boost Inference -- 2026-02-23
Repurposing 800 RX 580s into an AI Inference Cluster: Mass Document OCR at 24x Lower Cost -- 2026-02-23
[Editorial] Enterprise Open Source AI Coding Is Changing the ROI Calculation -- 2026-02-20
[Editorial] Think Tax: The Real Cost of AI-Generated Code -- 2026-02-20
[Editorial] RuVector DNA Sequence Analysis Example -- 2026-02-20
15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern -- 2026-02-20
Nvidia and OpenAI abandon unfinished $100B deal in favour of $30B investment -- 2026-02-20
Google releases Gemini 3.1 Pro with Benchmarks -- 2026-02-20
[Editorial] RVF — Most Consequential AI Infrastructure -- 2026-02-16
[Editorial] Introducing RVF Cognitive Container -- 2026-02-16
375ms Voice-to-Voice Latency: Local Nemotron-4 + Kokoro-82M on Blackwell Bare Metal -- 2026-02-16
Expensively Quadratic: The LLM Agent Cost Curve -- 2026-02-16
Is the Nvidia T4 actually viable for 70B (EXL2) daily driving, or is it just pure cope compared to dual 3090s? -- 2026-02-13
Open weight kimi k2.5 overtakes opus 4.5 non thinking on arena -- 2026-02-13
When did we go from 400k to 256k? -- 2026-02-13
[Editorial] https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference -- 2026-02-09
OpenClaw on edge Linux (systemd + cron) — quick experiment + a few questions -- 2026-02-03
[Ollama Cloud] 29.7% failure rate, 3,500+ errors in one session, support ignoring tickets for 2 weeks - Is this normal? -- 2026-02-03
OpenClaw is everywhere all at once, and a disaster waiting to happen -- 2026-02-03
Companion MIDI Pedal Helps Roland Groovebox Along -- 2026-01-30
Renting out the cheapest GPUs ! (CPU options available too) -- 2026-01-29
What secondary GPU should I get, mainly for local prompting? -- 2026-01-28
On-device tool calling with Llama 3.2 3B on iPhone - made it suggest sushi restaurants [Open Source, React Native] -- 2026-01-28
I have written gemma3 inference in pure C -- 2026-01-28
There's a hidden Android setting that spots fake cell towers -- 2026-01-23
TerabyteDeals – Compare storage prices by $/TB -- 2026-01-23
768Gb Fully Enclosed 10x GPU Mobile AI Build -- 2026-01-22
Drone Hacking Part 1: Dumping Firmware and Bruteforcing ECC -- 2026-01-21
Looking at a Real Fake Raspberry Pi RP2040 Board -- 2026-01-21
3x3090 + 3060 in a mid tower case -- 2026-01-19
Built an 8× RTX 3090 monster… considering nuking it for 2× Pro 6000 Max-Q -- 2026-01-19
vLLM on 2x/4x Tesla v100 32GB -- 2026-01-16
M.2 to 4x Pcie for extra GPU Power Question -- 2026-01-16
New version of Raspberry Pie Generative AI card (HAT+ 2) -- 2026-01-16
Qualcomm's RISC-Ventana Fusion -- 2026-01-14
An Open Source Electromagnetic Resonance Tablet -- 2026-01-14
sardanioss/httpcloak -- 2026-01-13
Making a CRT Spin Right Round, Round, Round -- 2026-01-13
[Editorial] https://www.linkedin.com/posts/stephenbklein_the-age-of-pretend-the-ai-industry-just-spent-activity-7415779694509219842-8OkK -- 2026-01-12
[Editorial] https://www.linkedin.com/posts/reuvencohen_most-people-talk-about-gpus-as-if-they-are-activity-7415778737486483456-7DQK -- 2026-01-12
Dual rx 9070 for LLMs? -- 2026-01-09
Opus 4.5 head-to-head against Codex 5.2 xhigh on a real task. Neither won. -- 2026-01-09
ARCANGEL0/EVA -- 2026-01-07
k2-fsa/Flow2GAN -- 2026-01-07
GNU Ddrescue 1.30 Released -- 2026-01-07
Debunking the AI food delivery hoax that fooled Reddit -- 2026-01-07
Who Cares about the Baltic Jammer? Terrestrial Navigation in Baltic Sea Region [video] -- 2025-12-30
Streaming Music to Cassette -- 2025-12-30
[Editorial] https://zymtrace.com/ -- 2025-12-22
PLX/PEX PCIe 4.0 seems to help for LLMs and P2P! I.e. PEX88096 (1 PCIe 4.0 X16 to 5 PCIE 4.0 X16) and others, and comparison vs bifurcation. -- 2025-12-22
Qubes OS 4.3.0 has been released -- 2025-12-22
SeeSee21/Z-Image-Turbo-AIO -- 2025-12-22
Designing a CPU for Native BASIC -- 2025-12-22
Memory at the Speed of Light -- 2025-12-19
Key Highlights of NVIDIA’s New Model: Nemotron 3 -- 2025-12-17
The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator -- 2025-12-17
[Editorial] https://docs.unsloth.ai/new/deploy-llms-phone -- 2025-12-17
Full AI Voice Agent (Whisper + 700M LLM + NeuTTS) running entirely on an Nvidia Jetson Orin Nano ($250 hardware) with no internet access -- 2025-12-17
8x Radeon 7900 XTX Build for Longer Context Local Inference - Performance Results & Build Details -- 2025-12-17
mistralai/Ministral-3-8B-Instruct-2512 -- 2025-12-12
Benchmarked A100 vs H100 local storage for Multi-GPU loading. The Gen4 bottleneck is brutal for cold starts. -- 2025-12-11
My first OSS project! Observability & Replay for AI agents -- 2025-12-11
Ollama + OpenVINO -- 2025-12-11
OpenTelemetry Distribution Builder -- 2025-12-11
Miles + FSDP2 = Megatron-Level Performance with More Flexibility -- 2025-12-10
[help] RTX pro 6000 - llama.cpp Qwen3-Next-80B maxes out at 70% gpu? -- 2025-12-09
Rate/roast my setup -- 2025-12-09
dynamic allocation of less used experts to slower memory -- 2025-12-08
I built a personal assistant script, and the CPU inference speed beats my Llama setup. -- 2025-12-08
At What Point Does Owning GPUs Become Cheaper Than LLM APIs ? I -- 2025-12-05
A Deep Dive into Using PIO and DMA on the RP2350 -- 2025-12-05
CUA Local Opensource -- 2025-12-05
AMD PRO 395 Radeon 8060S Graphics - Any recent Benchmarks -- 2025-12-04
LoRa Repeater Lasts 5 Years on PVC Pipe and D Cells -- 2025-12-03
LM Studio beta supports Qwen3 80b Next. -- 2025-12-03
4xRTX 4000 Pro Blackwell vs 1x6000 RTX Pro -- 2025-12-02
moonshotai/Kimi-Linear-48B-A3B-Instruct -- 2025-12-02
orabazes/FLUX.2-dev-GGUF -- 2025-12-02
Transformers v5: Simple model definitions powering the AI ecosystem -- 2025-12-02
You can now do FP8 reinforcement learning locally! (<5GB VRAM) -- 2025-12-01
stepfun-ai/Step-Audio-R1 -- 2025-11-28
Strix Halo batching with tensor parallel and pipeline parallel using vllm benchmarked -- 2025-11-28
RTX 3090 vs RX 7900 with ROCm, also Vulcan -- 2025-11-26
moonshotai/Kimi-K2-Thinking -- 2025-11-26
Can an expert chime in and explain what is holding Vulkan back from becoming the standard API for ML? -- 2025-11-25
Ollama Not Using GPU on RTX 5070 Ti (Blackwell) -- 2025-11-25
Microsoft makes Zork open-source -- 2025-11-25
Possibly-Smallest ESP32 Board Uses Smallest-Footprint Parts -- 2025-11-25
The Qtile Window Manager: A Python-Powered Tiling Experience -- 2025-11-25
Most Stable Raspberry Pi? Better NTP with Thermal Management -- 2025-11-25
zai-org/Glyph -- 2025-11-21
tlennon-ie/qwen-edit-skin -- 2025-11-21
Mating Cycles: Engineering Connectors to Last -- 2025-11-21
Commmunication-Efficient and Accurate Approach for Aggregation in Federated Low-Rank Adaptation -- 2025-11-21
Gemini 2.5 Flash Image / Nano Banana Tutorial -- 2025-11-21
Built a tool to solve the "how much GPU do I actually need?" problem for LLM deployment -- 2025-11-20
New Parameter Browser added to Llamacpp Model Launcher! experimental model parameter tuning(window/cuda only) -- 2025-11-20
cuda device list mismatch - ggml_cuda_init / ubuntu - significance to using --main-gpu flag -- 2025-11-20
What Size of LLM Can 4x RTX 5090 Handle? (96GB VRAM) -- 2025-11-20
DOE gives Microsoft partner $1B loan to restart Three Mile Island reactor -- 2025-11-20
PCIE Bifurcation - More than 4 GPUs on a consumer motherboard -- 2025-11-18
Qual a melhor GPU para o llama 3(.1 ou .3) -- 2025-11-18
PyTorch 2.10.0a0 w/ Blackwell (sm_120) Support — Patched & Packaged for One-Command Install -- 2025-11-17
Half-trillion parameter model on a machine with 128 GB RAM + 24 GB VRAM -- 2025-11-17
Real-Time BART in a Box Smaller Than Your Coffee Mug -- 2025-11-17
Tiny386 on an Espressif ESP32-S3 -- 2025-11-14
[Editorial] Balancing order, freedom, and technology -- 2025-11-12
AMD warns the Intel and Nvidia partnership is a risk to its business -- 2025-11-12
A Pentium In Your Hand -- 2025-11-12
Hardware recommendations for Ollama for homelab -- 2025-11-10
When Your Hash Becomes a String: Hunting Ruby's Million-to-One Memory Bug -- 2025-11-07
Maude 3 Manual -- 2025-11-07
[Editorial] https://www.suffsyed.com/futurememo/the-design-leaders-are-lying-to-you -- 2025-11-07
Adding a RTX 5080 into a 2U server with OcuLink -- 2025-11-06
Why does Image Recognition work in llama-server but not through Open WebUI? -- 2025-11-06
2025 Component Abuse Challenge: A Piezo Disk Powers A Transmitter -- 2025-11-06
[D] It turns out WDDM driver mode is making our RAM - GPU transfer extremely slower compared to TCC or MCDM mode. Anyone has figured out the bypass NVIDIA software level restrictions? -- 2025-11-05
[Editorial] https://blog.peerllm.com/2025/11/02/announcing-v0.7.6.html -- 2025-11-04
Faster llama.cpp ROCm performance for AMD RDNA3 (tested on Strix Halo/Ryzen AI Max 395) -- 2025-11-04
KTransformers Open Source New Era: Local Fine-tuning of Kimi K2 and DeepSeek V3 -- 2025-11-04
ZOZO's Contact Solver for physics-based simulations -- 2025-11-01
Need advice on building a GPU-based render/Al compute setup: Unsure about hardware direction -- 2025-11-01
Flamingo 3 released in safetensors -- 2025-10-31
My LLM-powered text adventure needed a dynamic soundtrack, so I'm training a MIDI generation model to compose it on the fly. Here's a video of its progress so far. -- 2025-10-31
Optimizing gpt-oss-120B on AMD RX 6900 XT 16GB: Achieving 19 tokens/sec -- 2025-10-31
[Editorial] https://tee.fail/ -- 2025-10-29
Satellite Snooping Reveals Sensitive Unencrypted Data -- 2025-10-29
GPT-OSS-20b TAKE THE HELM! Further experiments in autopilot. -- 2025-10-28
5060ti chads... ram overclocking, the phantom menace -- 2025-10-28
Batch inference locally on 4080 -- 2025-10-28
Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 -- 2025-10-28
Esonhugh/go-rex-java -- 2025-10-27
SuperSonic – SuperCollider's audio engine in a Web AudioWorklet -- 2025-10-27
3-way FTP: Pushing files around with silly and unusual methods -- 2025-10-27
HRV Gets Home Automation Upgrades -- 2025-10-27
Looking for some advice/input for LLM and more -- 2025-10-26
AlpinDale/ssh-dashboard -- 2025-10-26
GPU 101 and Triton kernels -- 2025-10-26
Llama.cpp is looking for M5 Neural Accelerator performance testers -- 2025-10-24
NVIDIA sent me a 5090 so I can demo Qwen3-VL GGUF -- 2025-10-24
AMD ROCm 7.9 and dwindling GPU support -- 2025-10-24
I got Kokoro TTS running natively on iOS! 🎉 Natural-sounding speech synthesis entirely on-device -- 2025-10-22
Mobile fully on device inference AI chat app with RAG support -- 2025-10-22
I am generally impressed by iPhone 17 GPU -- 2025-10-22
⚡ Gemma 3 1B Smart Q4 — Bilingual (IT/EN) Offline AI for Raspberry Pi 4/5 -- 2025-10-22
Valve Developer Contributes Major Improvement To RADV Vulkan For Llama.cpp AI -- 2025-10-22
I want to build an AI inference server for 72B models...what should I do? -- 2025-10-22
PlayDiffusion finetune for audio inpainting non-verbal tags -- 2025-10-21
Nvidia has produced the first Blackwell wafer on US soil -- 2025-10-21
Show HN: Syna – Minimal ML and RL Framework Built from Scratch with NumPy -- 2025-10-21
C Project Turns Into Full-Fledged OS -- 2025-10-21
Local VLLM Accelerated Evolution Framework -- 2025-10-19
Nvidia DGX Spark, is it worth ? -- 2025-10-19
Open source streaming STT (Parakeet + Silero + Pipecat Smart Turn) -- 2025-10-19
Turn ChatGPT into a real-time meeting assistant (via MCP + Apps SDK) -- 2025-10-19
[Editorial] Claude Skills are awesome, maybe a bigger deal than MCP -- 2025-10-18
NVIDIA DGX Spark Benchmarks -- 2025-10-18
Should I add another 5060 Ti 16GB or two? Already had 1 x 5070 Ti and 3 x 5060 Ti 16G -- 2025-10-18
Last week in Multimodal AI - Local Edition -- 2025-10-16
BosonAI's Higgs-Llama-3-70B AWQ Quantized (140GB → 37GB) -- 2025-10-16
Worthwhile using Ollama without nVidia? -- 2025-10-16
LiquidAI/LFM2-8B-A1B-GGUF -- 2025-10-16
The Entire Process of Building an Open Source Analog ASIC -- 2025-10-15
Smart Bulbs Are Turning Into Motion Sensors -- 2025-10-11
Qwen3-VL-30B-A3B-Thinking GGUF with llama.cpp patch to run it -- 2025-10-10
What and when 7900xtx is boosted? -- 2025-10-10
Script to install a bunch of AI or Dev tools automatically.. what can I add to it or improve? -- 2025-10-10
Qwen/Qwen3-VL-30B-A3B-Instruct -- 2025-10-10
BenchVolt PD: USB PD Meets Benchtop Precision -- 2025-10-10
Sneak Preview: Ollama Bench -- 2025-10-08
When Curl Works but IntelliJ Doesn't: The Ollama Connection Mystery -- 2025-10-08
XiangShan Vector Floating-Point Unit Design -- 2025-10-07
google/timesfm-2.5-200m-pytorch -- 2025-10-07
svg-project/flash-kmeans -- 2025-10-07
[Editorial] Agentic Tribe -- 2025-10-06
AlexanderYastrebov/onion-vanity-address -- 2025-10-06
Open Printer is an open-source inkjet with DRM-free ink and no subscriptions -- 2025-10-06
Yes, Gemini, A Wii Server Is Possible -- 2025-10-06
Running Qwen3-VL-235B (Thinking & Instruct) AWQ on vLLM -- 2025-10-06
Granite 4 H Tiny Q8 in RTX 3090, It's a context king. -- 2025-10-06
Video2X 6.x — open-source upscaler + frame interpolation (Anime4K v4 / Real-ESRGAN / Real-CUGAN / RIFE) 🚀 -- 2025-10-06
For llama.cpp/ggml AMD MI50s are now universally faster than NVIDIA P40s -- 2025-10-03
MSI EdgeXpert Compact AI Supercomputer Based on NVIDIA DGX Spark -- 2025-10-03
Kairos: Immutable Distro for K8s at the Edge -- 2025-10-03
Nvidia Has Been Supplying NDA'ed Docs to Red Hat for Helping NVK Driver -- 2025-10-03
Mini Laptop Needs Custom Kernel -- 2025-10-03
K2-Think 32B - Reasoning model from UAE -- 2025-10-03
MoonshotAI/checkpoint-engine -- 2025-10-03
Whither the Chip Shortage? -- 2025-10-02
[Editorial] https://github.com/emcie-co/parlant -- 2025-10-02
Built a persistent memory system for LLMs - 3 months testing with Claude/Llama -- 2025-10-02
Do I need to run /init on a repo if I already have AGENTS.md? -- 2025-10-02
Upgrade to Kernel 6.16.9 solves 15.5GB Stix Halo memory limitation -- 2025-09-30
Seeking Advice: Best Model + Framework for Max Tokens/sec on Dual L40S (Testing Rig) -- 2025-09-30
OpenHelix-Team/VLA-Adapter -- 2025-09-29
Reviving a Scrapped Sound Blaster 2.0 ISA Soundcard -- 2025-09-29
kijai/ComfyUI-WanAnimatePreprocess -- 2025-09-29
More money than brains... building a workstation for local LLM. -- 2025-09-28
Fully-Local AI Agent Runs on Raspberry Pi, With a Little Patience -- 2025-09-28
PC memory costs to climb as fabs chase filthy lucre in servers and HBM -- 2025-09-27
Qwen3 235b Q2 with Celeron, 2x8gb of 2400 RAM, 96GB VRAM @ 18.71 t/s -- 2025-09-27
This $5,999 RTX PRO 6000 Ebay listing is a scam, right? -- 2025-09-26
Accelerating Local AI on Consumer GPUs: A Hardware-Aware Dynamic Strategy for YOLOv10s -- 2025-09-26
How bad to have RTX Pro 6000 run at PCIE x8? -- 2025-09-24
I Upgrade 4090's to have 48gb VRAM: Comparative LLM Performance -- 2025-09-23
Some things I learned about installing flash-attn -- 2025-09-23
Comparison H100 vs RTX 6000 PRO with VLLM and GPT-OSS-120B -- 2025-09-23
My self-hosted app uses local Whisper for transcription and a local LLM for summaries & event extraction -- 2025-09-20
I open-sourced a text2SQL RAG for all your databases and local models -- 2025-09-20
firstbatchxyz/mem-agent-mcp -- 2025-09-20
yangzhou24/OmniWorld -- 2025-09-18
Analog Optical Computer for Inference and Combinatorial Optimization -- 2025-09-18
Why is the name of a wireless mouse hard-coded into Windows Bluetooth drivers? -- 2025-09-17
Chesars/whatsapp-mcp -- 2025-09-15
Claude’s memory architecture is the opposite of ChatGPT’s -- 2025-09-15
The Internet Will Be More Dead Than Alive Within 3 Years, Trend Shows | All signs point to a future internet where bot-driven interactions far outnumber human ones. -- 2025-09-15
New "speech" mode in Imagine... -- 2025-09-15
[vllm] Hints to run Qwen3-235B MoE on 8x AMD mixed cards! -- 2025-09-12
Inference for 24 people with a 5000€ budget -- 2025-09-12
$142 upgrade kit and spare modules turn Nvidia RTX 4090 24GB to 48GB AI card -- 2025-09-12
Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers -- 2025-09-12
Finally: 3090 Successor: 5070 Ti super 24Gb 800$ -- 2025-09-08
Fantastic pretraining optimizers and where to find them -- 2025-09-08
Qwen/Qwen3-4B-Instruct-2507 -- 2025-09-08
moonshotai/Kimi-K2-Instruct-0905 -- 2025-09-08
Tenstorrent p150a tested against RTX5090, RTX3090, A100, H100 by Russian blogger -- 2025-09-08
unsloth/gemma-3-270m-it-GGUF -- 2025-09-08
LiquidGEMM: Seems interesting -- 2025-09-07
How to use a Hugging Face embedding model in Ollama -- 2025-09-07
Wal3: A Write-Ahead Log for Chroma, Built on Object Storage -- 2025-09-07
Running LLM Locally with Ollama + RAG -- 2025-09-07
yaof20/Flash-RL -- 2025-09-07
Vulkan back ends, what do you use? -- 2025-09-06
Relaxed-System-Lab/Flash-Sparse-Attention -- 2025-09-06
Intel Files Patent for "Software Defined Super Cores" -- 2025-09-04
The Sense and Nonsense of Virtual Power Plants -- 2025-09-04
Configurable Stereo Preamp from Matrix Switch -- 2025-09-03
Raspberry Pi 5 support (OpenBSD) -- 2025-09-03
Measuring Nanoparticles by Scattering a Laser -- 2025-09-03
gpt-oss:120b running on an AMD 7800X3D CPU and a 7900XTX GPU -- 2025-09-03
unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF -- 2025-09-03
I tried almost every tts model on my ryzen 7 5000 series 16gb ram rtx 3060 laptop 6-8GB Vram -- 2025-09-02
devnen/Kitten-TTS-Server -- 2025-09-02
TheDrummer is on fire!!! -- 2025-09-01
NVIDIA-Nemotron-Nano-12B-v2 -- 2025-09-01
Sparrow: Custom language model architecture for microcontrollers like the ESP32 -- 2025-08-30
Claude Memory Lazy Method: The Graduation Path (From 4 Prompts to 1) -- 2025-08-30
Lynx-R1 Headset Makers Release 6DoF SLAM Solution As Open Source -- 2025-08-30
RTX PRO 6000 MAX-Q Blackwell for LLM -- 2025-08-28
A PLL For Perfect Pitch -- 2025-08-27
Local Inference for Very Large Models - a Look at Current Options -- 2025-08-27
Is there any way to run 100-120B MoE models at >32k context at 30 tokens/second without spending a lot? -- 2025-08-26
Right GPU for AI research -- 2025-08-26
Help me decide between these two pc builds -- 2025-08-25
Faster prefill on CPU-MoE IK-llama? -- 2025-08-23
Llamarunner, a llama.cpp manager and runner (with user presets!) -- 2025-08-23
what's "load_in_4bit" in unsloth LORA training? -- 2025-08-23
merve/smol-vision -- 2025-08-23
city96/Qwen-Image-gguf -- 2025-08-23
Menlo/Lucy-128k -- 2025-08-23
NVIDIA just accelerated output of OpenAI’s gpt-oss-120B by nearly 2x -- 2025-08-23
Is openrouters tokens per second reading super bugged? -- 2025-08-22
It’s a Pi, But it’s not Quite a Raspberry Pi -- 2025-08-22
Security Researchers Find XZ Utils Backdoored Debian Images on Docker Hub -- 2025-08-20
Open Source Lithium-Titanate Battery Management System -- 2025-08-20
Prospect Theory Fails for LLMs: Revealing Instability of Decision-Making under Epistemic Uncertainty -- 2025-08-20
Need Help: So-Vits-SVC Vibrated/Glitchy Output + Source Vocal Has Residual Music (G=98k, Diff=57k) -- 2025-08-19
GDPR meant nothing: chat control ends privacy for the EU [video] -- 2025-08-19
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels -- 2025-08-19
PSA: Don't waste time trying Gemma 3 27B on V100s - it's architecturally impossible -- 2025-08-16
People with MacBook Pro with 36gb of memory, which models you are running for coding? -- 2025-08-16
SparcStation 1+ Finally Gets Attention -- 2025-08-15
llamacpp+ROCm7 beta is now supported on Lemonade -- 2025-08-10
gpt-oss 120B runs ~13tps on laptop with igpu -- 2025-08-10
Throwing a MI50 32Gb in a gaming pc -- 2025-08-10
Suggestion for upgrading hardware for MOE inference and fine-tuning. -- 2025-08-09
Best models under 16GB?? -- 2025-08-09
Explicit tail calls are now available on Rust Nightly (become keyword) -- 2025-08-09
HuggingFaceTB/SmolLM3-3B -- 2025-08-09
mistralai/Magistral-Small-2507 -- 2025-08-09
What to do with a NVIDIA Tesla V100S 32GB GPU -- 2025-08-07
dsekz/chrome-x-browser-validation-header -- 2025-08-07
MorDavid/BruteForceAI -- 2025-08-07
Show HN: Aura – Like robots.txt, but for AI actions -- 2025-08-07
I built a GitHub scanner that automatically discovers AI tools using a new .awesome-ai.md standard I created -- 2025-08-07
Show HN: Tambo – build generative UX web apps -- 2025-08-06
Brilliant Labs Has New Smart Glasses, With a New Display -- 2025-08-06
Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ -- 2025-08-06
[Editorial] Voice ai and voice agents, howto -- 2025-08-05
[Editorial] You don’t need a WebRTC server for your voice agents -- 2025-08-05
Waiting on direct MCP integration—dev team, got a roadmap update? -- 2025-08-05
GLM-4.5 llama.cpp PR is nearing completion -- 2025-08-05
glm-4.5-Air appreciation poist - if you have not done so already, give this model a try -- 2025-08-05
peteromallet/Flux-Kontext-InScene -- 2025-08-02
A Dual-Screen Cyberdeck To Rule Them All -- 2025-08-02
NVIDIA RTX PRO 4000 Blackwell - 24GB GDDR7 -- 2025-08-02
Help for new LLM Rig -- 2025-08-02
bytillo/spyder-osint -- 2025-08-01
Secure boot certificate rollover is real but probably won't hurt you -- 2025-08-01
2025 One Hertz Challenge: RPI TinynumberHat9 -- 2025-08-01
Need help deciding on GPU options for inference -- 2025-07-31
Hey AI, Generate Me a Hardware Code! Agentic AI-based Hardware Design & Verification -- 2025-07-30
Building a quiet LLM machine for 24/7 use, is this setup overkill or smart? -- 2025-07-30
Best Local LLM + Hardware Build for Coding With a $15k Budget (2025) -- 2025-07-29
Optimizing inference on GPU + CPU -- 2025-07-29
I got Ollama models running locally and exposed them via a public API with one command -- 2025-07-29
I want to use llama 7b to check if a 5-7 sentence paragraph contains a given subject, what's the minimum GPU I need? -- 2025-07-29
Teufel Introduces an Open Source Bluetooth Speaker -- 2025-07-29
Build advice: Consumer AI workstation with RTX 3090 + dual MI50s for LLM inference and Stable Diffusion (~$5k budget) -- 2025-07-26
RTX 5090 (32GB VRAM) - Full Fine-Tuning: What Can I Expect? -- 2025-07-26
Is there a reason to prefer Nvidia over AMD for programming use cases? -- 2025-07-26
Entry GPU options - 5060 8GB enough to play with? -- 2025-07-26
How open-source models like Mistral, Devstral, and DeepSeek R1 compare for coding -- 2025-07-26
mistralai/Voxtral-Mini-3B-2507 -- 2025-07-26
Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ -- 2025-07-26
Playtron's Linux-Based GameOS Hits the Road with 1.0 -- 2025-07-26
Remembering Chiptunes, the Demoscene and the Illegal Music of Keygens -- 2025-07-26
How are people staging AI training datasets from NVMe → DDR5 → GPU VRAM for fine-tuning on RTX 5090s? -- 2025-07-25
Running Qwen3 235B-A22B 2507 on a Threadripper 3970X + 3x RTX 3090 Machine at 15 tok/s -- 2025-07-25
mistral-small3.2:latest 15B takes 28GB VRAM? -- 2025-07-25
Looking for help with terrible vLLM performance -- 2025-07-25
What upgrade option is better with $2000 available for my configuration? -- 2025-07-24
Foundry competition heats up as Japan's Rapidus says 2nm tech on track for 2027 -- 2025-07-23
AI Model Juggler automatically and transparently switches between LLM and image generation backends and models -- 2025-07-22
Looking to possibly replace my ChatGPT subscription with running a local LLM. What local models match/rival 4o? -- 2025-07-22
Nvidia GTX-1080Ti 11GB Vram -- 2025-07-22
I messed up my brother's Llama AI workstation.. looking for advice -- 2025-07-22
How do Claude Code token counts translate to “prompts” for usage limits? -- 2025-07-22
Claude is IN the files. -- 2025-07-21
Bitcoin Devs Float Proposal to Freeze Quantum-Vulnerable Addresses -- 2025-07-21
OpenSCAD: The Programmers Solid 3D CAD Modeller -- 2025-07-21
Software Defined Retro ROMs -- 2025-07-21
Arc Virtual Cell Challenge: A Primer -- 2025-07-21
Recommend hardware for my use case? -- 2025-07-20
Best Hardware Setup to Run DeepSeek-V3 670B Locally on $40K–$80K? -- 2025-07-20
e6a5/flow -- 2025-07-20
Improve Your KiCad Productivity With These Considered Shortcut Keys -- 2025-07-20
LGAI-EXAONE/EXAONE-4.0-1.2B -- 2025-07-18
This SSD Will Self Destruct in Ten Seconds… -- 2025-07-18
Locally Running AI model with Intel GPU -- 2025-07-18
Defeating Memory Leaks with Zig Allocators -- 2025-07-17
OpenDPDv2: A Unified Learning and Optimization Framework for Neural Network Digital Predistortion -- 2025-07-17
An Open-Concept 3D Printer Using Cantilever Arms -- 2025-07-17
What kind of rig would you build with a 5k budget for local LLM? -- 2025-07-16
What is your "perfect" £10,000 for Local LLM, Gaming, plex with the following conditional and context. -- 2025-07-16
How to use Claude code -- 2025-07-16
unsloth/Kimi-K2-Instruct-GGUF -- 2025-07-16
moonshotai/Kimi-K2-Base -- 2025-07-16
It's been a while, I'm out of date, suggest me a model -- 2025-07-16
i need the best local llm i can run on my gaming pc -- 2025-07-16
Arduino Saves Heat Pump -- 2025-07-16
Japan Achieves World Record 1.02 Petabits per Second Internet Speed -- 2025-07-15
Jcorp Nomad: ESP32-S3 Offline Media Server in a Thumbdrive -- 2025-07-15
Qwen3-235B-A22B @ 0.7t/s. Hardware or configuration bottleneck? -- 2025-07-15
Building a silent, budget 4-GPU LLM workstation—1×3090 + 3×P40, need advice -- 2025-07-15
Enough resources for light AI workloads? -- 2025-07-15
What can I expect from current amd igpu performance? -- 2025-07-15
Nvidia RTX Pro 6000 (96 Gb) vs Apple M3 Ultra (512 Gb) -- 2025-07-14
How fast is inference when utilizing DDR5 and PCIe 5.0x16? -- 2025-07-14
DIY Navigation System Floats this Boat -- 2025-07-12
Show HN: Pangolin – Open source alternative to Cloudflare Tunnels -- 2025-07-11
SUS Lang: The SUS Hardware Description Language -- 2025-07-11
Embedded USB Debug for Snapdragon -- 2025-07-11
Creating custom kernels for the AMD MI300 -- 2025-07-10
How are you selecting LLMs? -- 2025-07-10
Getting started with local AI -- 2025-07-10
How are commercial dense models so much faster? -- 2025-07-09
Best model for a RX 6950xt? -- 2025-07-07
SSD Upgrade for Mac Mini M4 -- 2025-07-06
Nvidia DGX Spark - what's the catch? -- 2025-07-06
Kyutai TTS is here: Real-time, voice-cloning, ultra-low-latency TTS, Robust Longform generation -- 2025-07-04
Privacy preserving ChatGPT/Claude voice mode alternative -- 2025-07-04
Subpixel Rendering For Impossibly Small Terminal Text -- 2025-07-04
Apple Intelligence on device model available to developers -- 2025-07-03
Is AMD Ryzen AI Max+ 395 really the only consumer option for running Llama 70B locally? -- 2025-07-02
Ollama - Windows 11 > LXC Docker - Openwebui = constant BSOD with RTX 5090 Ventus on driver 576.80 -- 2025-07-02
Running Open WebUI with NVIDIA GPU Support? -- 2025-07-02
martinbowling/thinkchain -- 2025-06-25
WireGuard vanity keygen -- 2025-06-25
zeptoforth: A not-so-small Forth for ARM Cortex-M -- 2025-06-25
Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone -- 2025-06-21
Optimized Chatterbox TTS (Up to 2-4x non-batched speedup) -- 2025-06-20
[Discussion] Thinking Without Words: Continuous latent reasoning for local LLaMA inference – feedback? -- 2025-06-20
Created a more accurate local speech-to-text tool for your Mac -- 2025-06-20
Attention by Hand - Practice attention mechanism on an interactive webpage -- 2025-06-20
Major update to my voice extractor (speech dataset creation program) -- 2025-06-20
Building a Text Adventure Game with Persistent AI Agents Using Ollama -- 2025-06-20
Azure OpenAI with latest version of NVIDIA'S Nemo Guardrails throwing error -- 2025-06-20
GitHub RAG MCP Server - A GitIngest alternative for any IDE -- 2025-06-20
Ruby on Rails Audit Complete -- 2025-06-20
largest context window model for 24GB VRAM? -- 2025-06-20
100ps time resolution with thin silicon pixel detectors and a SiGe HBT amplifier -- 2025-06-18
0-$\pi$ quantum transition in a carbon nanotube Josephson junction: universal phase dependence and orbital degeneracy -- 2025-06-18
Learning (The Basics of) Nftables -- 2025-06-18
Linux Cgroup from First Principles -- 2025-06-18
0.82 um 105 W diode-pumped thulium-doped all silica fiber laser -- 2025-06-17
Chinese AI firms smuggling suitcases full of hard drives to dodge US chip curbs -- 2025-06-17
UPDATE: Inference needs nontrivial amount of PCIe bandwidth (8x RTX 3090 rig, tensor parallelism) -- 2025-06-16
IQ1_Smol_Boi -- 2025-06-16
Qwen releases official MLX quants for Qwen3 models in 4 quantization levels: 4bit, 6bit, 8bit, and BF16 -- 2025-06-16
Seeking Help Setting Up a Local LLM Assistant for TTRPG Worldbuilding + RAG on Windows 11 -- 2025-06-16
New VS Code Pair Programming Extension, Need Help Testing -- 2025-06-16
Claude-Trace -- 2025-06-16
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training -- 2025-06-16
best fine tuned local LLM for Github Copilot Agent specificaly -- 2025-06-16
A Simulation in C++ of Joseph Weizenbaum's 1966 Eliza -- 2025-06-16
0D-2D Heterostructure for making very Large Quantum Registers using itinerant Bose-Einstein Condensate of Excitons -- 2025-06-16
100-mJ class, sub-two-cycle, carrier-envelope phase-stable dual-chirped optical parametric amplification -- 2025-06-16
Olow304/memvid -- 2025-06-14
mistralai/Magistral-Small-2506_gguf -- 2025-06-14
Sublime Text Build 4200 and Future Plugin Changes -- 2025-06-14
Carimbo: Minimal 2D game engine in modern C++20 with SDL, scriptable in Lua -- 2025-06-14
1000 Days to First Light: Construction of the Perth-Lowell Telescope Facility 1968-71 -- 2025-06-14
Semantic Search Demo Using Qwen3 0.6B Embedding (w/o reranker) in-browser Using transformers.js -- 2025-06-13
2x Instinct MI50 32G running vLLM results -- 2025-06-13
Self-hosted GitHub Copilot via Ollama – Dual RTX 4090 vs. Chained M4 Mac Minis -- 2025-06-13
Testing Claude, OpenAI and AI21 Studio for long context RAG assistant in enterprise -- 2025-06-13
Deepseek-R1-0528 MLX 4 bit quant up -- 2025-06-13
Open Source iOS OLLAMA Client -- 2025-06-13
OpenPOWER Foundation – Open-Source / Open Hardware PowerPC CPU ISA -- 2025-06-13
TIL: timeout in Bash scripts -- 2025-06-13
3x Modded 4090 48GB or RTX Pro 6000? -- 2025-06-13
KwaiCoder-AutoThink-preview is a Good Model for Creative Writing! Any Idea about Coding and Math? Your Thoughts? -- 2025-06-13
Faulty 120W charger analysis (Anker GAN Prime) [video] -- 2025-06-13
maomaocun/dLLM-cache -- 2025-06-13
AIR-THU/Asyncdriver-Tensorrt -- 2025-06-13
System Prompt Learning: Teaching your local LLMs to learn problem-solving strategies from experience (optillm plugin) -- 2025-06-11
GitHub - som1tokmynam/FusionQuant: FusionQuant Model Merge & GGUF Conversion Pipeline - Your Free Toolkit for Custom LLMs! -- 2025-06-11
Semantic Search PoC for Hugging Face – Now with Parameter Size Filters (0-1B to 70B+) -- 2025-06-11
Has anyone had success implementing a local FIM model? -- 2025-06-11
Which agent-like terminal do you guys use? Something like Warp but free. -- 2025-06-11
Rocm or vulkan support for AMD Radeon 780M? -- 2025-06-11
What are the most important stages to learn ML properly, step by step? -- 2025-06-11
What's the best open source coding agent as of now that can be run locally and can even test the created APIs by running the application and calling the endpoinst with various payloads? -- 2025-06-11
Gemini 2.5: Our most intelligent models are getting even better -- 2025-06-11
Hugging Face unveils two new humanoid robots -- 2025-06-11
Quick reference: Configure Ollama, Open WebUI installation paths in Windows 11 -- 2025-06-11
Is there an alternative to LM Studio with first class support for MLX models? -- 2025-06-11
0.52 V-mm ITO-based Mach-Zehnder Modulator in Silicon Photonics -- 2025-06-10
100 GHz Micrometer compact broadband Monolithic ITO Mach Zehnder Interferometer Modulator enabling 3500 times higher Packing Density -- 2025-06-06
0-$\pi$ phase-controllable $thermal$ Josephson junction -- 2025-06-06
ByteDance Bagel 14B MOE (7B active) Multimodal with image generation (open source, apache license) -- 2025-06-05
VLLM with 4x7900xtx with Qwen3-235B-A22B-UD-Q2_K_XL -- 2025-06-05
I would really like to start digging deeper into LLMs. If I have $1500-$2000 to spend, what hardware setup would you recommend assuming I have nothing currently. -- 2025-06-05
Having trouble getting to 1-2req/s with vllm and Qwen3 30B-A3B -- 2025-06-05
Trying to get to 24gb of vram - what are some sane options? -- 2025-06-05
Locally downloading Qwen pretrained weights for finetuning -- 2025-06-05
Web Application Frameworks Best Suited for AI Coding Assistants - putting the chicken before the egg. -- 2025-06-05
Debian AI General Resolution Withdrawn -- 2025-06-05
Authors Are Accidentally Leaving AI Prompts in Their Novels -- 2025-06-05
Cancelling internet & switching to a LLM: what is the optimal model? -- 2025-06-05
Yappus. Your Terminal Just Started Talking Back (The Fuck, but Better) -- 2025-06-03
GPU consideration: AMD Pro W7800 -- 2025-06-03
Thoughts on which open source is best for what use-cases -- 2025-06-03
I accidentally too many P100 -- 2025-06-03
Has anyone come across a good (open source) -- 2025-06-03
Need Suggestions regarding ML Laptop Configuration -- 2025-06-03
Web search tool - bing decommissioning -- 2025-06-03
LLM function calls don't scale; code orchestration is simpler, more effective -- 2025-06-03
GitHub issues is almost the best notebook in the world -- 2025-06-03
Strengths and limitations of diffusion language models -- 2025-06-03
How LLM uses MCP tools setup in OpenWebUI ? -- 2025-06-03
Database_url string for mysql -- 2025-06-03
10,000 km Straight-line Transmission using a Real-time Software-defined GPU-Based Receiver -- 2025-06-02
An Almost Pointless Exercise in GPU Optimization -- 2025-05-31
The Windows Registry Adventure #7: Attack surface analysis -- 2025-05-31
DuckLake: SQL as a Lakehouse Format -- 2025-05-31
1000x Faster Camera and Machine Vision with Ordinary Devices -- 2025-05-31
0.75 Gbit/s high-speed classical key distribution with mode-shift keying chaos synchronization of Fabry-Perot lasers -- 2025-05-31
Building a plug-and-play vector store for any data stream (text, audio, video, etc.)—searchable by your LLM via MCP -- 2025-05-29
Building a real-world LLM agent with open-source models—structure > prompt engineering -- 2025-05-29
New LocalLLM Hardware complete -- 2025-05-29
Parameter-Efficient Fine-Tuning (PEFT) Explained -- 2025-05-29
LLM help for recovering deleted data? -- 2025-05-29
AI Runner v4.10.0 Release Notes -- 2025-05-29
Unpopular opinion: RAG is actively hurting your coding agents -- 2025-05-29
Teal – A statically-typed dialect of Lua -- 2025-05-29
I think it's time to give Nix a chance -- 2025-05-29
deepseek-ai/DeepSeek-R1-0528 -- 2025-05-29
AM5 or TRX4 for local LLMs? -- 2025-05-29
Catalog of Novel Operating Systems -- 2025-05-28