Model Architecture

Transformers, mixture of experts, attention mechanisms, model design

576 articles across 159 editions

Articles

  1. lordx64/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled -- 2026-05-22
  2. Jackrong/Qwopus3.5-9B-Coder-GGUF · Hugging Face -- 2026-05-22
  3. Qwen 3.6 35B GGUF: NTP vs MTP quantization results across GPUs and CPUs -- 2026-05-22
  4. [WIP] Gemma 4 MTP -- 2026-05-22
  5. Lemonade v10.5.1: an MTP + ROCm 7.13 quick start for Strix Halo -- 2026-05-22
  6. gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic is Out Now -- 2026-05-22
  7. Gemma-4-Gembrain-31B-it-uncensored-heretic Is Out Now -- 2026-05-22
  8. Newbie vibe coding experience: Shifting from Claude Sonnet 4.6 to Qwen3.6-35B-A3B-UD-Q6_K -- 2026-05-22
  9. HalBench: Open Sycophancy and Hallucination Benchmark — 12,800 Graded Responses -- 2026-05-21
  10. DystopiaBench: 42 LLMs tested on willingness to build dystopian systems -- 2026-05-21
  11. Guardrails take an 8B model from 53% to 99% on agentic tasks [ACM CAIS '26 preprint] -- 2026-05-21
  12. bytedance released an open source model that attempts to do just about anything with only 3b parameters -- 2026-05-19
  13. Sapient Intelligence releases HRM-Text 1B: 40B tokens, ~$1k pretrain, beats Llama3.2 3B on MATH and DROP -- 2026-05-19
  14. PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend -- 2026-05-19
  15. [Editorial] -- 2026-05-18
  16. [Editorial] -- 2026-05-18
  17. [Editorial] -- 2026-05-18
  18. [Editorial] -- 2026-05-18
  19. EMO: Pretraining mixture of experts for emergent modularity -- 2026-05-13
  20. [Editorial] -- 2026-05-13
  21. iai-mcp: Persistent Claude Memory Daemon — 5 Months of Daily Use, Now Open Source -- 2026-05-08
  22. [Editorial] Claude Skins — Custom Claude Code Personas -- 2026-05-08
  23. [Editorial] AI Agents Projects & Tutorials Collection -- 2026-05-08
  24. "Second Thoughts" — A small transformer reads output and feeds it back as a refinement loop, drastically improving a 1.7B model's coding -- 2026-05-06
  25. A plug-n-play open-source pruning tool that is workload-aware (Sculpt) -- 2026-05-06
  26. "I" is not singular — 4 LLM agents with per-agent LoRA on a single RTX 3070 8GB -- 2026-05-06
  27. I made a visualizer for Hugging Face models (hfviewer.com) -- 2026-05-06
  28. The Road to a Billion-Token Context -- 2026-05-06
  29. How OpenAI delivers low-latency voice AI at scale -- 2026-05-06
  30. ArmSSL: Adversarial Robust Black-Box Watermarking for Self-Supervised Learning Pre-trained Encoders -- 2026-04-28
  31. GTFOBins -- 2026-04-28
  32. Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models -- 2026-04-24
  33. Switching from Opus 4.7 to Qwen-35B-A3B -- 2026-04-24
  34. Forgive my ignorance but how is a 27B model better than 397B? -- 2026-04-24
  35. NSA is using Anthropic's Mythos despite blacklist -- 2026-04-22
  36. [Editorial] Mad Bugs: Reverse Engineering Deep Dive -- 2026-04-22
  37. Anyone deployed Kimi K2.6 on their local hardware? -- 2026-04-21
  38. 24/7 Headless AI Server on Xiaomi 12 Pro (Guide & Benchmarks) Gemma4 VS Qwen2.5 -- 2026-04-21
  39. Gemm4:e4B-IT good at instructions following no refusals. -- 2026-04-21
  40. [Editorial] arxiv:2506.02153 — AI Research -- 2026-04-20
  41. [Editorial] arxiv:1503.02531 — Classic ML/AI Paper -- 2026-04-20
  42. [Editorial] arxiv:2604.06169 — Recent AI Research -- 2026-04-20
  43. [Editorial] Why Your LLM Is Slow (and the Eight Fixes) -- 2026-04-20
  44. Hot Experts in your VRAM! Dynamic expert cache in llama.cpp for 27% faster token generation -- 2026-04-20
  45. [Editorial] Schmidhuber: The World Model Boom -- 2026-04-20
  46. [Editorial] Schmidhuber: World Models, Planning & Curiosity (1990 Origins) -- 2026-04-20
  47. [Editorial] The Ontology Problem (Technical) — Kurt Cagle -- 2026-04-20
  48. Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database. -- 2026-04-16
  49. [Editorial] Looped LLMs Are the Nuclear Fusion of AI -- 2026-04-16
  50. Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems -- 2026-04-14
  51. elastic/supply-chain-monitor -- 2026-04-14
  52. [Editorial] UK AI Security Institute + Claude -- 2026-04-14
  53. [Editorial] -- 2026-04-13
  54. [Editorial] -- 2026-04-13
  55. Muse Spark: Scaling towards personal superintelligence -- 2026-04-13
  56. [Editorial] -- 2026-04-13
  57. [Editorial] -- 2026-04-13
  58. Abliterating Qwen3.5-397B on a Mac Studio revealed that MoE models encode refusal differently than dense models — safety refusals route through expert selection and survive weight-baking -- 2026-04-09
  59. Finally Abliterated Sarvam 30B and 105B! -- 2026-04-09
  60. Qwen3.6-Plus -- 2026-04-07
  61. Omnivoice - 600+ Language Open-Source TTS with Voice Cloning and Design -- 2026-04-07
  62. [Editorial] Generative Models and High-Quality Output -- 2026-04-07
  63. Falcon Perception -- 2026-04-02
  64. TRL v1.0: Post-Training Library Built to Move with the Field -- 2026-04-02
  65. lucas-maes/le-wm -- 2026-04-02
  66. Toward explaining why traditional ablation/abliteration works -- 2026-04-02
  67. [Editorial] ramimac/teampcp -- 2026-03-27
  68. This Week in Security: Second Verse, Worse Than the First -- 2026-03-27
  69. Towards a Neural Debugger for Python -- 2026-03-26
  70. Mathematics behind extreme quantization of Microsoft's BitNet -- 2026-03-26
  71. A.T.L.A.S - Adaptive Test-time Learning and Autonomous Specialization -- 2026-03-26
  72. Show HN: Three new Kitten TTS models – smallest less than 25MB -- 2026-03-23
  73. Flash-MoE: Running a 397B Parameter Model on a Laptop -- 2026-03-23
  74. [Editorial] Distillation Techniques -- 2026-03-23
  75. Jackrong/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-GGUF -- 2026-03-23
  76. Local Qwen 8B + 4B completes browser automation by replanning one step at a time -- 2026-03-23
  77. Does imatrix calibration data affect writing style? I ran a blind-scored experiment -- 2026-03-23
  78. Memory Chip Crunch to Persist Until 2030, SK Hynix Chairman Says -- 2026-03-19
  79. Meta's renewed commitment to jemalloc -- 2026-03-19
  80. We all had p2p wrong with vllm so I rtfm -- 2026-03-19
  81. [Editorial] LLM Architecture Gallery -- 2026-03-17
  82. Mistral small 4 PR on transformers. -- 2026-03-17
  83. I spent a weekend doing layer surgery on 6 different model architectures. There's a "danger zone" at ~50-56% depth. -- 2026-03-17
  84. Nemotron 3 Super and the no free lunch problem -- 2026-03-17
  85. Timber: AOT Compiler Turns XGBoost/sklearn/ONNX into Native C99 — 336x Faster -- 2026-03-13
  86. [Editorial] Living on the Edge: Advancements in Model Deployment -- 2026-03-13
  87. [Editorial] Translating a PDF While Preserving Layout -- 2026-03-13
  88. Heretic Defeats GPT-OSS with Arbitrary-Rank Ablation (ARA) Decensoring -- 2026-03-11
  89. Fat Fish — A Proper Upscale and Prune of Mistral Nemo -- 2026-03-11
  90. Optimizing Qwen3 Coder for RTX 5090 and PRO 6000 — Community Benchmarking Infrastructure -- 2026-03-11
  91. Yann LeCun's AMI Labs Raises $1.03 Billion to Build World Models -- 2026-03-11
  92. [Editorial] Bell Labs Solved Prompt Injection in 1976 -- 2026-02-27
  93. [Editorial] GitHub Copilot Exploited -- 2026-02-27
  94. [Editorial] Stop Telling Your AI Agent What Not to Do -- 2026-02-27
  95. [Editorial] AI MCP Integration Patterns -- 2026-02-27
  96. Turbo Connection: Reasoning as Information Flow from Higher to Lower Layers -- 2026-02-25
  97. O(1) Inference and Causal Monoid State Compression in Spartacus-1B -- 2026-02-25
  98. Sink-Aware Pruning for Diffusion Language Models -- 2026-02-25
  99. Liquid AI releases LFM2-24B-A2B -- 2026-02-24
  100. Qwen3's most underrated feature: Voice embeddings -- 2026-02-24
  101. After many contributions craft, Crane now officially supports Qwen3-TTS! -- 2026-02-24
  102. We tested the same INT8 model on 5 Snapdragon chipsets. Accuracy ranged from 93% to 71%. Same weights, same ONNX file. -- 2026-02-24
  103. [Editorial] https://www-cdn.anthropic.com/f21d93f21602ead5cdbecb8c8e1c765759d9e232.pdf -- 2026-02-12
  104. [Editorial] https://d3lm.medium.com/overly-agentic-why-anthropic-is-worried-about-opus-4-6-17eee0f8e5cd -- 2026-02-12
  105. [Editorial] https://www.linkedin.com/posts/avipil_i-got-my-first-bill-after-switching-to-claude-activity-7427320523870629889-vM5K -- 2026-02-12
  106. Pros/Cons and use case for bypassing permissions -- 2026-02-12
  107. [Editorial] https://www.kylerush.org/posts/opus-4-5-really-changed-things -- 2026-02-10
  108. [Editorial] https://www.linkedin.com/posts/ownyourai_i-taught-my-claude-code-to-swallow-a-32m-activity-7426902541868728321-2Z36 -- 2026-02-10
  109. [Editorial] https://docs.google.com/document/d/1I9r21TyQuAO1y2ecztBU0PSCpjHSL_vZJiA5v276Wro/mobilebasic -- 2026-02-10
  110. [Editorial] https://www.linkedin.com/posts/cole-medin-727752184_claude-codes-new-agent-teams-feature-is-share-7426633806792609792-pT3s -- 2026-02-10
  111. [Tool] claude-config-sync: Sync your Claude Code configuration across machines using GitHub Gists -- 2026-02-10
  112. [Editorial] https://www.linkedin.com/posts/ownyourai_i-just-woke-up-to-qwen3-coder-next-80b-activity-7424703876240695297-Nlqf -- 2026-02-04
  113. LiquidAI/LFM2.5-1.2B-Thinking -- 2026-02-04
  114. ByteDance-Seed/Stable-DiffCoder-8B-Instruct · Hugging Face -- 2026-02-03
  115. tencent/Youtu-VL-4B-Instruct -- 2026-02-03
  116. NousResearch/NousCoder-14B -- 2026-02-03
  117. transformers v5 final is out 🔥 -- 2026-01-30
  118. PaddlePaddle/PaddleOCR-VL-1.5 -- 2026-01-30
  119. zai-org/GLM-4.7 -- 2026-01-30
  120. MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models -- 2026-01-30
  121. Sharing my set of distilled small language models (3B) + training data in more than 50 low-resource languages -- 2026-01-30
  122. Introducing Kimi K2.5, Open-Source Visual Agentic Intelligence -- 2026-01-28
  123. ~60GB models on coding: GLM 4.7 Flash vs. GPT OSS 120B vs. Qwen3 Coder 30B -- your comparisons? -- 2026-01-28
  124. openbmb/AgentCPM-Report -- 2026-01-28
  125. Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice -- 2026-01-28
  126. Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation -- 2026-01-28
  127. One Year Since the “DeepSeek Moment” -- 2026-01-28
  128. Qwen/Qwen-Image-2512 -- 2026-01-28
  129. [Editorial] https://www.linkedin.com/posts/ownyourai_deepseek-just-released-the-first-vision-ai-activity-7421818927657385987-V1yo -- 2026-01-27
  130. Unsloth announces support for finetuning embedding models -- 2026-01-27
  131. Qwen/Qwen3-TTS-12Hz-0.6B-Base -- 2026-01-27
  132. Qwen/Qwen3-VL-Reranker-2B -- 2026-01-27
  133. [Editorial] https://www.linkedin.com/posts/ivandj_early-claims-around-self-evolving-memory-activity-7421307316437676033-l0Jm -- 2026-01-26
  134. [Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-ruvector-world-model-activity-7421556928910290944-cx4v -- 2026-01-26
  135. stepfun-ai/Step3-VL-10B -- 2026-01-26
  136. Qwen/Qwen3-VL-Embedding-2B -- 2026-01-21
  137. Phr00t/Qwen-Image-Edit-Rapid-AIO -- 2026-01-21
  138. Qwen/Qwen3-VL-Reranker-8B -- 2026-01-21
  139. Bartowski comes through again. GLM 4.7 flash GGUF -- 2026-01-21
  140. Step-Audio-R1.1 (Open Weight) by StepFun just set a new SOTA on the Artificial Analysis Speech Reasoning leaderboard -- 2026-01-20
  141. GLM-4.7-Flash -- 2026-01-20
  142. stepfun-ai/Step-Audio-R1.1 -- 2026-01-20
  143. tencent/HY-MT1.5-1.8B -- 2026-01-20
  144. Introducing GLM-Image -- 2026-01-14
  145. GPT-OSS -> MLA conversion breakthrough (20B), still looking for compute + collaborators -- 2026-01-14
  146. FrogBoss 32B and FrogMini 14B from Microsoft -- 2026-01-14
  147. Qwen/Qwen3-VL-Embedding-8B -- 2026-01-14
  148. MCP for Financial Ontology! -- 2026-01-09
  149. A community index for MCPs that don’t disappear after the thread ends -- 2026-01-09
  150. tencent/HY-WorldPlay -- 2026-01-09
  151. meituan-longcat/LongCat-Image -- 2026-01-09
  152. Introducing Falcon H1R 7B -- 2026-01-09
  153. [Editorial] https://www.linkedin.com/posts/ernst-van-gassen-9196a7b5_we-spend-a-lot-of-time-trying-to-make-prompts-ugcPost-7414966440023482368-dgh5 -- 2026-01-09
  154. [Editorial] https://leakhub.ai/ -- 2026-01-09
  155. [Editorial] https://www.linkedin.com/posts/hardmaru_survival-of-the-fittest-code-blog-https-activity-7415068590485458944-3tqP -- 2026-01-09
  156. A closer look at a BGP anomaly in Venezuela -- 2026-01-09
  157. Show HN: I visualized the entire history of Citi Bike in the browser -- 2026-01-09
  158. Modifying a QingPing Air Quality Monitor for Local MQTT Access -- 2026-01-09
  159. [Editorial] https://github.com/hiyouga/LlamaFactory -- 2026-01-07
  160. Tongyi-MAI/MAI-UI-8B · Hugging Face -- 2026-01-07
  161. MultiverseComputingCAI/HyperNova-60B · Hugging Face -- 2026-01-07
  162. [Editorial] https://www.alwaysfurther.ai/blog/train-4b-model-to-beat-claude-sonnet-gemini -- 2026-01-05
  163. [Experimental] Gemma 3 4B - Dark CoT: Pushing 4B Reasoning to 33%+ on GPQA Diamond -- 2026-01-05
  164. Youtu-LLM-2B-GGUF is here! -- 2026-01-05
  165. upstage/Solar-Open-100B -- 2026-01-05
  166. EditMGT — fast, localized image editing with Masked Generative Transformers -- 2025-12-30
  167. Francis-Rings/FlashPortrait -- 2025-12-30
  168. zai-org/GLM-TTS -- 2025-12-30
  169. Flowception: Temporally Expansive Flow Matching for Video Generation -- 2025-12-30
  170. I've been experimenting with SLM's a lot recently. My goal was to prove even SLMs can be accurate with the right architecture behind it. -- 2025-12-22
  171. MBZUAI releases K2-V2 - 70B fully open model. -- 2025-12-22
  172. microsoft/Fara-7B -- 2025-12-22
  173. Alibaba Tongyi Open Sources Two Audio Models: Fun-CosyVoice 3.0 (TTS) and Fun-ASR-Nano-2512 (ASR) -- 2025-12-19
  174. My professor lent me an A6000, so I tried to build a coding model. Here is Anni! (Qwen3-14B Fine-tune) -- 2025-12-19
  175. Two years ago, I was just a math major. Now I've built the 1.5B router model used by HuggingFace. Can I bring it to Cursor? -- 2025-12-19
  176. Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 -- 2025-12-16
  177. Did I overhype Claude Code? GPT + Comet are quietly beating it for me -- 2025-12-16
  178. 🚀 New: Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B -- 2025-12-16
  179. ByteDance/Dolphin-v2 -- 2025-12-16
  180. zai-org/AutoGLM-Phone-9B-Multilingual -- 2025-12-16
  181. [Editorial] https://www.linkedin.com/posts/eric-vyacheslav-156273169_a-7m-model-just-surpassed-deepseek-r1-gemini-activity-7405985266043297792-s1Jn -- 2025-12-15
  182. Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models -- 2025-12-15
  183. New in llama.cpp: Model Management -- 2025-12-12
  184. Successful prototype component prompt - For big projects -- 2025-12-12
  185. Building RNJ-1: What makes It different from Gemma 3? -- 2025-12-11
  186. EssentialAI/rnj-1 -- 2025-12-11
  187. From Azure Functions to FreeBSD -- 2025-12-10
  188. [Editorial] https://www.linkedin.com/posts/dakharlamov_too-slow-thats-what-they-called-engineers-activity-7403964370390646784-J27P -- 2025-12-09
  189. [Editorial] https://arxiv.org/abs/2511.20920 -- 2025-12-09
  190. [Editorial] https://www.linkedin.com/posts/anthony-alcaraz-b80763155_your-ai-agents-context-window-is-not-a-database-activity-7403394612893024256-9S87 -- 2025-12-09
  191. [Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-ruvector-postgres-a-self-learning-activity-7403841311029837824-jogp -- 2025-12-09
  192. Emacs is my new window manager -- 2025-12-08
  193. shubh-io/DockMate -- 2025-12-08
  194. Comfy-Org/HunyuanVideo_1.5_repackaged -- 2025-12-08
  195. We were tired of guessing which local model to use for which query. built a speculative execution lib that figures it out (github) -- 2025-12-05
  196. Nimony (eventually Nim 3.0) Design Principles -- 2025-12-03
  197. nvidia/Orchestrator-8B · Hugging Face -- 2025-12-03
  198. allenai/Olmo-3-1125-32B -- 2025-12-03
  199. moonshotai/Kimi-Linear-48B-A3B-Instruct -- 2025-12-02
  200. orabazes/FLUX.2-dev-GGUF -- 2025-12-02
  201. Transformers v5: Simple model definitions powering the AI ecosystem -- 2025-12-02
  202. Pocketbase – open-source realtime back end in 1 file -- 2025-11-28
  203. [Editorial] https://www.rand.org/pubs/commentary/2025/11/electromagnetic-warfare-natos-blind-spot-could-decide.html -- 2025-11-28
  204. In depth analysis of Nvidia's Jet Nemotron models -- 2025-11-26
  205. Hidden causes of LLM latency, its not just the model size -- 2025-11-26
  206. Benchmark: Self-Hosted Qwen-30B (LoRA) vs. Llama-3.1-8B vs. GPT-4.1-nano. Comparison of parsing success rates and negative constraints. -- 2025-11-25
  207. [Editorial] https://arxiv.org/html/2510.04871v1 -- 2025-11-24
  208. [Editorial] https://www.linkedin.com/posts/ingason_agenticai-erp-aiagents-activity-7396551539538141185-PxNU -- 2025-11-24
  209. facebook/sam3 -- 2025-11-24
  210. rl-research/DR-Tulu-8B -- 2025-11-24
  211. Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning -- 2025-11-24
  212. marinero4972/Open-o3-Video -- 2025-11-17
  213. nanonets/Nanonets-OCR2-3B -- 2025-11-17
  214. nvidia/ChronoEdit-14B-Diffusers-Upscaler-Lora -- 2025-11-17
  215. ibm-granite/granite-4.0-h-350m -- 2025-11-14
  216. tencent/HunyuanWorld-Mirror -- 2025-11-14
  217. [Editorial] https://www.npmjs.com/package/neural-trader -- 2025-11-14
  218. Critical RCE patched in Imunify360 affects up to 50M+ websites -- 2025-11-14
  219. Kubernetes Ingress Nginx is retiring -- 2025-11-14
  220. About KeePassXC's Code Quality Control -- 2025-11-14
  221. Visualizing Quantization Types -- 2025-11-11
  222. Co-authored a book called "Build DeepSeek from Scratch" | Live Now -- 2025-11-11
  223. Writing your own BEAM -- 2025-11-11
  224. DIY Powerwall Blows Clouds, Competition Out of the Water -- 2025-11-11
  225. BAAI/Emu3.5-Image -- 2025-11-07
  226. Riemannian Optimization for LoRA on the Stiefel Manifold -- 2025-11-07
  227. "On-the-fly" code reviews with ollama. It kinda works.. -- 2025-11-07
  228. I Built an "AI Art Director" Agent to Orchestrate Image and Video Models. -- 2025-11-07
  229. What's your most unexpected Claude workflow discovery? -- 2025-11-07
  230. [Research] Cross-Stage Vulnerabilities in Large Language Model Architectures -- 2025-11-07
  231. runZeroInc/runZeroHound -- 2025-11-07
  232. openai/gpt-oss-safeguard-120b -- 2025-11-07
  233. Why does Image Recognition work in llama-server but not through Open WebUI? -- 2025-11-06
  234. Has anyone tested ollama on Whisplay HAT with Raspberry pi zero 2W? -- 2025-11-06
  235. allenai/olmOCR-2-7B-1025 -- 2025-11-06
  236. is there simple way like .bat to compress to q4-q8 like Unsloth, Qwen3-VL-30B-A3B-Thinking-abliterated model -- 2025-11-06
  237. MiniMax M2 Llama.cpp support merged -- 2025-11-05
  238. Which AI IDE should I use under $20/month? -- 2025-11-05
  239. lzA6/video-to-txt -- 2025-11-05
  240. xaviviro/python-toon -- 2025-11-05
  241. [Editorial] https://www.evokesecurity.com/blogs/prompt-injection-is-for-everyone -- 2025-11-04
  242. Found a remote file inclusion vulnerability in an AI-generated app before launch -- 2025-11-04
  243. Launch HN: Propolis (YC X25) – Browser agents that QA your web app autonomously -- 2025-11-04
  244. Attacking macOS XPC Helpers: Protocol Reverse Engineering and Interface Analysis -- 2025-11-04
  245. [Research] Unvalidated Trust: Cross-Stage Failure Modes in LLM/agent pipelines arXiv -- 2025-11-04
  246. [Editorial] Context Engineering Handbook -- 2025-11-03
  247. Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
  248. OSS alternative to Open WebUI - ChatGPT-like UI, API and CLI -- 2025-11-03
  249. Looking for a RAG UI manager to meet our needs to replace Zapier -- 2025-11-03
  250. LiquidAI/LFM2-1.2B-Extract -- 2025-11-03
  251. DeepSeek may have found a new way to improve AI’s ability to remember -- 2025-11-02
  252. Reliable Unlearning Harmful Information in LLMs with Metamorphosis Representation Projection -- 2025-11-02
  253. OpenAI: gpt-oss-safeguard: two open-weight reasoning models built for safety classification (Now on Hugging Face) -- 2025-10-31
  254. briaai/FIBO -- 2025-10-31
  255. snowyfizz/Vision-Detection-API -- 2025-10-29
  256. Show HN: Apache Fory Rust – 10-20x faster serialization than JSON/Protobuf -- 2025-10-29
  257. huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning -- 2025-10-29
  258. AlphaXiv,Compare the Deepseek-OCR and Mistral-OCR OCR models -- 2025-10-26
  259. Open-Bee/Bee-8B-RL -- 2025-10-26
  260. datalab-to/chandra -- 2025-10-26
  261. Unlock the power of images with AI Sheets -- 2025-10-26
  262. Picture in Picture / Webcam detect model on HuggingFace -- 2025-10-25
  263. Show HN: Story Keeper – AI agents with narrative continuity instead of memory -- 2025-10-25
  264. Show HN: Deta Surf – An open source and local-first AI notebook -- 2025-10-25
  265. Show HN: Tommy – Turn ESP32 devices into through-wall motion sensors -- 2025-10-25
  266. 20x Max Plan (€216) takes 2% of weekly Opus usage for a single Deep Research Query. That equals 50 per week if you use it ONLY for this and never continue or respond -- 2025-10-24
  267. I got tired of OpenAI dependency. Built a multi-LLM control center instead. -- 2025-10-19
  268. Turn ChatGPT into a real-time meeting assistant (via MCP + Apps SDK) -- 2025-10-19
  269. Claude Code taking a coffee break 🤔 -- 2025-10-19
  270. Show HN: Cmux – Coding Agent Multiplexer -- 2025-10-19
  271. [Editorial] Agentic Orchestration -- 2025-10-19
  272. Chicken Squisher 3000: Squish-Proof Security -- 2025-10-19
  273. Show HN: Largest open-source multimodal AI dataset -- 2025-10-18
  274. KORMo-Team/KORMo-10B-sft -- 2025-10-18
  275. ByteDance/FaceCLIP -- 2025-10-18
  276. Do you use multiple AI models for coding? Trying to validate a workflow problem -- 2025-10-17
  277. The “Compounding Engineering” mindset changed how I think about AI coding tools -- 2025-10-17
  278. Show HN: Rebuilt Bible search app to run 100% client-side with Transformers.js -- 2025-10-13
  279. swiss-ai/Apertus-8B-Instruct-2509 -- 2025-10-13
  280. A Childhood Dream, Created and Open Sourced -- 2025-10-13
  281. Some small tools for you - Ollama Managment UI, Passkey authentication proxy -- 2025-10-12
  282. Single prompt I run after git commit (before push) for AI diff/commit review -- 2025-10-12
  283. Show HN: Gitcasso – Syntax Highlighting and Draft Recovery for GitHub Comments -- 2025-10-12
  284. simonw/claude-skills -- 2025-10-12
  285. Custom models don't work after v0.6.33 update - Anyone else? -- 2025-10-12
  286. Pardus AI: Open source AI Assistant thanks for the help with Ollama -- 2025-10-09
  287. What to use for refactoring -- 2025-10-09
  288. Claude Code finally has a planning partner — I built an AI backlog manager for solo devs -- 2025-10-09
  289. Granite4 Small-h 32b-A9b (Q4_K_M) at FULL 1M context window is using only 73GB of VRAM - Life is good! -- 2025-10-09
  290. Run Open AI GPT-OSS on a mobile phone (Demo) -- 2025-10-09
  291. AI21 releases Jamba 3B, the tiny model outperforming Qwen 3 4B and IBM Granite 4 Micro! -- 2025-10-09
  292. inclusionAI/Ling-mini-2.0 -- 2025-10-09
  293. [Editorial] The Tiny Recursive Mode -- 2025-10-08
  294. deepseek-ai/DeepSeek-V3.1-Terminus -- 2025-10-08
  295. Behavioral Modification Systems in Large Language Models: A Methodological Analysis of Long Conversation Reminders -- 2025-10-08
  296. Ring Flash 2.0 104B A6B with Linear Attention released a few days ago -- 2025-10-07
  297. Bring Your Own Data (BYOD) -- 2025-09-30
  298. CohereLabs/command-a-reasoning-08-2025 -- 2025-09-30
  299. Built an MCP server for Claude Desktop to browse Reddit in real-time -- 2025-09-30
  300. MetalQwen3: Full GPU-Accelerated Qwen3 Inference on Apple Silicon with Metal Shaders – Built on qwen3.c - WORK IN PROGRESS -- 2025-09-28
  301. yangdongchao/UniAudio2 -- 2025-09-28
  302. jimsweb/aiMIDI -- 2025-09-28
  303. Handy – Free open-source speech-to-text app written in Rust -- 2025-09-28
  304. IndexTeam/IndexTTS-2 -- 2025-09-28
  305. beankeji-cloud/SLiteIO -- 2025-09-28
  306. GitHub - shantur/jarvis-mcp: Bring your AI to life—talk to assistants instantly in your browser. Zero hasle, No API keys, No Whisper -- 2025-09-27
  307. PC memory costs to climb as fabs chase filthy lucre in servers and HBM -- 2025-09-27
  308. AI and licensing (commercial use) -- 2025-09-26
  309. I trained an LLM from scratch AMA! -- 2025-09-26
  310. pengzhangzhi/Open-dLLM -- 2025-09-26
  311. SWE-Bench Pro -- 2025-09-23
  312. Investigating Training Data Detection in AI Coders -- 2025-09-23
  313. A first stab at packaging llama.cpp in a performance-optimized manner -- 2025-09-23
  314. Model: Qwen3 Next Pull Request llama.cpp -- 2025-09-23
  315. Unobtanium No More; Perhaps We Already Have All The Elements We Need -- 2025-09-21
  316. support for the upcoming Olmo3 model has been merged into llama.cpp -- 2025-09-21
  317. Running Nvidia CUDA Pytorch/vLLM projects and pipelines on AMD with no modifications -- 2025-09-21
  318. A Quick Look At The AMD Instinct MI355X With ROCm 7.0 -- 2025-09-21
  319. Uncensored AI model for from 4b Max 8b -- 2025-09-21
  320. facebook/MobileLLM-R1-950M -- 2025-09-20
  321. OpenGVLab/InternVL3_5-241B-A28B -- 2025-09-20
  322. KBlueLeaf/HDM-xut-340M-anime -- 2025-09-20
  323. GPT-OSS:20b & Qwen 4b are a match made in heaven for 24GB VRAM builds -- 2025-09-18
  324. Was working in RAG recently got to know how well Gemma3 4B performs -- 2025-09-18
  325. ROCm 6.4.3 -> 7.0-rc1 after updating got +13.5% at 2xR9700 -- 2025-09-18
  326. Is it possible for different brand GPUs to work together? -- 2025-09-18
  327. Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts -- 2025-09-18
  328. LFM2-1.2B safety benchmark -- 2025-09-18
  329. Has anyone successfully gotten Ollama models (or any models) to execute SQL queries through natural language in Openwebui? -- 2025-09-18
  330. yangzhou24/OmniWorld -- 2025-09-18
  331. Analog Optical Computer for Inference and Combinatorial Optimization -- 2025-09-18
  332. Running Qwen-Next (Instruct and Thinking) MLX BF16 with MLX-LM on Macs -- 2025-09-17
  333. [Editorial] REFRAG: Rethinking RAG based Decoding -- 2025-09-17
  334. The Practicality Of Solar Powered Meshtastic -- 2025-09-17
  335. Kwai-Klear/Klear-46B-A2.5B-Instruct -- 2025-09-16
  336. WestZhang/VibeVoice-Large-pt -- 2025-09-16
  337. FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference -- 2025-09-16
  338. Qwen 3 Next Series – Qwen/Qwen3 Next 80B A3B Instruct Detected -- 2025-09-16
  339. Test-time Prompt Intervention -- 2025-09-16
  340. [Editorial] Tricks from OpenAI gpt-oss YOU can use with transformers -- 2025-09-15
  341. openbmb/MiniCPM4.1-8B -- 2025-09-15
  342. nunchaku-tech/nunchaku-qwen-image -- 2025-09-15
  343. ggml-org/gpt-oss-20b-GGUF -- 2025-09-15
  344. MBZUAI releases K2 Think. 32B reasoning model based on Qwen 2.5 32B backbone, focusing on high performance in math, coding and science. -- 2025-09-14
  345. unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF -- 2025-09-14
  346. Effecient hot-swappable LoRA variant supported in llama.cpp -- 2025-09-12
  347. Qwen/Qwen3-Next-80B-A3B-Instruct -- 2025-09-12
  348. swiss-ai/Apertus-70B-Instruct-2509 -- 2025-09-12
  349. GRASPED: Graph Anomaly Detection using Autoencoder with Spectral Encoder and Decoder (Full Version) -- 2025-09-12
  350. stepfun-ai/step3 -- 2025-09-12
  351. Exploring State-Space-Model based Language Model in Music Generation -- 2025-09-12
  352. HuggingFaceModelDownloader v2.0 — fast resume, a slick TUI, and powerful filters for GGUF/variants -- 2025-09-12
  353. Roo Code Cloud is here with Task Sync & Roomote Control || Roo Code 3.28.0 Release Notes -- 2025-09-12
  354. Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers -- 2025-09-12
  355. Made a one-click SearXNG fork with Redis, plus Dockerized Tika+OCR, and soon: local TTS/STT on Intel iGPU + AMD NPU -- 2025-09-12
  356. Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search -- 2025-09-10
  357. Introducing FineVision: a huge open-source dataset for training SOTA Vision Language Models -- 2025-09-10
  358. wildminder/ComfyUI-VibeVoice -- 2025-09-10
  359. bytedance/USO -- 2025-09-10
  360. Wan-AI/Wan2.2-I2V-A14B -- 2025-09-10
  361. YanoljaNEXT-Rosetta: A Collection of Translation Models in Different Sizes -- 2025-09-07
  362. nasa-ibm-ai4science/Surya-1.0 -- 2025-09-06
  363. Vulkan back ends, what do you use? -- 2025-09-06
  364. A new OpenAI model? Could this be 5.1 or 5o? What do you think? -- 2025-09-06
  365. haasonsaas/dspy-0to1-guide -- 2025-09-06
  366. 16 reproducible failures → upgraded into a 300+ page Global Fix Map. one link inside, feedback wanted -- 2025-09-06
  367. VibeVoice RIP? What do you think? -- 2025-09-05
  368. lodestones/Chroma1-HD -- 2025-09-05
  369. Welcome EmbeddingGemma, Google's new efficient embedding model -- 2025-09-05
  370. NousResearch/Hermes-4-405B -- 2025-09-05
  371. After deepseekv3 I feel like other MoE architectures are old or outdated. Why did Qwen chose a simple MoE architecture with softmax routing and aux loss for their Qwen3 models when there’s been better architectures for a while? -- 2025-09-02
  372. The Hacker's Guide to Building an AI Supercluster -- 2025-09-02
  373. CAD, From Scratch: MakerCAD -- 2025-09-02
  374. PSO-Merging: Merging Models Based on Particle Swarm Optimization -- 2025-09-02
  375. Coral-Protocol/Anemoi -- 2025-09-01
  376. After researchers unmasked a prolific SMS scammer, a new operation has emerged -- 2025-09-01
  377. Silent No More: Open-Source Fix for Mic Mishaps -- 2025-09-01
  378. How to reliably detect cross-listed job ads across multiple sites? -- 2025-09-01
  379. GPT OSS Fine-tuning QAT -- 2025-08-31
  380. Semantic Structure in Large Language Model Embeddings -- 2025-08-31
  381. facebook/dinov3-vit7b16-pretrain-lvd1689m -- 2025-08-31
  382. MizzenAI/HPSv3 -- 2025-08-31
  383. Some thoughts on LLMs and software development -- 2025-08-30
  384. Show HN: Sideko – Hybrid deterministic/LLM generator for API SDKs and docs -- 2025-08-30
  385. [Editorial] AI interfaces for future -- 2025-08-29
  386. I’ve Debugged 100+ RAG/LLM Pipelines. These 16 Bugs Always Come Back. (70 days, 800 stars) -- 2025-08-29
  387. Updates to Consumer Terms and Privacy Policy -- 2025-08-29
  388. Hierarchical Reasoning Model (HRM) implementation for text generation -- 2025-08-27
  389. SQLite-Vector adds support for float16 and bfloat16 (CPU, NEON, AVX2 and SSE2) -- 2025-08-26
  390. zai-org/GLM-4.5 -- 2025-08-26
  391. rednote-hilab/dots.vlm1.inst -- 2025-08-26
  392. black-forest-labs/FLUX.1-Krea-dev -- 2025-08-26
  393. Datarus-R1-14B-Preview, an adaptive multi-step reasoning LLM for automated data analysis -- 2025-08-24
  394. Fully Open source, serverless, community-driven MCP alternative built in Python, TS and Go -- 2025-08-24
  395. unsloth/Kimi-K2-Instruct-GGUF -- 2025-08-24
  396. DeepSeek V3.1 Reasoner improves over DeepSeek R1 on the Extended NYT Connections benchmark -- 2025-08-24
  397. Qwen3-30B-A3B-Instruct 2507 vs Qwen3-Coder Flash -- 2025-08-23
  398. Your model zoo for Software dev / webdev -- 2025-08-23
  399. sriniously/go-boilerplate -- 2025-08-23
  400. Design Patterns in MCP: Literate Reasoning -- 2025-08-22
  401. FlyMyAI/flymyai-lora-trainer -- 2025-08-22
  402. Zedless: Zed fork focused on privacy and being local-first -- 2025-08-22
  403. Show HN: Rucat – Cat for Prompt Engineers -- 2025-08-22
  404. NEW VERSION: 0.6.23 Has Just Released! - Many fixes and new features, huge changelog -- 2025-08-22
  405. Docker Model Runner is really neat -- 2025-08-20
  406. Build a Powerful RAG Web Scraper with Ollama and LangChain -- 2025-08-20
  407. Ollama interface with memory -- 2025-08-20
  408. YuminosukeSato/pyproc -- 2025-08-20
  409. AGENTS.md – Open format for guiding coding agents -- 2025-08-20
  410. NVIDIA Nemotron Nano 2 and the Nemotron Pretraining Dataset v1 -- 2025-08-20
  411. deepseek-ai/DeepSeek-V3.1-Base · Hugging Face -- 2025-08-20
  412. mistralai/Devstral-Small-2507 -- 2025-08-20
  413. zai-org/GLM-4.5-Air -- 2025-08-20
  414. microsoft/Phi-4-mini-flash-reasoning -- 2025-08-20
  415. ModelTC/Qwen-Image-Lightning -- 2025-08-20
  416. Fast Type-Aware Linting in Oxlint -- 2025-08-20
  417. GPT-5, where does it shine for you? -- 2025-08-19
  418. vidore/colqwen-omni-v0.1 -- 2025-08-19
  419. Tutorial: Open WebUI and llama-swap works great together! Demo of setup, model swapping and activity monitoring. -- 2025-08-18
  420. Concurrency in open-weight/open-source models? -- 2025-08-18
  421. [Editorial] Claude Flow, Alpha 90 release -- 2025-08-17
  422. Optimizing Text gen webui (oobabooga) for MOE models (Qwen3-235b, GLM 4.5) -- 2025-08-17
  423. Chen-zexi/vllm-cli -- 2025-08-17
  424. zyfoxx/subhunter -- 2025-08-17
  425. HuggingFaceTB/SmolLM3-3B-Base -- 2025-08-16
  426. mistralai/Voxtral-Small-24B-2507 -- 2025-08-16
  427. TiTan - a tiny model for tags and titles -- 2025-08-16
  428. baidu/ERNIE-4.5-VL-424B-A47B-PT -- 2025-08-16
  429. MiniLM (BERT) embeddings in C from scratch -- 2025-08-16
  430. GENNAI CLI - A ReAct-based agent CLI -- 2025-08-16
  431. gptme v0.28.0 major release - agent CLI with local model support -- 2025-08-16
  432. WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling -- 2025-08-16
  433. character-ai/pipelining-sft -- 2025-08-15
  434. Compass-Thinker-7B Technical Report -- 2025-08-15
  435. Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning -- 2025-08-15
  436. SaaS Is Dead -- 2025-08-15
  437. PYX: The next step in Python packaging -- 2025-08-15
  438. [Editorial] GLM-4.5, enterprise use -- 2025-08-13
  439. Best local model with function calling? -- 2025-08-13
  440. agentica-org/DeepSWE-Preview -- 2025-08-13
  441. janhq/Jan-v1-4B-GGUF -- 2025-08-13
  442. gpt-oss jailbreak workflow -- 2025-08-11
  443. GPT-5 removed logprob support from the API - technical breakdown and implications -- 2025-08-11
  444. A model for pure text continuation (not chirpy little Q&A assistant)? -- 2025-08-11
  445. Suggestion for upgrading hardware for MOE inference and fine-tuning. -- 2025-08-09
  446. Best models under 16GB?? -- 2025-08-09
  447. Explicit tail calls are now available on Rust Nightly (become keyword) -- 2025-08-09
  448. HuggingFaceTB/SmolLM3-3B -- 2025-08-09
  449. mistralai/Magistral-Small-2507 -- 2025-08-09
  450. N8N + OpenWebUI -- 2025-08-08
  451. Create space-saving clones on macOS with Python -- 2025-08-08
  452. Experience with GLM-4.5-Air + claude code? -- 2025-08-08
  453. Just when you thought Qwen was done... -- 2025-08-08
  454. Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs -- 2025-08-08
  455. Context Management by Trimming Conversation -- 2025-08-06
  456. Exploiting Primacy Effect To Improve Large Language Models -- 2025-08-06
  457. Help: Qwen3-Coder + LM Studio + Continue.dev (VSCode) + Mac 64GB M3 Max — 500 Internal Server Error, Even After Unsloth Fix -- 2025-08-04
  458. CWC now supports kimi.com (K2) and chat.z.ai (GLM-4.5) to enable coding with top tier models at no cost -- 2025-08-04
  459. 🧠 ICM+DPO: Used Qwen3's coherent understanding to improve Gemma3 at math - cross-model capability transfer with zero supervision -- 2025-08-03
  460. NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining -- 2025-08-03
  461. Build an AI Shopping Assistant with Gradio MCP Servers -- 2025-08-01
  462. [Editorial] Alternative to vector db rag -- 2025-07-30
  463. This year’s best open-source models and most cost-effective models -- 2025-07-29
  464. I’m looking for multimodal image input support and uncensored LLM -- 2025-07-29
  465. nvidia/audio-flamingo-3 -- 2025-07-29
  466. mistralai/Voxtral-Mini-3B-2507 -- 2025-07-29
  467. ziangcao0312/PhysX-3D -- 2025-07-29
  468. UI/UX benchmark update 7/22: Newest Qwen models added, Qwen3 takes the lead in terms of win rate (though still early) -- 2025-07-28
  469. unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF -- 2025-07-28
  470. zai-org/GLM-4.5 -- 2025-07-28
  471. Tesslate/UIGEN-X-32B-0727 -- 2025-07-28
  472. had to fine-tune qwen since llama sucks at summarizing -- 2025-07-28
  473. orchestre-dev/ccproxy -- 2025-07-28
  474. [Editorial] neural networks don’t need to be giant to be powerful -- 2025-07-27
  475. Qwen/Qwen3-235B-A22B-Thinking-2507 -- 2025-07-27
  476. mistralai/Magistral-Small-2507 -- 2025-07-27
  477. cmdaltctr/claude-gemini-mcp-slim -- 2025-07-26
  478. RichardAtCT/claude-code-openai-wrapper -- 2025-07-26
  479. Freigeist - The new Vibe Coding Platform -- 2025-07-26
  480. From chaotic prompting to structured workflow: My Claude evolution -- 2025-07-24
  481. A Request for Comments (RFC) for MCP-alternative Universal Tool Calling Protocol (UTCP) was created -- 2025-07-22
  482. How to use the same context across LLMs and Agents -- 2025-07-22
  483. Has anyone actually ran VLAs locally and how good are they? -- 2025-07-22
  484. google/medsiglip-448 -- 2025-07-22
  485. Questions about AI for translation -- 2025-07-22
  486. microsoft/Phi-4-mini-flash-reasoning -- 2025-07-22
  487. LGAI-EXAONE/EXAONE-4.0-32B -- 2025-07-22
  488. Replacing thinking with tool usage enables reasoning in small language models -- 2025-07-22
  489. Struggling to Generate Polished UI with Claude Code -- 2025-07-20
  490. Skywork/Skywork-R1V3-38B -- 2025-07-20
  491. ByteDance-Seed/Seed-X-PPO-7B -- 2025-07-20
  492. Diffusion model support in llama.cpp. -- 2025-07-16
  493. GLM-4 MoE incoming -- 2025-07-16
  494. GeoArrow and GeoParquet, and the Future of Geospatial Data Analysis -- 2025-07-15
  495. Programming Affordances That Invite Mistakes -- 2025-07-14
  496. HuggingFaceTB/SmolLM3-3B-Base -- 2025-07-14
  497. OLMo 2 - a family of fully-open language models -- 2025-07-12
  498. google/videoprism -- 2025-07-12
  499. Why don’t we have a big torrent repo for open-source LLMs? -- 2025-07-12
  500. Local PDF Database searchable with ollama - best setup? -- 2025-07-12
  501. Tinyllama on old Mediatek G80 android device -- 2025-07-12
  502. I used Ollama to build a Cursor for PDFs -- 2025-07-12
  503. Advice on switching to LLM -- 2025-07-12
  504. Building the Hugging Face MCP Server -- 2025-07-11
  505. Support for the upcoming IBM Granite 4.0 has been merged into llama.cpp -- 2025-07-11
  506. support for Falcon-H1 model family has been merged into llama.cpp -- 2025-07-11
  507. fsndzomga/metadspy -- 2025-07-10
  508. osmosis-ai/Osmosis-Apply-1.7B -- 2025-07-10
  509. IntervitensInc/pangu-pro-moe-model -- 2025-07-10
  510. Continual Gradient Low-Rank Projection Fine-Tuning for LLMs -- 2025-07-10
  511. pola-rs/polars -- 2025-07-09
  512. Introducing an open source cross-platform graphical interface LLM client -- 2025-07-08
  513. llama-server vs llama python binding -- 2025-07-08
  514. Llama server completion not working correctly -- 2025-07-08
  515. introducing cocoindex - super simple etl to prepare data for ai, with dynamic index (ollama integrated) -- 2025-07-08
  516. Higher topk and num_ctx or map/reduce ? -- 2025-07-08
  517. Looking for an upgrade from Meta-Llama-3.1-8B-Instruct-Q4_K_L.gguf, especially for letter parsing. Last time I looked into this was a very long time ago (7 months!) What are the best models nowadays? -- 2025-07-08
  518. Best models by size? -- 2025-07-08
  519. Planning a 7–8B Model Benchmark on 8GB GPU — What Should I Test & Measure? -- 2025-07-08
  520. DLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching -- 2025-07-08
  521. Efficient MultiModal Data Pipeline -- 2025-07-08
  522. Are non-autoregressive models really faster than autoregressive ones after all the denoising steps? -- 2025-07-06
  523. Yuan-ManX/ComfyUI-OmniGen2 -- 2025-07-06
  524. Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 -- 2025-07-06
  525. Using local models with Void -- 2025-07-05
  526. Intel GPU vLLM Docker Compose Bootstrap with Phi-lthy4 on A770 -- 2025-07-05
  527. Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm -- 2025-07-05
  528. Gemma 3n fully available in the open-source ecosystem! -- 2025-07-05
  529. 5060ti 16gb or 9060xt 16gb for small llm server -- 2025-07-05
  530. Qwen3 models in MLX format! -- 2025-07-05
  531. Best local coding model right now? -- 2025-07-05
  532. Found a Web3 LLM That Actually Gets DeFi Right -- 2025-07-03
  533. apple/DiffuCoder-7B-cpGRPO -- 2025-07-03
  534. Hoshinonyaruko/Gensokyo-MCP -- 2025-07-01
  535. ahmadallobani/BaldHead -- 2025-06-29
  536. modelcontextprotocol/registry -- 2025-06-27
  537. 0-concordance of knotted surfaces and Alexander ideals -- 2025-06-26
  538. unfinishedtr/progressbar -- 2025-06-24
  539. wearyfurnitur/confiq -- 2025-06-24
  540. Show HN: Controlling 3D models with voice and hand gestures -- 2025-06-24
  541. open-webui/mcpo -- 2025-06-23
  542. Built a fully local Whisper + pyannote stack to replace Otter. Full diarisation, transcripts & summaries on GPU. -- 2025-06-23
  543. MiniMax latest open-sourcing LLM, MiniMax-M1 — setting new standards in long-context reasoning,m -- 2025-06-23
  544. Run qwen 30b-a3b on Android local with Alibaba MNN Chat -- 2025-06-23
  545. A new PDF translation tool -- 2025-06-23
  546. What Really Happens When You Ask a Cursor a Question with GitHub MCP Integrated -- 2025-06-23
  547. [Q] How to Speed Up Mistral 7B Inference in LM Studio? 31s/Chunk on RTX 3070 -- 2025-06-23
  548. Cyber security guys are about to become very on demand in the coming few years -- 2025-06-23
  549. The first big AI disaster is yet to happen -- 2025-06-23
  550. Trading with Claude, and writing your own MCP server -- 2025-06-23
  551. Ecne AI Podcast Generator - Update -- 2025-06-23
  552. Help me decide on hardware for LLMs -- 2025-06-23
  553. chungmin99/pyroki -- 2025-06-23
  554. nvidia/GR00T-N1.5-3B -- 2025-06-23
  555. nvidia/Cosmos-Predict2-2B-Text2Image -- 2025-06-22
  556. trendmicro/vision-one-mcp-server -- 2025-06-20
  557. Minidoracat/mcp-feedback-enhanced -- 2025-06-20
  558. meta-llama/Llama-3.1-8B-Instruct -- 2025-06-19
  559. MiniMaxAI/MiniMax-M1-80k -- 2025-06-19
  560. ckanthony/openapi-mcp -- 2025-06-18
  561. Ta0ing/MCP-SecurityTools -- 2025-06-18
  562. fluxions/vui -- 2025-06-13
  563. carbon-language/carbon-lang -- 2025-06-12
  564. CJackHwang/AIstudioProxyAPI -- 2025-06-11
  565. News publishers call Google's AI Mode 'theft' -- 2025-06-11
  566. typelevel/cats -- 2025-06-11
  567. wesm/pydata-book -- 2025-06-11
  568. jedisct1/openapi-mcp -- 2025-06-09
  569. arcee-ai/Homunculus -- 2025-06-04
  570. GitHub MCP exploited: Accessing private repositories via MCP -- 2025-06-01
  571. dipampaul17/KVSplit -- 2025-06-01
  572. facebook/OMol25 -- 2025-05-30
  573. Is Microsoft’s new Foundry Local going to be the “easy button” for running newer transformers models locally? -- 2025-05-28
  574. Cobolt is now available on Linux! 🎉 -- 2025-05-28
  575. Round Up: Current Best Local Models under 40B for Code & Tool Calling, General Chatting, Vision, and Creative Story Writing. -- 2025-05-28
  576. Best local model for M2 16gb MacBook Air for Analyzing Transcripts -- 2025-05-28
  577. Prompt Debugging -- 2025-05-28
  578. Not so Smart Agent (Ollama, Spring AI, MCP) -- 2025-05-28
  579. [Q] How can one get better at fixing models,training etc.? -- 2025-05-28
  580. FOSS - MCP Server generator from OpenAPI specification files (swagger/etapi) -- 2025-05-28
  581. Digital Payment System GNU Taler Gets Green Light to Operate in Switzerland -- 2025-05-28