Model Architecture

Transformers, mixture of experts, attention mechanisms, model design

478 articles across 132 editions

Articles

  1. Liquid AI releases LFM2-24B-A2B -- 2026-02-24
  2. Qwen3's most underrated feature: Voice embeddings -- 2026-02-24
  3. After many contributions craft, Crane now officially supports Qwen3-TTS! -- 2026-02-24
  4. We tested the same INT8 model on 5 Snapdragon chipsets. Accuracy ranged from 93% to 71%. Same weights, same ONNX file. -- 2026-02-24
  5. [Editorial] https://www-cdn.anthropic.com/f21d93f21602ead5cdbecb8c8e1c765759d9e232.pdf -- 2026-02-12
  6. [Editorial] https://d3lm.medium.com/overly-agentic-why-anthropic-is-worried-about-opus-4-6-17eee0f8e5cd -- 2026-02-12
  7. [Editorial] https://www.linkedin.com/posts/avipil_i-got-my-first-bill-after-switching-to-claude-activity-7427320523870629889-vM5K -- 2026-02-12
  8. Pros/Cons and use case for bypassing permissions -- 2026-02-12
  9. [Editorial] https://www.kylerush.org/posts/opus-4-5-really-changed-things -- 2026-02-10
  10. [Editorial] https://www.linkedin.com/posts/ownyourai_i-taught-my-claude-code-to-swallow-a-32m-activity-7426902541868728321-2Z36 -- 2026-02-10
  11. [Editorial] https://docs.google.com/document/d/1I9r21TyQuAO1y2ecztBU0PSCpjHSL_vZJiA5v276Wro/mobilebasic -- 2026-02-10
  12. [Editorial] https://www.linkedin.com/posts/cole-medin-727752184_claude-codes-new-agent-teams-feature-is-share-7426633806792609792-pT3s -- 2026-02-10
  13. [Tool] claude-config-sync: Sync your Claude Code configuration across machines using GitHub Gists -- 2026-02-10
  14. [Editorial] https://www.linkedin.com/posts/ownyourai_i-just-woke-up-to-qwen3-coder-next-80b-activity-7424703876240695297-Nlqf -- 2026-02-04
  15. LiquidAI/LFM2.5-1.2B-Thinking -- 2026-02-04
  16. ByteDance-Seed/Stable-DiffCoder-8B-Instruct · Hugging Face -- 2026-02-03
  17. tencent/Youtu-VL-4B-Instruct -- 2026-02-03
  18. NousResearch/NousCoder-14B -- 2026-02-03
  19. transformers v5 final is out 🔥 -- 2026-01-30
  20. PaddlePaddle/PaddleOCR-VL-1.5 -- 2026-01-30
  21. zai-org/GLM-4.7 -- 2026-01-30
  22. MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models -- 2026-01-30
  23. Sharing my set of distilled small language models (3B) + training data in more than 50 low-resource languages -- 2026-01-30
  24. Introducing Kimi K2.5, Open-Source Visual Agentic Intelligence -- 2026-01-28
  25. ~60GB models on coding: GLM 4.7 Flash vs. GPT OSS 120B vs. Qwen3 Coder 30B -- your comparisons? -- 2026-01-28
  26. openbmb/AgentCPM-Report -- 2026-01-28
  27. Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice -- 2026-01-28
  28. Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation -- 2026-01-28
  29. One Year Since the “DeepSeek Moment” -- 2026-01-28
  30. Qwen/Qwen-Image-2512 -- 2026-01-28
  31. [Editorial] https://www.linkedin.com/posts/ownyourai_deepseek-just-released-the-first-vision-ai-activity-7421818927657385987-V1yo -- 2026-01-27
  32. Unsloth announces support for finetuning embedding models -- 2026-01-27
  33. Qwen/Qwen3-TTS-12Hz-0.6B-Base -- 2026-01-27
  34. Qwen/Qwen3-VL-Reranker-2B -- 2026-01-27
  35. [Editorial] https://www.linkedin.com/posts/ivandj_early-claims-around-self-evolving-memory-activity-7421307316437676033-l0Jm -- 2026-01-26
  36. [Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-ruvector-world-model-activity-7421556928910290944-cx4v -- 2026-01-26
  37. stepfun-ai/Step3-VL-10B -- 2026-01-26
  38. Qwen/Qwen3-VL-Embedding-2B -- 2026-01-21
  39. Phr00t/Qwen-Image-Edit-Rapid-AIO -- 2026-01-21
  40. Qwen/Qwen3-VL-Reranker-8B -- 2026-01-21
  41. Bartowski comes through again. GLM 4.7 flash GGUF -- 2026-01-21
  42. Step-Audio-R1.1 (Open Weight) by StepFun just set a new SOTA on the Artificial Analysis Speech Reasoning leaderboard -- 2026-01-20
  43. GLM-4.7-Flash -- 2026-01-20
  44. stepfun-ai/Step-Audio-R1.1 -- 2026-01-20
  45. tencent/HY-MT1.5-1.8B -- 2026-01-20
  46. Introducing GLM-Image -- 2026-01-14
  47. GPT-OSS -> MLA conversion breakthrough (20B), still looking for compute + collaborators -- 2026-01-14
  48. FrogBoss 32B and FrogMini 14B from Microsoft -- 2026-01-14
  49. Qwen/Qwen3-VL-Embedding-8B -- 2026-01-14
  50. MCP for Financial Ontology! -- 2026-01-09
  51. A community index for MCPs that don’t disappear after the thread ends -- 2026-01-09
  52. tencent/HY-WorldPlay -- 2026-01-09
  53. meituan-longcat/LongCat-Image -- 2026-01-09
  54. Introducing Falcon H1R 7B -- 2026-01-09
  55. [Editorial] https://www.linkedin.com/posts/ernst-van-gassen-9196a7b5_we-spend-a-lot-of-time-trying-to-make-prompts-ugcPost-7414966440023482368-dgh5 -- 2026-01-09
  56. [Editorial] https://leakhub.ai/ -- 2026-01-09
  57. [Editorial] https://www.linkedin.com/posts/hardmaru_survival-of-the-fittest-code-blog-https-activity-7415068590485458944-3tqP -- 2026-01-09
  58. A closer look at a BGP anomaly in Venezuela -- 2026-01-09
  59. Show HN: I visualized the entire history of Citi Bike in the browser -- 2026-01-09
  60. Modifying a QingPing Air Quality Monitor for Local MQTT Access -- 2026-01-09
  61. [Editorial] https://github.com/hiyouga/LlamaFactory -- 2026-01-07
  62. Tongyi-MAI/MAI-UI-8B · Hugging Face -- 2026-01-07
  63. MultiverseComputingCAI/HyperNova-60B · Hugging Face -- 2026-01-07
  64. [Editorial] https://www.alwaysfurther.ai/blog/train-4b-model-to-beat-claude-sonnet-gemini -- 2026-01-05
  65. [Experimental] Gemma 3 4B - Dark CoT: Pushing 4B Reasoning to 33%+ on GPQA Diamond -- 2026-01-05
  66. Youtu-LLM-2B-GGUF is here! -- 2026-01-05
  67. upstage/Solar-Open-100B -- 2026-01-05
  68. EditMGT — fast, localized image editing with Masked Generative Transformers -- 2025-12-30
  69. Francis-Rings/FlashPortrait -- 2025-12-30
  70. zai-org/GLM-TTS -- 2025-12-30
  71. Flowception: Temporally Expansive Flow Matching for Video Generation -- 2025-12-30
  72. I've been experimenting with SLM's a lot recently. My goal was to prove even SLMs can be accurate with the right architecture behind it. -- 2025-12-22
  73. MBZUAI releases K2-V2 - 70B fully open model. -- 2025-12-22
  74. microsoft/Fara-7B -- 2025-12-22
  75. Alibaba Tongyi Open Sources Two Audio Models: Fun-CosyVoice 3.0 (TTS) and Fun-ASR-Nano-2512 (ASR) -- 2025-12-19
  76. My professor lent me an A6000, so I tried to build a coding model. Here is Anni! (Qwen3-14B Fine-tune) -- 2025-12-19
  77. Two years ago, I was just a math major. Now I've built the 1.5B router model used by HuggingFace. Can I bring it to Cursor? -- 2025-12-19
  78. Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 -- 2025-12-16
  79. Did I overhype Claude Code? GPT + Comet are quietly beating it for me -- 2025-12-16
  80. 🚀 New: Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B -- 2025-12-16
  81. ByteDance/Dolphin-v2 -- 2025-12-16
  82. zai-org/AutoGLM-Phone-9B-Multilingual -- 2025-12-16
  83. [Editorial] https://www.linkedin.com/posts/eric-vyacheslav-156273169_a-7m-model-just-surpassed-deepseek-r1-gemini-activity-7405985266043297792-s1Jn -- 2025-12-15
  84. Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models -- 2025-12-15
  85. New in llama.cpp: Model Management -- 2025-12-12
  86. Successful prototype component prompt - For big projects -- 2025-12-12
  87. Building RNJ-1: What makes It different from Gemma 3? -- 2025-12-11
  88. EssentialAI/rnj-1 -- 2025-12-11
  89. From Azure Functions to FreeBSD -- 2025-12-10
  90. [Editorial] https://www.linkedin.com/posts/dakharlamov_too-slow-thats-what-they-called-engineers-activity-7403964370390646784-J27P -- 2025-12-09
  91. [Editorial] https://arxiv.org/abs/2511.20920 -- 2025-12-09
  92. [Editorial] https://www.linkedin.com/posts/anthony-alcaraz-b80763155_your-ai-agents-context-window-is-not-a-database-activity-7403394612893024256-9S87 -- 2025-12-09
  93. [Editorial] https://www.linkedin.com/posts/reuvencohen_introducing-ruvector-postgres-a-self-learning-activity-7403841311029837824-jogp -- 2025-12-09
  94. Emacs is my new window manager -- 2025-12-08
  95. shubh-io/DockMate -- 2025-12-08
  96. Comfy-Org/HunyuanVideo_1.5_repackaged -- 2025-12-08
  97. We were tired of guessing which local model to use for which query. built a speculative execution lib that figures it out (github) -- 2025-12-05
  98. Nimony (eventually Nim 3.0) Design Principles -- 2025-12-03
  99. nvidia/Orchestrator-8B · Hugging Face -- 2025-12-03
  100. allenai/Olmo-3-1125-32B -- 2025-12-03
  101. moonshotai/Kimi-Linear-48B-A3B-Instruct -- 2025-12-02
  102. orabazes/FLUX.2-dev-GGUF -- 2025-12-02
  103. Transformers v5: Simple model definitions powering the AI ecosystem -- 2025-12-02
  104. Pocketbase – open-source realtime back end in 1 file -- 2025-11-28
  105. [Editorial] https://www.rand.org/pubs/commentary/2025/11/electromagnetic-warfare-natos-blind-spot-could-decide.html -- 2025-11-28
  106. In depth analysis of Nvidia's Jet Nemotron models -- 2025-11-26
  107. Hidden causes of LLM latency, its not just the model size -- 2025-11-26
  108. Benchmark: Self-Hosted Qwen-30B (LoRA) vs. Llama-3.1-8B vs. GPT-4.1-nano. Comparison of parsing success rates and negative constraints. -- 2025-11-25
  109. [Editorial] https://arxiv.org/html/2510.04871v1 -- 2025-11-24
  110. [Editorial] https://www.linkedin.com/posts/ingason_agenticai-erp-aiagents-activity-7396551539538141185-PxNU -- 2025-11-24
  111. facebook/sam3 -- 2025-11-24
  112. rl-research/DR-Tulu-8B -- 2025-11-24
  113. Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning -- 2025-11-24
  114. marinero4972/Open-o3-Video -- 2025-11-17
  115. nanonets/Nanonets-OCR2-3B -- 2025-11-17
  116. nvidia/ChronoEdit-14B-Diffusers-Upscaler-Lora -- 2025-11-17
  117. ibm-granite/granite-4.0-h-350m -- 2025-11-14
  118. tencent/HunyuanWorld-Mirror -- 2025-11-14
  119. [Editorial] https://www.npmjs.com/package/neural-trader -- 2025-11-14
  120. Critical RCE patched in Imunify360 affects up to 50M+ websites -- 2025-11-14
  121. Kubernetes Ingress Nginx is retiring -- 2025-11-14
  122. About KeePassXC's Code Quality Control -- 2025-11-14
  123. Visualizing Quantization Types -- 2025-11-11
  124. Co-authored a book called "Build DeepSeek from Scratch" | Live Now -- 2025-11-11
  125. Writing your own BEAM -- 2025-11-11
  126. DIY Powerwall Blows Clouds, Competition Out of the Water -- 2025-11-11
  127. BAAI/Emu3.5-Image -- 2025-11-07
  128. Riemannian Optimization for LoRA on the Stiefel Manifold -- 2025-11-07
  129. "On-the-fly" code reviews with ollama. It kinda works.. -- 2025-11-07
  130. I Built an "AI Art Director" Agent to Orchestrate Image and Video Models. -- 2025-11-07
  131. What's your most unexpected Claude workflow discovery? -- 2025-11-07
  132. [Research] Cross-Stage Vulnerabilities in Large Language Model Architectures -- 2025-11-07
  133. runZeroInc/runZeroHound -- 2025-11-07
  134. openai/gpt-oss-safeguard-120b -- 2025-11-07
  135. Why does Image Recognition work in llama-server but not through Open WebUI? -- 2025-11-06
  136. Has anyone tested ollama on Whisplay HAT with Raspberry pi zero 2W? -- 2025-11-06
  137. allenai/olmOCR-2-7B-1025 -- 2025-11-06
  138. is there simple way like .bat to compress to q4-q8 like Unsloth, Qwen3-VL-30B-A3B-Thinking-abliterated model -- 2025-11-06
  139. MiniMax M2 Llama.cpp support merged -- 2025-11-05
  140. Which AI IDE should I use under $20/month? -- 2025-11-05
  141. lzA6/video-to-txt -- 2025-11-05
  142. xaviviro/python-toon -- 2025-11-05
  143. [Editorial] https://www.evokesecurity.com/blogs/prompt-injection-is-for-everyone -- 2025-11-04
  144. Found a remote file inclusion vulnerability in an AI-generated app before launch -- 2025-11-04
  145. Launch HN: Propolis (YC X25) – Browser agents that QA your web app autonomously -- 2025-11-04
  146. Attacking macOS XPC Helpers: Protocol Reverse Engineering and Interface Analysis -- 2025-11-04
  147. [Research] Unvalidated Trust: Cross-Stage Failure Modes in LLM/agent pipelines arXiv -- 2025-11-04
  148. [Editorial] Context Engineering Handbook -- 2025-11-03
  149. Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
  150. OSS alternative to Open WebUI - ChatGPT-like UI, API and CLI -- 2025-11-03
  151. Looking for a RAG UI manager to meet our needs to replace Zapier -- 2025-11-03
  152. LiquidAI/LFM2-1.2B-Extract -- 2025-11-03
  153. DeepSeek may have found a new way to improve AI’s ability to remember -- 2025-11-02
  154. Reliable Unlearning Harmful Information in LLMs with Metamorphosis Representation Projection -- 2025-11-02
  155. OpenAI: gpt-oss-safeguard: two open-weight reasoning models built for safety classification (Now on Hugging Face) -- 2025-10-31
  156. briaai/FIBO -- 2025-10-31
  157. snowyfizz/Vision-Detection-API -- 2025-10-29
  158. Show HN: Apache Fory Rust – 10-20x faster serialization than JSON/Protobuf -- 2025-10-29
  159. huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning -- 2025-10-29
  160. AlphaXiv,Compare the Deepseek-OCR and Mistral-OCR OCR models -- 2025-10-26
  161. Open-Bee/Bee-8B-RL -- 2025-10-26
  162. datalab-to/chandra -- 2025-10-26
  163. Unlock the power of images with AI Sheets -- 2025-10-26
  164. Picture in Picture / Webcam detect model on HuggingFace -- 2025-10-25
  165. Show HN: Story Keeper – AI agents with narrative continuity instead of memory -- 2025-10-25
  166. Show HN: Deta Surf – An open source and local-first AI notebook -- 2025-10-25
  167. Show HN: Tommy – Turn ESP32 devices into through-wall motion sensors -- 2025-10-25
  168. 20x Max Plan (€216) takes 2% of weekly Opus usage for a single Deep Research Query. That equals 50 per week if you use it ONLY for this and never continue or respond -- 2025-10-24
  169. I got tired of OpenAI dependency. Built a multi-LLM control center instead. -- 2025-10-19
  170. Turn ChatGPT into a real-time meeting assistant (via MCP + Apps SDK) -- 2025-10-19
  171. Claude Code taking a coffee break 🤔 -- 2025-10-19
  172. Show HN: Cmux – Coding Agent Multiplexer -- 2025-10-19
  173. [Editorial] Agentic Orchestration -- 2025-10-19
  174. Chicken Squisher 3000: Squish-Proof Security -- 2025-10-19
  175. Show HN: Largest open-source multimodal AI dataset -- 2025-10-18
  176. KORMo-Team/KORMo-10B-sft -- 2025-10-18
  177. ByteDance/FaceCLIP -- 2025-10-18
  178. Do you use multiple AI models for coding? Trying to validate a workflow problem -- 2025-10-17
  179. The “Compounding Engineering” mindset changed how I think about AI coding tools -- 2025-10-17
  180. Show HN: Rebuilt Bible search app to run 100% client-side with Transformers.js -- 2025-10-13
  181. swiss-ai/Apertus-8B-Instruct-2509 -- 2025-10-13
  182. A Childhood Dream, Created and Open Sourced -- 2025-10-13
  183. Some small tools for you - Ollama Managment UI, Passkey authentication proxy -- 2025-10-12
  184. Single prompt I run after git commit (before push) for AI diff/commit review -- 2025-10-12
  185. Show HN: Gitcasso – Syntax Highlighting and Draft Recovery for GitHub Comments -- 2025-10-12
  186. simonw/claude-skills -- 2025-10-12
  187. Custom models don't work after v0.6.33 update - Anyone else? -- 2025-10-12
  188. Pardus AI: Open source AI Assistant thanks for the help with Ollama -- 2025-10-09
  189. What to use for refactoring -- 2025-10-09
  190. Claude Code finally has a planning partner — I built an AI backlog manager for solo devs -- 2025-10-09
  191. Granite4 Small-h 32b-A9b (Q4_K_M) at FULL 1M context window is using only 73GB of VRAM - Life is good! -- 2025-10-09
  192. Run Open AI GPT-OSS on a mobile phone (Demo) -- 2025-10-09
  193. AI21 releases Jamba 3B, the tiny model outperforming Qwen 3 4B and IBM Granite 4 Micro! -- 2025-10-09
  194. inclusionAI/Ling-mini-2.0 -- 2025-10-09
  195. [Editorial] The Tiny Recursive Mode -- 2025-10-08
  196. deepseek-ai/DeepSeek-V3.1-Terminus -- 2025-10-08
  197. Behavioral Modification Systems in Large Language Models: A Methodological Analysis of Long Conversation Reminders -- 2025-10-08
  198. Ring Flash 2.0 104B A6B with Linear Attention released a few days ago -- 2025-10-07
  199. Bring Your Own Data (BYOD) -- 2025-09-30
  200. CohereLabs/command-a-reasoning-08-2025 -- 2025-09-30
  201. Built an MCP server for Claude Desktop to browse Reddit in real-time -- 2025-09-30
  202. MetalQwen3: Full GPU-Accelerated Qwen3 Inference on Apple Silicon with Metal Shaders – Built on qwen3.c - WORK IN PROGRESS -- 2025-09-28
  203. yangdongchao/UniAudio2 -- 2025-09-28
  204. jimsweb/aiMIDI -- 2025-09-28
  205. Handy – Free open-source speech-to-text app written in Rust -- 2025-09-28
  206. IndexTeam/IndexTTS-2 -- 2025-09-28
  207. beankeji-cloud/SLiteIO -- 2025-09-28
  208. GitHub - shantur/jarvis-mcp: Bring your AI to life—talk to assistants instantly in your browser. Zero hasle, No API keys, No Whisper -- 2025-09-27
  209. PC memory costs to climb as fabs chase filthy lucre in servers and HBM -- 2025-09-27
  210. AI and licensing (commercial use) -- 2025-09-26
  211. I trained an LLM from scratch AMA! -- 2025-09-26
  212. pengzhangzhi/Open-dLLM -- 2025-09-26
  213. SWE-Bench Pro -- 2025-09-23
  214. Investigating Training Data Detection in AI Coders -- 2025-09-23
  215. A first stab at packaging llama.cpp in a performance-optimized manner -- 2025-09-23
  216. Model: Qwen3 Next Pull Request llama.cpp -- 2025-09-23
  217. Unobtanium No More; Perhaps We Already Have All The Elements We Need -- 2025-09-21
  218. support for the upcoming Olmo3 model has been merged into llama.cpp -- 2025-09-21
  219. Running Nvidia CUDA Pytorch/vLLM projects and pipelines on AMD with no modifications -- 2025-09-21
  220. A Quick Look At The AMD Instinct MI355X With ROCm 7.0 -- 2025-09-21
  221. Uncensored AI model for from 4b Max 8b -- 2025-09-21
  222. facebook/MobileLLM-R1-950M -- 2025-09-20
  223. OpenGVLab/InternVL3_5-241B-A28B -- 2025-09-20
  224. KBlueLeaf/HDM-xut-340M-anime -- 2025-09-20
  225. GPT-OSS:20b & Qwen 4b are a match made in heaven for 24GB VRAM builds -- 2025-09-18
  226. Was working in RAG recently got to know how well Gemma3 4B performs -- 2025-09-18
  227. ROCm 6.4.3 -> 7.0-rc1 after updating got +13.5% at 2xR9700 -- 2025-09-18
  228. Is it possible for different brand GPUs to work together? -- 2025-09-18
  229. Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts -- 2025-09-18
  230. LFM2-1.2B safety benchmark -- 2025-09-18
  231. Has anyone successfully gotten Ollama models (or any models) to execute SQL queries through natural language in Openwebui? -- 2025-09-18
  232. yangzhou24/OmniWorld -- 2025-09-18
  233. Analog Optical Computer for Inference and Combinatorial Optimization -- 2025-09-18
  234. Running Qwen-Next (Instruct and Thinking) MLX BF16 with MLX-LM on Macs -- 2025-09-17
  235. [Editorial] REFRAG: Rethinking RAG based Decoding -- 2025-09-17
  236. The Practicality Of Solar Powered Meshtastic -- 2025-09-17
  237. Kwai-Klear/Klear-46B-A2.5B-Instruct -- 2025-09-16
  238. WestZhang/VibeVoice-Large-pt -- 2025-09-16
  239. FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference -- 2025-09-16
  240. Qwen 3 Next Series – Qwen/Qwen3 Next 80B A3B Instruct Detected -- 2025-09-16
  241. Test-time Prompt Intervention -- 2025-09-16
  242. [Editorial] Tricks from OpenAI gpt-oss YOU can use with transformers -- 2025-09-15
  243. openbmb/MiniCPM4.1-8B -- 2025-09-15
  244. nunchaku-tech/nunchaku-qwen-image -- 2025-09-15
  245. ggml-org/gpt-oss-20b-GGUF -- 2025-09-15
  246. MBZUAI releases K2 Think. 32B reasoning model based on Qwen 2.5 32B backbone, focusing on high performance in math, coding and science. -- 2025-09-14
  247. unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF -- 2025-09-14
  248. Effecient hot-swappable LoRA variant supported in llama.cpp -- 2025-09-12
  249. Qwen/Qwen3-Next-80B-A3B-Instruct -- 2025-09-12
  250. swiss-ai/Apertus-70B-Instruct-2509 -- 2025-09-12
  251. GRASPED: Graph Anomaly Detection using Autoencoder with Spectral Encoder and Decoder (Full Version) -- 2025-09-12
  252. stepfun-ai/step3 -- 2025-09-12
  253. Exploring State-Space-Model based Language Model in Music Generation -- 2025-09-12
  254. HuggingFaceModelDownloader v2.0 — fast resume, a slick TUI, and powerful filters for GGUF/variants -- 2025-09-12
  255. Roo Code Cloud is here with Task Sync & Roomote Control || Roo Code 3.28.0 Release Notes -- 2025-09-12
  256. Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers -- 2025-09-12
  257. Made a one-click SearXNG fork with Redis, plus Dockerized Tika+OCR, and soon: local TTS/STT on Intel iGPU + AMD NPU -- 2025-09-12
  258. Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search -- 2025-09-10
  259. Introducing FineVision: a huge open-source dataset for training SOTA Vision Language Models -- 2025-09-10
  260. wildminder/ComfyUI-VibeVoice -- 2025-09-10
  261. bytedance/USO -- 2025-09-10
  262. Wan-AI/Wan2.2-I2V-A14B -- 2025-09-10
  263. YanoljaNEXT-Rosetta: A Collection of Translation Models in Different Sizes -- 2025-09-07
  264. nasa-ibm-ai4science/Surya-1.0 -- 2025-09-06
  265. Vulkan back ends, what do you use? -- 2025-09-06
  266. A new OpenAI model? Could this be 5.1 or 5o? What do you think? -- 2025-09-06
  267. haasonsaas/dspy-0to1-guide -- 2025-09-06
  268. 16 reproducible failures → upgraded into a 300+ page Global Fix Map. one link inside, feedback wanted -- 2025-09-06
  269. VibeVoice RIP? What do you think? -- 2025-09-05
  270. lodestones/Chroma1-HD -- 2025-09-05
  271. Welcome EmbeddingGemma, Google's new efficient embedding model -- 2025-09-05
  272. NousResearch/Hermes-4-405B -- 2025-09-05
  273. After deepseekv3 I feel like other MoE architectures are old or outdated. Why did Qwen chose a simple MoE architecture with softmax routing and aux loss for their Qwen3 models when there’s been better architectures for a while? -- 2025-09-02
  274. The Hacker's Guide to Building an AI Supercluster -- 2025-09-02
  275. CAD, From Scratch: MakerCAD -- 2025-09-02
  276. PSO-Merging: Merging Models Based on Particle Swarm Optimization -- 2025-09-02
  277. Coral-Protocol/Anemoi -- 2025-09-01
  278. After researchers unmasked a prolific SMS scammer, a new operation has emerged -- 2025-09-01
  279. Silent No More: Open-Source Fix for Mic Mishaps -- 2025-09-01
  280. How to reliably detect cross-listed job ads across multiple sites? -- 2025-09-01
  281. GPT OSS Fine-tuning QAT -- 2025-08-31
  282. Semantic Structure in Large Language Model Embeddings -- 2025-08-31
  283. facebook/dinov3-vit7b16-pretrain-lvd1689m -- 2025-08-31
  284. MizzenAI/HPSv3 -- 2025-08-31
  285. Some thoughts on LLMs and software development -- 2025-08-30
  286. Show HN: Sideko – Hybrid deterministic/LLM generator for API SDKs and docs -- 2025-08-30
  287. [Editorial] AI interfaces for future -- 2025-08-29
  288. I’ve Debugged 100+ RAG/LLM Pipelines. These 16 Bugs Always Come Back. (70 days, 800 stars) -- 2025-08-29
  289. Updates to Consumer Terms and Privacy Policy -- 2025-08-29
  290. Hierarchical Reasoning Model (HRM) implementation for text generation -- 2025-08-27
  291. SQLite-Vector adds support for float16 and bfloat16 (CPU, NEON, AVX2 and SSE2) -- 2025-08-26
  292. zai-org/GLM-4.5 -- 2025-08-26
  293. rednote-hilab/dots.vlm1.inst -- 2025-08-26
  294. black-forest-labs/FLUX.1-Krea-dev -- 2025-08-26
  295. Datarus-R1-14B-Preview, an adaptive multi-step reasoning LLM for automated data analysis -- 2025-08-24
  296. Fully Open source, serverless, community-driven MCP alternative built in Python, TS and Go -- 2025-08-24
  297. unsloth/Kimi-K2-Instruct-GGUF -- 2025-08-24
  298. DeepSeek V3.1 Reasoner improves over DeepSeek R1 on the Extended NYT Connections benchmark -- 2025-08-24
  299. Qwen3-30B-A3B-Instruct 2507 vs Qwen3-Coder Flash -- 2025-08-23
  300. Your model zoo for Software dev / webdev -- 2025-08-23
  301. sriniously/go-boilerplate -- 2025-08-23
  302. Design Patterns in MCP: Literate Reasoning -- 2025-08-22
  303. FlyMyAI/flymyai-lora-trainer -- 2025-08-22
  304. Zedless: Zed fork focused on privacy and being local-first -- 2025-08-22
  305. Show HN: Rucat – Cat for Prompt Engineers -- 2025-08-22
  306. NEW VERSION: 0.6.23 Has Just Released! - Many fixes and new features, huge changelog -- 2025-08-22
  307. Docker Model Runner is really neat -- 2025-08-20
  308. Build a Powerful RAG Web Scraper with Ollama and LangChain -- 2025-08-20
  309. Ollama interface with memory -- 2025-08-20
  310. YuminosukeSato/pyproc -- 2025-08-20
  311. AGENTS.md – Open format for guiding coding agents -- 2025-08-20
  312. NVIDIA Nemotron Nano 2 and the Nemotron Pretraining Dataset v1 -- 2025-08-20
  313. deepseek-ai/DeepSeek-V3.1-Base · Hugging Face -- 2025-08-20
  314. mistralai/Devstral-Small-2507 -- 2025-08-20
  315. zai-org/GLM-4.5-Air -- 2025-08-20
  316. microsoft/Phi-4-mini-flash-reasoning -- 2025-08-20
  317. ModelTC/Qwen-Image-Lightning -- 2025-08-20
  318. Fast Type-Aware Linting in Oxlint -- 2025-08-20
  319. GPT-5, where does it shine for you? -- 2025-08-19
  320. vidore/colqwen-omni-v0.1 -- 2025-08-19
  321. Tutorial: Open WebUI and llama-swap works great together! Demo of setup, model swapping and activity monitoring. -- 2025-08-18
  322. Concurrency in open-weight/open-source models? -- 2025-08-18
  323. [Editorial] Claude Flow, Alpha 90 release -- 2025-08-17
  324. Optimizing Text gen webui (oobabooga) for MOE models (Qwen3-235b, GLM 4.5) -- 2025-08-17
  325. Chen-zexi/vllm-cli -- 2025-08-17
  326. zyfoxx/subhunter -- 2025-08-17
  327. HuggingFaceTB/SmolLM3-3B-Base -- 2025-08-16
  328. mistralai/Voxtral-Small-24B-2507 -- 2025-08-16
  329. TiTan - a tiny model for tags and titles -- 2025-08-16
  330. baidu/ERNIE-4.5-VL-424B-A47B-PT -- 2025-08-16
  331. MiniLM (BERT) embeddings in C from scratch -- 2025-08-16
  332. GENNAI CLI - A ReAct-based agent CLI -- 2025-08-16
  333. gptme v0.28.0 major release - agent CLI with local model support -- 2025-08-16
  334. WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling -- 2025-08-16
  335. character-ai/pipelining-sft -- 2025-08-15
  336. Compass-Thinker-7B Technical Report -- 2025-08-15
  337. Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning -- 2025-08-15
  338. SaaS Is Dead -- 2025-08-15
  339. PYX: The next step in Python packaging -- 2025-08-15
  340. [Editorial] GLM-4.5, enterprise use -- 2025-08-13
  341. Best local model with function calling? -- 2025-08-13
  342. agentica-org/DeepSWE-Preview -- 2025-08-13
  343. janhq/Jan-v1-4B-GGUF -- 2025-08-13
  344. gpt-oss jailbreak workflow -- 2025-08-11
  345. GPT-5 removed logprob support from the API - technical breakdown and implications -- 2025-08-11
  346. A model for pure text continuation (not chirpy little Q&A assistant)? -- 2025-08-11
  347. Suggestion for upgrading hardware for MOE inference and fine-tuning. -- 2025-08-09
  348. Best models under 16GB?? -- 2025-08-09
  349. Explicit tail calls are now available on Rust Nightly (become keyword) -- 2025-08-09
  350. HuggingFaceTB/SmolLM3-3B -- 2025-08-09
  351. mistralai/Magistral-Small-2507 -- 2025-08-09
  352. N8N + OpenWebUI -- 2025-08-08
  353. Create space-saving clones on macOS with Python -- 2025-08-08
  354. Experience with GLM-4.5-Air + claude code? -- 2025-08-08
  355. Just when you thought Qwen was done... -- 2025-08-08
  356. Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs -- 2025-08-08
  357. Context Management by Trimming Conversation -- 2025-08-06
  358. Exploiting Primacy Effect To Improve Large Language Models -- 2025-08-06
  359. Help: Qwen3-Coder + LM Studio + Continue.dev (VSCode) + Mac 64GB M3 Max — 500 Internal Server Error, Even After Unsloth Fix -- 2025-08-04
  360. CWC now supports kimi.com (K2) and chat.z.ai (GLM-4.5) to enable coding with top tier models at no cost -- 2025-08-04
  361. 🧠 ICM+DPO: Used Qwen3's coherent understanding to improve Gemma3 at math - cross-model capability transfer with zero supervision -- 2025-08-03
  362. NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining -- 2025-08-03
  363. Build an AI Shopping Assistant with Gradio MCP Servers -- 2025-08-01
  364. [Editorial] Alternative to vector db rag -- 2025-07-30
  365. This year’s best open-source models and most cost-effective models -- 2025-07-29
  366. I’m looking for multimodal image input support and uncensored LLM -- 2025-07-29
  367. nvidia/audio-flamingo-3 -- 2025-07-29
  368. mistralai/Voxtral-Mini-3B-2507 -- 2025-07-29
  369. ziangcao0312/PhysX-3D -- 2025-07-29
  370. UI/UX benchmark update 7/22: Newest Qwen models added, Qwen3 takes the lead in terms of win rate (though still early) -- 2025-07-28
  371. unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF -- 2025-07-28
  372. zai-org/GLM-4.5 -- 2025-07-28
  373. Tesslate/UIGEN-X-32B-0727 -- 2025-07-28
  374. had to fine-tune qwen since llama sucks at summarizing -- 2025-07-28
  375. orchestre-dev/ccproxy -- 2025-07-28
  376. [Editorial] neural networks don’t need to be giant to be powerful -- 2025-07-27
  377. Qwen/Qwen3-235B-A22B-Thinking-2507 -- 2025-07-27
  378. mistralai/Magistral-Small-2507 -- 2025-07-27
  379. cmdaltctr/claude-gemini-mcp-slim -- 2025-07-26
  380. RichardAtCT/claude-code-openai-wrapper -- 2025-07-26
  381. Freigeist - The new Vibe Coding Platform -- 2025-07-26
  382. From chaotic prompting to structured workflow: My Claude evolution -- 2025-07-24
  383. A Request for Comments (RFC) for MCP-alternative Universal Tool Calling Protocol (UTCP) was created -- 2025-07-22
  384. How to use the same context across LLMs and Agents -- 2025-07-22
  385. Has anyone actually ran VLAs locally and how good are they? -- 2025-07-22
  386. google/medsiglip-448 -- 2025-07-22
  387. Questions about AI for translation -- 2025-07-22
  388. microsoft/Phi-4-mini-flash-reasoning -- 2025-07-22
  389. LGAI-EXAONE/EXAONE-4.0-32B -- 2025-07-22
  390. Replacing thinking with tool usage enables reasoning in small language models -- 2025-07-22
  391. Struggling to Generate Polished UI with Claude Code -- 2025-07-20
  392. Skywork/Skywork-R1V3-38B -- 2025-07-20
  393. ByteDance-Seed/Seed-X-PPO-7B -- 2025-07-20
  394. Diffusion model support in llama.cpp. -- 2025-07-16
  395. GLM-4 MoE incoming -- 2025-07-16
  396. GeoArrow and GeoParquet, and the Future of Geospatial Data Analysis -- 2025-07-15
  397. Programming Affordances That Invite Mistakes -- 2025-07-14
  398. HuggingFaceTB/SmolLM3-3B-Base -- 2025-07-14
  399. OLMo 2 - a family of fully-open language models -- 2025-07-12
  400. google/videoprism -- 2025-07-12
  401. Why don’t we have a big torrent repo for open-source LLMs? -- 2025-07-12
  402. Local PDF Database searchable with ollama - best setup? -- 2025-07-12
  403. Tinyllama on old Mediatek G80 android device -- 2025-07-12
  404. I used Ollama to build a Cursor for PDFs -- 2025-07-12
  405. Advice on switching to LLM -- 2025-07-12
  406. Building the Hugging Face MCP Server -- 2025-07-11
  407. Support for the upcoming IBM Granite 4.0 has been merged into llama.cpp -- 2025-07-11
  408. support for Falcon-H1 model family has been merged into llama.cpp -- 2025-07-11
  409. fsndzomga/metadspy -- 2025-07-10
  410. osmosis-ai/Osmosis-Apply-1.7B -- 2025-07-10
  411. IntervitensInc/pangu-pro-moe-model -- 2025-07-10
  412. Continual Gradient Low-Rank Projection Fine-Tuning for LLMs -- 2025-07-10
  413. pola-rs/polars -- 2025-07-09
  414. Introducing an open source cross-platform graphical interface LLM client -- 2025-07-08
  415. llama-server vs llama python binding -- 2025-07-08
  416. Llama server completion not working correctly -- 2025-07-08
  417. introducing cocoindex - super simple etl to prepare data for ai, with dynamic index (ollama integrated) -- 2025-07-08
  418. Higher topk and num_ctx or map/reduce ? -- 2025-07-08
  419. Looking for an upgrade from Meta-Llama-3.1-8B-Instruct-Q4_K_L.gguf, especially for letter parsing. Last time I looked into this was a very long time ago (7 months!) What are the best models nowadays? -- 2025-07-08
  420. Best models by size? -- 2025-07-08
  421. Planning a 7–8B Model Benchmark on 8GB GPU — What Should I Test & Measure? -- 2025-07-08
  422. DLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching -- 2025-07-08
  423. Efficient MultiModal Data Pipeline -- 2025-07-08
  424. Are non-autoregressive models really faster than autoregressive ones after all the denoising steps? -- 2025-07-06
  425. Yuan-ManX/ComfyUI-OmniGen2 -- 2025-07-06
  426. Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 -- 2025-07-06
  427. Using local models with Void -- 2025-07-05
  428. Intel GPU vLLM Docker Compose Bootstrap with Phi-lthy4 on A770 -- 2025-07-05
  429. Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm -- 2025-07-05
  430. Gemma 3n fully available in the open-source ecosystem! -- 2025-07-05
  431. 5060ti 16gb or 9060xt 16gb for small llm server -- 2025-07-05
  432. Qwen3 models in MLX format! -- 2025-07-05
  433. Best local coding model right now? -- 2025-07-05
  434. Found a Web3 LLM That Actually Gets DeFi Right -- 2025-07-03
  435. apple/DiffuCoder-7B-cpGRPO -- 2025-07-03
  436. Hoshinonyaruko/Gensokyo-MCP -- 2025-07-01
  437. ahmadallobani/BaldHead -- 2025-06-29
  438. modelcontextprotocol/registry -- 2025-06-27
  439. 0-concordance of knotted surfaces and Alexander ideals -- 2025-06-26
  440. unfinishedtr/progressbar -- 2025-06-24
  441. wearyfurnitur/confiq -- 2025-06-24
  442. Show HN: Controlling 3D models with voice and hand gestures -- 2025-06-24
  443. open-webui/mcpo -- 2025-06-23
  444. Built a fully local Whisper + pyannote stack to replace Otter. Full diarisation, transcripts & summaries on GPU. -- 2025-06-23
  445. MiniMax latest open-sourcing LLM, MiniMax-M1 — setting new standards in long-context reasoning,m -- 2025-06-23
  446. Run qwen 30b-a3b on Android local with Alibaba MNN Chat -- 2025-06-23
  447. A new PDF translation tool -- 2025-06-23
  448. What Really Happens When You Ask a Cursor a Question with GitHub MCP Integrated -- 2025-06-23
  449. [Q] How to Speed Up Mistral 7B Inference in LM Studio? 31s/Chunk on RTX 3070 -- 2025-06-23
  450. Cyber security guys are about to become very on demand in the coming few years -- 2025-06-23
  451. The first big AI disaster is yet to happen -- 2025-06-23
  452. Trading with Claude, and writing your own MCP server -- 2025-06-23
  453. Ecne AI Podcast Generator - Update -- 2025-06-23
  454. Help me decide on hardware for LLMs -- 2025-06-23
  455. chungmin99/pyroki -- 2025-06-23
  456. nvidia/GR00T-N1.5-3B -- 2025-06-23
  457. nvidia/Cosmos-Predict2-2B-Text2Image -- 2025-06-22
  458. trendmicro/vision-one-mcp-server -- 2025-06-20
  459. Minidoracat/mcp-feedback-enhanced -- 2025-06-20
  460. meta-llama/Llama-3.1-8B-Instruct -- 2025-06-19
  461. MiniMaxAI/MiniMax-M1-80k -- 2025-06-19
  462. ckanthony/openapi-mcp -- 2025-06-18
  463. Ta0ing/MCP-SecurityTools -- 2025-06-18
  464. fluxions/vui -- 2025-06-13
  465. carbon-language/carbon-lang -- 2025-06-12
  466. CJackHwang/AIstudioProxyAPI -- 2025-06-11
  467. News publishers call Google's AI Mode 'theft' -- 2025-06-11
  468. typelevel/cats -- 2025-06-11
  469. wesm/pydata-book -- 2025-06-11
  470. jedisct1/openapi-mcp -- 2025-06-09
  471. arcee-ai/Homunculus -- 2025-06-04
  472. GitHub MCP exploited: Accessing private repositories via MCP -- 2025-06-01
  473. dipampaul17/KVSplit -- 2025-06-01
  474. facebook/OMol25 -- 2025-05-30
  475. Is Microsoft’s new Foundry Local going to be the “easy button” for running newer transformers models locally? -- 2025-05-28
  476. Cobolt is now available on Linux! 🎉 -- 2025-05-28
  477. Round Up: Current Best Local Models under 40B for Code & Tool Calling, General Chatting, Vision, and Creative Story Writing. -- 2025-05-28
  478. Best local model for M2 16gb MacBook Air for Analyzing Transcripts -- 2025-05-28
  479. Prompt Debugging -- 2025-05-28
  480. Not so Smart Agent (Ollama, Spring AI, MCP) -- 2025-05-28
  481. [Q] How can one get better at fixing models,training etc.? -- 2025-05-28
  482. FOSS - MCP Server generator from OpenAPI specification files (swagger/etapi) -- 2025-05-28
  483. Digital Payment System GNU Taler Gets Green Light to Operate in Switzerland -- 2025-05-28