RAG & Retrieval

Retrieval-augmented generation, embeddings, vector search, memory systems

102 articles across 41 editions

Articles

  1. [Editorial] https://github.com/GMaN1911/claude-cognitive -- 2026-01-02
  2. SA-RAG: Using spreading activation to improve multi-hop retrieval in RAG systems -- 2026-01-02
  3. Is there a way to see what is trashing my context? -- 2026-01-02
  4. [Tool] imesde: Zero-GPU, In-Memory Vector Engine for Real-Time Local RAG -- 2025-12-22
  5. I built a Rust-based HTML-to-Markdown converter to save RAG tokens (Self-Hosted / API) -- 2025-12-22
  6. This Week in Security: Hornet, Gogs, and Blinkenlights -- 2025-12-12
  7. timwhitez/MDTCred -- 2025-12-10
  8. [Tool] Tiny MCP server for local FAISS-based RAG (no external DB) -- 2025-12-09
  9. P4nda0s/IDA-NO-MCP -- 2025-12-09
  10. Code Embeddings vs Documentation Embeddings for RAG in Large-Scale Codebase Analysis -- 2025-12-08
  11. smallevals - Tiny 0.6B Evaluation Models and a Local LLM Evaluation Framework -- 2025-12-05
  12. I built a real-time RAG visualizer for pgvector because debugging invisible chunks is a nightmare -- 2025-12-03
  13. Built a local MCP Hub + Memory Engine for Ollama — looking for testers -- 2025-12-03
  14. RAG follow-ups not working — Qwen2.5 ignores previous context and gives unrelated answers -- 2025-11-26
  15. Turns out LLM's can be consistent ..! -- 2025-11-19
  16. AI Safety Evaluation! -- 2025-11-19
  17. messkan/rag-chunk -- 2025-11-19
  18. I was tired of guessing my RAG chunking strategy, so I built rag-chunk, a CLI to test it. -- 2025-11-17
  19. vaspike/agtok -- 2025-11-17
  20. Heartbeats in Distributed Systems -- 2025-11-17
  21. Migrating from Gitea to Codeberg -- 2025-11-17
  22. I built a RAG as a Service orchestrator for local models -- 2025-11-14
  23. Fine-Tuning SLMs and Running Them Securely in Your Web Browser -- 2025-11-14
  24. I took a deep dive into ChatGPT's web_search API to learn how to get my content cited. Here's what I found. -- 2025-11-14
  25. [Editorial] https://arxiv.org/pdf/2506.21734 -- 2025-11-11
  26. OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval -- 2025-11-11
  27. Working on a list of open source tools for a Kubernetes ML stack -- 2025-11-10
  28. I built a leaderboard for Rerankers -- 2025-11-10
  29. LiquidAI/LFM2-ColBERT-350M -- 2025-11-10
  30. [Editorial] Context Engineering Handbook -- 2025-11-03
  31. Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
  32. OSS alternative to Open WebUI - ChatGPT-like UI, API and CLI -- 2025-11-03
  33. Looking for a RAG UI manager to meet our needs to replace Zapier -- 2025-11-03
  34. LiquidAI/LFM2-1.2B-Extract -- 2025-11-03
  35. Un-LOCC (Universal Lossy Optical Context Compression), Achieve Up To 3× context compression with 93.65% Accuracy. -- 2025-10-24
  36. LiquidAI/LFM2-1.2B-RAG -- 2025-10-24
  37. zai-org/GLM-4.6 -- 2025-10-24
  38. [Editorial] Browsers you can socially engineer -- 2025-10-24
  39. Update on Plans for Privacy Sandbox Technologies -- 2025-10-24
  40. 1r0BIT/TaskHound -- 2025-10-18
  41. armai92/goauth -- 2025-10-18
  42. Chinese gang used ArcGIS as a backdoor for a year – and no one noticed -- 2025-10-18
  43. [Editorial] Sqlite vector -- 2025-10-17
  44. Meta Superintelligence group publishes paper on new RAG technique -- 2025-10-17
  45. Dolphin — analyze-then-parse document image model (open-source, ByteDance) -- 2025-10-06
  46. Eclaire – Open-source, privacy-focused AI assistant for your data -- 2025-10-06
  47. KeyKnowledgeRAG (K^2RAG): An Enhanced RAG method for improved LLM question-answering capabilities -- 2025-10-06
  48. Granite 4 H Tiny Q8 in RTX 3090, It's a context king. -- 2025-10-06
  49. A tiny receipt per AI run: κ (stress), Δhol (drift), and guards—in plain JSON. -- 2025-10-02
  50. Microsoft Agent Framework (Preview): Making AI Agents Simple for Every Developer -- 2025-10-02
  51. Stress-Testing RAG in Production: Retrieval Quality, Drift, and Hidden Costs -- 2025-09-30
  52. Uncensored LLM -- 2025-09-23
  53. Building RAG Systems at Enterprise Scale: Our Lessons and Challenges -- 2025-09-23
  54. HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data -- 2025-09-19
  55. TencentCloudADP/youtu-graphrag -- 2025-09-17
  56. [Editorial] REFRAG: Rethinking RAG based Decoding -- 2025-09-17
  57. I'm building local, open-source, fast, efficient, minimal, and extendible RAG library I always wanted to use -- 2025-09-03
  58. Creating the brain behind dumb models -- 2025-09-03
  59. Would a “Knowledge Coverage Audit” tool be useful for RAG/chatbot builders? -- 2025-08-30
  60. Anyone have the deets on ROCM 7.0's 3x perf claims? -- 2025-08-19
  61. Rust in 2025: Targeting foundational software -- 2025-08-19
  62. [UPDATE] DocStrange - Structured data extraction from images/pdfs/docs -- 2025-08-15
  63. Local RAG with 97% smaller index and Claude Code–compatible semantic search -- 2025-08-10
  64. Best way to manage context/notes locally for API usage while optimizing token costs? -- 2025-07-26
  65. Advice on choice of model -- 2025-07-26
  66. Document processing -- 2025-07-26
  67. Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support -- 2025-07-24
  68. MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models -- 2025-07-24
  69. Building an MCP Server and Client with FastMCP 2.0 -- 2025-07-24
  70. muyuanlove/sensitive_info_extractor -- 2025-07-24
  71. Wordle-like game using your photos and on-device Small Language Models (SLMs) -- 2025-07-24
  72. Model to retrieve information from Knowledge. -- 2025-07-23
  73. merve/smol-vision -- 2025-07-23
  74. WhisPad (Note app, transcription, speaker diarization, AI style enhancements, mindmaps, chat with notes, etc) -- 2025-07-23
  75. vidore/colqwen-omni-v0.1 -- 2025-07-21
  76. Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data -- 2025-07-21
  77. Super fast local CPU file processing with static embeddings! -- 2025-07-21
  78. Deep Research with local LLM and local documents -- 2025-07-10
  79. WikipeQA : An evaluation dataset for both web-browsing agents and vector DB RAG systems -- 2025-07-10
  80. Looking for advice. -- 2025-07-10
  81. If NotebookLM were Agentic -- 2025-07-09
  82. run-llama/notebookllama -- 2025-07-09
  83. Tips for running a local RAG and llm? -- 2025-07-04
  84. I made an LLM tool to let you search offline Wikipedia/StackExchange/DevDocs ZIM files (llm-tools-kiwix, works with Python & LLM cli) -- 2025-07-04
  85. Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective -- 2025-07-04
  86. automated debugging using Ollama -- 2025-06-27
  87. Knowledge Database Advise needed/ Local RAG for IT Asset Discovery - Best approach for varied data? -- 2025-06-27
  88. Need feedback for a RAG using Ollama as background. -- 2025-06-27
  89. Best tutorial for installing a local llm with GUI setup? -- 2025-06-27
  90. Create 2 and 3-bit GPTQ quantization for Qwen3-235B-A22B? -- 2025-06-27
  91. Ollama Frontend/GUI -- 2025-06-27
  92. Newtonian Formulation of Attention: Treating Tokens as Interacting Masses? -- 2025-06-27
  93. Gemini CLI: Open-source AI agent. Write code, debug, and automate tasks with Gemini 2.5 Pro with industry-leading high usage limits at no cost. -- 2025-06-27
  94. A Plan for SIMD -- 2025-06-27
  95. Squiggle: A simple programming language for intuitive probabilistic estimation -- 2025-06-27
  96. I Built a Symbolic Cognitive System to Fix AI Drift — It’s Now Public (SCS 2.0) -- 2025-06-27
  97. trendmicro/vision-one-mcp-server -- 2025-06-20
  98. Minidoracat/mcp-feedback-enhanced -- 2025-06-20
  99. showlab/OmniConsistency -- 2025-06-15
  100. echo840/MonkeyOCR -- 2025-06-15
  101. Qwen/Qwen3-Embedding-8B -- 2025-06-06
  102. Sriram-PR/doc-scraper -- 2025-06-04