RAG & Retrieval
Retrieval-augmented generation, embeddings, vector search, memory systems
102 articles across 41 editions
Articles
- [Editorial] https://github.com/GMaN1911/claude-cognitive -- 2026-01-02
- SA-RAG: Using spreading activation to improve multi-hop retrieval in RAG systems -- 2026-01-02
- Is there a way to see what is trashing my context? -- 2026-01-02
- [Tool] imesde: Zero-GPU, In-Memory Vector Engine for Real-Time Local RAG -- 2025-12-22
- I built a Rust-based HTML-to-Markdown converter to save RAG tokens (Self-Hosted / API) -- 2025-12-22
- This Week in Security: Hornet, Gogs, and Blinkenlights -- 2025-12-12
- timwhitez/MDTCred -- 2025-12-10
- [Tool] Tiny MCP server for local FAISS-based RAG (no external DB) -- 2025-12-09
- P4nda0s/IDA-NO-MCP -- 2025-12-09
- Code Embeddings vs Documentation Embeddings for RAG in Large-Scale Codebase Analysis -- 2025-12-08
- smallevals - Tiny 0.6B Evaluation Models and a Local LLM Evaluation Framework -- 2025-12-05
- I built a real-time RAG visualizer for pgvector because debugging invisible chunks is a nightmare -- 2025-12-03
- Built a local MCP Hub + Memory Engine for Ollama — looking for testers -- 2025-12-03
- RAG follow-ups not working — Qwen2.5 ignores previous context and gives unrelated answers -- 2025-11-26
- Turns out LLM's can be consistent ..! -- 2025-11-19
- AI Safety Evaluation! -- 2025-11-19
- messkan/rag-chunk -- 2025-11-19
- I was tired of guessing my RAG chunking strategy, so I built rag-chunk, a CLI to test it. -- 2025-11-17
- vaspike/agtok -- 2025-11-17
- Heartbeats in Distributed Systems -- 2025-11-17
- Migrating from Gitea to Codeberg -- 2025-11-17
- I built a RAG as a Service orchestrator for local models -- 2025-11-14
- Fine-Tuning SLMs and Running Them Securely in Your Web Browser -- 2025-11-14
- I took a deep dive into ChatGPT's web_search API to learn how to get my content cited. Here's what I found. -- 2025-11-14
- [Editorial] https://arxiv.org/pdf/2506.21734 -- 2025-11-11
- OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval -- 2025-11-11
- Working on a list of open source tools for a Kubernetes ML stack -- 2025-11-10
- I built a leaderboard for Rerankers -- 2025-11-10
- LiquidAI/LFM2-ColBERT-350M -- 2025-11-10
- [Editorial] Context Engineering Handbook -- 2025-11-03
- Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
- OSS alternative to Open WebUI - ChatGPT-like UI, API and CLI -- 2025-11-03
- Looking for a RAG UI manager to meet our needs to replace Zapier -- 2025-11-03
- LiquidAI/LFM2-1.2B-Extract -- 2025-11-03
- Un-LOCC (Universal Lossy Optical Context Compression), Achieve Up To 3× context compression with 93.65% Accuracy. -- 2025-10-24
- LiquidAI/LFM2-1.2B-RAG -- 2025-10-24
- zai-org/GLM-4.6 -- 2025-10-24
- [Editorial] Browsers you can socially engineer -- 2025-10-24
- Update on Plans for Privacy Sandbox Technologies -- 2025-10-24
- 1r0BIT/TaskHound -- 2025-10-18
- armai92/goauth -- 2025-10-18
- Chinese gang used ArcGIS as a backdoor for a year – and no one noticed -- 2025-10-18
- [Editorial] Sqlite vector -- 2025-10-17
- Meta Superintelligence group publishes paper on new RAG technique -- 2025-10-17
- Dolphin — analyze-then-parse document image model (open-source, ByteDance) -- 2025-10-06
- Eclaire – Open-source, privacy-focused AI assistant for your data -- 2025-10-06
- KeyKnowledgeRAG (K^2RAG): An Enhanced RAG method for improved LLM question-answering capabilities -- 2025-10-06
- Granite 4 H Tiny Q8 in RTX 3090, It's a context king. -- 2025-10-06
- A tiny receipt per AI run: κ (stress), Δhol (drift), and guards—in plain JSON. -- 2025-10-02
- Microsoft Agent Framework (Preview): Making AI Agents Simple for Every Developer -- 2025-10-02
- Stress-Testing RAG in Production: Retrieval Quality, Drift, and Hidden Costs -- 2025-09-30
- Uncensored LLM -- 2025-09-23
- Building RAG Systems at Enterprise Scale: Our Lessons and Challenges -- 2025-09-23
- HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data -- 2025-09-19
- TencentCloudADP/youtu-graphrag -- 2025-09-17
- [Editorial] REFRAG: Rethinking RAG based Decoding -- 2025-09-17
- I'm building local, open-source, fast, efficient, minimal, and extendible RAG library I always wanted to use -- 2025-09-03
- Creating the brain behind dumb models -- 2025-09-03
- Would a “Knowledge Coverage Audit” tool be useful for RAG/chatbot builders? -- 2025-08-30
- Anyone have the deets on ROCM 7.0's 3x perf claims? -- 2025-08-19
- Rust in 2025: Targeting foundational software -- 2025-08-19
- [UPDATE] DocStrange - Structured data extraction from images/pdfs/docs -- 2025-08-15
- Local RAG with 97% smaller index and Claude Code–compatible semantic search -- 2025-08-10
- Best way to manage context/notes locally for API usage while optimizing token costs? -- 2025-07-26
- Advice on choice of model -- 2025-07-26
- Document processing -- 2025-07-26
- Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support -- 2025-07-24
- MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models -- 2025-07-24
- Building an MCP Server and Client with FastMCP 2.0 -- 2025-07-24
- muyuanlove/sensitive_info_extractor -- 2025-07-24
- Wordle-like game using your photos and on-device Small Language Models (SLMs) -- 2025-07-24
- Model to retrieve information from Knowledge. -- 2025-07-23
- merve/smol-vision -- 2025-07-23
- WhisPad (Note app, transcription, speaker diarization, AI style enhancements, mindmaps, chat with notes, etc) -- 2025-07-23
- vidore/colqwen-omni-v0.1 -- 2025-07-21
- Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data -- 2025-07-21
- Super fast local CPU file processing with static embeddings! -- 2025-07-21
- Deep Research with local LLM and local documents -- 2025-07-10
- WikipeQA : An evaluation dataset for both web-browsing agents and vector DB RAG systems -- 2025-07-10
- Looking for advice. -- 2025-07-10
- If NotebookLM were Agentic -- 2025-07-09
- run-llama/notebookllama -- 2025-07-09
- Tips for running a local RAG and llm? -- 2025-07-04
- I made an LLM tool to let you search offline Wikipedia/StackExchange/DevDocs ZIM files (llm-tools-kiwix, works with Python & LLM cli) -- 2025-07-04
- Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective -- 2025-07-04
- automated debugging using Ollama -- 2025-06-27
- Knowledge Database Advise needed/ Local RAG for IT Asset Discovery - Best approach for varied data? -- 2025-06-27
- Need feedback for a RAG using Ollama as background. -- 2025-06-27
- Best tutorial for installing a local llm with GUI setup? -- 2025-06-27
- Create 2 and 3-bit GPTQ quantization for Qwen3-235B-A22B? -- 2025-06-27
- Ollama Frontend/GUI -- 2025-06-27
- Newtonian Formulation of Attention: Treating Tokens as Interacting Masses? -- 2025-06-27
- Gemini CLI: Open-source AI agent. Write code, debug, and automate tasks with Gemini 2.5 Pro with industry-leading high usage limits at no cost. -- 2025-06-27
- A Plan for SIMD -- 2025-06-27
- Squiggle: A simple programming language for intuitive probabilistic estimation -- 2025-06-27
- I Built a Symbolic Cognitive System to Fix AI Drift — It’s Now Public (SCS 2.0) -- 2025-06-27
- trendmicro/vision-one-mcp-server -- 2025-06-20
- Minidoracat/mcp-feedback-enhanced -- 2025-06-20
- showlab/OmniConsistency -- 2025-06-15
- echo840/MonkeyOCR -- 2025-06-15
- Qwen/Qwen3-Embedding-8B -- 2025-06-06
- Sriram-PR/doc-scraper -- 2025-06-04