RAG & Retrieval

Retrieval-augmented generation, embeddings, vector search, memory systems

143 articles across 54 editions

Articles

The 'papers, please' era of the internet will decimate your privacy -- 2026-06-27
The Swiss Federal Supreme Court is evaluating Heretic -- 2026-06-27
Heretic Grimoire 1.4: Takedown-Resilient Model Backup — 9KB Reproducible Manifests + IPFS Distribution -- 2026-06-16
z.ai Poll: MIT-Licensed Open Weights Are Losing -- 2026-06-16
How much of Thermo Fisher's antibody data has been manipulated? -- 2026-06-09
News Sites are Blocking Internet Archive over AI Scraping Fears -- 2026-06-09
OneDrive data now has an expiry date -- 2026-06-09
Dopamine Fracking -- 2026-06-09
HiDream-ai/HiDream-O1-Image -- 2026-06-03
How we index images for RAG -- 2026-06-03
The $500K AI Film That "Premiered at Cannes" Was Not in the Official Festival -- 2026-06-02
A 10 year old Xeon is all you need -- 2026-06-02
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop -- 2026-06-02
OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension -- 2026-06-02
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation -- 2026-06-02
Qwen3.6 huge quality gain from Q4 to Q6 for coding agent -- 2026-06-02
[Editorial] -- 2026-05-26
Codex Just Became THE BEST Long Running Agentic Harness -- 2026-05-26
Harnesses -- 2026-05-26
DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost -- 2026-05-26
Archon + Jira: Drag a Ticket, Get a Pull Request (Live Build) -- 2026-05-26
server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp -- 2026-05-26
[Editorial] -- 2026-05-26
[Editorial] Abliterix — Model Abliteration Tool -- 2026-04-24
[Editorial] Qwen3.6-35B-A3B Abliterix EGA Abliterated -- 2026-04-24
[Editorial] Abliteration Research Paper -- 2026-04-24
FBI used iPhone notification data to retrieve deleted Signal messages -- 2026-04-14
boratanrikulu/gecit - DPI bypass tool (eBPF on Linux, proxy on macOS) -- 2026-04-14
[Editorial] The Witness Stand: Code Quality Under AI -- 2026-03-31
Go Hard on Agents, Not on Your Filesystem (Stanford) -- 2026-03-31
Copilot edited an ad into my PR -- 2026-03-31
[Editorial] -- 2026-03-25
[Editorial] -- 2026-03-25
[Editorial] Your AI Forgot Everything Again — There's a Fix for That -- 2026-03-18
ylytdeng/wechat-decrypt -- 2026-03-18
Source code of Swedish e-government services has been leaked -- 2026-03-14
Bucketsquatting is (finally) dead -- 2026-03-14
E2E encrypted messaging on Instagram will no longer be supported after 8 May -- 2026-03-14
[Editorial] -- 2026-02-26
[Editorial] -- 2026-02-26
[Editorial] -- 2026-02-26
[Editorial] https://github.com/GMaN1911/claude-cognitive -- 2026-01-02
SA-RAG: Using spreading activation to improve multi-hop retrieval in RAG systems -- 2026-01-02
Is there a way to see what is trashing my context? -- 2026-01-02
[Tool] imesde: Zero-GPU, In-Memory Vector Engine for Real-Time Local RAG -- 2025-12-22
I built a Rust-based HTML-to-Markdown converter to save RAG tokens (Self-Hosted / API) -- 2025-12-22
This Week in Security: Hornet, Gogs, and Blinkenlights -- 2025-12-12
timwhitez/MDTCred -- 2025-12-10
[Tool] Tiny MCP server for local FAISS-based RAG (no external DB) -- 2025-12-09
P4nda0s/IDA-NO-MCP -- 2025-12-09
Code Embeddings vs Documentation Embeddings for RAG in Large-Scale Codebase Analysis -- 2025-12-08
smallevals - Tiny 0.6B Evaluation Models and a Local LLM Evaluation Framework -- 2025-12-05
I built a real-time RAG visualizer for pgvector because debugging invisible chunks is a nightmare -- 2025-12-03
Built a local MCP Hub + Memory Engine for Ollama — looking for testers -- 2025-12-03
RAG follow-ups not working — Qwen2.5 ignores previous context and gives unrelated answers -- 2025-11-26
Turns out LLM's can be consistent ..! -- 2025-11-19
AI Safety Evaluation! -- 2025-11-19
messkan/rag-chunk -- 2025-11-19
I was tired of guessing my RAG chunking strategy, so I built rag-chunk, a CLI to test it. -- 2025-11-17
vaspike/agtok -- 2025-11-17
Heartbeats in Distributed Systems -- 2025-11-17
Migrating from Gitea to Codeberg -- 2025-11-17
I built a RAG as a Service orchestrator for local models -- 2025-11-14
Fine-Tuning SLMs and Running Them Securely in Your Web Browser -- 2025-11-14
I took a deep dive into ChatGPT's web_search API to learn how to get my content cited. Here's what I found. -- 2025-11-14
[Editorial] https://arxiv.org/pdf/2506.21734 -- 2025-11-11
OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval -- 2025-11-11
Working on a list of open source tools for a Kubernetes ML stack -- 2025-11-10
I built a leaderboard for Rerankers -- 2025-11-10
LiquidAI/LFM2-ColBERT-350M -- 2025-11-10
[Editorial] Context Engineering Handbook -- 2025-11-03
Qwen3-VL-32B Q8 speeds in llama.cpp vs vLLM FP8 on a RTX PRO 6000 -- 2025-11-03
OSS alternative to Open WebUI - ChatGPT-like UI, API and CLI -- 2025-11-03
Looking for a RAG UI manager to meet our needs to replace Zapier -- 2025-11-03
LiquidAI/LFM2-1.2B-Extract -- 2025-11-03
Un-LOCC (Universal Lossy Optical Context Compression), Achieve Up To 3× context compression with 93.65% Accuracy. -- 2025-10-24
LiquidAI/LFM2-1.2B-RAG -- 2025-10-24
zai-org/GLM-4.6 -- 2025-10-24
[Editorial] Browsers you can socially engineer -- 2025-10-24
Update on Plans for Privacy Sandbox Technologies -- 2025-10-24
1r0BIT/TaskHound -- 2025-10-18
armai92/goauth -- 2025-10-18
Chinese gang used ArcGIS as a backdoor for a year – and no one noticed -- 2025-10-18
[Editorial] Sqlite vector -- 2025-10-17
Meta Superintelligence group publishes paper on new RAG technique -- 2025-10-17
Dolphin — analyze-then-parse document image model (open-source, ByteDance) -- 2025-10-06
Eclaire – Open-source, privacy-focused AI assistant for your data -- 2025-10-06
KeyKnowledgeRAG (K^2RAG): An Enhanced RAG method for improved LLM question-answering capabilities -- 2025-10-06
Granite 4 H Tiny Q8 in RTX 3090, It's a context king. -- 2025-10-06
A tiny receipt per AI run: κ (stress), Δhol (drift), and guards—in plain JSON. -- 2025-10-02
Microsoft Agent Framework (Preview): Making AI Agents Simple for Every Developer -- 2025-10-02
Stress-Testing RAG in Production: Retrieval Quality, Drift, and Hidden Costs -- 2025-09-30
Uncensored LLM -- 2025-09-23
Building RAG Systems at Enterprise Scale: Our Lessons and Challenges -- 2025-09-23
HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data -- 2025-09-19
TencentCloudADP/youtu-graphrag -- 2025-09-17
[Editorial] REFRAG: Rethinking RAG based Decoding -- 2025-09-17
I'm building local, open-source, fast, efficient, minimal, and extendible RAG library I always wanted to use -- 2025-09-03
Creating the brain behind dumb models -- 2025-09-03
Would a “Knowledge Coverage Audit” tool be useful for RAG/chatbot builders? -- 2025-08-30
Anyone have the deets on ROCM 7.0's 3x perf claims? -- 2025-08-19
Rust in 2025: Targeting foundational software -- 2025-08-19
[UPDATE] DocStrange - Structured data extraction from images/pdfs/docs -- 2025-08-15
Local RAG with 97% smaller index and Claude Code–compatible semantic search -- 2025-08-10
Best way to manage context/notes locally for API usage while optimizing token costs? -- 2025-07-26
Advice on choice of model -- 2025-07-26
Document processing -- 2025-07-26
Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support -- 2025-07-24
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models -- 2025-07-24
Building an MCP Server and Client with FastMCP 2.0 -- 2025-07-24
muyuanlove/sensitive_info_extractor -- 2025-07-24
Wordle-like game using your photos and on-device Small Language Models (SLMs) -- 2025-07-24
Model to retrieve information from Knowledge. -- 2025-07-23
merve/smol-vision -- 2025-07-23
WhisPad (Note app, transcription, speaker diarization, AI style enhancements, mindmaps, chat with notes, etc) -- 2025-07-23
vidore/colqwen-omni-v0.1 -- 2025-07-21
Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data -- 2025-07-21
Super fast local CPU file processing with static embeddings! -- 2025-07-21
Deep Research with local LLM and local documents -- 2025-07-10
WikipeQA : An evaluation dataset for both web-browsing agents and vector DB RAG systems -- 2025-07-10
Looking for advice. -- 2025-07-10
If NotebookLM were Agentic -- 2025-07-09
run-llama/notebookllama -- 2025-07-09
Tips for running a local RAG and llm? -- 2025-07-04
I made an LLM tool to let you search offline Wikipedia/StackExchange/DevDocs ZIM files (llm-tools-kiwix, works with Python & LLM cli) -- 2025-07-04
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective -- 2025-07-04
automated debugging using Ollama -- 2025-06-27
Knowledge Database Advise needed/ Local RAG for IT Asset Discovery - Best approach for varied data? -- 2025-06-27
Need feedback for a RAG using Ollama as background. -- 2025-06-27
Best tutorial for installing a local llm with GUI setup? -- 2025-06-27
Create 2 and 3-bit GPTQ quantization for Qwen3-235B-A22B? -- 2025-06-27
Ollama Frontend/GUI -- 2025-06-27
Newtonian Formulation of Attention: Treating Tokens as Interacting Masses? -- 2025-06-27
Gemini CLI: Open-source AI agent. Write code, debug, and automate tasks with Gemini 2.5 Pro with industry-leading high usage limits at no cost. -- 2025-06-27
A Plan for SIMD -- 2025-06-27
Squiggle: A simple programming language for intuitive probabilistic estimation -- 2025-06-27
I Built a Symbolic Cognitive System to Fix AI Drift — It’s Now Public (SCS 2.0) -- 2025-06-27
trendmicro/vision-one-mcp-server -- 2025-06-20
Minidoracat/mcp-feedback-enhanced -- 2025-06-20
showlab/OmniConsistency -- 2025-06-15
echo840/MonkeyOCR -- 2025-06-15
Qwen/Qwen3-Embedding-8B -- 2025-06-06
Sriram-PR/doc-scraper -- 2025-06-04