Riffle - signposting for AI agents
I use a local Obsidian vault organised using PARA for storing all of the context and artefacts from my work. As the vault gets larger, I’ve started wondering how effective Claude or Gemini will be in finding the right context or related files for a specific question or project.
This led me to build riffle, a self-contained, single binary CLI and MCP server
that watches, indexes and searches a vault of plain text files. I was curious
to see if, by creating a hierarchical semantic (vector) index of my files, Claude
or Gemini could use that to quickly (and cheaply) find relevant content.
Before I get into the detail of this, there are some things worth noting. This is essentially a RAG, and very well trodden territory. There are countless options available that do a far better job of this. That being said, I learn by experimenting, and this is a non-contrived, meaningful (to me) project to play in and with some interesting (again, to me) areas and ideas.
The shape of this project is guided by an interest in AI tooling - not the AI tools themselves (like Gemini or Claude), but the tooling that can be packed in and around them to enhance and extend. Most of the interest seems to be around MCP servers, but I am still more interested in CLI tooling, especially if these tools can service both human and agent needs equally well.
Then, onto riffle itself. The implementation relies on 2 core ideas, first
that in wiki-style documentation, knowledge bases or AI second brains, relevant,
semantically related information is typically stored in folders rather than
single individual files, and secondly that an LLM agent can reason more
efficiently with multiple low token features instead of a single input.
Practically this means I get to play experiment with
HNSW,
Merkle trees and, to keep everything
local, ONNX.
Follow along or use at your peril: github.com/allank/riffle.