PATTERN · 02 · LAB · IN PROGRESS

A personal knowledge base, built with an LLM.

Following the pattern Andrej Karpathy sketched. Our version, with the choices we made on retrieval, chunking, and the daily-use loop. Plus what we'd carry into a client engagement (and what we wouldn't).

RAGPersonal toolingAfter Karpathy
LAB · IN PROGRESS
This pattern is being written as we use it. The studio's own knowledge base is being rebuilt on the approach below; the full write-up will land here when we've used it daily for a quarter and can tell you what actually held up. Subscribe to the Monday Brief to get the full version when it ships.
THE SHAPE
Notes in. Embeddings out. An LLM as the search-and-synthesis layer.

The basic shape Karpathy sketched: your own writing. Meeting notes, reading highlights, half-finished essays. Gets chunked, embedded, and stored. A retrieval layer surfaces the relevant chunks for any prompt. An LLM synthesizes across them. The result is a personal Claude / GPT that knows what you've already thought about, not just what the public internet thinks.

Most personal-KB attempts fail at one of three places: the ingest is too painful to be daily; the retrieval pulls too much noise; the daily-use loop has no actual habit attached to it. The full write-up will walk through the choices we made at each of those three points and which ones held up after three months of daily use.

WHAT WE'D BRING TO A CLIENT ENGAGEMENT
The same three failure modes apply to client internal-KB builds. The ingest has to be where the team already writes (Slack, Notion, email). The retrieval has to be tuned per query type, not "one RAG fits all". The daily-use loop has to attach to a real workflow people already do. Search, briefing, onboarding. Not a new habit they have to learn.
PRIOR ART · STANDING ON SHOULDERS
  • Andrej Karpathy. Public posts on personal LLM tooling
  • Jerry Liu / LlamaIndex. Retrieval patterns & chunking strategies
  • Hamel Husain. Evaluation-driven development for RAG