Legible
Keyword retrieval you can read and tune, not a black-box embedding distance.
Engram · Fast path
The fast path in front of your expensive calls.
Engram is a lightweight retrieval layer that sits between a query and the expensive work behind it: an LLM call, a database hit, an external API. When a query matches something Engram already holds, it returns that answer immediately and the costly call never happens. It is keyword-indexed, learns which answers actually get used, and is small and transparent by design.
The problem
Conversational systems ask the same things over and over, and every repeat pays full price: another inference, another query, another round trip.
The usual fix is a cache, but a naive cache matches exact strings and breaks on the first reworded question. A vector store answers every near-miss with an opaque similarity score you cannot read and cannot tune by hand.
You want the speed of a cache with retrieval you can actually reason about.
What it does
Engram sits in front of expensive computation. A sufficient match returns stored knowledge and skips the downstream call entirely.
what is its population resolves against what came before.Why it’s different
A vector store hides its judgment inside an embedding distance you have to trust. Engram is deliberately legible: the reason it returned an answer is something you can read, predict, and tune by hand.
Keyword retrieval you can read and tune, not a black-box embedding distance.
Sharpens on what actually gets retrieved, rather than guessing at relevance.
A library and a command-line tool you run yourself, not a service to stand up.
You own it
Engram is a library and a CLI, not a service you have to host. State persists as plain JSON you can inspect, version, and move. Nothing leaves your environment, and there is nothing to stand up. It is Apache-2.0 licensed, so you can read the source, fork it, and ship it.
Available now · Apache-2.0
Engram is open source and usable as a library or from the command line. Read the source, run it yourself, and skip the calls you have already paid for.
View on GitHub