Step 7: The Session-15 Test — Give Your Agent Long-Term Memory

The test

This is the payoff. session15_test.py seeds six facts in session 1, moves the user in session 8, and interrogates the store in session 15 — no LLM in the assertions, just the memory layer, so the result is deterministic.

# session15_test.py
from memory import connect, write_fact, recall
 
con = connect("test.db")
 
# --- Session 1: six things about the user ---
session1 = [
    ("name", "Maya"),
    ("role", "backend engineer"),
    ("city", "Toronto"),
    ("project", "the billing rewrite"),
    ("preference", "terse answers, no preamble"),
    ("constraint", "no cloud vendors"),
]
for pred, val in session1:
    write_fact(con, pred, val, session=1)
 
# --- Session 8: one fact changes ---
write_fact(con, "city", "Berlin", session=8)
 
# --- Session 15: ask ---
print("Q: Where do I live?")
print("   ->", recall(con, "where do I live?", k=1))
 
print("\nQ: What do you remember about me?")
for s, p, v in recall(con, "tell me everything about the user", k=6):
    print(f"   - {p}: {v}")

$ python session15_test.py
Q: Where do I live?
   -> [('user', 'city', 'Berlin')]
 
Q: What do you remember about me?
   - name: Maya
   - role: backend engineer
   - city: Berlin
   - project: the billing rewrite
   - preference: terse answers, no preamble
   - constraint: no cloud vendors

Both questions pass. Six facts recalled, and "where do I live?" returns Berlin, not the stale Toronto — because the write path in Step 4 closed the Toronto row when the move came in. That single UPDATE ... SET valid_to is the difference between a memory that searches the past and one that knows the present.

What you built

A temporal fact store in one SQLite file — facts with validity intervals, so change is a first-class operation.
An extract → embed → upsert write path that keeps only durable facts and retires stale ones.
A valid-only recall that can't surface a fact that's no longer true.
A rolling-summary layer for the fuzzy context the facts don't hold.
A memory manager that runs all four operations around a normal agent turn — fully offline via Ollama.

Run the session-15 test against your own code and it passes the exact scenario the companion drip uses to separate the strategies.

Where to take it next

Skip identical writes. Guard write_fact so re-stating the same value doesn't churn a new row.
Confidence & decay. Add a confidence column; let low-confidence facts expire on a schedule rather than living forever.
Scale the recall. Past ~10k facts, replace the Python cosine scan with sqlite-vec — same recall() signature, an indexed ANN search underneath.
Learn the write policy. The frontier direction from the drip: instead of a hand-written extractor, train when to write and what to keep with RL, à la MemAgent. Overkill for a personal agent, essential at scale.
Share it as a tool. Wrap recall and write_fact behind an MCP server so any agent — Claude Code, your own harness — can read and write the same memory.

The memory layer is the seam everything else hangs off. Build it once, and every session after the first starts warm.

Reference: Agent Long-Term Memory (drip) · sqlite-vec · MemAgent (RL-trained memory) · Build Your Own MCP Server (blueprint)