Step 4: The Write Path — Give Your Agent Long-Term Memory

Extract durable facts

Not every sentence is worth remembering. "What's the weather?" is transient; "I'm a backend engineer who can't use cloud vendors" is durable. We ask the local model to do the triage and return structured triples. Forcing JSON output (format="json") keeps it parseable.

Add to memory.py:

import json
 
CHAT_MODEL = "llama3.2"
 
EXTRACT_PROMPT = """Extract only DURABLE facts about the user from the message.
Durable = stable identity, location, role, preferences, constraints, projects.
Ignore transient chit-chat, questions, and one-off requests.
Return JSON: {{"facts": [{{"predicate": "...", "value": "..."}}]}}
Use short, reusable predicate names (name, city, role, preference, constraint, project).
 
Message: {msg}"""
 
def extract_facts(message: str) -> list[dict]:
    r = ollama.chat(
        model=CHAT_MODEL,
        messages=[{"role": "user", "content": EXTRACT_PROMPT.format(msg=message)}],
        format="json",
    )
    try:
        return json.loads(r["message"]["content"]).get("facts", [])
    except (json.JSONDecodeError, KeyError):
        return []

Upsert with invalidation

This is the heart of the build. Before inserting a new value for a predicate, we close the current one — stamp its valid_to. The old fact stays in the table (history), but it's no longer valid_to IS NULL, so recall from Step 3 will never surface it again.

from datetime import datetime, timezone
 
def now_iso() -> str:
    return datetime.now(timezone.utc).isoformat()
 
def write_fact(con, predicate: str, value: str,
               session: int, subject: str = "user") -> None:
    ts = now_iso()
    # 1. retire the current value for this (subject, predicate), if any
    con.execute(
        "UPDATE facts SET valid_to = ?"
        " WHERE subject = ? AND predicate = ? AND valid_to IS NULL",
        (ts, subject, predicate),
    )
    # 2. insert the new current value, with its embedding
    emb = to_blob(embed(f"{subject} {predicate} {value}"))
    con.execute(
        "INSERT INTO facts(subject,predicate,value,embedding,valid_from,valid_to,session)"
        " VALUES(?,?,?,?,?,NULL,?)",
        (subject, predicate, value, emb, ts, session),
    )
    con.commit()

Note we don't guard against writing an identical value — in practice you'd skip the write if value matches the current one, to avoid churning identical rows. Left out here for clarity; it's a two-line check.

Watch the move happen

if __name__ == "__main__":
    con = connect()
    write_fact(con, "city", "Toronto", session=1)
    write_fact(con, "city", "Berlin", session=8)   # the move
    print("current:", recall(con, "where do I live?", k=1))
    print("history:")
    for row in con.execute(
        "SELECT value, valid_to FROM facts WHERE predicate='city' ORDER BY id"
    ):
        print("  ", row)

$ python memory.py
current: [('user', 'city', 'Berlin')]
history:
   ('Toronto', '2026-...T...')   # closed
   ('Berlin', None)              # current

Toronto isn't gone — it's closed. You could answer "where did she used to live?" from history if you wanted. But for "where do I live?", recall only sees Berlin. That's the session-15 update test, passing at the data layer.

Reference: Ollama structured outputs (JSON mode) · ollama-python · Agent Long-Term Memory — §03 Writing it down