TencentDB‑Agent‑Memory: From a Plugin Function to a MEMORY.md Document

Below is a comprehensive MEMORY.md that translates the design philosophy of TencentDB‑Agent‑Memory into a universal, file‑based constraint specification. This MEMORY.md is meant to be placed in the root of any project once and then read at the start of each session by any AI coding agent (Claude Code, Cline, Windsurf, DeepSeek TUI, Cursor, etc.). It borrows the layered, write‑once/trace‑forever approach from the official plugin. This function makes cross‑session memory durable, recoverable, and contextually accurate.

Why `MEMORY.md` the plugin is not enough for most agents

The official TencentDB‑Agent‑Memory (GitHub, Introduction) is a runtime plugin that executes context offloading and automatic memory extraction. A MEMORY.md file cannot execute code, but it can instruct any agent to simulate the same layered memory behavior by

Following a fixed file layout
Writing new facts/decisions into the right file
Reading only the lightweight map before acting, then drilling down when needed

This approach works for all agents that can read/write project files and respect a MEMORY.md directive. DeepSeek TUI already supports user-level memory via ~/.deepseek/memory.md [4†L7-L8]; Claude Code uses CLAUDE.md documents, Cursor uses .cursorrules documents, and Cline uses .clinerules documents [11†L31-L35].

File layout: mapping TencentDB‑Agent‑Memory layers to files

The official agent memory architecture uses L0–L3 layers [5†L5-L7][6†L11-L12][8†L19-L21]:

L0 – raw dialogue / tool outputs – everything kept as‑is
L1 – atomic facts (preferences, constraints, decisions)
L2 – scene chunks (grouped by task/project)
L3 – user profile (stable long‑term traits)

We map these layers to a small directory structure that any agent can follow:

memory-bank/ ├── MEMORY.md (L1+L3 combined – the “always‑read” file) ├── scene_<task_id>.md (L2 – one file per active task, created/deleted) └── raw/ (L0 – optional, for long tool outputs) └── turn_<timestamp>.md

L0 + L2‑3 on‑disk: when `raw/` is populated

Tool outputs longer than ~800 tokens or any piece of data that must be kept bit‑perfect go into a dedicated file under raw/. The main MEMORY.md only stores a short ref: pointer to that file. This is the exact “context offloading” idea of the plugin: the context window stays lean, but nothing is ever thrown away [9†L16-L24].

`MEMORY.md` (L1+L3) – the always‑read file

This is the only file the agent is required to read on every session start. It contains durable facts (L1) and your user profile (L3). Keep it short (30–60 lines)—pre‑seed only what changes how the agent behaves [2†L18-L21].

# User Profile (L3)


- Primary language: Python 3.11+, TypeScript 5.x

- Code style: Black + isort (Python), Prettier + ESLint (TypeScript)

- Documentation: docstring‑driven, mkdocs for external docs

- Testing: pytest + coverage (min 85%)
# Atomic Facts (L1) – Do not modify this section manually.

# The agent appends new facts at the end of each session.
- 2026‑04‑15: We switched from FastAPI to Litestar because of better lifespan hooks.

- 2026‑04‑18: User strongly prefers `async` for all I/O – verified with actual performance profile.

- 2026‑04‑20: Redis is the chosen state backend; memcached is explicitly rejected.

- 2026‑05‑01: The `/metrics` endpoint must be public (no auth) for Prometheus scraping.

- 2026‑05‑05: DO NOT use `pydantic` <2.5 – the `model_config` bug breaks nested models.
# Active Scenes (L2 pointers)

- active_scene: 2026‑05‑15_auth_refactor → see memory-bank/scene_2026-05-15_auth_refactor.md - active_scene: 2026‑05‑10_logging_upgrade → see memory-bank/scene_2026-05-10_logging_upgrade.md

Agent rule: Every time the agent learns a new fact or decision (a constraint, a rejected approach, a confirmed working pattern), it must append a new line to the “Atomic Facts” section with the current date. No overwriting, no deleting – durable traceability.

Scene files (L2) – one per active task

Each active task (or project) gets its own scene_<id>.md under memory-bank/. It acts as the Mermaid task canvas—a lightweight, structured map of the task state, dependencies, and progress exactly as the official plugin does [9†L11-L15].

# Scene: Authentication Refactor **Created:** 2026‑05‑15 **Scope:** `auth/` and `middleware/` **Target:** Replace JWT with OAuth2 + refresh rotation


## Task Canvas (Mermaid flowchart)
\```mermaid

flowchart TD

    A[Analyze current jwt.py] --> B[Design OAuth2 schema]

    B --> C[Implement refresh table in PostgreSQL]

    B --> D[Rewrite middleware]

    C --> E[Integration test]

    D --> E

    E --> F[Performance comparison]

\```
## Current execution status
| Node | Status | ref |

|------|--------|-----|

| A | Done | raw/2026-05-15_analysis.md |

| B | In progress | — |

| C | Blocked (needs table permissions) | raw/pg_request.md |

| D | Pending | — |

| E | Pending | — |

| F | Pending | — |
## Decisions made within this scene
- Use `httpx` for OAuth2 discovery endpoints (the built‑in `requests` stalls under concurrency).

- Store refresh tokens in a separate `refresh_tokens` table with `ON DELETE CASCADE`.

- Keep existing session‑id logic; only replace the credential layer.
## Open questions

- Should we rotate refresh tokens on every access, or only after reaching 80% of lifetime?

Agent rule: Before executing any step, the agent must read the current scene’s status table and the Mermaid canvas. It updates the status after completing a node, appends new decisions to “Decisions made,” and creates a ref: pointer to any raw output file that exceeds ~800 tokens.

Raw files (L0) – full traceability when needed

Whenever a tool returns a long result (e.g., a long API response or a full test failure log), the agent writes the complete output to a file under it and places only a short one ref: in the scene status table. This is the direct application of “context offloading” [9†L16-L24].

memory-bank/raw/ ├── 2026-05-15_analysis.md ├── pg_request.md └── oauth2_debug_log.txt

Example ref: line in a scene status table:

| A | Done | raw/2026-05-15_analysis.md |

Agent rule: If a scene decision needs to be revisited, the agent first checks the “Decisions made” section. If the summary is insufficient, it follows the ref: pointer to the full raw file. This ensures every compressed decision remains fully reversible.

How different agents use this `MEMORY.md`

DeepSeek TUI

DeepSeek TUI can read and write any project file. At session start, the user instructs it: “Read the entire MEMORY.md under memory-bank/ and follow its rules.” The agent then respects the L1 facts, the L2 scene pointers, and the L3 user profile. It also appends new atomic facts at the end of each session. Because DeepSeek TUI has its own ~/.deepseek/memory.md for user‑level cross‑project memory, the project‑level MEMORY.md is project‑specific and does not leak facts into unrelated projects.

Claude Code (or Cline, Cursor)

Claude Code automatically looks for CLAUDE.md, but the agent can be pointed to memory-bank/MEMORY.md via a short custom instruction. Many teams create a memory-bank/ folder that contains multiple .md files, each for a different memory concern, exactly as the official TencentDB‑Agent‑Memory does with its L0‑L3 separation [11†L46-L48].

Any agent that reads project files

Because the entire memory system is file‑based and human‑readable, any agent that can cat and echo into files can implement this system. There is no vendor lock‑in, no plugin requirement, and no per‑project re‑installation – the same memory-bank/ works across all of them [12†L45-L48].

Workflow summary for any agent

Session start: Read memory-bank/MEMORY.md. Note the L3 user profile and the L1 atomic facts.
Identify active scene: Look at “Active Scenes” in MEMORY.md. Open the corresponding one.
Plan the next action: Read the Mermaid canvas + status table. Check the “ref:” pointers if more detail is needed.
Execute and log: After completing a node, update the status table. If a decision was made, append a new line to “Atomic Facts” in MEMORY.md.
Offload long outputs: If a tool returns a long result, write the full content to raw/<timestamp>.md and store only a ref: in the scene status.

Benchmark reference – why this design works

The official TencentDB‑Agent‑Memory plugin (as an executable plugin for OpenClaw) achieved the following gains over a flat memory approach [8†L10-L13][10†L14-L15]:

Token consumption reduction: up to 61.38% (WideSearch: 221.31M → 85.64M)
Success rate increase: relative +51.52% (33% → 50%)
Long‑term memory accuracy (PersonaMem): 48% → 76% (+59%)

These gains come from layered memory (L0–L3), context offloading (only pointers stay in the context), and the Mermaid task canvas (structure + traceability). A well‑written MEMORY.md following the same layered principles gives any agent comparable benefits, even without the runtime plugin.

References

Official GitHub repository: Tencent/TencentDB-Agent-Memory—MIT‑licensed, local‑first memory engine for AI agents [1†L16-L18].
Official introduction (English): Tencent Open Sources Memory System, OpenClaw Saves Up to 61% on Token—explains L0‑L3 long‑term memory and short‑term task canvas with Mermaid [7†L2-L23].
Official introduction (Chinese, more technical): TencentDB Agent Memory 正式开源：让 Agent 沉淀经验，让人专注创造—detailed breakdown of the four layers, benchmark tables, and plugin usage [10†L3-L27].
How to Write a Memory Bank for Your AI Coding Agent: kinde.com guide—covers multi‑file memory banks [11†L4-L8].
DeepSeek TUI user memory documentation: shows that ~/.deepseek/memory.md is injected into the system prompt as <user_memory> a block—the same principle applied at the project level [4†L7-L8].
Context offloading + Mermaid canvas details (QQ news): explains how tool logs are written to refs/*.md and only a short index stays in the context, reducing token usage while preserving full traceability [9†L16-L24].
Tencent Cloud announcement (Phoenix New Media): covers the L0‑L3 four‑layer progressive memory architecture and the simple one‑click plugin installation [5† L2-L9].