Skip to content

Memory & Task Systems: Giving Your AI Agent a Brain

Published on March 05, 2026

Every morning, my AI agent Alfred wakes up with amnesia.

Not total amnesia. He remembers who he is, what he's doing, what happened yesterday. But he doesn't wake up knowing those things naturally. He has to read about them first.

That's the fundamental problem with AI agents: context windows are finite, and sessions don't persist.

When you close ChatGPT and reopen it, you get a fresh conversation. When Alfred restarts for an update or a crash recovery, he comes back with an empty context window. Everything he "knew" from our last conversation? Gone.

So the first real problem you solve when building an always-on AI agent isn't picking the right model or connecting the right tools. It's giving it a memory system.

I didn't build this from scratch. The foundation comes from OpenClaw, the agent framework Alfred runs on. OpenClaw provides the runtime, the memory tools, and the workspace conventions. But the memory architecture — how files are structured, what gets stored where, how things compound — that's the part I had to figure out by doing it badly for a few weeks first and learning from others.

If you want your agent to get better over time, to actually learn from every interaction and remember your preferences and track what's working, you need to write things down.

The rule: Text > Brain. If you want the agent to remember something, put it in a file.

The Three-Tier Memory System

After a month of running Alfred 24/7, here's what we built. Three layers, each serving a different purpose.

Layer 1: CORE.md + CURRENT.md — The Executive Summary

I originally had a single MEMORY.md file, but it kept growing until it was 300+ lines and burning context on things that rarely changed. So I split it.

CORE.md is permanent facts that almost never change: infrastructure, hard rules, tech stack, patterns. Under 50 lines. Things like "workspace lives at ~/clawd, pushes to GitHub" and "never send external messages without approval."

CURRENT.md is the active state: what projects are in flight, what shipped recently, what's next. Updated weekly, kept under 100 lines.

Example from my actual CORE.md:

Infrastructure
Workspace: ~/clawd/ (linked with Github)
Coding: ~/Coding/{project}/ (each project has .claude/tasks.md)
Primary models: Opus 4.6 (main), Codex 5.3 (coding), Haiku 4.5 (crons)
Gateway: MacBook Air M1, local OpenClaw
Crons: 23 jobs, all 6-part format
Hard Rules
Nothing external without approval (draft, don't send)
trash > rm
Security changes need explicit approval

And from CURRENT.md:

SEOTakeoff (PRIMARY)
Location: ~/Coding/seo-saas-platform/
Pricing: $9 first month → $69/mo locked
Current MRR: $97 (2 customers)
Shipped Mar 4: Ahrefs health score 7 → 100

When Alfred starts a new session, he reads both files. CORE.md gives him the rules of the world. CURRENT.md tells him what's happening in it.

Layer 2: Daily Notes — The Timeline

Every day gets a file: `memory/2026-03-05.md`, `memory/2026-03-04.md`, etc.

These are raw, chronological logs of what happened. Conversations we had, decisions made, things that we learned, tasks completed and mistakes I want him to remember.

A real daily note from this week:

Blog Publishing (10:47 AST)
Hero images: Dev tool series uses Pillow compositing with official logos (not AI gen)
Published Stripe post to Sanity, live at grahammann.net
Distribution cadence: 10 published posts = 3-4 weeks of content
Lesson Learned: Check Sanity Before Publishing (11:03 AST)
Mistake: Ran import script without checking if post already existed
Rule: Always query Sanity for existing posts before import
Added pre-publish check to PUBLISHING-WORKFLOW.md

Alfred writes to today's daily note in real time. When we have an important conversation in the evening, he doesn't wait for the nightly batch job. He writes key points down immediately.

Why? Because the compound nightly review runs in a separate session. If he doesn't write it down during our conversation, it might not get captured.

Learned that the hard way. We had a detailed evening recap one night, and the batch review missed it entirely. Now the rule is: write it down when it happens.

Layer 3: Knowledge Graph — The Structured Brain

This is the layer I wish I'd built first.

The knowledge graph lives in `life/` and follows the PARA structure: Projects, Areas, Resources, Archives. The structure is inspired by Tiago Forte's PARA method and adapted from Nat Eliason's personal knowledge management approach for use with an AI agent instead of a human note-taking system.

Each entity gets a directory with two files:

  • `summary.md` — Quick human-readable context
  • `items.json` — Atomic facts with metadata

Example: `life/projects/seotakeoff/`

The summary.md gives Alfred quick context when he needs it. The items.json contains granular facts like:

{
"id": "seotakeoff-001",
"fact": "SEOTakeoff is an AI-powered SEO content generation platform at seotakeoff.com",
"category": "context",
"timestamp": "2026-01-09",
"status": "active",
"supersededBy": null,
"lastAccessed": "2026-02-07",
"accessCount": 16
}

When a fact changes, we don't delete the old one. We mark it `status: "superseded"` and link it to the new fact. That way Alfred knows the history — not just what's true now, but what we believed before and why it changed.

The schema tracks decay. Facts that aren't accessed regularly lose importance over time. High-importance facts that get referenced often stay fresh. It's a living memory, not an append-only log.

The structure looks like this:

life/
├── index.md (master entity list)
├── projects/
│   ├── seotakeoff/
│   ├── grahams-blog/
│   ├── sensei/
│   ├── twitter-growth/
│   ├── meta-ads/
│   └── building/
├── areas/
│   ├── people/
│   │   ├── sam/
│   │   ├── bryan/
│   │   ├── elias/
│   │   └── fred/
│   ├── health-fitness/
│   └── content-creation/
├── resources/
│   ├── seo-knowledge/
│   ├── marketing-tactics/
│   ├── ai-tools/
│   └── x-insights/
└── archives/

When Alfred needs context on a person, a project, or a topic, he loads the summary first. Only opens items.json if he needs granular detail.

Vector Search: Remembering by Meaning

The three tiers give us structure. But structure alone isn't enough.

Let's say I mentioned a broken link problem on SEOTakeoff three weeks ago. Alfred's not going to grep through 21 daily note files to find it. And even if he did, what if I said "LLM hallucinating external URLs" then and "broken links" now? Text search fails.

That's why I added vector search via LanceDB.

OpenClaw has this built in — every memory gets embedded using OpenAI's text-embedding-3-small model. When Alfred needs to remember something, he doesn't just search for keywords. He searches by meaning.

I say "what was that issue with bad URLs on the site?" and the vector search pulls up the content pipeline hallucination problem, even though I never said "hallucination" or "pipeline" in my question.

Semantic memory, persistent across sessions, searchable by meaning instead of just recency.

Pair this with full-text search for when you know exactly what you're looking for, and you've got both precision and fuzzy recall covered. I use a local indexer that watches my knowledge graph, daily notes, and workspace files. Between keyword search and semantic search, Alfred can find almost anything we've discussed.

The Compound Nightly Review

All these memory files don't update themselves.

Every night at 10:30pm, a cron job kicks off. A separate session (Alfred in batch mode) reviews everything that happened during the day.

He reads all the Telegram conversations from the main session, all cron job outputs (lead scans, bookmark processing, error monitoring), and the current daily note.

Then he:

  • Extracts key learnings
  • Updates CURRENT.md if priorities shifted
  • Updates the knowledge graph if facts changed
  • Writes a summary to the daily note

It runs on Sonnet, not Opus. Cheaper, still smart enough for synthesis work.

This is inspired by Max Frenzel's daily AI-enabled review system. Max does 15-minute structured reviews at the end of each workday: mood, key activities, blockers, what's next. The insight is that it separates capture from synthesis.

During the day, Alfred just writes things down. At night, the review session connects the dots.

Max's article on this was a lightbulb moment for me. He describes the AI as acting like a coach: "asking follow-up questions when I'm vague, pushing me to articulate half-formed thoughts, and helping me spot patterns I might otherwise miss."

That's what the compound review does. It goes beyond a log dump. Reflection and pattern detection happen alongside the memory updates, all in one pass.

Real-Time Capture Beats Batch Jobs

One lesson we learned fast: write important things down immediately, not later.

The nightly review is powerful, but it's not magic. If we have a critical conversation at 9pm and Alfred doesn't write the key points to the daily note right then, the batch job at 10:30pm might miss it.

So the rule now: during important conversations, Alfred updates the daily note in real time. Evening recaps, strategic decisions, new project context. All written down as we talk.

Batch jobs are for synthesis and cleanup, not for capturing what matters.

The Identity Files: SOUL.md, USER.md, AGENTS.md

Memory isn't just data. It's also who Alfred is and how he operates.

Three workspace files define this:

SOUL.md — Who Alfred is. Personality and communication style.

Example snippet:

You are Alfred. You are relentlessly resourceful, direct, and proactive.
You don't wait for permission to improve. You push back when something
doesn't make sense. You remember things.

USER.md — Who I am. What I care about. How I work.

AGENTS.md — Operating rules. Learned lessons. Workflows that work.

Every session, Alfred reads all three before doing anything. These files are his continuity. Without them, he'd wake up knowing he's an AI assistant but not knowing he's Alfred specifically.

This is why the "Text > Brain" principle matters so much. Mental notes don't survive restarts. If Alfred learns something useful today, it goes into AGENTS.md immediately. That way, tomorrow-Alfred benefits from today-Alfred's experience.

Compound learning. Every session makes the next one better.

Task Management: No Separate App

Most people want to connect their AI agent to Notion or Todoist or some task management system.

We tried that. It added complexity for no real benefit.

Here's what works better: project-specific tasks.md files in each project's `.claude/` directory.

~/Coding/seo-saas-platform/.claude/tasks.md
~/Coding/grahams-blog/.claude/tasks.md
~/Coding/twitter/.claude/tasks.md
~/Coding/sensei/.claude/tasks.md

I have 18 projects set up this way. Each tasks.md is just markdown: current tasks, recently completed, backlog. Nothing fancy. They live alongside the code in each GitHub repo, which means they're version-controlled and backed up automatically.

When Alfred does the nightly review, he updates the relevant tasks.md files. Mark completed items. Add new ones based on what we discussed.

When I ask "what's left on SEOTakeoff?" Alfred opens `~/Coding/seo-saas-platform/.claude/tasks.md` and tells me.

No API integrations, no sync issues, no subscription fees. Just files.

This works because it lives where the work lives. The task list for SEOTakeoff is in the SEOTakeoff repo, not in some separate task management tool that Alfred has to poll via API.

The other benefit to this is that I still do work outside of OpenClaw/Alfred, either in Cursor (using Claude Code), Claude Code itself, or another tool like Conductor. And I still find it useful.

This way I can work wherever I want, and things are still up-to-date.

Had I set up OpenClaw first, and then my coding projects, I could have kept everything in one folder probably. But it's easier for me this way (and easier to separate Github repos).

How This Compounds Over Time

Alfred gets noticeably better every week. And it's not because the model improved — it's because the memory system did.

Week 1, Alfred knew my name and basic preferences. Week 4, Alfred knows which subreddits have strict rules, which leads we've contacted before, which approaches worked and which didn't, what my writing voice sounds like, what projects are currently blocked on what issues.

Conversations add to the knowledge graph, mistakes get documented in AGENTS.md so he doesn't repeat them, and evening recaps surface patterns I wouldn't have noticed on my own.

Compound learning. Human and AI, both getting better, stored in plain text files.

Structure your memory well, and it becomes more valuable over time instead of just noisier. That's the core insight from Nat Eliason and Tiago Forte's approaches to personal knowledge management, and it translates directly to AI agents.

The "Text > Brain" Principle

One rule makes all of this work: if you want the agent to remember something, write it to a file.

Don't rely on the model's context window or assume the agent will "just remember." The nightly review won't catch everything either. Write it down.

When you learn a lesson → update AGENTS.md

When you make a decision → update the daily note

When context changes → update the knowledge graph

When you figure out a better approach → update TOOLS.md

The agent's brain is the filesystem. Everything else is temporary.

Also, this doesn't need to be complicated: just ask/tell your agent "hey, remember this" or "hey, this is a mistake, make sure to document it and avoid in future" and that's it.

What You Don't Need

I spent a week researching memory frameworks before realizing the answer was pretty boring.

You don't need a separate task management app, a database backend, or a sync service. Definitely not a web dashboard to "view your agent's memory."

Plain markdown files, a structured approach (PARA or equivalent), automated reviews that write to those files, and search that works across all of them.

That's the whole system. Text files and cron jobs.

OpenClaw gives you the runtime. The memory system is just files in your workspace.

Getting Started

If you're building this yourself, resist the urge to set up everything at once. I built each layer only when I felt the pain of not having it.

Start with daily notes. One file per day, write down what happens. That alone is 80% of the value and takes five minutes to set up.

Add a high-level summary file (I split mine into CORE.md and CURRENT.md) when the daily notes pile up and you can't quickly scan what matters. That took me about two weeks.

The knowledge graph came at week three, when I had enough entities (projects, people, recurring topics) that I needed structure beyond flat files.

Vector search came when I found myself saying "what was that thing we discussed about..." and couldn't find it with grep. Around week four.

The nightly compound review was last. I was manually updating memory files every evening, which felt like homework. Automating it was the best decision I made.

Layer by layer. Each one compounds on the last.

Next: Where to Run It

You've got the paradigm (who your agent is). You've got the team structure (multiple specialized agents). You've got the memory system (so it actually remembers things).

Now the question is: where does this thing run?

MacBook Air? Cloud VM? Mac Studio with local models?

Each has tradeoffs. Some of them are surprising.

That's the next post.

This is part of a series on running an always-on AI agent with OpenClaw.

Weekly Wisdom

Join 25,000+ readers. One email per week with ideas on productivity, health, and living better.

Free forever. Unsubscribe anytime. No spam.