Memory Mechanisms

LLM's Short-term Memory (Context)

The "short-term memory" of large language models is their context window, containing all information in the current conversation.

Characteristics

Limited Capacity: Constrained by context window size (e.g., 4K, 32K, 200K tokens)
Session-dependent: Only valid in the current conversation
Dynamic Updates: Continuously changes as conversation progresses
Cost-sensitive: Longer context means higher usage costs

Example

Conversation start:
User: My name is Zhang San
AI: Hello Zhang San!

After multiple rounds of conversation:
User: What's my name?
AI: Your name is Zhang San. (Remembered from context)

New conversation:
User: What's my name?
AI: I don't know your name. (New conversation, no previous information)

LLM's Long-term Memory (Training Data)

The "long-term memory" of large language models is the knowledge learned during training, encoded in the model's parameters.

Characteristics

Huge Capacity: Contains all knowledge from training data
Persistent: Doesn't disappear when conversation ends
Static and Fixed: Fixed after training completes (unless retrained)
Difficult to Update: Updating requires retraining or fine-tuning

Example

User: What is Python?

AI: Python is a high-level programming language created by Guido van Rossum in 1991...
(This is long-term memory learned from training data)

User: What's the birth date of Python's creator?

AI: Python's creator Guido van Rossum was born on January 31, 1956...
(This is also long-term memory learned from training data)

Memory Capacity Limitations

Short-term Memory Limitations

Problem:

Context window is limited (e.g., 4K, 32K, 200K tokens)
Information beyond the window is "forgotten"
Early information in long conversations may be lost

Example:

Long conversation scenario:
Round 1: [Important information A]
Round 2: [Information B]
...
Round 100: [Information Z]

Problem: Early important information A may have been "forgotten"

Solutions:

Periodically summarize key information
Only keep necessary information
Use external memory systems

Long-term Memory Limitations

Problem:

Training data has a cutoff date
Cannot know new information after training
May contain outdated information

Example:

User: Who is the US President in 2024?

AI: [Answer based on 2023 data, possibly inaccurate]
(Because training data cutoff is 2023)

Solutions:

Use online search functionality
Combine with external knowledge bases
Periodically update models

Memory Decay Problem

What is Memory Decay

Memory decay refers to the phenomenon where as conversation progresses, early information gradually loses weight in the model's attention, causing the model to "forget" early information.

Example:

Conversation start:
User: My name is Zhang San, I'm a programmer, living in Shanghai...
AI: Okay, I've noted it.

After 50 rounds:
User: What's my profession?
AI: I'm not quite sure, did you mention it before?
(Early information has decayed)

Impact of Memory Decay

Information Loss: Early information may be forgotten
Inconsistency: May give inconsistent answers
Context Confusion: May confuse different information

Solutions

Periodic Summarization

Prompt: "Let's summarize the key information so far..."

Repeat Key Information

Prompt: "Remember, my name is Zhang San, profession is programmer, living in Shanghai."

Use Structured Format

Prompt: "My information:
- Name: Zhang San
- Profession: Programmer
- Address: Shanghai"

External Memory Mechanisms (RAG)

What is RAG

RAG (Retrieval-Augmented Generation) is a technique that combines external knowledge bases, enhancing model capabilities by retrieving relevant information.

How RAG Works

Step 1: User asks question
Step 2: Retrieve relevant information from external knowledge base
Step 3: Input retrieved information along with user question to model
Step 4: Model generates answer based on retrieved information

Advantages of RAG

Extended Memory: Not limited by context window
Real-time Updates: Knowledge base can be updated anytime
High Accuracy: Based on reliable information sources
Traceable: Can know information sources

RAG Application Scenarios

Enterprise Knowledge Base: Company documents, processes, FAQs
Technical Documentation: API docs, development guides
Academic Research: Papers, research reports
Legal Consultation: Laws, cases

RAG Implementation Example

Scenario: Answering questions based on company documents

System components:
1. Vector database: Stores vector representations of documents
2. Retrieval system: Retrieves relevant documents based on questions
3. LLM: Generates answers based on retrieved information

Workflow:
User question: "What is the company's leave request process?"
↓
Retrieval: Retrieve relevant documents from vector database
↓
Context: [Retrieved document content]
↓
Generation: LLM generates answer based on context
↓
Answer: "According to company documents, the leave request process is as follows..."

Memory and Forgetting

Why Forgetting is Needed

Avoid Interference: Irrelevant information may interfere with current task
Save Resources: Retaining all information is costly
Adapt to Changes: Need to update information when environment changes

Active Forgetting Strategies

Clear Irrelevant Information

Prompt: "Let's start a new topic, forget previous discussion about X."

Start New Conversation

New conversation clears all previous context

Selective Retention

Prompt: "Remember these key pieces of information: [key info], forget other details."

Practical Application Cases

Case 1: Long Conversation Management

Scenario: Continuous technical consultation

Problem: Conversation is very long, early information may be forgotten

Solution:

1. Periodic summarization
Prompt: "Summarize the key technical points discussed so far"

2. Structured storage
Prompt: "Project information:
- Tech stack: [tech stack]
- Architecture: [architecture]
- Issues: [issues]"

3. Staged processing
Prompt: "Let's complete this stage, summarize key points, then move to next stage"

Case 2: Knowledge Base Q&A

Scenario: Answering questions based on company documents

Solution: Use RAG

1. Build vector database
- Convert documents to vectors
- Store in vector database

2. Implement retrieval system
- Retrieve relevant documents based on question
- Return most relevant document fragments

3. Combine LLM to generate answers
- Use retrieved information as context
- Let LLM generate answer based on context

Case 3: Personalized Assistant

Scenario: Remember user preferences

Solution:

1. User profile
- Store user preferences, history, etc.
- Use database for persistent storage

2. Context injection
- Inject user profile at conversation start
- Let AI remember key information

3. Dynamic updates
- Update user profile based on conversation
- Maintain information timeliness

Summary

Memory mechanisms are an important component of AI systems:

Key Points:

✅ LLMs have short-term memory (context) and long-term memory (training data)
✅ Memory capacity has limitations
✅ Memory decays over time
✅ External memory (RAG) can extend memory capabilities
✅ Need to manage memory and forgetting

Best Practices:

Periodically summarize key information
Use structured formats to store information
Reasonably use external memory systems
Manage context window
Actively forget when necessary

Remember:

AI memory is not real "memory"
Short-term memory has limitations
Long-term memory is static and fixed
External memory can extend capabilities

Understanding memory mechanisms helps better use AI, especially when handling long conversations and complex tasks.

Next Steps

What is an Agent - Learn about Agent concepts
Agent Architecture - Learn about Agent architecture design

Memory Mechanisms ​

LLM's Short-term Memory (Context) ​

Characteristics ​

Example ​

LLM's Long-term Memory (Training Data) ​

Characteristics ​

Example ​

Memory Capacity Limitations ​

Short-term Memory Limitations ​

Long-term Memory Limitations ​

Memory Decay Problem ​

What is Memory Decay ​

Impact of Memory Decay ​

Solutions ​

External Memory Mechanisms (RAG) ​

What is RAG ​

How RAG Works ​

Advantages of RAG ​

RAG Application Scenarios ​

RAG Implementation Example ​

Memory and Forgetting ​

Why Forgetting is Needed ​

Active Forgetting Strategies ​

Practical Application Cases ​

Case 1: Long Conversation Management ​

Case 2: Knowledge Base Q&A ​

Case 3: Personalized Assistant ​

Summary ​

Next Steps ​

Memory Mechanisms

LLM's Short-term Memory (Context)

Characteristics

Example

LLM's Long-term Memory (Training Data)

Characteristics

Example

Memory Capacity Limitations

Short-term Memory Limitations

Long-term Memory Limitations

Memory Decay Problem

What is Memory Decay

Impact of Memory Decay

Solutions

External Memory Mechanisms (RAG)

What is RAG

How RAG Works

Advantages of RAG

RAG Application Scenarios

RAG Implementation Example

Memory and Forgetting

Why Forgetting is Needed

Active Forgetting Strategies

Practical Application Cases

Case 1: Long Conversation Management

Case 2: Knowledge Base Q&A

Case 3: Personalized Assistant

Summary

Next Steps