Agent Memory with Spring AI & Redis

July 23, 2025
402 Unique Views
9 min read

Table of Contents

You're building an AI agent with memory using Spring AI and Redis. Unlike traditional chatbots that forget previous interactions, memory-enabled agents can recall past conversations and facts. It works by storing two types of memory in Redis: short-term (conversation history) and long-term (facts and experiences as vectors), allowing agents to provide personalized, context-aware responses.

Redis as a Memory Store for AI AgentsSpring AI and RedisBuilding the Application

0. GitHub Repository
1. Add the required dependencies
2. Define the Memory model
3. Configure the Vector Store
4. Implement the Memory Service
5. Implement the Chat Service
6. Configure the Agent System Prompt
7. Create the REST Controller

Running the Demo

Step 1: Clone the repository
Step 2: Configure your environment
Step 3: Start the services
Step 4: Use the application

Exploring the Data in Redis InsightWrapping up

You're building an AI agent with memory using Spring AI and Redis. Unlike traditional chatbots that forget previous interactions, memory-enabled agents can recall past conversations and facts. It works by storing two types of memory in Redis: short-term (conversation history) and long-term (facts and experiences as vectors), allowing agents to provide personalized, context-aware responses.

LLMs respond to each message in isolation, treating every interaction as if it's the first time they've spoken with a user. They lack the ability to remember previous conversations, preferences, or important facts.

Memory-enabled AI agents, on the other hand, can maintain context across multiple interactions. They remember who you are, what you've told them before, and can use that information to provide more personalized, relevant responses.

In a travel assistant scenario, for example, if a user mentions "I'm allergic to shellfish" in one conversation, and later asks for restaurant recommendations in Boston, a memory-enabled agent would recall the allergy information and filter out inappropriate suggestions, creating a much more helpful and personalized experience.

Video: What is an embedding model?

Behind the scenes, this works thanks to vector similarity search. It turns text into vectors (embeddings) — lists of numbers — stores them in a vector database, and then finds the ones closest to your query when relevant information needs to be recalled.

Video: What is semantic search?

Today, we're gonna build a memory-enabled AI agent that helps users plan travel. It will remember user preferences, past trips, and important details across multiple conversations — even if the user leaves and comes back later.

To do that, we'll build a Spring Boot app from scratch and use Redis as our memory store. It'll handle both short-term memory (conversation history) and long-term memory (facts and preferences as vector embeddings), enabling our agent to provide truly personalized assistance.

Redis as a Memory Store for AI Agents

Video: What is a vector database?

In the last 15 years, Redis became the foundational infrastructure for realtime applications. Today, with Redis Open Source 8, it's committed to becoming the foundational infrastructure for AI applications as well.

Redis Open Source 8 not only turns the community version of Redis into a Vector Database, but also makes it the fastest and most scalable database in the market today. Redis 8 allows you to scale to one billion vectors without penalizing latency.

Learn more: https://redis.io/blog/searching-1-billion-vectors-with-redis-8/

For AI agents, Redis serves as both:

A short-term memory store using Redis Lists to maintain conversation history
A long-term memory store using Redis JSON and the Redis Query Engine that enables vector search to store and retrieve facts and experiences

Spring AI and Redis

Spring AI provides a unified API for working with various AI models and vector stores. Combined with Redis, it allows our users to easily build memory-enabled AI agents that can:

Store and retrieve vector embeddings for semantic search
Maintain conversation context across sessions
Extract and deduplicate memories from conversations
Summarize long conversations to prevent context window overflow

Building the Application

Our application will be built using Spring Boot with Spring AI and Redis. It will implement a travel assistant that remembers user preferences and past trips, providing personalized recommendations based on this memory.

0. GitHub Repository

The full application can be found on GitHub: https://github.com/redis-developer/redis-springboot-resources/tree/main/artificial-intelligence/agent-memory-with-spring-ai

1. Add the required dependencies

From a Spring Boot application, add the following dependencies to your Maven or Gradle file:

implementation("org.springframework.ai:spring-ai-transformers:1.0.0")
implementation("org.springframework.ai:spring-ai-starter-vector-store-redis")
implementation("org.springframework.ai:spring-ai-starter-model-openai")

implementation("com.redis.om:redis-om-spring:1.0.0-RC3")

2. Define the Memory model

The core of our implementation is the Memory class that represents items stored in long-term memory:

data class Memory(
    val id: String? = null,
    val content: String,
    val memoryType: MemoryType,
    val userId: String,
    val metadata: String = "{}",
    val createdAt: LocalDateTime = LocalDateTime.now()
)

enum class MemoryType {
    EPISODIC,  // Personal experiences and preferences
    SEMANTIC   // General knowledge and facts
}

3. Configure the Vector Store

We'll use Spring AI's RedisVectorStore to store and search vector embeddings of memories:

@Configuration
class MemoryVectorStoreConfig {

    @Bean
    fun memoryVectorStore(
        embeddingModel: EmbeddingModel,
        jedisPooled: JedisPooled
    ): RedisVectorStore {
        return RedisVectorStore.builder(jedisPooled, embeddingModel)
            .indexName("memoryIdx")
            .contentFieldName("content")
            .embeddingFieldName("embedding")
            .metadataFields(
                RedisVectorStore.MetadataField("memoryType", Schema.FieldType.TAG),
                RedisVectorStore.MetadataField("metadata", Schema.FieldType.TEXT),
                RedisVectorStore.MetadataField("userId", Schema.FieldType.TAG),
                RedisVectorStore.MetadataField("createdAt", Schema.FieldType.TEXT)
            )
            .prefix("memory:")
            .initializeSchema(true)
            .vectorAlgorithm(RedisVectorStore.Algorithm.HSNW)
            .build()
    }
}

Let's break this down:

Index Name: memoryIdx - Redis will create an index with this name for searching memories
Content Field: content - The raw memory content that will be embedded
Embedding Field: embedding - The field that will store the resulting vector embedding
Metadata Fields:
- memoryType: TAG field for filtering by memory type (EPISODIC or SEMANTIC)
- metadata: TEXT field for storing additional context about the memory
- userId: TAG field for filtering by user ID
- createdAt: TEXT field for storing the creation timestamp

4. Implement the Memory Service

The MemoryService handles storing and retrieving memories from Redis:

@Service
class MemoryService(
    private val memoryVectorStore: RedisVectorStore
) {
    private val systemUserId = "system"

    fun storeMemory(
        content: String,
        memoryType: MemoryType,
        userId: String? = null,
        metadata: String = "{}"
    ): StoredMemory {
        // Check if a similar memory already exists to avoid duplicates
        if (similarMemoryExists(content, memoryType, userId)) {
            return StoredMemory(
                Memory(
                    content = content,
                    memoryType = memoryType,
                    userId = userId ?: systemUserId,
                    metadata = metadata,
                    createdAt = LocalDateTime.now()
                )
            )
        }

        // Create a document for the vector store
        val document = Document(
            content,
            mapOf(
                "memoryType" to memoryType.name,
                "metadata" to metadata,
                "userId" to (userId ?: systemUserId),
                "createdAt" to LocalDateTime.now().toString()
            )
        )

        // Store the document in the vector store
        memoryVectorStore.add(listOf(document))

        return StoredMemory(
            Memory(
                content = content,
                memoryType = memoryType,
                userId = userId ?: systemUserId,
                metadata = metadata,
                createdAt = LocalDateTime.now()
            )
        )
    }

    fun retrieveMemories(
        query: String,
        memoryType: MemoryType? = null,
        userId: String? = null,
        limit: Int = 5,
        distanceThreshold: Float = 0.9f
    ): List<StoredMemory> {
        // Build filter expression
        val b = FilterExpressionBuilder()
        val filterList = mutableListOf<FilterExpressionBuilder.Op>()

        // Add user filter
        val effectiveUserId = userId ?: systemUserId
        filterList.add(b.or(b.eq("userId", effectiveUserId), b.eq("userId", systemUserId)))

        // Add memory type filter if specified
        if (memoryType != null) {
            filterList.add(b.eq("memoryType", memoryType.name))
        }

        // Combine filters
        val filterExpression = when (filterList.size) {
            0 -> null
            1 -> filterList[0]
            else -> filterList.reduce { acc, expr -> b.and(acc, expr) }
        }?.build()

        // Execute search
        val searchResults = memoryVectorStore.similaritySearch(
            SearchRequest.builder()
                .query(query)
                .topK(limit)
                .filterExpression(filterExpression)
                .build()
        )

        // Transform results to StoredMemory objects
        return searchResults.mapNotNull { result ->
            if (distanceThreshold < (result.score ?: 1.0)) {
                val metadata = result.metadata
                val memoryObj = Memory(
                    id = result.id,
                    content = result.text ?: "",
                    memoryType = MemoryType.valueOf(metadata["memoryType"] as String? ?: MemoryType.SEMANTIC.name),
                    metadata = metadata["metadata"] as String? ?: "{}",
                    userId = metadata["userId"] as String? ?: systemUserId,
                    createdAt = try {
                        LocalDateTime.parse(metadata["createdAt"] as String?)
                    } catch (_: Exception) {
                        LocalDateTime.now()
                    }
                )
                StoredMemory(memoryObj, result.score)
            } else {
                null
            }
        }
    }
}

Key features of the memory service:

Stores memories as vector embeddings in Redis
Retrieves memories using vector similarity search
Filters memories by user ID and memory type
Prevents duplicate memories through similarity checking

5. Implement the Chat Service

The ChatService orchestrates the conversation flow, including memory retrieval and storage:

@Service
class ChatService(
    private val chatModel: ChatModel,
    private val memoryService: MemoryService,
    private val travelAgentSystemPrompt: Message,
    private val jedisPooled: JedisPooled
) {
    private val log = LoggerFactory.getLogger(ChatService::class.java)
    private val conversationHistory = ConcurrentHashMap<String, MutableList<Message>>()
    private val conversationKeyPrefix = "conversation:"

    fun sendMessage(
        message: String,
        userId: String,
    ): ChatResult {
        // Get or create conversation history (try to load from Redis first)
        val history = conversationHistory.computeIfAbsent(userId) {
            // Try to load from Redis first
            val redisHistory = loadConversationHistoryFromRedis(userId)
            if (redisHistory.isNotEmpty()) {
                redisHistory.toMutableList()
            } else {
                mutableListOf(travelAgentSystemPrompt)
            }
        }

        // Retrieve relevant memories with timing
        val (memories, embTime) = retrieveRelevantMemoriesWithTiming(message, userId)

        // Add memory context if available
        if (memories.isNotEmpty()) {
            val memoryContext = formatMemoriesAsContext(memories)
            // Add memory context as a system message
            history.add(SystemMessage(memoryContext))
        }

        // Add user's message to history
        val userMessage = UserMessage(message)
        history.add(userMessage)

        // Create prompt with conversation history
        val prompt = Prompt(history)

        // Generate response
        val response = chatModel.call(prompt)

        // Add assistant response to history
        history.add(AssistantMessage(response.result.output.text ?: ""))

        // Save conversation history to Redis
        saveConversationHistoryToRedis(userId, history)

        // Extract and store memories from the conversation
        extractAndStoreMemoriesWithTiming(message, response.result.output.text ?: "", userId)

        // Summarize conversation if it's getting too long
        if (history.size > 10) {
            summarizeConversation(history, userId)
            // Save the summarized history to Redis
            saveConversationHistoryToRedis(userId, history)
        }

        // Return result
        return ChatResult(response, metrics)
    }

    private fun saveConversationHistoryToRedis(userId: String, history: List<Message>) {
        val redisKey = "$conversationKeyPrefix$userId"

        // Delete existing key if it exists
        jedisPooled.del(redisKey)

        // Serialize each message and add to Redis list
        for (message in history) {
            val serializedMessage = serializeMessage(message)
            jedisPooled.rpush(redisKey, serializedMessage)
        }

        // Set TTL of one hour (3600 seconds)
        jedisPooled.expire(redisKey, 3600)
    }

    private fun loadConversationHistoryFromRedis(userId: String): List<Message> {
        val redisKey = "$conversationKeyPrefix$userId"

        // Get all messages from Redis list
        val serializedMessages = jedisPooled.lrange(redisKey, 0, -1)

        // Deserialize messages
        return serializedMessages.mapNotNull { deserializeMessage(it) }.toMutableList()
    }

    private fun extractAndStoreMemoriesWithTiming(
        userMessage: String,
        assistantResponse: String,
        userId: String
    ) {
        // Create extraction prompt
        val extractionPrompt = """
            Analyze the following conversation and extract potential memories.

            USER MESSAGE:
            $userMessage

            ASSISTANT RESPONSE:
            $assistantResponse

            Extract two types of memories:

            1. EPISODIC MEMORIES: Personal experiences and user-specific preferences
               Examples: "User prefers Delta airlines", "User visited Paris last year"

            2. SEMANTIC MEMORIES: General domain knowledge and facts
               Examples: "Singapore requires passport", "Tokyo has excellent public transit"

            Format your response as a JSON array with objects containing:
            - "type": Either "EPISODIC" or "SEMANTIC"
            - "content": The memory content
        """.trimIndent()

        // Call the LLM to extract memories
        val extractionResponse = chatModel.call(
            Prompt(listOf(SystemMessage(extractionPrompt)))
        )

        // Parse the response and store memories
        // ...
    }
}

Key features of the chat service:

Maintains conversation history in Redis Lists
Retrieves relevant memories for each user message
Adds memory context to the conversation
Extracts and stores new memories from conversations
Summarizes long conversations to prevent context window overflow

6. Configure the Agent System Prompt

The agent is configured with a system prompt that explains its capabilities and access to different types of memory:

@Bean
fun travelAgentSystemPrompt(): Message {
    val promptText = """
        You are a travel assistant helping users plan their trips. You remember user preferences
        and provide personalized recommendations based on past interactions.

        You have access to the following types of memory:
        1. Short-term memory: The current conversation thread
        2. Long-term memory:
           - Episodic: User preferences and past trip experiences (e.g., "User prefers window seats")
           - Semantic: General knowledge about travel destinations and requirements

        Always be helpful, personal, and context-aware in your responses.

        Always answer in text format. No markdown or special formatting.
    """.trimIndent()

    return SystemMessage(promptText)
}

7. Create the REST Controller

The REST controller exposes endpoints for chat and memory management:

@RestController
@RequestMapping("/api")
class ChatController(private val chatService: ChatService) {

    @PostMapping("/chat")
    fun chat(@RequestBody request: ChatRequest): ChatResponse {
        val result = chatService.sendMessage(request.message, request.userId)
        return ChatResponse(
            message = result.response.result.output.text ?: "",
            metrics = result.metrics
        )
    }

    @GetMapping("/history/{userId}")
    fun getHistory(@PathVariable userId: String): List<MessageDto> {
        return chatService.getConversationHistory(userId).map { message ->
            MessageDto(
                role = when (message) {
                    is SystemMessage -> "system"
                    is UserMessage -> "user"
                    is AssistantMessage -> "assistant"
                    else -> "unknown"
                },
                content = when (message) {
                    is SystemMessage -> message.content
                    is UserMessage -> message.content
                    is AssistantMessage -> message.content
                    else -> ""
                }
            )
        }
    }

    @DeleteMapping("/history/{userId}")
    fun clearHistory(@PathVariable userId: String) {
        chatService.clearConversationHistory(userId)
    }
}

Running the Demo

The easiest way to run the demo is with Docker Compose, which sets up all required services in one command.

Step 1: Clone the repository

git clone https://github.com/redis/redis-springboot-recipes.git
cd redis-springboot-recipes/artificial-intelligence/agent-memory-with-spring-ai

Step 2: Configure your environment

Create a .env file with your OpenAI API key:

OPENAI_API_KEY=sk-your-api-key

Step 3: Start the services

docker compose up --build

This will start:

redis: for storing both vector embeddings and chat history
redis-insight: a UI to explore the Redis data
agent-memory-app: the Spring Boot app that implements the memory-aware AI agent

Step 4: Use the application

When all services are running, go to localhost:8080 to access the demo. You'll see a travel assistant interface with a chat panel and a memory management sidebar:

Enter a user ID and click "Start Chat":

Close-up screenshot of the user ID input and chat controls. The label “User ID:” appears on the left with a text input field containing the value “raphael”. To the right are two red buttons labeled “Start Chat” and “Clear Chat”.

Send a message like: "Hi, my name's Raphael. I went to Paris back in 2009 with my wife for our honeymoon and we had a lovely time. For our 10-year anniversary we're planning to go back. Help us plan the trip!"

The system will reply with the response to your message and, in case it identifies potential memories to be stored, they will be stored either as semantic or episodic memories. You can see the stored memories on the "Memory Management" sidebar.

On top of that, with each message, the system will also return performance metrics.

If you refresh the page, you will see that all memories and the chat history are gone.

If you reenter the same user ID, the long-term memories will be reloaded on the sidebar and the short-term memory (the chat history) will be reloaded as well:

If you refresh the page and enter the same user ID, your memories and conversation history will be reloaded

Exploring the Data in Redis Insight

RedisInsight provides a visual interface for exploring the data stored in Redis. Access it at localhost:5540 to see:

Short-term memory (conversation history) stored in Redis Lists

Long-term memory (facts and experiences) stored as JSON documents with vector embeddings

The vector index schema used for similarity search

If you run the FT.INFO memoryIdx command in the RedisInsight workbench, you'll see the details of the vector index schema that enables efficient memory retrieval.

Wrapping up

And that's it — you now have a working AI agent with memory using Spring Boot and Redis.

Instead of forgetting everything between conversations, your agent can now remember user preferences, past experiences, and important facts. Redis handles both short-term memory (conversation history) and long-term memory (vector embeddings) — all with the performance and scalability Redis is known for.

With Spring AI and Redis, you get an easy way to integrate this into your Java applications. The combination of vector similarity search for semantic retrieval and traditional data structures for conversation history gives you a powerful foundation for building truly intelligent agents.

Whether you're building customer service bots, personal assistants, or domain-specific experts, this memory architecture gives you the tools to create more helpful, personalized, and context-aware AI experiences.

Try it out, experiment with different memory types, explore other embedding models, and see how far you can push the boundaries of AI agent capabilities!

Stay Curious!

Don’t Forget to Share This Post!

Raphael De Lio

Author

Software Engineer | Developer Advocate | International Conference Speaker | Tech Content Creator | Working @ Redis

AI Newsletter #1

MongoDB ACID Transactions With Java

Java 22 to 24: Level up your Java Code by embracing new features in a safe way

Video series “JavaFX In Action”, Part 5 with Cormac Redmond (KafkIO), Brian Schlining (Annotating the Deep-Sea Wildlife), Gerrit Grunwald (JavaFX Libraries), Dirk Lemmermann (JavaFX Libraries and Applications, JFX Central)

Creating Scalable OpenAI GPT Applications in Java

Spring Cloud Stream: Event-Driven Architecture – Part 1

Foojay Podcast #76: DevBcn Report, Part 1 – Learn from the Community

Project Panama for Newbies (Part 1)

🚀 Document Your Spring Boot APIs with Redocusaurus in Minutes 🦕

Understanding MCP Through Raw STDIO Communication

foojay: A Place for Friends of OpenJDK

Dashboard for OpenJDK Update Release Details

JDK14: New Features and Enhancements

Fun with Flags: My Top 10 Resources for JVM Flags

Performance of Modern Java on Data-Heavy Workloads: Real-Time Streaming

Performance of Modern Java on Data-Heavy Workloads: Batch Processing

How does Java handle different Images and ColorSpaces – Part 1

How does Java handle different Images and ColorSpaces – Part 2

How does Java handle different Images and ColorSpaces – Part 3

How does Java handle different Images and ColorSpaces – Part 4

Indexing all of Wikipedia, on a laptop

Working with Multiple Carets in IntelliJ IDEA

Clean Shutdown of Spring Boot Applications

Java 17 on the Raspberry Pi

Project Panama for Newbies (Part 1)

How to Create Mobile Apps with JavaFX (Part 1)

Foojay Slack: bit.ly/join-foojay-slack

Beginning JavaFX Applications with IntelliJ IDE

SpringBoot 3.2 + CRaC

Debugging Java on the Command Line

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Learn about a number of experiments that have been conducted with Apache Kafka performance on Azul Platform Prime, compared to vanilla OpenJDK. Roughly 40% improvements in performance, both throughput and latency, are achieved.

Stable, Secure, and Affordable Java

Azul Platform Core is the #1 Oracle Java alternative, offering OpenJDK support for more versions (including Java 6 & 7) and more configurations for the greatest business value and lowest TCO.

Stable, Secure, and Affordable Java

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Agent Memory with Spring AI & Redis

Redis as a Memory Store for AI Agents

Spring AI and Redis

Building the Application

0. GitHub Repository

1. Add the required dependencies

2. Define the Memory model

3. Configure the Vector Store

4. Implement the Memory Service

5. Implement the Chat Service

6. Configure the Agent System Prompt

7. Create the REST Controller

Running the Demo

Step 1: Clone the repository

Step 2: Configure your environment

Step 3: Start the services

Step 4: Use the application

Exploring the Data in Redis Insight

Wrapping up

Raphael De Lio

Raphael De Lio

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Payara

Digma

adesso

Trending

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Stable, Secure, and Affordable Java

Stable, Secure, and Affordable Java

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Comments (0)

Stable, Secure, and Affordable Java

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Do you want your ad here?

Agent Memory with Spring AI & Redis

Redis as a Memory Store for AI Agents

Spring AI and Redis

Building the Application

0. GitHub Repository

1. Add the required dependencies

2. Define the Memory model

3. Configure the Vector Store

4. Implement the Memory Service

5. Implement the Chat Service

6. Configure the Agent System Prompt

7. Create the REST Controller

Running the Demo

Step 1: Clone the repository

Step 2: Configure your environment

Step 3: Start the services

Step 4: Use the application

Exploring the Data in Redis Insight

Wrapping up

Raphael De Lio

Raphael De Lio

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Payara

Digma

adesso

Trending

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Stable, Secure, and Affordable Java

Stable, Secure, and Affordable Java

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Do you want your ad here?

Related Articles

Comments (0)

Set Event Reminder

Subscribe to foojay updates:

Share with