Use this file to discover all available pages before exploring further.
This guide demonstrates how to combine OpenAI’s Agents SDK for voice applications with Mem0’s memory capabilities to create a voice assistant that remembers user preferences and past interactions.
# OpenAI Agents SDK importsfrom agents import ( Agent, function_tool)from agents.voice import ( AudioInput, SingleAgentVoiceWorkflow, VoicePipeline)from agents.extensions.handoff_prompt import prompt_with_handoff_instructions# Mem0 importsfrom mem0 import AsyncMemoryClient# Set up API keys (replace with your actual keys)os.environ["OPENAI_API_KEY"] = "your-openai-api-key"os.environ["MEM0_API_KEY"] = "your-mem0-api-key"# Define a global user ID for simplicityUSER_ID = "voice_user"# Initialize Mem0 clientmem0_client = AsyncMemoryClient()
This section handles:
Importing required modules from OpenAI Agents SDK and Mem0
Setting up environment variables for API keys
Defining a simple user identification system (using a global variable)
Initializing the Mem0 client that will handle memory operations
import logging# Set up logging at the top of your filelogging.basicConfig( level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', force=True)logger = logging.getLogger("memory_voice_agent")# Then use logger in your function tools@function_toolasync def save_memories( memory: str) -> str: """Store a user memory in memory.""" # This will be visible in your console logger.debug(f"Saving memory: {memory} for user {USER_ID}") # Store the preference in Mem0 memory_content = f"User memory - {memory}" await mem0_client.add( memory_content, user_id=USER_ID, ) return f"I've saved your memory: {memory}"
This function:
Takes a memory string
Creates a formatted memory string
Stores it in Mem0 using the add() method
Includes metadata to categorize the memory for easier retrieval
Returns a confirmation message that the agent will speak
@function_toolasync def search_memories( query: str) -> str: """ Find memories relevant to the current conversation. Args: query: The search query to find relevant memories """ print(f"Finding memories related to: {query}") results = await mem0_client.search( query, filters={"user_id": USER_ID}, top_k=5, threshold=0.7, # Higher threshold for more relevant results ) # Format and return the results if not results.get('results', []): return "I don't have any relevant memories about this topic." memories = [f"• {result['memory']}" for result in results.get('results', [])] return "Here's what I remember that might be relevant:\n" + "\n".join(memories)
This tool:
Takes a search query string
Passes it to Mem0’s semantic search to find related memories
Sets a threshold for relevance to ensure quality results
Returns a formatted list of relevant memories or a default message
def create_memory_voice_agent(): # Create the agent with memory-enabled tools agent = Agent( name="Memory Assistant", instructions=prompt_with_handoff_instructions( """You're speaking to a human, so be polite and concise. Always respond in clear, natural English. You have the ability to remember information about the user. Use the save_memories tool when the user shares an important information worth remembering. Use the search_memories tool when you need context from past conversations or user asks you to recall something. """, ), model="gpt-5-mini", tools=[save_memories, search_memories], ) return agent
This function:
Creates an OpenAI Agent with specific instructions
Configures it to use gpt-4.1-nano (you can use other models)
Registers the memory-related tools with the agent
Uses prompt_with_handoff_instructions to include standard voice agent behaviors
async def record_from_microphone(duration=5, samplerate=24000): """Record audio from the microphone for a specified duration.""" print(f"Recording for {duration} seconds...") # Create a buffer to store the recorded audio frames = [] # Callback function to store audio data def callback(indata, frames_count, time_info, status): frames.append(indata.copy()) # Start recording with sd.InputStream(samplerate=samplerate, channels=1, callback=callback, dtype=np.int16): await asyncio.sleep(duration) # Combine all frames into a single numpy array audio_data = np.concatenate(frames) return audio_data
This function:
Creates a simple asynchronous microphone recording function
Uses the sounddevice library to capture audio input
Stores frames in a buffer during recording
Combines frames into a single numpy array when complete
Now that we’ve explained each component, here’s the complete implementation that combines OpenAI Agents SDK for voice with Mem0’s memory capabilities:
import asyncioimport osimport loggingfrom typing import Optional, List, Dict, Anyimport numpy as npimport sounddevice as sdfrom pydantic import BaseModel# OpenAI Agents SDK importsfrom agents import ( Agent, function_tool)from agents.voice import ( AudioInput, SingleAgentVoiceWorkflow, VoicePipeline)from agents.extensions.handoff_prompt import prompt_with_handoff_instructions# Mem0 importsfrom mem0 import AsyncMemoryClient# Set up API keys (replace with your actual keys)os.environ["OPENAI_API_KEY"] = "your-openai-api-key"os.environ["MEM0_API_KEY"] = "your-mem0-api-key"# Define a global user ID for simplicityUSER_ID = "voice_user"# Initialize Mem0 clientmem0_client = AsyncMemoryClient()# Create tools that utilize Mem0's memory@function_toolasync def save_memories( memory: str) -> str: """ Store a user memory in memory. Args: memory: The memory to save """ print(f"Saving memory: {memory} for user {USER_ID}") # Store the preference in Mem0 memory_content = f"User memory - {memory}" await mem0_client.add( memory_content, user_id=USER_ID, ) return f"I've saved your memory: {memory}"@function_toolasync def search_memories( query: str) -> str: """ Find memories relevant to the current conversation. Args: query: The search query to find relevant memories """ print(f"Finding memories related to: {query}") results = await mem0_client.search( query, filters={"user_id": USER_ID}, top_k=5, threshold=0.7, # Higher threshold for more relevant results ) # Format and return the results if not results.get('results', []): return "I don't have any relevant memories about this topic." memories = [f"• {result['memory']}" for result in results.get('results', [])] return "Here's what I remember that might be relevant:\n" + "\n".join(memories)# Create the agent with memory-enabled toolsdef create_memory_voice_agent(): # Create the agent with memory-enabled tools agent = Agent( name="Memory Assistant", instructions=prompt_with_handoff_instructions( """You're speaking to a human, so be polite and concise. Always respond in clear, natural English. You have the ability to remember information about the user. Use the save_memories tool when the user shares an important information worth remembering. Use the search_memories tool when you need context from past conversations or user asks you to recall something. """, ), model="gpt-5-mini", tools=[save_memories, search_memories], ) return agentasync def record_from_microphone(duration=5, samplerate=24000): """Record audio from the microphone for a specified duration.""" print(f"Recording for {duration} seconds...") # Create a buffer to store the recorded audio frames = [] # Callback function to store audio data def callback(indata, frames_count, time_info, status): frames.append(indata.copy()) # Start recording with sd.InputStream(samplerate=samplerate, channels=1, callback=callback, dtype=np.int16): await asyncio.sleep(duration) # Combine all frames into a single numpy array audio_data = np.concatenate(frames) return audio_dataasync def main(): print("Starting Memory Voice Agent") # Create the agent and context agent = create_memory_voice_agent() # Set up the voice pipeline pipeline = VoicePipeline( workflow=SingleAgentVoiceWorkflow(agent) ) # Configure TTS settings pipeline.config.tts_settings.voice = "alloy" pipeline.config.tts_settings.speed = 1.0 try: while True: # Get user input print("\nPress Enter to start recording (or 'q' to quit)...") user_input = input() if user_input.lower() == 'q': break # Record and process audio audio_data = await record_from_microphone(duration=5) audio_input = AudioInput(buffer=audio_data) print("Processing your request...") # Process the audio input result = await pipeline.run(audio_input) # Create an audio player player = sd.OutputStream(samplerate=24000, channels=1, dtype=np.int16) player.start() # Store the agent's response for adding to memory agent_response = "" print("\nAgent response:") # Play the audio stream as it comes in async for event in result.stream(): if event.type == "voice_stream_event_audio": player.write(event.data) elif event.type == "voice_stream_event_content": # Accumulate and print the text response content = event.data agent_response += content print(content, end="", flush=True) print("\n") # Example of saving the conversation to Mem0 after completion if agent_response: try: await mem0_client.add( f"Agent response: {agent_response}", user_id=USER_ID, metadata={"type": "agent_response"} ) except Exception as e: print(f"Failed to store memory: {e}") except KeyboardInterrupt: print("\nExiting...")if __name__ == "__main__": asyncio.run(main())
Replace the placeholder API keys with your actual keys
Make sure your microphone is properly connected
Run the script with Python 3.8 or newer
Press Enter to start recording, then speak your request
Press ‘q’ to quit the application
The agent will listen to your request, process it through the OpenAI model, utilize Mem0 for memory operations as needed, and respond both through text output and voice speech.
By combining OpenAI’s Agents SDK with Mem0’s memory capabilities, you can create voice agents that maintain persistent memory of user preferences and past interactions. This significantly enhances the user experience by making conversations more natural and personalized.As you build your voice application, experiment with different memory strategies and filtering approaches to find the optimal balance between comprehensive memory and efficient retrieval for your specific use case.
When working with the OpenAI Agents SDK, you might notice that regular print() statements inside @function_tool decorated functions don’t appear in your console output. This is because the Agents SDK captures and redirects standard output when executing these functions.To effectively debug your function tools, use Python’s logging module instead:
import logging# Set up logging at the top of your filelogging.basicConfig( level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', force=True)logger = logging.getLogger("memory_voice_agent")# Then use logger in your function tools@function_toolasync def save_memories( memory: str) -> str: """Store a user memory in memory.""" # This will be visible in your console logger.debug(f"Saving memory: {memory} for user {USER_ID}") # Rest of your function...
Multimodal Support
Learn how to add vision and audio memory alongside voice interactions.
Build a Mem0 Companion
Master the core patterns for building memory-powered companions.