Multimodal Support
Integrate images and documents into your interactions with Mem0
Mem0 extends its capabilities beyond text by supporting multimodal data, including images and documents. With this feature, users can seamlessly integrate visual and document content into their interactions—allowing Mem0 to extract relevant information from various media types and enrich the memory system.
How It Works
When a user submits an image or document, Mem0 processes it to extract textual information and other pertinent details. These details are then added to the user’s memory, enhancing the system’s ability to understand and recall multimodal inputs.
Supported Media Types
Mem0 currently supports the following media types:
- Images - JPG, PNG, and other common image formats
- Documents - MDX, TXT, and PDF files
Integration Methods
1. Images
Using an Image URL (Recommended)
You can include an image by providing its direct URL. This method is simple and efficient for online images.
Using Base64 Image Encoding for Local Files
For local images—or when embedding the image directly is preferable—you can use a Base64-encoded string.
2. Text Documents (MDX/TXT)
Mem0 supports both online and local text documents in MDX or TXT format.
Using a Document URL
Using Base64 Encoding for Local Documents
3. PDF Documents
Mem0 supports PDF documents via URL.
Complete Example with Multiple File Types
Here’s a comprehensive example showing how to work with different file types:
Using these methods, you can seamlessly incorporate various media types into your interactions, further enhancing Mem0’s multimodal capabilities.
If you have any questions, please feel free to reach out to us using one of the following methods: