Integrate images and documents into your interactions with Mem0
Mem0 extends its capabilities beyond text by supporting multimodal data, including images and documents. With this feature, users can seamlessly integrate visual and document content into their interactions—allowing Mem0 to extract relevant information from various media types and enrich the memory system.
When a user submits an image or document, Mem0 processes it to extract textual information and other pertinent details. These details are then added to the user’s memory, enhancing the system’s ability to understand and recall multimodal inputs.
Copy
Ask AI
import osfrom mem0 import MemoryClientos.environ["MEM0_API_KEY"] = "your-api-key"client = MemoryClient()messages = [ { "role": "user", "content": "Hi, my name is Alice." }, { "role": "assistant", "content": "Nice to meet you, Alice! What do you like to eat?" }, { "role": "user", "content": { "type": "image_url", "image_url": { "url": "https://www.superhealthykids.com/wp-content/uploads/2021/10/best-veggie-pizza-featured-image-square-2.jpg" } } },]# Calling the add method to ingest messages into the memory systemclient.add(messages, user_id="alice")
import base64# Path to the document filedocument_path = "path/to/your/document.txt"# Function to convert file to Base64def file_to_base64(file_path): with open(file_path, "rb") as file: return base64.b64encode(file.read()).decode('utf-8')# Encode the document in Base64base64_document = file_to_base64(document_path)# Create the message dictionary with the Base64-encoded documentdocument_message = { "role": "user", "content": { "type": "mdx_url", "mdx_url": { "url": base64_document } }}client.add([document_message], user_id="alice")
# Define the PDF URLpdf_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"# Create the message dictionary with the PDF URLpdf_message = { "role": "user", "content": { "type": "pdf_url", "pdf_url": { "url": pdf_url } }}client.add([pdf_message], user_id="alice")
Here’s a comprehensive example showing how to work with different file types:
Copy
Ask AI
import base64from mem0 import MemoryClientclient = MemoryClient()def file_to_base64(file_path): with open(file_path, "rb") as file: return base64.b64encode(file.read()).decode('utf-8')# Example 1: Using an image URLimage_message = { "role": "user", "content": { "type": "image_url", "image_url": { "url": "https://example.com/sample-image.jpg" } }}# Example 2: Using a text document URLtext_message = { "role": "user", "content": { "type": "mdx_url", "mdx_url": { "url": "https://www.w3.org/TR/2003/REC-PNG-20031110/iso_8859-1.txt" } }}# Example 3: Using a PDF URLpdf_message = { "role": "user", "content": { "type": "pdf_url", "pdf_url": { "url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" } }}# Add each message to the memory systemclient.add([image_message], user_id="alice")client.add([text_message], user_id="alice")client.add([pdf_message], user_id="alice")
Using these methods, you can seamlessly incorporate various media types into your interactions, further enhancing Mem0’s multimodal capabilities.If you have any questions, please feel free to reach out to us using one of the following methods: