Skip to main content
LLM-based reranker provides maximum flexibility by using any Large Language Model to score document relevance. This approach allows for custom prompts and domain-specific scoring logic.

Supported LLM Providers

Any LLM provider supported by Mem0 can be used for reranking:
  • OpenAI: GPT-4, GPT-3.5-turbo, etc.
  • Anthropic: Claude models
  • Together: Open-source models
  • Groq: Fast inference
  • Ollama: Local models
  • And more…

Configuration

Python
from mem0 import Memory

config = {
    "vector_store": {
        "provider": "chroma",
        "config": {
            "collection_name": "my_memories",
            "path": "./chroma_db"
        }
    },
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini"
        }
    },
    "rerank": {
        "provider": "llm",
        "config": {
            "model": "gpt-4o-mini",
            "provider": "openai",
            "api_key": "your-openai-api-key",  # or set OPENAI_API_KEY
            "top_k": 5,
            "temperature": 0.0
        }
    }
}

memory = Memory.from_config(config)

Custom Scoring Prompt

You can provide a custom prompt for relevance scoring:
Python
custom_prompt = """You are a relevance scoring assistant. Rate how well this document answers the query.

Query: "{query}"
Document: "{document}"

Score from 0.0 to 1.0 where:
- 1.0: Perfect match, directly answers the query
- 0.8-0.9: Highly relevant, good match  
- 0.6-0.7: Moderately relevant, partial match
- 0.4-0.5: Slightly relevant, limited useful information
- 0.0-0.3: Not relevant or no useful information

Provide only a single numerical score between 0.0 and 1.0."""

config["rerank"]["config"]["scoring_prompt"] = custom_prompt

Usage Example

Python
import os
from mem0 import Memory

# Set API key
os.environ["OPENAI_API_KEY"] = "your-api-key"

# Initialize memory with LLM reranker
config = {
    "vector_store": {"provider": "chroma"},
    "llm": {"provider": "openai", "config": {"model": "gpt-4o-mini"}},
    "rerank": {
        "provider": "llm",
        "config": {
            "model": "gpt-4o-mini",
            "provider": "openai",
            "temperature": 0.0
        }
    }
}

memory = Memory.from_config(config)

# Add memories
messages = [
    {"role": "user", "content": "I'm learning Python programming"},
    {"role": "user", "content": "I find object-oriented programming challenging"}, 
    {"role": "user", "content": "I love hiking in national parks"}
]

memory.add(messages, user_id="david")

# Search with LLM reranking
results = memory.search("What programming topics is the user studying?", user_id="david")

for result in results['results']:
    print(f"Memory: {result['memory']}")
    print(f"Vector Score: {result['score']:.3f}")
    print(f"Rerank Score: {result['rerank_score']:.3f}")
    print()
Output
Memory: I'm learning Python programming
Vector Score: 0.856
Rerank Score: 0.920

Memory: I find object-oriented programming challenging
Vector Score: 0.782
Rerank Score: 0.850

Domain-Specific Scoring

Create specialized scoring for your domain:
Python
medical_prompt = """You are a medical relevance expert. Score how relevant this medical record is to the clinical query.

Clinical Query: "{query}"
Medical Record: "{document}"

Consider:
- Clinical relevance and accuracy
- Patient safety implications
- Diagnostic value
- Treatment relevance

Score from 0.0 to 1.0. Provide only the numerical score."""

config = {
    "rerank": {
        "provider": "llm",
        "config": {
            "model": "gpt-4o-mini",
            "provider": "openai",
            "scoring_prompt": medical_prompt,
            "temperature": 0.0
        }
    }
}

Multiple LLM Providers

Use different LLM providers for reranking:
Python
# Using Anthropic Claude
anthropic_config = {
    "rerank": {
        "provider": "llm",
        "config": {
            "model": "claude-3-haiku-20240307",
            "provider": "anthropic",
            "temperature": 0.0
        }
    }
}

# Using local Ollama model
ollama_config = {
    "rerank": {
        "provider": "llm", 
        "config": {
            "model": "llama2:7b",
            "provider": "ollama",
            "temperature": 0.0
        }
    }
}

Configuration Parameters

ParameterDescriptionTypeDefault
modelLLM model to use for scoringstr"gpt-4o-mini"
providerLLM provider namestr"openai"
api_keyAPI key for the LLM providerstrNone
top_kMaximum documents to returnintNone
temperatureTemperature for LLM generationfloat0.0
max_tokensMaximum tokens for LLM responseint100
scoring_promptCustom prompt templatestrDefault prompt

Advantages

  • Maximum Flexibility: Custom prompts for any use case
  • Domain Expertise: Leverage LLM knowledge for specialized domains
  • Interpretability: Understand scoring through prompt engineering
  • Multi-criteria: Score based on multiple relevance factors

Considerations

  • Latency: Higher latency than specialized rerankers
  • Cost: LLM API costs per reranking operation
  • Consistency: May have slight variations in scoring
  • Prompt Engineering: Requires careful prompt design

Best Practices

  1. Temperature: Use 0.0 for consistent scoring
  2. Prompt Design: Be specific about scoring criteria
  3. Token Efficiency: Keep prompts concise to reduce costs
  4. Caching: Cache results for repeated queries when possible
  5. Fallback: Handle API errors gracefully
I