Overview
The LLM reranker allows you to use any supported language model as a reranker. This approach uses prompts to instruct the LLM to score and rank memories based on their relevance to the query. While slower than specialized rerankers, it offers maximum flexibility and can be fine-tuned with custom prompts.Configuration
Basic Setup
Configuration Parameters
Parameter | Type | Default | Description |
---|---|---|---|
llm | dict | Required | LLM configuration object |
top_k | int | 10 | Number of results to rerank |
temperature | float | 0.0 | LLM temperature for consistency |
custom_prompt | str | None | Custom reranking prompt |
score_range | tuple | (0, 10) | Score range for relevance |
Advanced Configuration
Supported LLM Providers
OpenAI
Anthropic
Ollama (Local)
Azure OpenAI
Custom Prompts
Default Prompt Behavior
The default prompt asks the LLM to score relevance on a 0-10 scale:Custom Prompt Examples
Domain-Specific Scoring
Contextual Relevance
Conversational Context
Usage Examples
Basic Usage
Batch Processing with Error Handling
Performance Considerations
Speed vs Quality Trade-offs
Model Type | Speed | Quality | Cost | Best For |
---|---|---|---|---|
GPT-3.5 Turbo | Fast | Good | Low | High-volume applications |
GPT-4 | Medium | Excellent | Medium | Quality-critical applications |
Claude 3 Sonnet | Medium | Excellent | Medium | Balanced performance |
Ollama Local | Variable | Good | Free | Privacy-sensitive applications |
Optimization Strategies
Advanced Use Cases
Multi-Step Reasoning
Comparative Ranking
Emotional Intelligence
Error Handling and Fallbacks
Best Practices
- Use Specific Prompts: Tailor prompts to your domain and use case
- Set Temperature to 0: Ensure consistent scoring across runs
- Limit Top-K: Don’t rerank too many candidates to control costs
- Implement Fallbacks: Always have a backup plan for API failures
- Monitor Costs: Track API usage, especially with expensive models
- Cache Results: Consider caching reranking results for repeated queries
- Test Prompts: Experiment with different prompts to find what works best
Troubleshooting
Common Issues
Inconsistent Scores- Set temperature to 0.0
- Use more specific prompts
- Consider using multiple calls and averaging
- Implement exponential backoff
- Use cheaper models for high-volume scenarios
- Add retry logic with delays
- Refine your custom prompt
- Try different LLM models
- Add examples to your prompt