You’ll use this when…
- Queries are nuanced and require semantic understanding beyond vector distance.
- Large memory collections produce too many near matches to review manually.
- You want consistent scoring across providers by delegating ranking to a dedicated model.
All configuration snippets translate directly to the TypeScript SDK—swap dictionaries for objects while keeping the same keys (
provider, config, rerank flags).Feature anatomy
- Initial vector search: Retrieve candidate memories by similarity.
- Reranker pass: A specialized model scores each candidate against the original query.
- Reordered results: Mem0 sorts responses using the reranker’s scores before returning them.
- Optional fallbacks: Toggle reranking per request or disable it entirely if performance or cost becomes a concern.
Supported providers
Supported providers
- Cohere – Multilingual hosted reranker with API-based scoring.
- Sentence Transformer – Local Hugging Face cross-encoders for GPU or CPU.
- Hugging Face – Bring any hosted or on-prem reranker model ID.
- LLM Reranker – Use your preferred LLM (OpenAI, etc.) for prompt-driven scoring.
- Zero Entropy – High-quality neural reranking tuned for retrieval tasks.
Provider comparison
Provider comparison
| Provider | Latency | Quality | Cost | Local deploy |
|---|---|---|---|---|
| Cohere | Medium | High | API cost | ❌ |
| Sentence Transformer | Low | Good | Free | ✅ |
| Hugging Face | Low–Medium | Variable | Free | ✅ |
| LLM Reranker | High | Very high | API cost | Depends |
Configure it
Basic setup
Confirm
results["results"][0]["score"] reflects the reranker output—if the field is missing, the reranker was not applied.Provider-specific options
Keep authentication keys in environment variables when you plug these configs into production projects.
Full stack example
A quick search should now return results with both vector and reranker scores, letting you compare improvements immediately.
Async support
Inspect the async response to confirm reranking still applies; the scores should match the synchronous implementation.
Tune performance and cost
Handle failures gracefully
Migrate from v0.x
See it in action
Basic reranked search
Expect each result to list the reranker-adjusted score so you can compare ordering against baseline vector results.
Toggle reranking per request
You should see the same memories in both lists, but the reranked response will reorder them based on semantic relevance.
Combine with metadata filters
Verify filtered reranked searches still respect every metadata clause—reranking only reorders candidates, it never bypasses filters.
Real-world playbooks
Customer support
Top results should highlight tickets matching the login issue context so agents can respond faster.
Content recommendation
Expect high-scoring recommendations that match both the requested theme and any metadata limits you applied.
Personal assistant
Each workflow keeps the same
m.search(...) signature, so you can template these queries across agents with only the prompt and filters changing.Verify the feature is working
- Inspect result payloads for both
score(vector) and reranker scores; mismatched fields indicate the reranker didn’t execute. - Track latency before and after enabling reranking to ensure SLAs hold.
- Review provider logs or dashboards for throttling or quota warnings.
- Run A/B comparisons (rerank on/off) to validate improved relevance before defaulting to reranked responses.
Best practices
- Start local: Try Sentence Transformer models to prove value before paying for hosted APIs.
- Monitor latency: Add metrics around reranker duration so you notice regressions quickly.
- Control spend: Use
top_kand selective toggles to cap hosted reranker costs. - Keep a fallback: Always catch reranker failures and continue with vector-only ordering.
- Experiment often: Swap providers or models to find the best fit for your domain and language mix.
Configure Rerankers
Review provider fields, defaults, and environment variables before going live.
Build a Custom LLM Reranker
Extend scoring with prompt-tuned LLM rerankers for niche workflows.