Reranking trades extra latency for better precision. Start once you have baseline search working and measure before/after relevance.
Understand Reranking
Configure Providers
Optimize Performance
Custom Prompts
Zero Entropy Guide
Sentence Transformers
Picking the Right Reranker
- API-first when you need top quality and can absorb request costs (Cohere, Zero Entropy).
- Self-hosted for privacy-sensitive deployments that must stay on your hardware (Sentence Transformer, Hugging Face).
- LLM-driven when you need bespoke scoring logic or complex prompts.
- Hybrid by enabling reranking only on premium journeys to control spend.
Implementation Checklist
- Confirm baseline search KPIs so you can measure uplift.
- Select a provider and add the
rerankerblock to your config. - Test latency impact with production-like query batches.
- Decide whether to enable reranking globally or per-search via the
rerankflag.