Reranking trades extra latency for better precision. Start once you have baseline search working and measure before/after relevance.
Supported Rerankers
Cohere
Sentence Transformers
Hugging Face
LLM Reranker
Zero Entropy
Reranking Workflow
Understand Reranking
Configure Providers
Optimize Performance
Custom Prompts
Zero Entropy Guide
Sentence Transformers
Picking the Right Reranker
- API-first when you need top quality and can absorb request costs (Cohere, Zero Entropy).
- Self-hosted for privacy-sensitive deployments that must stay on your hardware (Sentence Transformer, Hugging Face).
- LLM-driven when you need bespoke scoring logic or complex prompts.
- Hybrid by enabling reranking only on premium journeys to control spend.
Implementation Checklist
- Confirm baseline search KPIs so you can measure uplift.
- Select a provider and add the
rerankerblock to your config. - Test latency impact with production-like query batches.
- Decide whether to enable reranking globally or per-search via the
rerankflag.