Usage
To use Sentence Transformer reranker with Mem0:Configuration
Parameter | Description | Default |
---|---|---|
model | Sentence Transformer cross-encoder model | cross-encoder/ms-marco-MiniLM-L-6-v2 |
device | Device to run on (cpu , cuda , mps ) | cpu |
top_n | Number of results to return | 10 |
Popular Models
Lightweight Models
cross-encoder/ms-marco-MiniLM-L-6-v2
: Fast and efficientcross-encoder/ms-marco-MiniLM-L-4-v2
: Even faster, slightly lower accuracycross-encoder/ms-marco-MiniLM-L-2-v2
: Fastest, good for real-time applications
High-Performance Models
cross-encoder/ms-marco-electra-base
: Better accuracy, larger modelms-marco-MiniLM-L-12-v2
: Balanced performance and speedcross-encoder/qnli-electra-base
: Good for question-answering tasks
Device Configuration
CPU Usage
GPU Usage (CUDA)
Apple Silicon (MPS)
Installation
The sentence-transformers library is required:Performance Optimization
Model Selection
- Use MiniLM models for faster inference
- Use larger models (electra-base) for better accuracy
- Consider the trade-off between speed and quality
Device Optimization
- Use GPU (
cuda
ormps
) for larger models - CPU is sufficient for MiniLM models
- Batch processing improves GPU utilization
Memory Considerations
Custom Models
You can use any Sentence Transformer cross-encoder model:Advantages
- Local Processing: No external API calls required
- Privacy: Data stays on your infrastructure
- Cost Effective: No per-request charges
- Fast: Especially with GPU acceleration
- Customizable: Can fine-tune on your specific data