Skip to main content

What is Advanced Retrieval?

Advanced Retrieval gives you precise control over how memories are found and ranked. While basic search uses semantic similarity, these advanced options help you find exactly what you need, when you need it.

Search Enhancement Options

Reranking

Reorders results using deep semantic understanding to put the most relevant memories first.
  • Need the most relevant result at the top
  • Result order is critical for your application
  • Want consistent quality across different queries
  • Building user-facing features where accuracy matters

Real-World Use Cases

Python
# Smart home assistant finding device preferences
results = client.search(
    query="How do I like my bedroom temperature?",
    rerank=True,           # Get most recent preferences first
    user_id="user123"
)

# Finds: "Keep bedroom at 68°F", "Too cold last night at 65°F", etc.

Choosing the Right Configuration

# Basic search - good for exploration
def quick_search(query, user_id):
    return client.search(
        query=query,
        user_id=user_id
    )

# Reranked search - good for most applications
def standard_search(query, user_id):
    return client.search(
        query=query,
        rerank=True,
        user_id=user_id
    )

# Reranked search - good for critical applications
def precise_search(query, user_id):
    return client.search(
        query=query,
        rerank=True,
        user_id=user_id
    )

Best Practices

Do

  • Start simple with basic search and measure impact before enabling reranking
  • Use reranking when the top result quality matters most
  • Monitor latency and adjust based on your application’s needs
  • Handle empty results gracefully

Don’t

  • Enable reranking by default without measuring necessity
  • Ignore latency impact in real-time applications
  • Use advanced retrieval for simple, fast lookup scenarios

Performance Guidelines

Latency Expectations

Python
# Performance monitoring example
import time

start_time = time.time()
results = client.search(
    query="user preferences",
    rerank=True,         # +150ms
    user_id="user123"
)
latency = time.time() - start_time
print(f"Search completed in {latency:.2f}s")

Optimization Tips

  1. Cache frequent queries to avoid repeated advanced processing
  2. Use session-specific search with run_id to reduce search space
  3. Implement fallback logic when search returns empty results
  4. Monitor and alert on search latency patterns

Discord

Join our community

GitHub

Ask questions on GitHub

Support

Talk to founders