Skip to main content

Overview

The graph store threshold parameter controls how strictly nodes are matched during graph data ingestion based on embedding similarity. This feature allows you to customize the matching behavior to prevent false matches or enable entity merging based on your specific use case.

Configuration

Add the threshold parameter to your graph store configuration:
from mem0 import Memory

config = {
    "graph_store": {
        "provider": "neo4j",  # or memgraph, neptune, kuzu
        "config": {
            "url": "bolt://localhost:7687",
            "username": "neo4j",
            "password": "password"
        },
        "threshold": 0.7  # Default value, range: 0.0 to 1.0
    }
}

memory = Memory.from_config(config)

Parameters

ParameterTypeDefaultRangeDescription
thresholdfloat0.70.0 - 1.0Minimum embedding similarity score required to match existing nodes during graph ingestion

Use Cases

Strict Matching (UUIDs, IDs)

Use higher thresholds (0.95-0.99) when working with identifiers that should remain distinct:
config = {
    "graph_store": {
        "provider": "neo4j",
        "config": {...},
        "threshold": 0.95  # Strict matching
    }
}
Example: Prevents UUID collisions like MXxBUE18QVBQTElDQVRJT058MjM3MTM4NjI5 being matched with MXxBUE18QVBQTElDQVRJT058MjA2OTYxMzM

Permissive Matching (Natural Language)

Use lower thresholds (0.6-0.7) when entity variations should be merged:
config = {
    "graph_store": {
        "threshold": 0.6  # Permissive matching
    }
}
Example: Merges similar entities like “Bob” and “Robert” as the same person.

Threshold Guidelines

Use CaseRecommended ThresholdBehavior
UUIDs, IDs, Keys0.95 - 0.99Prevent false matches between similar identifiers
Structured Data0.85 - 0.9Balanced precision and recall
General Purpose0.7 - 0.8Default recommendation
Natural Language0.6 - 0.7Allow entity variations to merge

Examples

Example 1: Preventing Data Loss with UUIDs

from mem0 import Memory

config = {
    "graph_store": {
        "provider": "neo4j",
        "config": {
            "url": "bolt://localhost:7687",
            "username": "neo4j",
            "password": "password"
        },
        "threshold": 0.98  # Very strict for UUIDs
    }
}

memory = Memory.from_config(config)

# These UUIDs create separate nodes instead of being incorrectly merged
memory.add(
    [{"role": "user", "content": "MXxBUE18QVBQTElDQVRJT058MjM3MTM4NjI5 relates to Project A"}],
    user_id="user1"
)

memory.add(
    [{"role": "user", "content": "MXxBUE18QVBQTElDQVRJT058MjA2OTYxMzM relates to Project B"}],
    user_id="user1"
)

Example 2: Merging Entity Variations

config = {
    "graph_store": {
        "provider": "neo4j",
        "config": {...},
        "threshold": 0.6  # More permissive
    }
}

memory = Memory.from_config(config)

# These will be merged as the same entity
memory.add([{"role": "user", "content": "Bob works at Google"}], user_id="user1")
memory.add([{"role": "user", "content": "Robert works at Google"}], user_id="user1")

Example 3: Different Thresholds for Different Clients

# Client 1: Strict matching for transactional data
memory_strict = Memory.from_config({
    "graph_store": {"threshold": 0.95}
})

# Client 2: Permissive matching for conversational data
memory_permissive = Memory.from_config({
    "graph_store": {"threshold": 0.6}
})

Supported Graph Providers

The threshold parameter works with all graph store providers:
  • ✅ Neo4j
  • ✅ Memgraph
  • ✅ Kuzu
  • ✅ Neptune (both Analytics and DB)

How It Works

When adding a relation to the graph:
  1. Embedding Generation: The system generates embeddings for source and destination entities
  2. Node Search: Searches for existing nodes with similar embeddings
  3. Threshold Comparison: Compares similarity scores against the configured threshold
  4. Decision:
    • If similarity ≥ threshold: Uses the existing node
    • If similarity < threshold: Creates a new node
# Pseudocode
if node_similarity >= threshold:
    use_existing_node()
else:
    create_new_node()

Troubleshooting

Issue: Duplicate nodes being created

Symptom: Expected nodes to merge but they’re created separately Solution: Lower the threshold
config = {"graph_store": {"threshold": 0.6}}

Issue: Unrelated entities being merged

Symptom: Different entities incorrectly matched as the same node Solution: Raise the threshold
config = {"graph_store": {"threshold": 0.95}}

Issue: Validation error

Symptom: ValidationError: threshold must be between 0.0 and 1.0 Solution: Ensure threshold is in valid range
config = {"graph_store": {"threshold": 0.7}}  # Valid: 0.0 ≤ x ≤ 1.0

Backward Compatibility

  • Default Value: 0.7 (maintains existing behavior)
  • Optional Parameter: Existing code works without any changes
  • No Breaking Changes: Graceful fallback if not specified
I