Relevance Scoring

Memory OS uses a multi-factor scoring algorithm to rank memories during retrieval. This ensures that the most relevant, recent, and important memories are surfaced first, mimicking how human memory prioritizes information.

Overview

The combined relevance score is calculated from six weighted factors:

Factor	Weight	Description
Semantic Similarity	40%	How closely the memory matches the query meaning
Recency	20%	How recently the memory was created or accessed
Importance	15%	Explicit importance score assigned to the memory
Access Frequency	10%	How often the memory has been retrieved
User Feedback	10%	Signals from useful/not useful ratings
Entity Co-occurrence	5%	Shared entities between query and memory

The Scoring Formula

TEXT

combined_score = (0.40 * similarity) +
                 (0.20 * recency_score) +
                 (0.15 * importance_score) +
                 (0.10 * access_score) +
                 (0.10 * feedback_score) +
                 (0.05 * entity_score)

All component scores are normalized to a 0-1 range, resulting in a final score between 0 and 1.

Factor Details

1. Semantic Similarity (40%)

The largest factor is semantic similarity, calculated using vector embeddings. When you search, your query is converted to an embedding and compared against stored memory embeddings using cosine similarity.

How it works:

Query text is embedded using the same model as stored memories
Cosine similarity is computed between query and memory embeddings
Higher similarity means the memory content is more semantically related

Example:

TEXT

Query: "What programming languages does the user prefer?"
Memory: "User works primarily with Python and TypeScript"
Similarity: 0.87 (high - related to programming languages)

Memory: "User prefers dark mode in their IDE"
Similarity: 0.45 (low - related to user preferences but not languages)

2. Recency (20%)

Recent memories are often more relevant. The recency score decays exponentially based on the memory's age and tier.

Decay rates by tier:

Short-term: Half-life of 6 hours
Medium-term: Half-life of 7 days
Long-term: Half-life of 90 days

Formula:

TEXT

recency_score = exp(-decay_rate * hours_since_access)

A memory accessed 1 hour ago scores higher than one accessed 1 week ago (assuming same tier).

3. Importance Score (15%)

Each memory has an explicit importance score (0-1) that you can set during creation or update. This lets you manually boost critical memories.

Default values:

New memories start at 0.5
Promote to 0.7-0.9 for important information
Demote to 0.2-0.4 for nice-to-have context

JavaScript

// Create a high-importance memory
await client.memories.create({
  content: "User is the CEO and needs executive-level responses",
  tier: "long",
  content_type: "fact",
  importance_score: 0.95,
  metadata: { category: "user-profile" }
});

// Update importance based on feedback
await client.memories.update("memory-id", {
  importance_score: 0.8  // Boost after positive feedback
});

Python

# Create a high-importance memory
client.memories.create(
    content="User is the CEO and needs executive-level responses",
    tier="long",
    content_type="fact",
    importance_score=0.95,
    metadata={"category": "user-profile"}
)

# Update importance based on feedback
client.memories.update("memory-id",
    importance_score=0.8  # Boost after positive feedback
)

Bash

# Create a high-importance memory
curl -X POST https://api.mymemoryos.com/v1/memories \
  -H "Authorization: Bearer $MEMORY_OS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User is the CEO and needs executive-level responses",
    "tier": "long",
    "content_type": "fact",
    "importance_score": 0.95,
    "metadata": {"category": "user-profile"}
  }'

4. Access Frequency (10%)

Memories that are frequently retrieved are likely more relevant. Access frequency is tracked automatically and normalized.

How it works:

Each retrieval increments access_count
Score is calculated as min(1, access_count / 20)
Caps at 20 accesses to prevent runaway scores

This creates a self-reinforcing loop where useful memories become easier to find.

5. User Feedback (10%)

Explicit feedback from users or your application adjusts memory relevance.

Feedback types:

useful: Boosts relevance score
not_useful: Decreases relevance score
outdated: Marks for review/decay
incorrect: Significantly penalizes the memory

JavaScript

// Record positive feedback
await client.feedback.create({
  memory_id: "memory-id",
  type: "useful",
  context: "User found this information helpful"
});

// Record negative feedback
await client.feedback.create({
  memory_id: "memory-id",
  type: "not_useful",
  context: "Information was outdated"
});

Python

# Record positive feedback
client.feedback.create(
    memory_id="memory-id",
    type="useful",
    context="User found this information helpful"
)

# Record negative feedback
client.feedback.create(
    memory_id="memory-id",
    type="not_useful",
    context="Information was outdated"
)

Bash

# Record positive feedback
curl -X POST https://api.mymemoryos.com/v1/feedback \
  -H "Authorization: Bearer $MEMORY_OS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "memory_id": "memory-id",
    "type": "useful",
    "context": "User found this information helpful"
  }'

6. Entity Co-occurrence (5%)

Memories that share entities with the query receive a boost. Entities are extracted automatically from memory content.

Example:

TEXT

Query: "What does Sarah think about the React migration?"
Entities: ["Sarah", "React"]

Memory 1: "Sarah mentioned concerns about the migration timeline"
Entities: ["Sarah", "migration"]
Entity overlap: 1 (Sarah)
Entity score: 0.5

Memory 2: "The React migration is scheduled for Q2"
Entities: ["React", "migration", "Q2"]
Entity overlap: 1 (React)
Entity score: 0.5

Memory 3: "Sarah prefers Vue over React"
Entities: ["Sarah", "Vue", "React"]
Entity overlap: 2 (Sarah, React)
Entity score: 1.0

Understanding Search Results

Search results include individual scores for transparency:

JSON

{
  "results": [
    {
      "id": "mem_123",
      "content": "User prefers Python for data analysis work",
      "similarity": 0.89,
      "combined_score": 0.82,
      "relevance_score": 0.75,
      "tier": "long",
      "created_at": "2024-01-10T15:30:00Z"
    }
  ],
  "search_type": "semantic",
  "threshold": 0.7
}

similarity: Raw semantic similarity (0-1)
combined_score: Final weighted score used for ranking
relevance_score: Stored relevance score (updated by decay and feedback)

Tuning for Your Use Case

Different applications may need different scoring weights. Here are common tuning patterns:

Real-time Chat (Prioritize Recency)

For chatbots where recent context matters most:

JavaScript

// Emphasize short-term memories and recent access
const results = await client.search({
  query: "What is the user currently working on?",
  tier: "short",  // Only look at short-term memories
  limit: 5
});

Knowledge Base (Prioritize Importance)

For FAQ or documentation systems where accuracy trumps recency:

JavaScript

// Focus on long-term semantic memories with high importance
const results = await client.search({
  query: "How does authentication work?",
  tier: "long",
  memory_nature: "semantic",
  threshold: 0.8  // Higher threshold for precision
});

Personalization (Balance All Factors)

For personalization engines that need both history and preferences:

JavaScript

// Use default scoring but filter by relevant metadata
const context = await client.getContext({
  query: "Personalize the homepage for this user",
  max_tokens: 3000
  // Default scoring works well for personalization
});

Minimum Thresholds

Use the threshold parameter to set a minimum combined score:

JavaScript

// Only return highly relevant memories
const results = await client.search({
  query: "User preferences",
  threshold: 0.8,  // 80% minimum combined score
  limit: 10
});

// More lenient for broad context
const broadResults = await client.search({
  query: "Any relevant user information",
  threshold: 0.5,  // 50% minimum
  limit: 50
});

Python

# Only return highly relevant memories
results = client.search(
    query="User preferences",
    threshold=0.8,  # 80% minimum combined score
    limit=10
)

# More lenient for broad context
broad_results = client.search(
    query="Any relevant user information",
    threshold=0.5,  # 50% minimum
    limit=50
)

Bash

# Only return highly relevant memories
curl -X POST https://api.mymemoryos.com/v1/search \
  -H "Authorization: Bearer $MEMORY_OS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "User preferences",
    "threshold": 0.8,
    "limit": 10
  }'

Best Practices

1. Set Meaningful Importance Scores

Don't leave everything at the default 0.5. Actively manage importance:

JavaScript

// Critical information
{ importance_score: 0.9 }  // User identity, key preferences

// Standard information
{ importance_score: 0.5 }  // Regular interactions

// Low-priority context
{ importance_score: 0.3 }  // Nice-to-have details

2. Collect Feedback

Implement feedback loops to improve relevance over time:

JavaScript

// After using a memory successfully
if (userFoundResponseHelpful) {
  await client.feedback.create({
    memory_id: usedMemoryId,
    type: "useful"
  });
}

3. Use Appropriate Tiers

The tier affects decay rate, which impacts recency scoring:

Short-term memories decay quickly, keeping them relevant only briefly
Long-term memories decay slowly, remaining relevant for months

4. Monitor Combined Scores

Track the combined scores of retrieved memories to calibrate your thresholds:

JavaScript

const results = await client.search({ query, limit: 10 });

// Log score distribution
const scores = results.results.map(r => r.combined_score);
console.log(`Score range: ${Math.min(...scores)} - ${Math.max(...scores)}`);
console.log(`Average: ${scores.reduce((a,b) => a+b, 0) / scores.length}`);

Relevance Scoring

Overview

The Scoring Formula

Factor Details

1. Semantic Similarity (40%)

2. Recency (20%)

3. Importance Score (15%)

4. Access Frequency (10%)

5. User Feedback (10%)

6. Entity Co-occurrence (5%)

Understanding Search Results

Tuning for Your Use Case

Real-time Chat (Prioritize Recency)

Knowledge Base (Prioritize Importance)

Personalization (Balance All Factors)

Minimum Thresholds

Best Practices

1. Set Meaningful Importance Scores

2. Collect Feedback

3. Use Appropriate Tiers

4. Monitor Combined Scores

Search API

Context API

Memory Tiers

Relevance Scoring

Overview

The Scoring Formula

Factor Details

1. Semantic Similarity (40%)

2. Recency (20%)

3. Importance Score (15%)

4. Access Frequency (10%)

5. User Feedback (10%)

6. Entity Co-occurrence (5%)

Understanding Search Results

Tuning for Your Use Case

Real-time Chat (Prioritize Recency)

Knowledge Base (Prioritize Importance)

Personalization (Balance All Factors)

Minimum Thresholds

Best Practices

1. Set Meaningful Importance Scores

2. Collect Feedback

3. Use Appropriate Tiers

4. Monitor Combined Scores

Related

Search API

Context API

Memory Tiers