Search API

The Search API provides semantic search capabilities using vector embeddings. Unlike keyword search, semantic search understands the meaning of your query and finds conceptually related memories.

How Vector Search Works

Query Processing: Your text query is converted into a 1536-dimensional embedding vector using OpenAI's text-embedding-3-small model
Similarity Calculation: The query embedding is compared against all stored memory embeddings using cosine similarity
Ranking: Results are ranked by a combined score that factors in:
- Vector similarity (how semantically close the content is)
- Relevance score (memory's current relevance based on access patterns)
- Importance score (manually set priority)
Filtering: Results are filtered by threshold and optional parameters

Semantic Search

HTTP

POST /v1/search

Performs semantic search across your memories.

Required Scope: search:read

Request Body

Parameter	Type	Required	Default	Description
`query`	string	Yes*	-	Natural language search query
`embedding`	number[]	No	-	Pre-computed query embedding (1536 dimensions)
`threshold`	number	No	0.7	Minimum similarity score (0-1)
`limit`	integer	No	20	Maximum results (max 100)
`tier`	string	No	-	Filter by tier: `short`, `medium`, `long`
`memory_nature`	string	No	-	Filter by nature: `episodic`, `semantic`
`tags`	string[]	No	-	Filter by tag names
`entities`	string[]	No	-	Filter by entity IDs

*Either query or embedding is required.

Response

Field	Type	Description
`results`	array	Array of search results
`search_type`	string	`semantic` or `text` (fallback)
`threshold`	number	Applied similarity threshold

Search Result Object

Field	Type	Description
`id`	string	Memory ID
`content`	string	Memory content
`content_type`	string	Content type
`tier`	string	Memory tier
`relevance_score`	number	Current relevance score
`similarity`	number	Vector similarity to query (0-1)
`combined_score`	number	Weighted combination score
`metadata`	object	Memory metadata
`memory_nature`	string	Episodic or semantic
`created_at`	string	Creation timestamp

cURL Example

Bash

curl -X POST "https://api.mymemoryos.com/api/v1/search" \
  -H "Authorization: Bearer mos_live_<your_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the user interface preferences?",
    "threshold": 0.6,
    "limit": 10,
    "tier": "long"
  }'

JavaScript Example

JavaScript

const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer mos_live_<your_key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    query: 'What are the user interface preferences?',
    threshold: 0.6,
    limit: 10,
    tier: 'long'
  })
});

const { data } = await response.json();
console.log(`Found ${data.results.length} relevant memories`);

for (const result of data.results) {
  console.log(`[${result.similarity.toFixed(2)}] ${result.content}`);
}

Python Example

Python

import requests

response = requests.post(
    'https://api.mymemoryos.com/api/v1/search',
    headers={
        'Authorization': 'Bearer mos_live_<your_key>',
        'Content-Type': 'application/json'
    },
    json={
        'query': 'What are the user interface preferences?',
        'threshold': 0.6,
        'limit': 10,
        'tier': 'long'
    }
)

data = response.json()['data']
print(f"Found {len(data['results'])} relevant memories")

for result in data['results']:
    print(f"[{result['similarity']:.2f}] {result['content']}")

Response Example

JSON

{
  "data": {
    "results": [
      {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "content": "User prefers dark mode interfaces and minimal UI designs",
        "content_type": "fact",
        "tier": "long",
        "memory_nature": "semantic",
        "relevance_score": 0.85,
        "similarity": 0.92,
        "combined_score": 0.88,
        "metadata": {
          "category": "preferences"
        },
        "created_at": "2024-01-15T10:30:00.000Z"
      },
      {
        "id": "550e8400-e29b-41d4-a716-446655440001",
        "content": "User requested larger font sizes in settings",
        "content_type": "fact",
        "tier": "long",
        "memory_nature": "semantic",
        "relevance_score": 0.72,
        "similarity": 0.78,
        "combined_score": 0.75,
        "metadata": {},
        "created_at": "2024-01-14T09:15:00.000Z"
      }
    ],
    "search_type": "semantic",
    "threshold": 0.6
  },
  "meta": {
    "request_id": "req_abc123",
    "latency_ms": 120
  }
}

Using Pre-computed Embeddings

If you're generating embeddings client-side, you can pass them directly:

JavaScript

// Generate embedding using OpenAI's API
const embeddingResponse = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: 'user interface preferences'
});

const queryEmbedding = embeddingResponse.data[0].embedding;

// Search with pre-computed embedding
const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer mos_live_<your_key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    embedding: queryEmbedding,
    threshold: 0.7,
    limit: 20
  })
});

Note: The embedding must be a 1536-dimensional vector from OpenAI's text-embedding-3-small model.

Threshold Tuning Guide

The threshold parameter controls the minimum similarity score for results. Choosing the right threshold balances precision and recall.

Threshold Guidelines

Threshold	Precision	Recall	Use Case
0.9+	Very High	Very Low	Exact semantic matches only
0.8-0.9	High	Low	Strong topical relevance
0.7-0.8	Medium	Medium	General semantic search (default)
0.6-0.7	Low	High	Exploratory search, related topics
< 0.6	Very Low	Very High	Broad discovery, may include noise

Tuning Strategy

JavaScript

// Start with default threshold
let threshold = 0.7;

// Search and evaluate results
const results = await search(query, { threshold });

// If too few results, lower threshold
if (results.length < 5) {
  threshold = 0.6;
}

// If too many irrelevant results, raise threshold
if (results.some(r => r.similarity < 0.65)) {
  threshold = 0.75;
}

Query-Specific Thresholds

Different query types benefit from different thresholds:

Query Type	Recommended Threshold	Reason
Specific facts	0.8+	Need precise matches
General topics	0.7	Balance precision/recall
Exploratory	0.5-0.6	Cast a wide net
Conversational	0.65	Allow context flexibility

Filtering Search Results

Combine filters to narrow search scope:

JavaScript

const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer mos_live_<your_key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    query: 'project meeting notes',
    threshold: 0.6,
    limit: 20,
    tier: 'medium',                    // Only medium-term memories
    memory_nature: 'episodic',         // Only event-based memories
    tags: ['work', 'meetings'],        // Must have these tags
    entities: ['entity_123']           // Related to specific entity
  })
});

Fallback Behavior

If vector search fails (e.g., embedding service unavailable), the API automatically falls back to text-based search:

JSON

{
  "data": {
    "results": [...],
    "search_type": "text"
  }
}

The search_type field indicates which search method was used. Text search results use a default similarity of 0.5.

Performance Considerations

Query Optimization

Be specific: More specific queries produce better embeddings
Use filters: Narrow the search space with tier, tags, or entities
Limit results: Request only what you need

Latency Factors

Factor	Impact	Mitigation
Query length	Minimal	None needed
Memory count	Linear	Use filters to reduce scope
Result limit	Minimal	None needed
Embedding generation	~50-100ms	Pre-compute embeddings
Database query	Variable	Indexed queries

Caching Strategy

For repeated queries, consider caching results:

JavaScript

const cache = new Map();

async function searchWithCache(query, options = {}) {
  const cacheKey = JSON.stringify({ query, ...options });

  if (cache.has(cacheKey)) {
    const cached = cache.get(cacheKey);
    if (Date.now() - cached.timestamp < 60000) { // 1 minute TTL
      return cached.results;
    }
  }

  const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer mos_live_<your_key>',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ query, ...options })
  });

  const { data } = await response.json();

  cache.set(cacheKey, {
    results: data.results,
    timestamp: Date.now()
  });

  return data.results;
}

Combined Score Calculation

The combined_score is calculated as:

TEXT

combined_score = (similarity * 0.6) + (relevance_score * 0.25) + (importance_score * 0.15)

This weighting ensures:

Semantic relevance is the primary factor (60%)
Recent access patterns matter (25%)
Manual importance settings are considered (15%)