Search API
The Search API provides semantic search capabilities using vector embeddings. Unlike keyword search, semantic search understands the meaning of your query and finds conceptually related memories.
How Vector Search Works
- Query Processing: Your text query is converted into a 1536-dimensional embedding vector using OpenAI's text-embedding-3-small model
- Similarity Calculation: The query embedding is compared against all stored memory embeddings using cosine similarity
- Ranking: Results are ranked by a combined score that factors in:
- Vector similarity (how semantically close the content is)
- Relevance score (memory's current relevance based on access patterns)
- Importance score (manually set priority)
- Filtering: Results are filtered by threshold and optional parameters
Semantic Search
POST /v1/searchPerforms semantic search across your memories.
Required Scope: search:read
Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes* | - | Natural language search query |
embedding | number[] | No | - | Pre-computed query embedding (1536 dimensions) |
threshold | number | No | 0.7 | Minimum similarity score (0-1) |
limit | integer | No | 20 | Maximum results (max 100) |
tier | string | No | - | Filter by tier: short, medium, long |
memory_nature | string | No | - | Filter by nature: episodic, semantic |
tags | string[] | No | - | Filter by tag names |
entities | string[] | No | - | Filter by entity IDs |
*Either query or embedding is required.
Response
| Field | Type | Description |
|---|---|---|
results | array | Array of search results |
search_type | string | semantic or text (fallback) |
threshold | number | Applied similarity threshold |
Search Result Object
| Field | Type | Description |
|---|---|---|
id | string | Memory ID |
content | string | Memory content |
content_type | string | Content type |
tier | string | Memory tier |
relevance_score | number | Current relevance score |
similarity | number | Vector similarity to query (0-1) |
combined_score | number | Weighted combination score |
metadata | object | Memory metadata |
memory_nature | string | Episodic or semantic |
created_at | string | Creation timestamp |
cURL Example
curl -X POST "https://api.mymemoryos.com/api/v1/search" \
-H "Authorization: Bearer mos_live_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the user interface preferences?",
"threshold": 0.6,
"limit": 10,
"tier": "long"
}'JavaScript Example
const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
method: 'POST',
headers: {
'Authorization': 'Bearer mos_live_<your_key>',
'Content-Type': 'application/json'
},
body: JSON.stringify({
query: 'What are the user interface preferences?',
threshold: 0.6,
limit: 10,
tier: 'long'
})
});
const { data } = await response.json();
console.log(`Found ${data.results.length} relevant memories`);
for (const result of data.results) {
console.log(`[${result.similarity.toFixed(2)}] ${result.content}`);
}Python Example
import requests
response = requests.post(
'https://api.mymemoryos.com/api/v1/search',
headers={
'Authorization': 'Bearer mos_live_<your_key>',
'Content-Type': 'application/json'
},
json={
'query': 'What are the user interface preferences?',
'threshold': 0.6,
'limit': 10,
'tier': 'long'
}
)
data = response.json()['data']
print(f"Found {len(data['results'])} relevant memories")
for result in data['results']:
print(f"[{result['similarity']:.2f}] {result['content']}")Response Example
{
"data": {
"results": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"content": "User prefers dark mode interfaces and minimal UI designs",
"content_type": "fact",
"tier": "long",
"memory_nature": "semantic",
"relevance_score": 0.85,
"similarity": 0.92,
"combined_score": 0.88,
"metadata": {
"category": "preferences"
},
"created_at": "2024-01-15T10:30:00.000Z"
},
{
"id": "550e8400-e29b-41d4-a716-446655440001",
"content": "User requested larger font sizes in settings",
"content_type": "fact",
"tier": "long",
"memory_nature": "semantic",
"relevance_score": 0.72,
"similarity": 0.78,
"combined_score": 0.75,
"metadata": {},
"created_at": "2024-01-14T09:15:00.000Z"
}
],
"search_type": "semantic",
"threshold": 0.6
},
"meta": {
"request_id": "req_abc123",
"latency_ms": 120
}
}Using Pre-computed Embeddings
If you're generating embeddings client-side, you can pass them directly:
// Generate embedding using OpenAI's API
const embeddingResponse = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'user interface preferences'
});
const queryEmbedding = embeddingResponse.data[0].embedding;
// Search with pre-computed embedding
const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
method: 'POST',
headers: {
'Authorization': 'Bearer mos_live_<your_key>',
'Content-Type': 'application/json'
},
body: JSON.stringify({
embedding: queryEmbedding,
threshold: 0.7,
limit: 20
})
});Note: The embedding must be a 1536-dimensional vector from OpenAI's text-embedding-3-small model.
Threshold Tuning Guide
The threshold parameter controls the minimum similarity score for results. Choosing the right threshold balances precision and recall.
Threshold Guidelines
| Threshold | Precision | Recall | Use Case |
|---|---|---|---|
| 0.9+ | Very High | Very Low | Exact semantic matches only |
| 0.8-0.9 | High | Low | Strong topical relevance |
| 0.7-0.8 | Medium | Medium | General semantic search (default) |
| 0.6-0.7 | Low | High | Exploratory search, related topics |
| < 0.6 | Very Low | Very High | Broad discovery, may include noise |
Tuning Strategy
// Start with default threshold
let threshold = 0.7;
// Search and evaluate results
const results = await search(query, { threshold });
// If too few results, lower threshold
if (results.length < 5) {
threshold = 0.6;
}
// If too many irrelevant results, raise threshold
if (results.some(r => r.similarity < 0.65)) {
threshold = 0.75;
}Query-Specific Thresholds
Different query types benefit from different thresholds:
| Query Type | Recommended Threshold | Reason |
|---|---|---|
| Specific facts | 0.8+ | Need precise matches |
| General topics | 0.7 | Balance precision/recall |
| Exploratory | 0.5-0.6 | Cast a wide net |
| Conversational | 0.65 | Allow context flexibility |
Filtering Search Results
Combine filters to narrow search scope:
const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
method: 'POST',
headers: {
'Authorization': 'Bearer mos_live_<your_key>',
'Content-Type': 'application/json'
},
body: JSON.stringify({
query: 'project meeting notes',
threshold: 0.6,
limit: 20,
tier: 'medium', // Only medium-term memories
memory_nature: 'episodic', // Only event-based memories
tags: ['work', 'meetings'], // Must have these tags
entities: ['entity_123'] // Related to specific entity
})
});Fallback Behavior
If vector search fails (e.g., embedding service unavailable), the API automatically falls back to text-based search:
{
"data": {
"results": [...],
"search_type": "text"
}
}The search_type field indicates which search method was used. Text search results use a default similarity of 0.5.
Performance Considerations
Query Optimization
- Be specific: More specific queries produce better embeddings
- Use filters: Narrow the search space with tier, tags, or entities
- Limit results: Request only what you need
Latency Factors
| Factor | Impact | Mitigation |
|---|---|---|
| Query length | Minimal | None needed |
| Memory count | Linear | Use filters to reduce scope |
| Result limit | Minimal | None needed |
| Embedding generation | ~50-100ms | Pre-compute embeddings |
| Database query | Variable | Indexed queries |
Caching Strategy
For repeated queries, consider caching results:
const cache = new Map();
async function searchWithCache(query, options = {}) {
const cacheKey = JSON.stringify({ query, ...options });
if (cache.has(cacheKey)) {
const cached = cache.get(cacheKey);
if (Date.now() - cached.timestamp < 60000) { // 1 minute TTL
return cached.results;
}
}
const response = await fetch('https://api.mymemoryos.com/api/v1/search', {
method: 'POST',
headers: {
'Authorization': 'Bearer mos_live_<your_key>',
'Content-Type': 'application/json'
},
body: JSON.stringify({ query, ...options })
});
const { data } = await response.json();
cache.set(cacheKey, {
results: data.results,
timestamp: Date.now()
});
return data.results;
}Combined Score Calculation
The combined_score is calculated as:
combined_score = (similarity * 0.6) + (relevance_score * 0.25) + (importance_score * 0.15)This weighting ensures:
- Semantic relevance is the primary factor (60%)
- Recent access patterns matter (25%)
- Manual importance settings are considered (15%)