Search Documents
Retrieve semantically relevant document chunks via vector similarity, hybrid search, or nearest-neighbour matching.
Search Documents
RAG exposes three retrieval surfaces — all operate over the chunks you've embedded via embed and are scoped to your organization.
Semantic search
Vector similarity search on the query's embedding.
Endpoint
POST /api/v1/rag/searchRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Natural language search query (min length 1) |
k | integer | No | Number of results (1–20, default: 5) |
doc_type | string | No | Filter by document type (e.g. markdown) |
source_pattern | string | No | Substring match on the source field |
Examples
from cerebe import AsyncCerebe
client = AsyncCerebe(api_key="ck_live_...")
results = await client.rag.search(
query="how does authentication work?",
k=3,
doc_type="markdown",
)
for r in results.data["results"]:
print(f"{r['source']} (score={r['score']:.3f})")
print(r["content"][:200])import Cerebe from '@cerebe/sdk'
const client = new Cerebe({ apiKey: 'ck_live_...' })
const results = await client.rag.search({
query: 'how does authentication work?',
k: 3,
docType: 'markdown',
})
for (const r of results.data.results) {
console.log(`${r.source} (score=${r.score.toFixed(3)})`)
console.log(r.content.slice(0, 200))
}curl -X POST https://api.cerebe.ai/api/v1/rag/search \
-H "X-API-Key: ck_live_..." \
-H "Content-Type: application/json" \
-d '{
"query": "how does authentication work?",
"k": 3,
"doc_type": "markdown"
}'Response
{
"message": "Search completed",
"data": {
"results": [
{
"source": "docs/auth.md",
"content": "Authentication uses API-key headers...",
"score": 0.84,
"metadata": { "version": "v2", "team": "platform" },
"doc_type": "markdown"
}
]
}
}metadata contains whatever was supplied at embed time — it is not auto-populated with doc_type or any other system fields.
score is the cosine similarity in [0, 1]. Higher is more similar.
Hybrid search
Weighted blend of semantic similarity and keyword overlap. Useful when queries contain specific terms, acronyms, or proper nouns that pure vector search can miss.
Endpoint
POST /api/v1/rag/search/hybridRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Natural language search query |
k | integer | No | Number of results (1–20, default: 5) |
semantic_weight | float | No | Weight for semantic score (0.0–1.0, default: 0.7) |
keyword_weight | float | No | Weight for keyword score (0.0–1.0, default: 0.3) |
doc_type | string | No | Filter by document type |
The weights don't have to sum to 1.0 — the final score is a weighted sum and results are ranked by it. The default 0.7 / 0.3 split favours semantic matches while still surfacing exact-term hits.
Examples
results = await client.rag.hybrid_search(
query="JWT middleware rate-limiting",
k=5,
semantic_weight=0.6,
keyword_weight=0.4,
)const results = await client.rag.hybridSearch({
query: 'JWT middleware rate-limiting',
k: 5,
semanticWeight: 0.6,
keywordWeight: 0.4,
})curl -X POST https://api.cerebe.ai/api/v1/rag/search/hybrid \
-H "X-API-Key: ck_live_..." \
-H "Content-Type: application/json" \
-d '{
"query": "JWT middleware rate-limiting",
"k": 5,
"semantic_weight": 0.6,
"keyword_weight": 0.4
}'Response
Hybrid results include both component scores alongside the final score so you can see what's driving the ranking:
{
"message": "Hybrid search completed",
"data": {
"results": [
{
"source": "docs/auth.md",
"content": "...",
"score": 0.78,
"semantic_score": 0.81,
"keyword_score": 0.72,
"metadata": {},
"doc_type": "markdown"
}
]
}
}Find similar documents
Given a content block, return the chunks most similar to it. Useful for "find related docs" or recommendation-style flows. Operates on the content's embedding (not a query embedding) so you can pass a paragraph, a title, or even an entire short document.
Endpoint
POST /api/v1/rag/search/similarRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
content | string | Yes | Content block to find similar documents for |
k | integer | No | Number of results (1–20, default: 5) |
doc_type | string | No | Filter by document type |
Examples
results = await client.rag.find_similar(
content="JWT tokens are validated via the JWKS endpoint",
k=3,
)const results = await client.rag.findSimilar({
content: 'JWT tokens are validated via the JWKS endpoint',
k: 3,
})curl -X POST https://api.cerebe.ai/api/v1/rag/search/similar \
-H "X-API-Key: ck_live_..." \
-H "Content-Type: application/json" \
-d '{
"content": "JWT tokens are validated via the JWKS endpoint",
"k": 3
}'Response shape is identical to POST /api/v1/rag/search.