Retrieve semantically relevant document chunks via vector similarity, hybrid search, or nearest-neighbour matching.

Search Documents

RAG exposes three retrieval surfaces — all operate over the chunks you've embedded via embed and are scoped to your organization.

Semantic search

Vector similarity search on the query's embedding.

Endpoint

POST /api/v1/rag/search

Request Body

Parameter	Type	Required	Description
`query`	`string`	Yes	Natural language search query (min length 1)
`k`	`integer`	No	Number of results (1–20, default: 5)
`doc_type`	`string`	No	Filter by document type (e.g. `markdown`)
`source_pattern`	`string`	No	Substring match on the `source` field

Examples

from cerebe import AsyncCerebe

client = AsyncCerebe(api_key="ck_live_...")

results = await client.rag.search(
    query="how does authentication work?",
    k=3,
    doc_type="markdown",
)

for r in results.data["results"]:
    print(f"{r['source']} (score={r['score']:.3f})")
    print(r["content"][:200])

import Cerebe from '@cerebe/sdk'

const client = new Cerebe({ apiKey: 'ck_live_...' })

const results = await client.rag.search({
  query: 'how does authentication work?',
  k: 3,
  docType: 'markdown',
})

for (const r of results.data.results) {
  console.log(`${r.source} (score=${r.score.toFixed(3)})`)
  console.log(r.content.slice(0, 200))
}

curl -X POST https://api.cerebe.ai/api/v1/rag/search \
  -H "X-API-Key: ck_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how does authentication work?",
    "k": 3,
    "doc_type": "markdown"
  }'

Response

{
  "message": "Search completed",
  "data": {
    "results": [
      {
        "source": "docs/auth.md",
        "content": "Authentication uses API-key headers...",
        "score": 0.84,
        "metadata": { "version": "v2", "team": "platform" },
        "doc_type": "markdown"
      }
    ]
  }
}

metadata contains whatever was supplied at embed time — it is not auto-populated with doc_type or any other system fields.

score is the cosine similarity in [0, 1]. Higher is more similar.

Hybrid search

Weighted blend of semantic similarity and keyword overlap. Useful when queries contain specific terms, acronyms, or proper nouns that pure vector search can miss.

Endpoint

POST /api/v1/rag/search/hybrid

Request Body

Parameter	Type	Required	Description
`query`	`string`	Yes	Natural language search query
`k`	`integer`	No	Number of results (1–20, default: 5)
`semantic_weight`	`float`	No	Weight for semantic score (0.0–1.0, default: 0.7)
`keyword_weight`	`float`	No	Weight for keyword score (0.0–1.0, default: 0.3)
`doc_type`	`string`	No	Filter by document type

The weights don't have to sum to 1.0 — the final score is a weighted sum and results are ranked by it. The default 0.7 / 0.3 split favours semantic matches while still surfacing exact-term hits.

Examples

results = await client.rag.hybrid_search(
    query="JWT middleware rate-limiting",
    k=5,
    semantic_weight=0.6,
    keyword_weight=0.4,
)

const results = await client.rag.hybridSearch({
  query: 'JWT middleware rate-limiting',
  k: 5,
  semanticWeight: 0.6,
  keywordWeight: 0.4,
})

curl -X POST https://api.cerebe.ai/api/v1/rag/search/hybrid \
  -H "X-API-Key: ck_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "JWT middleware rate-limiting",
    "k": 5,
    "semantic_weight": 0.6,
    "keyword_weight": 0.4
  }'

Response

Hybrid results include both component scores alongside the final score so you can see what's driving the ranking:

{
  "message": "Hybrid search completed",
  "data": {
    "results": [
      {
        "source": "docs/auth.md",
        "content": "...",
        "score": 0.78,
        "semantic_score": 0.81,
        "keyword_score": 0.72,
        "metadata": {},
        "doc_type": "markdown"
      }
    ]
  }
}

Given a content block, return the chunks most similar to it. Useful for "find related docs" or recommendation-style flows. Operates on the content's embedding (not a query embedding) so you can pass a paragraph, a title, or even an entire short document.

Endpoint

POST /api/v1/rag/search/similar

Request Body

Parameter	Type	Required	Description
`content`	`string`	Yes	Content block to find similar documents for
`k`	`integer`	No	Number of results (1–20, default: 5)
`doc_type`	`string`	No	Filter by document type

Examples

results = await client.rag.find_similar(
    content="JWT tokens are validated via the JWKS endpoint",
    k=3,
)

const results = await client.rag.findSimilar({
  content: 'JWT tokens are validated via the JWKS endpoint',
  k: 3,
})

curl -X POST https://api.cerebe.ai/api/v1/rag/search/similar \
  -H "X-API-Key: ck_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "JWT tokens are validated via the JWKS endpoint",
    "k": 3
  }'

Response shape is identical to POST /api/v1/rag/search.

Search Documents

Search Documents

Semantic search

Endpoint

Request Body

Examples

Response

Hybrid search

Endpoint

Request Body

Examples

Response

Find similar documents

Endpoint

Request Body

Examples

On this page