Collection Stats

Return summary statistics for your organization's RAG collection — how many chunks, how many unique sources, and which document types are represented. Useful for dashboards, health checks, and capacity planning.

Endpoint

GET /api/v1/rag/stats

Examples

from cerebe import AsyncCerebe

client = AsyncCerebe(api_key="ck_live_...")

stats = await client.rag.stats()
print(stats.data)
# → {
#   "total_chunks": 127,
#   "unique_sources": 34,
#   "document_types": {"markdown": 120, "yaml": 7},
#   "embedding_model": "text-embedding-3-small",
#   "chunk_size": 1000,
#   "chunk_overlap": 200,
# }

import Cerebe from '@cerebe/sdk'

const client = new Cerebe({ apiKey: 'ck_live_...' })

const stats = await client.rag.stats()
console.log(stats.data)

curl https://api.cerebe.ai/api/v1/rag/stats \
  -H "X-API-Key: ck_live_..."

Response

{
  "message": "Stats retrieved",
  "data": {
    "total_chunks": 127,
    "unique_sources": 34,
    "document_types": {
      "markdown": 120,
      "yaml": 7
    },
    "embedding_model": "text-embedding-3-small",
    "chunk_size": 1000,
    "chunk_overlap": 200
  }
}

Field	Description
`total_chunks`	Total number of embedded chunks in your org
`unique_sources`	Number of distinct `source` values (documents)
`document_types`	Chunk count broken down by `doc_type`
`embedding_model`	Model used to produce the vector embeddings
`chunk_size`	Chunk size in characters used when embedding
`chunk_overlap`	Overlap size in characters between adjacent chunks

Collection Stats

Collection Stats

Endpoint

Examples

Response

On this page