CerebeCerebe Docs

Collection Stats

Retrieve collection statistics for the RAG corpus.

Collection Stats

Return summary statistics for your organization's RAG collection — how many chunks, how many unique sources, and which document types are represented. Useful for dashboards, health checks, and capacity planning.

Endpoint

GET /api/v1/rag/stats

Examples

from cerebe import AsyncCerebe

client = AsyncCerebe(api_key="ck_live_...")

stats = await client.rag.stats()
print(stats.data)
# → {
#   "total_chunks": 127,
#   "unique_sources": 34,
#   "document_types": {"markdown": 120, "yaml": 7},
#   "embedding_model": "text-embedding-3-small",
#   "chunk_size": 1000,
#   "chunk_overlap": 200,
# }
import Cerebe from '@cerebe/sdk'

const client = new Cerebe({ apiKey: 'ck_live_...' })

const stats = await client.rag.stats()
console.log(stats.data)
curl https://api.cerebe.ai/api/v1/rag/stats \
  -H "X-API-Key: ck_live_..."

Response

{
  "message": "Stats retrieved",
  "data": {
    "total_chunks": 127,
    "unique_sources": 34,
    "document_types": {
      "markdown": 120,
      "yaml": 7
    },
    "embedding_model": "text-embedding-3-small",
    "chunk_size": 1000,
    "chunk_overlap": 200
  }
}
FieldDescription
total_chunksTotal number of embedded chunks in your org
unique_sourcesNumber of distinct source values (documents)
document_typesChunk count broken down by doc_type
embedding_modelModel used to produce the vector embeddings
chunk_sizeChunk size in characters used when embedding
chunk_overlapOverlap size in characters between adjacent chunks

On this page