Collection Stats
Retrieve collection statistics for the RAG corpus.
Collection Stats
Return summary statistics for your organization's RAG collection — how many chunks, how many unique sources, and which document types are represented. Useful for dashboards, health checks, and capacity planning.
Endpoint
GET /api/v1/rag/statsExamples
from cerebe import AsyncCerebe
client = AsyncCerebe(api_key="ck_live_...")
stats = await client.rag.stats()
print(stats.data)
# → {
# "total_chunks": 127,
# "unique_sources": 34,
# "document_types": {"markdown": 120, "yaml": 7},
# "embedding_model": "text-embedding-3-small",
# "chunk_size": 1000,
# "chunk_overlap": 200,
# }import Cerebe from '@cerebe/sdk'
const client = new Cerebe({ apiKey: 'ck_live_...' })
const stats = await client.rag.stats()
console.log(stats.data)curl https://api.cerebe.ai/api/v1/rag/stats \
-H "X-API-Key: ck_live_..."Response
{
"message": "Stats retrieved",
"data": {
"total_chunks": 127,
"unique_sources": 34,
"document_types": {
"markdown": 120,
"yaml": 7
},
"embedding_model": "text-embedding-3-small",
"chunk_size": 1000,
"chunk_overlap": 200
}
}| Field | Description |
|---|---|
total_chunks | Total number of embedded chunks in your org |
unique_sources | Number of distinct source values (documents) |
document_types | Chunk count broken down by doc_type |
embedding_model | Model used to produce the vector embeddings |
chunk_size | Chunk size in characters used when embedding |
chunk_overlap | Overlap size in characters between adjacent chunks |