Storage
Upload, store, and retrieve files with processing pipelines and memory integration.
Storage
Cerebe's Storage service provides S3-compatible file storage with built-in processing pipelines, virus scanning, and optional bridging to the Memory Fabric. Upload files via multipart form, base64, or presigned URLs for direct browser uploads.
Core Operations
Upload File (Multipart)
POST /api/v1/storage/uploadUpload a file with optional processing and memory integration.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | file | Yes | The file to upload (multipart) |
session_id | string | Yes | Session identifier |
user_id | string | Yes | User identifier |
purpose | string | No | Upload purpose (default: memory) |
processing | string | No | Comma-separated processor names (e.g., ocr,summarize) |
bridge_to_memory | boolean | No | Store extracted content as a memory (default: false) |
tenant_id | string | No | Tenant identifier (default: cerebe) |
import httpx
# Multipart upload via REST API (not available in SDK — use base64 upload below for SDK)
async with httpx.AsyncClient() as http:
with open("essay.pdf", "rb") as f:
response = await http.post(
"https://api.cerebe.ai/api/v1/storage/upload",
headers={"X-API-Key": "ck_live_..."},
files={"file": ("essay.pdf", f, "application/pdf")},
data={
"session_id": "session_abc",
"user_id": "user_123",
"purpose": "assessment",
"processing": "ocr,summarize",
"bridge_to_memory": "true",
},
)
result = response.json()
print(f"Upload ID: {result['upload_id']}")
print(f"CDN URL: {result['cdn_url']}")// Multipart upload via REST API (not available in SDK — use base64 upload below for SDK)
const form = new FormData()
form.append('file', fs.createReadStream('essay.pdf'))
form.append('session_id', 'session_abc')
form.append('user_id', 'user_123')
form.append('purpose', 'assessment')
form.append('processing', 'ocr,summarize')
form.append('bridge_to_memory', 'true')
const response = await fetch('https://api.cerebe.ai/api/v1/storage/upload', {
method: 'POST',
headers: { 'X-API-Key': 'ck_live_...' },
body: form,
})
const result = await response.json()
console.log(`Upload ID: ${result.upload_id}`)
console.log(`CDN URL: ${result.cdn_url}`)curl -X POST https://api.cerebe.ai/api/v1/storage/upload \
-H "X-API-Key: ck_live_..." \
-F "file=@essay.pdf" \
-F "session_id=session_abc" \
-F "user_id=user_123" \
-F "purpose=assessment" \
-F "processing=ocr,summarize" \
-F "bridge_to_memory=true"Upload Response
{
"upload_id": "upl_a1b2c3d4",
"cdn_url": "https://cdn.cerebe.ai/uploads/upl_a1b2c3d4/essay.pdf",
"status": "clean",
"extracted": {"text": "...extracted content..."},
"memory_id": "mem_x1y2z3",
"processing_status": "completed"
}Upload Base64 Content
POST /api/v1/storage/upload-base64Upload base64-encoded content. Useful for programmatic uploads without multipart form handling.
import base64
with open("image.png", "rb") as f:
content_b64 = base64.b64encode(f.read()).decode()
result = await client.storage.upload(
content=content_b64,
filename="image.png",
content_type="image/png",
session_id="session_abc",
user_id="user_123",
)curl -X POST https://api.cerebe.ai/api/v1/storage/upload-base64 \
-H "X-API-Key: ck_live_..." \
-H "Content-Type: application/json" \
-d '{
"content": "iVBORw0KGgo...",
"filename": "image.png",
"content_type": "image/png",
"session_id": "session_abc",
"user_id": "user_123"
}'Retrieve File Content
GET /api/v1/storage/{upload_id}/contentDownload the raw file content by upload ID. Returns the file as a streaming response with the original content type.
import httpx
url = await client.storage.get_url(upload_id="upl_a1b2c3d4")
async with httpx.AsyncClient() as http:
response = await http.get(url)
response.raise_for_status()
with open("downloaded.pdf", "wb") as f:
f.write(response.content)const { url } = await client.storage.getUrl('upl_a1b2c3d4')
const response = await fetch(url)
fs.writeFileSync('downloaded.pdf', Buffer.from(await response.arrayBuffer()))curl https://api.cerebe.ai/api/v1/storage/upl_a1b2c3d4/content \
-H "X-API-Key: ck_live_..." \
-o downloaded.pdfPresigned Upload URL
POST /api/v1/storage/presigned-uploadGenerate a presigned URL for direct browser-to-storage uploads, bypassing the API server for large files.
| Parameter | Type | Required | Description |
|---|---|---|---|
file_name | string | Yes | Original filename |
file_type | string | Yes | MIME type (e.g., application/pdf) |
file_size | integer | Yes | File size in bytes |
content_hash | string | Yes | SHA-256 hash for deduplication |
tenant_id | string | Yes | Tenant identifier |
user_id | string | No | User identifier |
purpose | string | No | Upload purpose (default: assessment) |
presigned = await client.storage.presigned_upload(
file_name="large-video.mp4",
file_type="video/mp4",
file_size=52428800,
content_hash="sha256_abc123...",
tenant_id="my_app",
)
print(f"Upload to: {presigned['upload_url']}")
print(f"Expires in: {presigned['expires_in']}s")
# Upload directly to the presigned URL
import httpx
async with httpx.AsyncClient() as http:
with open("large-video.mp4", "rb") as f:
await http.put(presigned["upload_url"], content=f.read())# Step 1: Get presigned URL
curl -X POST https://api.cerebe.ai/api/v1/storage/presigned-upload \
-H "X-API-Key: ck_live_..." \
-H "Content-Type: application/json" \
-d '{
"file_name": "large-video.mp4",
"file_type": "video/mp4",
"file_size": 52428800,
"content_hash": "sha256_abc123...",
"tenant_id": "my_app"
}'
# Step 2: Upload directly to the returned URL
curl -X PUT "<upload_url_from_step_1>" \
-H "Content-Type: video/mp4" \
--data-binary @large-video.mp4Presigned Response
{
"status": "new",
"upload_id": "upl_x1y2z3",
"upload_url": "https://storage.cerebe.ai/presigned/...",
"expires_in": 3600,
"cdn_url": "https://cdn.cerebe.ai/uploads/upl_x1y2z3/large-video.mp4"
}Additional Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST | /api/v1/storage/check-hash | Check if a file already exists by content hash (deduplication) |
GET | /api/v1/storage/{upload_id}/metadata | Get upload metadata without downloading content |
GET | /api/v1/storage/{upload_id}/ephemeral-url | Generate a time-limited presigned URL for browser access |
POST | /api/v1/storage/extract | Apply processing to already-uploaded content |
POST | /api/v1/storage/analyze-content | Detect file type using Magika |
Memory Bridge
When bridge_to_memory is enabled, the storage service extracts text content from the uploaded file and stores it as a memory in the Memory Fabric. This makes uploaded documents searchable via the Memory Search API.
Error Responses
| Status | Description |
|---|---|
400 | Invalid request (bad base64, missing fields) |
401 | Missing or invalid API key |
403 | Upload not available (failed virus scan) |
404 | Upload not found |
500 | Storage processing failed |
Next Steps
- Memory API — Search through content bridged from uploads
- Authentication — API key management