API reference¶
KDBL Context Lake (K-Lake) exposes a REST API. The UI and the kdbl-control CLI both use it, so anything you see in the console can be scripted.
Base URL¶
Everything except the health probes lives under /api:
Authentication¶
All /api/* endpoints require a bearer token. The two accepted token types:
- Personal access token (PAT) — minted from
/api/tokensor the UI. Format:kdblpat_<base64>. - OIDC bearer token — issued by your tenant's identity provider.
Send the token in the Authorization header:
Unauthenticated endpoints (no token required):
GET /healthz— livenessGET /readyz— readinessGET /api/authn/discover?tenant=<slug>— returns OIDC issuer + client ID for a tenant
Conventions¶
- All identifiers in path segments are URL-encoded.
s3://my-bucketbecomess3%3A%2F%2Fmy-bucket. - Request and response bodies are JSON.
- Timestamps are ISO-8601 UTC.
- Listing endpoints accept
?limit=and?after=for cursor pagination. The next cursor is returned in the response body. - Errors return a JSON body
{ "error": "...", "detail": "..." }with an appropriate 4xx or 5xx status.
Endpoints¶
Health¶
| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Liveness probe. Always returns 200 if the process is up. |
GET |
/readyz |
Readiness probe. Returns 200 only if dependencies are reachable. |
Sources¶
| Method | Path | Description |
|---|---|---|
GET |
/api/sources |
List sources in the calling tenant |
POST |
/api/sources |
Create a source (see body below) |
GET |
/api/sources/:id |
Source detail |
DELETE |
/api/sources/:id |
Remove a source and its indexed files |
GET |
/api/sources/:id/stats |
File count, bytes, last indexed time |
GET |
/api/sources/:id/health |
Last successful crawl + last error |
GET |
/api/sources/:id/crawls |
Recent crawl runs and outcomes |
POST |
/api/sources/:id/crawl |
Trigger a crawl |
GET |
/api/sources/:id/files |
Paginated file listing |
GET |
/api/sources/:id/files/:key |
Single file detail |
POST |
/api/sources/:id/enabled |
Toggle the enabled flag |
POST |
/api/sources/:id/bulk-ingest |
Toggle the bulk-ingest fast path |
POST |
/api/sources/:id/meta-caps |
Set which optional enrichments to gather |
POST |
/api/sources/:id/backfill-meta |
Enqueue enrichment for previously-indexed files |
GET |
/api/sources/:id/meta-coverage |
How many files have each enrichment populated |
POST |
/api/sources/:id/subtree |
Adjust per-source concurrency hints |
POST |
/api/sources/:id/security-trim |
Set the per-file trim policy ({mode, fail_closed?}) |
POST |
/api/sources/:id/multichannel |
SMB3 multi-channel (smbfs only) |
GET |
/api/sources/:id/content/search |
Full-text search over extracted content (?q=&limit=) |
POST |
/api/sources/:id/extract |
Enable/disable content extraction + filters |
GET |
/api/sources/:id/extract/coverage |
Extraction status rollup |
GET |
/api/sources/:id/extract/progress |
Live extraction progress |
GET |
/api/sources/:id/crawl-progress |
Live crawl progress |
GET/POST |
/api/sources/:id/schedules |
List / add a recurring crawl or backfill |
PATCH/DELETE |
/api/sources/:id/schedules/:sid |
Update / remove a schedule |
POST |
/api/sources/:id/schedules/:sid/run |
Run a schedule now |
Create source body:
{
"source_id": "s3://my-bucket",
"protocol": "s3",
"config": { "bucket": "my-bucket", "region": "us-east-1" },
"secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
}
Set secret to null to use ambient credentials on the worker.
The config and secret shapes vary per protocol — see Sources for fields per protocol.
Files & downloads (open the original)¶
K-Lake extracts text and doesn't store the original bytes, so opening a file
means re-fetching it from its source on demand. To let a browser open/verify a
citation, the API mints short-lived HS256-signed links that the download
route resolves (re-checking access for the principal the link was minted for, then
streaming the bytes via the extractor — the API never holds source
credentials). Every download is audited (tool = files/download).
| Method | Path | Description |
|---|---|---|
GET |
/api/files/download?t=<token> |
Stream the original file behind a signed token. No bearer — the signed token is the credential. &dl=1 forces a download (attachment) instead of an inline preview. 401 if the token is invalid/expired; 404 if the principal can no longer see the file. |
The links are produced for you on the responses that reference files, so you rarely build the URL yourself:
GET /api/sources/:id/content/search— each hit carries aurl(an inline preview link to the original), alongsidekey/seq/snippet/rank.GET /api/sources/:id/files/:key— the file detail carries bothpreview_url(inline) anddownload_url(attachment).
All three are omitted when signed downloads aren't configured
(KDBL_DOWNLOAD_SIGNING_SECRET / KDBL_API_PUBLIC_URL / KDBL_INTERNAL_FETCH_TOKEN
unset), so a client should treat them as optional. Tokens expire after
~15 minutes (KDBL_DOWNLOAD_TTL_SECS).
Access control (per source)¶
| Method | Path | Description |
|---|---|---|
GET |
/api/sources/:id/acl |
List principals and roles |
POST |
/api/sources/:id/acl |
Grant a principal access |
DELETE |
/api/sources/:id/acl/:principal |
Revoke a principal |
Status¶
| Method | Path | Description |
|---|---|---|
GET |
/api/status |
Tenant queue depth and per-source rollup |
GET |
/api/status?include_cluster=true |
Cluster-wide rollup (requires cluster admin) |
Tokens (self-service)¶
| Method | Path | Description |
|---|---|---|
GET |
/api/tokens |
List the calling user's tokens (metadata only — no secrets) |
POST |
/api/tokens |
Mint a new PAT. The raw token is returned once. |
DELETE |
/api/tokens/:id |
Revoke a token |
Users (tenant admin)¶
| Method | Path | Description |
|---|---|---|
GET |
/api/users |
List users in the tenant |
POST |
/api/users |
Create a user (returns an initial PAT, shown once) |
GET |
/api/users/me |
Current caller |
GET |
/api/users/:id |
User detail |
PATCH |
/api/users/:id |
Update email, display name, groups, admin flag |
DELETE |
/api/users/:id |
Remove a user |
Tenants (cluster admin)¶
| Method | Path | Description |
|---|---|---|
GET |
/api/tenants |
List tenants |
POST |
/api/tenants |
Create a tenant |
GET |
/api/tenants/:slug |
Tenant detail |
PATCH |
/api/tenants/:slug |
Update tenant configuration |
DELETE |
/api/tenants/:slug |
Remove a tenant |
GET |
/api/tenants/:slug/retention |
Tenant retention override |
PATCH |
/api/tenants/:slug/retention |
Set tenant retention override |
GET |
/api/tenants/:slug/directory |
Read non-secret directory-correlation config + has_*_secret flags |
PATCH |
/api/tenants/:slug/directory |
Merge graph / ldap / principal_mappings config (secrets stay CLI-only) |
GET |
/api/cluster/retention |
Cluster-wide retention default |
MCP¶
The Model Context Protocol surface. See MCP server and connecting MCP clients.
| Method | Path | Description |
|---|---|---|
GET |
/.well-known/oauth-protected-resource |
RFC 9728 Protected Resource Metadata (unauthenticated) |
GET |
/.well-known/oauth-protected-resource/mcp |
Path-scoped PRM (unauthenticated) |
POST |
/mcp |
Streamable-HTTP JSON-RPC endpoint (OAuth 2.1 bearer; aud = the resource URI) |
GET |
/api/mcp/audit |
Query the MCP audit trail (tenant-scoped; ?tool=&principal=&limit=) |
Example: end-to-end¶
Add a source, trigger a crawl, watch progress:
# Add
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
-H "Content-Type: application/json" \
"$KDBL_URL/api/sources" \
-d '{
"source_id": "s3://docs-bucket",
"protocol": "s3",
"config": { "bucket": "docs-bucket", "region": "us-east-1" },
"secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
}'
# Crawl
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
"$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/crawl"
# Watch
curl -H "Authorization: Bearer $KDBL_TOKEN" \
"$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/stats"
Rate limits and pagination¶
There are no hard request rate limits on the API. Be a good citizen — for bulk reads use the listing endpoints with ?limit= and follow the cursor rather than hammering point lookups.