API reference¶

KDBL Context Lake (K-Lake) exposes a REST API. The UI and the kdbl-control CLI both use it, so anything you see in the console can be scripted.

Base URL¶

Everything except the health probes lives under /api:

https://<your-kdbl-host>/api/...

Authentication¶

All /api/* endpoints require a bearer token. The two accepted token types:

Personal access token (PAT) — minted from /api/tokens or the UI. Format: kdblpat_<base64>.
OIDC bearer token — issued by your tenant's identity provider.

Send the token in the Authorization header:

Authorization: Bearer kdblpat_...

Unauthenticated endpoints (no token required):

GET /healthz — liveness
GET /readyz — readiness
GET /api/authn/discover?tenant=<slug> — returns OIDC issuer + client ID for a tenant

Conventions¶

All identifiers in path segments are URL-encoded. s3://my-bucket becomes s3%3A%2F%2Fmy-bucket.
Request and response bodies are JSON.
Timestamps are ISO-8601 UTC.
Listing endpoints accept ?limit= and ?after= for cursor pagination. The next cursor is returned in the response body.
Errors return a JSON body { "error": "...", "detail": "..." } with an appropriate 4xx or 5xx status.

Endpoints¶

Health¶

Method	Path	Description
`GET`	`/healthz`	Liveness probe. Always returns 200 if the process is up.
`GET`	`/readyz`	Readiness probe. Returns 200 only if dependencies are reachable.

Sources¶

Method	Path	Description
`GET`	`/api/sources`	List sources in the calling tenant
`POST`	`/api/sources`	Create a source (see body below)
`GET`	`/api/sources/:id`	Source detail
`DELETE`	`/api/sources/:id`	Remove a source and its indexed files
`GET`	`/api/sources/:id/stats`	File count, bytes, last indexed time
`GET`	`/api/sources/:id/health`	Last successful crawl + last error
`GET`	`/api/sources/:id/crawls`	Recent crawl runs and outcomes
`POST`	`/api/sources/:id/crawl`	Trigger a crawl
`GET`	`/api/sources/:id/files`	Paginated file listing
`GET`	`/api/sources/:id/files/:key`	Single file detail
`POST`	`/api/sources/:id/enabled`	Toggle the enabled flag
`POST`	`/api/sources/:id/bulk-ingest`	Toggle the bulk-ingest fast path
`POST`	`/api/sources/:id/meta-caps`	Set which optional enrichments to gather
`POST`	`/api/sources/:id/backfill-meta`	Enqueue enrichment for previously-indexed files
`GET`	`/api/sources/:id/meta-coverage`	How many files have each enrichment populated
`POST`	`/api/sources/:id/subtree`	Adjust per-source concurrency hints
`POST`	`/api/sources/:id/security-trim`	Set the per-file trim policy (`{mode, fail_closed?}`)
`POST`	`/api/sources/:id/multichannel`	SMB3 multi-channel (`smbfs` only)
`GET`	`/api/sources/:id/content/search`	Full-text search over extracted content (`?q=&limit=`)
`POST`	`/api/sources/:id/extract`	Enable/disable content extraction + filters
`GET`	`/api/sources/:id/extract/coverage`	Extraction status rollup
`GET`	`/api/sources/:id/extract/progress`	Live extraction progress
`GET`	`/api/sources/:id/crawl-progress`	Live crawl progress
`GET`/`POST`	`/api/sources/:id/schedules`	List / add a recurring crawl or backfill
`PATCH`/`DELETE`	`/api/sources/:id/schedules/:sid`	Update / remove a schedule
`POST`	`/api/sources/:id/schedules/:sid/run`	Run a schedule now

Create source body:

{
  "source_id": "s3://my-bucket",
  "protocol": "s3",
  "config": { "bucket": "my-bucket", "region": "us-east-1" },
  "secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
}

Set secret to null to use ambient credentials on the worker.

The config and secret shapes vary per protocol — see Sources for fields per protocol.

Files & downloads (open the original)¶

K-Lake extracts text and doesn't store the original bytes, so opening a file means re-fetching it from its source on demand. To let a browser open/verify a citation, the API mints short-lived HS256-signed links that the download route resolves (re-checking access for the principal the link was minted for, then streaming the bytes via the extractor — the API never holds source credentials). Every download is audited (tool = files/download).

Method	Path	Description
`GET`	`/api/files/download?t=<token>`	Stream the original file behind a signed token. No bearer — the signed token is the credential. `&dl=1` forces a download (attachment) instead of an inline preview. `401` if the token is invalid/expired; `404` if the principal can no longer see the file.

The links are produced for you on the responses that reference files, so you rarely build the URL yourself:

GET /api/sources/:id/content/search — each hit carries a url (an inline preview link to the original), alongside key/seq/snippet/rank.
GET /api/sources/:id/files/:key — the file detail carries both preview_url (inline) and download_url (attachment).

All three are omitted when signed downloads aren't configured (KDBL_DOWNLOAD_SIGNING_SECRET / KDBL_API_PUBLIC_URL / KDBL_INTERNAL_FETCH_TOKEN unset), so a client should treat them as optional. Tokens expire after ~15 minutes (KDBL_DOWNLOAD_TTL_SECS).

Access control (per source)¶

Method	Path	Description
`GET`	`/api/sources/:id/acl`	List principals and roles
`POST`	`/api/sources/:id/acl`	Grant a principal access
`DELETE`	`/api/sources/:id/acl/:principal`	Revoke a principal

Status¶

Method	Path	Description
`GET`	`/api/status`	Tenant queue depth and per-source rollup
`GET`	`/api/status?include_cluster=true`	Cluster-wide rollup (requires cluster admin)

Tokens (self-service)¶

Method	Path	Description
`GET`	`/api/tokens`	List the calling user's tokens (metadata only — no secrets)
`POST`	`/api/tokens`	Mint a new PAT. The raw token is returned once.
`DELETE`	`/api/tokens/:id`	Revoke a token

Users (tenant admin)¶

Method	Path	Description
`GET`	`/api/users`	List users in the tenant
`POST`	`/api/users`	Create a user (returns an initial PAT, shown once)
`GET`	`/api/users/me`	Current caller
`GET`	`/api/users/:id`	User detail
`PATCH`	`/api/users/:id`	Update email, display name, groups, admin flag
`DELETE`	`/api/users/:id`	Remove a user

Tenants (cluster admin)¶

Method	Path	Description
`GET`	`/api/tenants`	List tenants
`POST`	`/api/tenants`	Create a tenant
`GET`	`/api/tenants/:slug`	Tenant detail
`PATCH`	`/api/tenants/:slug`	Update tenant configuration
`DELETE`	`/api/tenants/:slug`	Remove a tenant
`GET`	`/api/tenants/:slug/retention`	Tenant retention override
`PATCH`	`/api/tenants/:slug/retention`	Set tenant retention override
`GET`	`/api/tenants/:slug/directory`	Read non-secret directory-correlation config + `has_*_secret` flags
`PATCH`	`/api/tenants/:slug/directory`	Merge `graph` / `ldap` / `principal_mappings` config (secrets stay CLI-only)
`GET`	`/api/cluster/retention`	Cluster-wide retention default

MCP¶

The Model Context Protocol surface. See MCP server and connecting MCP clients.

Method	Path	Description
`GET`	`/.well-known/oauth-protected-resource`	RFC 9728 Protected Resource Metadata (unauthenticated)
`GET`	`/.well-known/oauth-protected-resource/mcp`	Path-scoped PRM (unauthenticated)
`POST`	`/mcp`	Streamable-HTTP JSON-RPC endpoint (OAuth 2.1 bearer; `aud` = the resource URI)
`GET`	`/api/mcp/audit`	Query the MCP audit trail (tenant-scoped; `?tool=&principal=&limit=`)

Example: end-to-end¶

Add a source, trigger a crawl, watch progress:

# Add
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
     -H "Content-Type: application/json" \
     "$KDBL_URL/api/sources" \
     -d '{
       "source_id": "s3://docs-bucket",
       "protocol": "s3",
       "config": { "bucket": "docs-bucket", "region": "us-east-1" },
       "secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
     }'

# Crawl
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
     "$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/crawl"

# Watch
curl -H "Authorization: Bearer $KDBL_TOKEN" \
     "$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/stats"

Rate limits and pagination¶

There are no hard request rate limits on the API. Be a good citizen — for bulk reads use the listing endpoints with ?limit= and follow the cursor rather than hammering point lookups.