Skip to content

API reference

KDBL Context Lake (K-Lake) exposes a REST API. The UI and the kdbl-control CLI both use it, so anything you see in the console can be scripted.

Base URL

Everything except the health probes lives under /api:

https://<your-kdbl-host>/api/...

Authentication

All /api/* endpoints require a bearer token. The two accepted token types:

  • Personal access token (PAT) — minted from /api/tokens or the UI. Format: kdblpat_<base64>.
  • OIDC bearer token — issued by your tenant's identity provider.

Send the token in the Authorization header:

Authorization: Bearer kdblpat_...

Unauthenticated endpoints (no token required):

  • GET /healthz — liveness
  • GET /readyz — readiness
  • GET /api/authn/discover?tenant=<slug> — returns OIDC issuer + client ID for a tenant

Conventions

  • All identifiers in path segments are URL-encoded. s3://my-bucket becomes s3%3A%2F%2Fmy-bucket.
  • Request and response bodies are JSON.
  • Timestamps are ISO-8601 UTC.
  • Listing endpoints accept ?limit= and ?after= for cursor pagination. The next cursor is returned in the response body.
  • Errors return a JSON body { "error": "...", "detail": "..." } with an appropriate 4xx or 5xx status.

Endpoints

Health

Method Path Description
GET /healthz Liveness probe. Always returns 200 if the process is up.
GET /readyz Readiness probe. Returns 200 only if dependencies are reachable.

Sources

Method Path Description
GET /api/sources List sources in the calling tenant
POST /api/sources Create a source (see body below)
GET /api/sources/:id Source detail
DELETE /api/sources/:id Remove a source and its indexed files
GET /api/sources/:id/stats File count, bytes, last indexed time
GET /api/sources/:id/health Last successful crawl + last error
GET /api/sources/:id/crawls Recent crawl runs and outcomes
POST /api/sources/:id/crawl Trigger a crawl
GET /api/sources/:id/files Paginated file listing
GET /api/sources/:id/files/:key Single file detail
POST /api/sources/:id/enabled Toggle the enabled flag
POST /api/sources/:id/bulk-ingest Toggle the bulk-ingest fast path
POST /api/sources/:id/meta-caps Set which optional enrichments to gather
POST /api/sources/:id/backfill-meta Enqueue enrichment for previously-indexed files
GET /api/sources/:id/meta-coverage How many files have each enrichment populated
POST /api/sources/:id/subtree Adjust per-source concurrency hints
POST /api/sources/:id/security-trim Set the per-file trim policy ({mode, fail_closed?})
POST /api/sources/:id/multichannel SMB3 multi-channel (smbfs only)
GET /api/sources/:id/content/search Full-text search over extracted content (?q=&limit=)
POST /api/sources/:id/extract Enable/disable content extraction + filters
GET /api/sources/:id/extract/coverage Extraction status rollup
GET /api/sources/:id/extract/progress Live extraction progress
GET /api/sources/:id/crawl-progress Live crawl progress
GET/POST /api/sources/:id/schedules List / add a recurring crawl or backfill
PATCH/DELETE /api/sources/:id/schedules/:sid Update / remove a schedule
POST /api/sources/:id/schedules/:sid/run Run a schedule now

Create source body:

{
  "source_id": "s3://my-bucket",
  "protocol": "s3",
  "config": { "bucket": "my-bucket", "region": "us-east-1" },
  "secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
}

Set secret to null to use ambient credentials on the worker.

The config and secret shapes vary per protocol — see Sources for fields per protocol.

Files & downloads (open the original)

K-Lake extracts text and doesn't store the original bytes, so opening a file means re-fetching it from its source on demand. To let a browser open/verify a citation, the API mints short-lived HS256-signed links that the download route resolves (re-checking access for the principal the link was minted for, then streaming the bytes via the extractor — the API never holds source credentials). Every download is audited (tool = files/download).

Method Path Description
GET /api/files/download?t=<token> Stream the original file behind a signed token. No bearer — the signed token is the credential. &dl=1 forces a download (attachment) instead of an inline preview. 401 if the token is invalid/expired; 404 if the principal can no longer see the file.

The links are produced for you on the responses that reference files, so you rarely build the URL yourself:

  • GET /api/sources/:id/content/search — each hit carries a url (an inline preview link to the original), alongside key/seq/snippet/rank.
  • GET /api/sources/:id/files/:key — the file detail carries both preview_url (inline) and download_url (attachment).

All three are omitted when signed downloads aren't configured (KDBL_DOWNLOAD_SIGNING_SECRET / KDBL_API_PUBLIC_URL / KDBL_INTERNAL_FETCH_TOKEN unset), so a client should treat them as optional. Tokens expire after ~15 minutes (KDBL_DOWNLOAD_TTL_SECS).

Access control (per source)

Method Path Description
GET /api/sources/:id/acl List principals and roles
POST /api/sources/:id/acl Grant a principal access
DELETE /api/sources/:id/acl/:principal Revoke a principal

Status

Method Path Description
GET /api/status Tenant queue depth and per-source rollup
GET /api/status?include_cluster=true Cluster-wide rollup (requires cluster admin)

Tokens (self-service)

Method Path Description
GET /api/tokens List the calling user's tokens (metadata only — no secrets)
POST /api/tokens Mint a new PAT. The raw token is returned once.
DELETE /api/tokens/:id Revoke a token

Users (tenant admin)

Method Path Description
GET /api/users List users in the tenant
POST /api/users Create a user (returns an initial PAT, shown once)
GET /api/users/me Current caller
GET /api/users/:id User detail
PATCH /api/users/:id Update email, display name, groups, admin flag
DELETE /api/users/:id Remove a user

Tenants (cluster admin)

Method Path Description
GET /api/tenants List tenants
POST /api/tenants Create a tenant
GET /api/tenants/:slug Tenant detail
PATCH /api/tenants/:slug Update tenant configuration
DELETE /api/tenants/:slug Remove a tenant
GET /api/tenants/:slug/retention Tenant retention override
PATCH /api/tenants/:slug/retention Set tenant retention override
GET /api/tenants/:slug/directory Read non-secret directory-correlation config + has_*_secret flags
PATCH /api/tenants/:slug/directory Merge graph / ldap / principal_mappings config (secrets stay CLI-only)
GET /api/cluster/retention Cluster-wide retention default

MCP

The Model Context Protocol surface. See MCP server and connecting MCP clients.

Method Path Description
GET /.well-known/oauth-protected-resource RFC 9728 Protected Resource Metadata (unauthenticated)
GET /.well-known/oauth-protected-resource/mcp Path-scoped PRM (unauthenticated)
POST /mcp Streamable-HTTP JSON-RPC endpoint (OAuth 2.1 bearer; aud = the resource URI)
GET /api/mcp/audit Query the MCP audit trail (tenant-scoped; ?tool=&principal=&limit=)

Example: end-to-end

Add a source, trigger a crawl, watch progress:

# Add
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
     -H "Content-Type: application/json" \
     "$KDBL_URL/api/sources" \
     -d '{
       "source_id": "s3://docs-bucket",
       "protocol": "s3",
       "config": { "bucket": "docs-bucket", "region": "us-east-1" },
       "secret": { "access_key_id": "AKIA...", "secret_access_key": "..." }
     }'

# Crawl
curl -X POST -H "Authorization: Bearer $KDBL_TOKEN" \
     "$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/crawl"

# Watch
curl -H "Authorization: Bearer $KDBL_TOKEN" \
     "$KDBL_URL/api/sources/s3%3A%2F%2Fdocs-bucket/stats"

Rate limits and pagination

There are no hard request rate limits on the API. Be a good citizen — for bulk reads use the listing endpoints with ?limit= and follow the cursor rather than hammering point lookups.