Sizing guide¶

KDBL Context Lake (K-Lake) scales horizontally. Workers are stateless — when you need more crawl throughput, add worker replicas. The database is the floor on how fast metadata can be persisted, so size it for your peak ingest rate, not your steady state.

This page gives starting points. Tune from there based on the metrics in Telemetry.

Defaults¶

The shipped manifests set conservative requests with headroom in the limits.

Worker¶

	Request	Limit
CPU	1 core	4 cores
Memory	512 Mi	2 Gi

Each worker handles many in-flight crawl tasks concurrently. Increase replicas to increase throughput; the workers coordinate through the work queue and will not duplicate work.

API¶

	Request	Limit
CPU	100 m	1 core
Memory	128 Mi	512 Mi

The API is mostly thin — it reads and writes the database on behalf of the UI and CLI. Two replicas is a sensible default for availability; scale up only if you push it with heavy programmatic API traffic.

UI¶

	Request	Limit
CPU	50 m	250 m
Memory	32 Mi	128 Mi

The UI is static assets served by a lightweight web server. One or two replicas is enough.

Database¶

K-Lake persists everything in a managed database. Recommended starting points:

CPU: 2 cores, scale up under heavy ingest
Memory: at least 8 Gi for ingests above a few million files
Disk: provision for the eventual file count — figure tens of bytes per file record, plus indexes

A connection pooler is recommended in front of the database if you run more than a handful of workers.

When to scale up¶

Symptom	Action
Queue depth (`kdbl_queue_depth{state="pending"}`) sustained high	Add worker replicas
Worker CPU at limit, queue still growing	Bump worker CPU limit, then add replicas
Database CPU pinned, queue stable	Scale the database vertically
API responses slow under heavy CLI/API use	Add API replicas
OOMKills on workers during very wide directory listings	Bump worker memory limit

Source-level tuning¶

Two source-level toggles affect throughput. Both are exposed via the UI, the CLI (source bulk-ingest, source meta-caps), and the API.

Bulk ingest — defaults to on. Optimizes the write path for first-time crawls and large catch-up runs. Leave on unless you know you're doing many small incremental updates and have benchmarked the alternative.
Metadata caps — controls which optional enrichments (S3 tags, NTFS / NFSv4 ACLs, xattrs) are gathered. Enabling more enrichment costs more crawl time and storage. Start narrow, widen as needed.

Estimating headroom¶

A worker pod can sustain steady-state ingest from a single source at network-bound rates for most object stores and NAS protocols. Real throughput depends heavily on:

Object size distribution — many small files is harder than fewer large ones
Source latency — same-region S3 is much faster than a remote SMB share
Whether metadata enrichment is enabled

Plan for capacity using a representative source: run a crawl on it, watch the metrics, and extrapolate.