Chat-UI QuickStart — talk to KDBL Context Lake (K-Lake) over MCP¶
K-Lake ships an optional local chat surface: a self-hosted chat interface wired to a self-hosted LLM, with K-Lake's knowledge base exposed to it as tools over the Model Context Protocol (MCP). It is self-seeding — deploying it yields a fully-configured chat (models present, system prompt set, K-Lake tools bound, grounded answers that cite signed original-file links) with no manual UI clicks. Answers are grounded in your content and carry the same per-user security and auditing as the rest of K-Lake.
For connecting a remote, cloud AI (such as Claude or ChatGPT connectors) to an on-prem K-Lake instead, see Connect your AI.
What gets deployed¶
The optional chat bundle stands up two things in your cluster:
- A local chat interface — the chat surface the user types into. It is configured login-free for evaluation.
- An MCP bridge that re-exposes K-Lake's MCP tools (
search_content,get_file_text, and the rest) so the chat interface can call them.
The model backend is a self-hosted LLM running in your cluster (CPU or GPU).
Prerequisites¶
- A self-hosted LLM running in your cluster. The served model name must match the model id the chat bundle expects.
- A tenant PAT for the data the chat should retrieve. The PAT's permission scope is exactly what the chat can see.
Deploy¶
-
Create one out-of-band secret holding the tenant PAT the chat uses to authenticate to K-Lake's MCP endpoint (kept out of source control).
-
Apply the optional chat bundle included in your deployment package:
- Reach the chat interface — either point a hostname at any node IP via the bundled Ingress, or port-forward the chat service:
- Open it, pick a model, and ask a grounded question. The answer should cite each fact with a clickable signed original-file link — re-fetched, permissions-re-checked, and audited on click.
How the seeding works¶
The chat bundle is self-configuring: on deploy, a one-shot bootstrap step seeds the model definitions, the retrieval system prompt, and the binding to K-Lake's MCP tools, then exits. A redeploy re-reconciles this configuration rather than duplicating it, so re-applying the bundle is safe. The system prompt is the source of truth for retrieval behaviour — edit it and re-apply to re-sync.
Caveats¶
- Shared GPU. If more than one LLM is configured but they share a single GPU, only one is loaded at a time; the others simply contribute no models to the picker until scaled up.
- Login-free mode. The chat interface is configured without authentication for ease of evaluation. Do not expose it to untrusted networks in this mode.