Agent Server Container¶
The agent-server-container/ directory contains the pluggable server that bridges the Kubernetes operator with an AI backend running inside the agent pod. Each subdirectory implements the server for a specific AI engine. The operator doesn't care which engine is inside — as long as the container implements the API contract below, everything works.
agent-server-container/
github-copilot/ ← GitHub Copilot SDK engine (default, shipped)
server.py ← FastAPI server (SDK-backed)
entrypoint.sh ← Container entrypoint (auth setup, skill staging)
Containerfile ← Container image definition
claude-code/ ← (example) Claude Code engine — add your own!
API Contract¶
Every agent server image must expose at minimum these HTTP endpoints:
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Liveness probe — return {"status":"ok"} |
/asyncchat |
POST | Enqueue a message (with optional session_config); returns {"queue_id": "..."} |
/cancel/{queue_id} |
DELETE | Cancel/disconnect the in-flight request for a given queue item |
The server must POST streaming chunks and final responses back to $WEBHOOK_URL (injected by the operator). Optional endpoints for richer functionality:
| Endpoint | Method | Description |
|---|---|---|
/chat |
POST | Synchronous chat — blocks until the agent responds |
/models |
GET | List available models (enables model picker in the UI) |
/config/instructions |
GET/PUT | Manage instructions file on the PVC |
/config/skills |
GET | List all skills on the PVC |
/config/skills/{name} |
GET/PUT/DELETE | Manage individual skills |
/config/agents |
GET/PUT | Manage custom agent definitions on the PVC |
/tasks/monitor |
POST | Register a background monitoring task (see Background Task API) |
/tasks |
GET | List all background tasks |
/tasks/{id} |
GET | Get details of a specific background task |
/tasks/{id} |
DELETE | Cancel and remove a background task |
GitHub Copilot SDK (Default Engine)¶
The GitHub Copilot implementation uses the GitHub Copilot Python SDK (CopilotClient) to communicate with the Copilot CLI running in server mode via JSON-RPC. This replaces the previous subprocess-per-request approach with a persistent connection, proper session management, and typed streaming events.
sequenceDiagram
participant Op as Operator
participant Shim as Agent Server (SDK)
participant CLI as Copilot CLI (server mode)
Note over Shim,CLI: Persistent JSON-RPC connection
Op->>Shim: POST /asyncchat + session_config
Shim-->>Op: { queue_id }
Shim->>CLI: create_session(opts) / resume_session()
Shim->>CLI: session.send(message)
loop SDK streaming events
CLI-->>Shim: assistant.message_delta
Shim-->>Op: POST /chunk (thinking/tool_call/response)
CLI-->>Shim: tool.execution_start
Shim-->>Op: POST /chunk (tool_call)
CLI-->>Shim: tool.execution_complete
Shim-->>Op: POST /chunk (tool_result)
end
CLI-->>Shim: session.idle
Shim->>CLI: session.disconnect()
Shim-->>Op: POST /response (final answer)
Note over Op,Shim: Cancellation
Op->>Shim: DELETE /cancel/{queue_id}
Shim->>CLI: session.disconnect()
Key SDK features used:
CopilotClient(SubprocessConfig)— singleton managing the CLI in server modePermissionHandler.approve_all— auto-approve tool executionsasyncio.Semaphore(3)— bounded concurrency for parallel sessionsclient.list_models()— query available models for the settings UIsession.on(callback)— typed event streaming for real-time chunks
Webhook Payloads¶
Every agent server must POST these payloads to $WEBHOOK_URL (injected by the operator).
Chunk (streamed during execution):
{
"queue_id": "<uuid>",
"seq": 1,
"type": "thinking|tool_call|tool_result|response|info|error",
"content": "...",
"session_id": "<copilot-session-id>",
"send_ref": "...",
"namespace": "...",
"agent_ref": "..."
}
Final response (POST to $WEBHOOK_URL):
{
"queue_id": "<uuid>",
"response": "full answer text",
"session_id": "<session-id>",
"send_ref": "...",
"namespace": "...",
"agent_ref": "..."
}
Notification (POST to $WEBHOOK_URL with /response replaced by /notification):
{
"session_id": "<session-id>",
"agent_ref": "<agent-name>",
"namespace": "<namespace>",
"message": "Node worker-3 is now Ready!",
"notification_type": "success",
"title": "Background task completed",
"task_ref": "<task-id>"
}
The operator webhook validates this payload, creates a KubeCopilotNotification CR, and the Web UI delivers it to the user via SSE. notification_type must be one of info, success, warning, or error (defaults to info). title and task_ref are optional.
Background Task API¶
The GitHub Copilot agent server implements a background task framework for long-running operations. Tasks periodically check a Kubernetes resource condition or pod phase, and fire a notification to the user session when the condition is met (or when the task times out).
Tasks are persisted to $COPILOT_HOME/tasks.json and automatically re-launched on pod restart.
POST /tasks/monitor¶
Register a new background monitoring task.
Request body:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
session_id |
string |
✅ | — | Session to notify when complete |
agent_ref |
string |
✅ | — | Name of the KubeCopilotAgent |
namespace |
string |
— | "" |
Kubernetes namespace of the target resource |
task_type |
string |
— | monitor_resource |
Monitor type: monitor_resource or monitor_pod_phase |
config |
object |
— | {} |
Monitor-specific config (see below) |
check_interval |
int |
— | 30 |
Seconds between condition checks (minimum: 5) |
timeout |
int |
— | 3600 |
Maximum seconds to wait before timing out (maximum: 86400) |
notification_message |
string |
— | "Background task completed" |
Message sent in the notification |
notification_type |
string |
— | success |
Notification severity: info, success, warning, error |
title |
string |
— | "Task Completed" |
Toast popup title |
monitor_resource config fields:
| Field | Default | Description |
|---|---|---|
resource_type |
nodes |
Kubernetes resource type (e.g. pods, deployments) |
resource_name |
"" |
Name of the resource |
condition_type |
Ready |
Status condition type to check |
condition_status |
True |
Expected condition status |
api_version |
v1 |
API version (e.g. v1, apps/v1) |
resource_namespace |
"" |
Namespace of the resource (empty for cluster-scoped) |
monitor_pod_phase config fields:
| Field | Default | Description |
|---|---|---|
pod_name |
"" |
Name of the pod |
pod_namespace |
default |
Namespace of the pod |
target_phase |
Running |
Target pod phase (e.g. Running, Succeeded) |
Response:
Example — monitor a node until Ready:
curl -X POST http://<agent-svc>:8080/tasks/monitor \
-H 'Content-Type: application/json' \
-d '{
"session_id": "abc",
"agent_ref": "my-agent",
"namespace": "default",
"task_type": "monitor_resource",
"config": {
"resource_type": "nodes",
"resource_name": "worker-3",
"condition_type": "Ready",
"condition_status": "True"
},
"check_interval": 30,
"timeout": 3600,
"notification_message": "Node worker-3 is now Ready!"
}'
GET /tasks¶
List all background tasks. Returns { "tasks": [...] } where each item includes task_id, task_type, status, session_id, title, and config.
GET /tasks/{task_id}¶
Get full details of a specific task, including check_interval, timeout, and notification_message.
DELETE /tasks/{task_id}¶
Cancel and remove a task. Returns { "status": "deleted", "task_id": "..." }.
Environment Variables¶
Variables injected by the operator into the agent container:
| Variable | Description |
|---|---|
GITHUB_TOKEN |
Auth token from the githubTokenSecretRef (can be repurposed for any API key) |
WEBHOOK_URL |
URL of the operator's internal webhook (http://<svc>/response) |
COPILOT_HOME |
Persistent storage root (backed by a PV) |
KUBECONFIG |
Path to kubeconfig if a kubeconfigSecretRef is set |
Skills and AGENT.md are mounted into the container as ConfigMaps:
- Skills ConfigMap →
/copilot-skills-staging/→entrypoint.shstages them into$COPILOT_HOME/skills/<name>/SKILL.md - AGENT.md ConfigMap →
$COPILOT_HOME/AGENT.md
Creating a New Agent Image (e.g., Claude Code)¶
To add a new AI engine (such as Claude Code), create a new subdirectory under agent-server-container/ and implement the API contract:
graph TB
subgraph repo["agent-server-container/"]
A["github-copilot/<br/><sub>default engine · Copilot SDK</sub>"]
B["claude-code/<br/><sub>example new engine</sub>"]
end
subgraph files["Required Files"]
F1["server.py<br/><sub>implement /health, /asyncchat, /cancel</sub>"]
F2["entrypoint.sh<br/><sub>auth setup, skill staging</sub>"]
F3["Containerfile<br/><sub>build the image</sub>"]
end
B --> F1
B --> F2
B --> F3
style repo fill:#1a1a2e,stroke:#00bcd4,color:#e0e0e0
style files fill:#16213e,stroke:#e94560,color:#e0e0e0
1. Write entrypoint.sh¶
Set up auth and launch server.py:
#!/bin/bash
set -e
export ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}"
export AGENT_HOME="${AGENT_HOME:-/agent}"
mkdir -p "${AGENT_HOME}/sessions" "${AGENT_HOME}/.cache"
# Stage skills (same pattern as github-copilot)
if [ -d /copilot-skills-staging ]; then
for f in /copilot-skills-staging/*.md; do
[ -f "$f" ] || continue
skill_name="$(basename "$f" .md)"
mkdir -p "${AGENT_HOME}/skills/${skill_name}"
cp "$f" "${AGENT_HOME}/skills/${skill_name}/SKILL.md"
done
fi
exec /opt/venv/bin/python /server.py
2. Write server.py¶
Implement the three required endpoints:
import asyncio, httpx, json, os, subprocess, uuid
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
WEBHOOK_URL = os.environ.get("WEBHOOK_URL", "")
_active_procs = {}
class AsyncChatRequest(BaseModel):
message: str
session_id: str | None = None
send_ref: str | None = None
namespace: str | None = None
agent_ref: str | None = None
@app.get("/health")
async def health():
return {"status": "ok"}
@app.post("/asyncchat")
async def asyncchat(req: AsyncChatRequest):
queue_id = str(uuid.uuid4())
asyncio.create_task(process(queue_id, req))
return {"queue_id": queue_id, "status": "queued"}
@app.delete("/cancel/{queue_id}")
async def cancel(queue_id: str):
proc = _active_procs.get(queue_id)
if proc:
proc.terminate()
_active_procs.pop(queue_id, None)
return {"status": "cancelled", "queue_id": queue_id}
return {"status": "not_found", "queue_id": queue_id}
async def process(queue_id: str, req: AsyncChatRequest):
chunk_url = WEBHOOK_URL.replace("/response", "/chunk")
# Launch Claude Code CLI — adapt flags to the actual binary
cmd = ["claude", "--no-interactive", "--output-format", "stream-json",
req.message]
# ... implement the rest
3. Write Containerfile¶
FROM python:3.12-slim
RUN pip install --no-cache-dir fastapi uvicorn httpx && \
# Install the Claude Code CLI (adjust to actual install method)
pip install claude-code
RUN useradd -m -s /bin/bash agent
WORKDIR /home/agent
COPY entrypoint.sh /entrypoint.sh
COPY server.py /server.py
RUN chmod +x /entrypoint.sh
USER agent
EXPOSE 8080
ENTRYPOINT ["/entrypoint.sh"]
4. Add a Makefile target¶
CLAUDE_IMG ?= quay.io/yourorg/kube-claude-code-agent-server:v1.0
.PHONY: container-build-claude container-push-claude
container-build-claude:
$(CONTAINER_TOOL) build -t $(CLAUDE_IMG) ./agent-server-container/claude-code/
container-push-claude:
$(CONTAINER_TOOL) push $(CLAUDE_IMG)
5. Create a KubeCopilotAgent CR pointing to the new image¶
apiVersion: kubecopilot.io/v1
kind: KubeCopilotAgent
metadata:
name: claude-code-agent
namespace: kube-copilot-agent
spec:
image: quay.io/yourorg/kube-claude-code-agent-server:v1.0
githubTokenSecretRef: # reuse field for ANTHROPIC_API_KEY via a secret
name: anthropic-token
skillsConfigMap: claude-skills
agentConfigMap: claude-agent-md
storageSize: "1Gi"
The operator treats every KubeCopilotAgent the same way regardless of which AI engine runs inside — as long as the container implements the API contract, the full UI, streaming, session history, and cancellation features work automatically.