KeGAL Quick Reference — Agent Guide for Graph Construction¶
This document is the single authoritative reference for building KeGAL graphs. Read it completely before writing any graph YAML. Every rule here reflects tested, production-verified behaviour.
1. Graph File Anatomy¶
A KeGAL graph is a YAML file with these top-level keys:
models: # required — one or more LLM configurations
verbose: # optional bool — enable verbose logging
user_message: # optional string — default user prompt
mcp_servers: # optional — out-of-process tool servers
blackboard: # optional — named shared markdown buffers
tools: # optional — in-process Python tool schemas
prompts: # required — prompt templates (0-indexed list)
nodes: # required — agent nodes
edges: # required — execution topology
chat_history: # optional — persistent conversation history
2. Models¶
models:
- llm: "ollama" # provider: ollama | anthropic | anthropic_aws | openai | bedrock | gemini
model: "qwen2.5:7b" # model name or ARN
host: "http://localhost:11434" # ollama only
context_window: 32768 # required when using react compact: true
api_key: "${ANTHROPIC_API_KEY}" # use ${ENV_VAR} to read from environment
# aws_region_name, aws_access_key, aws_secret_key for bedrock / anthropic_aws
- The list is 0-indexed. Nodes reference models by index:
model: 0. - Multiple entries allow different nodes to use different models/providers.
context_windowis required for accurate ReAct compaction. Without it,max_tokensis used as a proxy (much smaller).
3. Prompts¶
prompts:
- template: # index 0
system_template:
role: |
System instruction text.
any_key: |
More system text (multiple keys are concatenated).
prompt_template:
label_a: |
{user_message} # placeholder
label_b: |
{message_passing} # placeholder
- The list is 0-indexed. Nodes reference prompts by index:
prompt.template: 0. system_templatekeys are concatenated to build the system prompt.prompt_templatekeys become the user-turn content.- Template keys are arbitrary labels — they only affect readability.
Available Placeholders¶
| Placeholder | Source | Notes |
|---|---|---|
{user_message} |
compiler.user_message or YAML user_message: |
Activate with prompt.user_message: true on the node |
{message_passing} |
accumulated pipe from upstream nodes | Activate with message_passing.input: true on the node |
{blackboard} |
named board file content | Activate with blackboard.read: true on the node |
{retrieved_chunks} |
compiler.retrieved_chunks |
Activate with prompt.chunks: true on the node |
4. Nodes¶
Every node entry follows this schema:
nodes:
- id: "node_id" # required — unique string
model: 0 # required — index into models list
temperature: 0.0 # float 0.0–1.0
max_tokens: 512 # int — max output tokens
show: true # bool — include in output report (ignored on react agents — use message_passing.output instead)
prompt:
template: 0 # required — index into prompts list
user_message: true # inject {user_message} placeholder
chunks: false # inject {retrieved_chunks} placeholder
chat_history: "session" # named chat history key
# Data flow
message_passing:
input: true # receive upstream pipe as {message_passing}
output: true # write final LLM text to pipe after execution
# Tools (choose one or both)
tools: ["tool_name"] # in-process Python tools (schema declared top-level)
mcp_servers: # out-of-process MCP tools
- id: "server_id"
tools: ["tool_a"] # optional whitelist; omit to expose all server tools
max_tool_calls: 10 # int — max LLM-call iterations in tool loop (default: 10)
# Shared state
blackboard:
id: "board_id" # must match a declared board
read: true # inject board content as {blackboard}
write: true # append LLM response to board after execution
# ReAct controller (mutually exclusive with tools/mcp_servers/blackboard)
react:
max_iterations: 20 # int — hard cap on dispatch loop
compact: true # enable automatic conversation compaction
compact_threshold: 0.8 # compact at this fraction of context_window (default: 0.8)
react_output: # JSON schema the controller must return each iteration
type: object
properties:
next_agent: { type: string }
agent_input: { type: string }
done: { type: boolean }
final_answer: { type: string }
reasoning: { type: string }
required: ["done"]
# Structured output (non-ReAct nodes)
structured_output:
description: "optional description"
parameters:
field_name: { type: string, description: "..." }
score: { type: integer }
valid: { type: boolean }
required: ["field_name"]
ReAct Controller Restrictions¶
| Feature | Controller | Agent nodes |
|---|---|---|
tools |
Raises ValueError |
✓ |
mcp_servers |
Raises ValueError |
✓ |
blackboard.write: true |
Raises ValueError |
✓ |
blackboard.read: true |
Raises ValueError |
✓ |
message_passing.input |
✓ seeds initial conversation | ✓ |
message_passing.output |
✓ pushes final_answer downstream |
✓ |
All four restrictions are hard errors raised at Compiler() construction — the graph never starts.
5. Edges¶
Edges define the execution topology. Nodes NOT referenced in edges are still executed — they run in dependency order inferred from message_passing flags.
5.1 Sequential (linear chain)¶
The simplest case — no edges or bare edges list. The scheduler infers order from message_passing flags automatically.
5.2 Fan-out: children — parallel branches¶
children means parallel. All children start simultaneously after the parent completes.
Never use children for sequential intent — use message_passing ordering instead.
One child is effectively sequential (nothing to run in parallel with), but multiple children always run concurrently.
Children are recursive:
edges:
- node: "A"
children:
- node: "B"
children:
- node: "D"
- node: "E" # D and E run in parallel after B
- node: "C" # C runs in parallel with B
5.3 Fan-in: fan_in — aggregation gate¶
synthesizer waits for all listed predecessors. Combine with children for fan-out → fan-in:
edges:
- node: "dispatcher"
children:
- node: "agent_a"
- node: "agent_b"
- node: "agent_c"
- node: "synthesizer"
fan_in:
- node: "agent_a"
- node: "agent_b"
- node: "agent_c"
A node appears in children (who launches it) AND fan_in (who waits for it). This double appearance is correct and intentional.
5.5 Ordered variants: ordered_children and ordered_fan_in¶
ordered_children runs siblings sequentially instead of in parallel.
Each entry depends on the previous one, and the first depends on the parent:
edges:
- node: "parent"
ordered_children:
- node: "step_1" # depends on parent
- node: "step_2" # depends on step_1
- node: "step_3" # depends on step_2
ordered_fan_in makes predecessors run sequentially before the aggregator:
edges:
- node: "synthesizer"
ordered_fan_in:
- node: "A" # runs first
- node: "B" # depends on A
- node: "C" # depends on B → synthesizer waits for C
Both can coexist with their parallel counterparts on the same edge:
edges:
- node: "parent"
children: # X and Y in parallel
- node: "X"
- node: "Y"
ordered_children: # A then B sequentially
- node: "A"
- node: "B"
Both work identically inside react sub-graph dispatches — use ordered_children
on a react agent entry to sequence its sub-DAG without nesting children:
react:
- node: "analyze_agent"
ordered_children:
- node: "writer_agent"
- node: "validator_agent" # flat, readable — no nested children needed
5.4 ReAct edges¶
Agent nodes listed under react: are excluded from the main DAG — they only run when the controller dispatches them.
react and children/fan_in are mutually exclusive on the same edge entry.
Each entry in react: is itself a full GraphEdge and may have its own children/fan_in to define a sub-DAG that runs when that agent is dispatched:
edges:
- node: "controller"
react:
- node: "analyze_agent"
children:
- node: "writer_agent" # runs after analyze_agent in sub-graph
- node: "report_agent"
Ordering rule for children sub-DAGs: When two agents run inside the same dispatch (linked via children), they share one isolated pipe. Within that execution, message_passing order inference does NOT apply — use children to guarantee sequence.
Separate react entries are ordered by the controller. agent_a and agent_b as distinct entries in the react: list are dispatched in separate controller iterations. Message passing between them flows through the controller: agent_a's output becomes the controller's observation, the controller embeds it in agent_input, and agent_b receives it via message_passing.input: true.
6. Message Passing — Data Flow¶
Message passing flows text between nodes through a shared pipe.
How it works¶
- A node with
output: trueappends its final LLM text to the pipe after execution. - A node with
input: truereceives the accumulated pipe as{message_passing}in its prompt.
Output priority (what gets written to the pipe)¶
response.messages— the LLM's final text response (always preferred)response.tool_results— raw tool output (fallback when no text was produced)response.json_output— structured output (last fallback)
Important: Even when a node calls tools, it is the LLM's final text that flows downstream — not the raw tool results. Design downstream prompts to receive processed findings, not raw data.
Sequential inference¶
The scheduler infers node_a → node_b automatically when node_a has output: true and node_b has input: true, based on declaration order. No explicit edge entries are needed for a linear chain.
Isolation in ReAct sub-graphs¶
When the controller dispatches an agent:
- A fresh local pipe is created: [agent_input]
- The agent sub-graph runs in isolation against this local pipe
- After the sub-graph completes, the controller receives as observation:
- The pipe growth (everything added beyond the initial agent_input)
- Falls back to the last sub-graph node's response.messages if no growth
- The global pipe is restored unchanged after dispatch
7. Blackboard¶
Named markdown buffers persisted to disk during compile().
blackboard:
path: ./blackboards/ # directory for board files (relative to YAML)
boards:
- id: "findings"
file: findings.md
cleanup: true # truncate at Compiler init (default: true)
- id: "report"
file: report.md
cleanup: true
import: ["findings"] # prepend 'findings' content at read time
Node categories¶
| Category | read |
write |
Role |
|---|---|---|---|
| Cat-1 | false | true | Writer — seeds the board |
| Cat-2 | true | true | Enricher — reads then extends |
| Cat-3 | true | false | Reader — consumes final board |
Execution order is inferred automatically: Cat-1 → Cat-2 (parallel) → Cat-3. No explicit children/fan_in needed for this basic pattern.
Blackboard write behaviour¶
write: true appends the node's response.messages (the LLM's final text) to the board file.
Only use write: true on nodes without tools. A node that calls tools produces noisy intermediate messages mixed with the final text — the blackboard content will be polluted.
For clean blackboard writes from tool-calling workflows, use a dedicated no-tool writer node that receives the findings via message_passing and writes them via an MCP append_text_file tool, or use a pure write: true node that only outputs text.
8. MCP Servers¶
Out-of-process tool servers using the Model Context Protocol.
mcp_servers:
- id: "my_tools"
transport: "stdio" # stdio | sse
command: "/path/to/python" # stdio only — absolute path recommended
args: ["./server.py"] # stdio only
# url: "http://host:8080/sse" # sse only
Attaching to nodes¶
nodes:
- id: "tool_node"
mcp_servers: ["my_tools"] # string shorthand — all tools exposed
- id: "scoped_node"
mcp_servers:
- id: "my_tools"
tools: ["read_file", "list_dir"] # whitelist — model only sees these tools
Filter tools per node. Local LLMs hallucinate less when exposed only to the tools they need.
max_tool_calls — default 10 LLM-call iterations. Increase for nodes that must read many files:
MCP servers must be on agent nodes, never on the ReAct controller.
9. Python Tool Executors¶
In-process tools defined as Python functions, wired at Compiler construction.
# In the YAML
tools:
- name: "get_weather"
description: "Returns the current weather for a given city."
parameters:
city: { type: string, description: "The city name." }
required: ["city"]
nodes:
- id: "tool_node"
tools: ["get_weather"]
# In Python
def get_weather(city: str) -> str:
return f"Sunny, 22°C in {city}"
with Compiler(uri="graph.yml", tool_executors={"get_weather": get_weather}) as c:
c.compile()
Tool names in tool_executors must exactly match names in the YAML tools: list.
A node can have both tools and mcp_servers simultaneously.
10. ReAct Loop¶
Controller prompt requirements¶
The controller system prompt must declare: - The available agents and what each does - The expected JSON output fields - The workflow / rules
The controller must return react_output JSON on every iteration. Key fields:
- next_agent — ID of the agent to dispatch (omit only when done: true)
- agent_input — instruction string passed to the agent
- done — true only when the full task is complete
- reasoning — current state and rationale (helps compaction)
- final_answer / final_summary — optional summary when done
Agent node requirements¶
Agent nodes must have:
- message_passing: {input: true} — to receive agent_input
- message_passing: {output: true} — if the observation should come from the pipe growth
(otherwise it falls back to response.messages of the last sub-graph node)
Observation extraction priority (what the controller sees)¶
- Pipe growth since dispatch start (if any node in the sub-graph has
output: true) - Last sub-graph node's
response.messages - Last sub-graph node's
response.tool_results[-1] - Last sub-graph node's
response.json_output - Echo of
agent_input(with warning) if nothing produced
Conversation compaction (compact)¶
react:
max_iterations: 30
compact: true
compact_threshold: 0.8 # compact when conversation > 80% of context_window (default)
models:
- llm: "ollama"
model: "qwen3:30b"
host: "http://..."
context_window: 32768 # required for accurate threshold calculation
Always set context_window on the model when using compact: true.
Piping controller output downstream¶
A controller with message_passing.output: true writes its final_answer to the shared pipe after the loop ends. Downstream nodes with input: true receive it automatically.
If the loop ends without a final_answer (e.g. max_iterations reached without done: true), nothing is written to the pipe and a warning is logged. Always ensure the controller produces a final_answer before signalling done: true.
11. Execution Order — DAG Rules¶
The compiler builds the DAG in four stages:
- Structural:
childrencreates fan-out;fan_increates aggregation. - Message passing:
output: truenode → all laterinput: truenodes (by declaration order). - Guard barrier: nodes with
validationinstructured_outputprecede all other nodes. - Blackboard: Cat-1 → Cat-2 (parallel) → Cat-3 (inferred from
read/writeflags).
Nodes in the same topological level run in parallel (ThreadPoolExecutor). Guard nodes run first within a level, sequentially. The ReAct controller runs last within its level, after all regular nodes complete.
12. Guard Nodes¶
A node whose structured_output includes a validation boolean field becomes a guard. If validation: false, the graph aborts immediately and subsequent nodes never run.
nodes:
- id: "input_guard"
model: 0
temperature: 0.0
max_tokens: 64
show: false
structured_output:
parameters:
validation: { type: boolean, description: "true if input is valid" }
reason: { type: string }
required: ["validation"]
prompt: { template: 0, user_message: true }
Guard nodes are injected automatically before all non-guard nodes. No explicit edge declarations needed.
13. Multi-Model Assignment¶
Assign different models to different nodes to optimise cost and speed:
models:
- llm: "ollama"
model: "qwen3:30b" # model 0 — strong, for complex reasoning
host: "http://..."
context_window: 32768
- llm: "ollama"
model: "qwen2.5:14b" # model 1 — medium, for structured analysis
host: "http://..."
context_window: 32768
- llm: "ollama"
model: "llama3.2:latest" # model 2 — small, for trivial tasks
host: "http://..."
context_window: 8192
nodes:
- id: "controller"
model: 0 # strongest model for orchestration
- id: "analyst"
model: 1 # medium model for analysis
- id: "list_agent"
model: 2 # smallest model for trivial tool calls
14. Critical Rules and Common Pitfalls¶
Rule 1 — children is parallel, not sequential¶
Multiple nodes listed under children run concurrently. Never use children expecting sequential order across siblings.
# WRONG — these run in parallel, not in sequence
children:
- node: "step_1"
- node: "step_2" # does NOT wait for step_1
# CORRECT — step_2 waits for step_1
edges:
- node: "step_1"
children:
- node: "step_2"
# Or rely on message_passing order inference for a linear chain.
Rule 2 — Use ordered_children for sequential siblings¶
children is parallel. For sequential execution of siblings under a common parent, use ordered_children:
# WRONG — parallel, not sequential
children:
- node: "step_1"
- node: "step_2"
# CORRECT — sequential
ordered_children:
- node: "step_1"
- node: "step_2"
Inside react dispatches, ordered_children replaces the nested children workaround for sequencing agents within a single dispatch.
Rule 3 — Controller never has tools, MCP servers, or blackboard access¶
Setting tools, mcp_servers, blackboard.read, or blackboard.write on a controller raises ValueError at Compiler() construction. Attach all of these to agent nodes only.
Rule 4 — Blackboard writes must come from no-tool nodes¶
blackboard.write: true appends response.messages. A node that also calls tools will pollute the blackboard with tool-call strings. Use a dedicated text-only node for clean writes.
Rule 5 — context_window is required for compact¶
Without context_window on the model, compact: true uses max_tokens as the budget — far too small, causing premature compaction.
Rule 6 — Prompt template indices start at 0¶
Both models: and prompts: are 0-indexed lists. Index out of range is caught at Compiler() construction, not at compile().
Rule 7 — MCP paths must be absolute on the target machine¶
The command field is executed on the machine running kegal, not the machine authoring the YAML. Use absolute paths for command and ensure args paths resolve from the working directory where kegal run is executed.
Rule 8 — show: true is ignored on react agent nodes¶
React agent outputs are not included in compiled output — agent nodes run in an isolated context that is discarded after each dispatch. Setting show: true on a react agent logs a warning at Compiler() construction. Use message_passing.output: true on the agent and message_passing.output: true on the controller to surface results downstream.
Rule 9 — One MCP server per node reference¶
The object form {id: server_id, tools: [...]} is the correct way to filter tools per node. The string shorthand "server_id" exposes all tools.
15. Complete Working Example — ReAct with Sub-DAG¶
A controller dispatching a two-step agent pair (analyze → write) and a report agent:
models:
- llm: "ollama"
model: "qwen3:30b"
host: "http://localhost:11434"
context_window: 32768
mcp_servers:
- id: "files"
transport: "stdio"
command: "/path/to/python"
args: ["./file_server.py"]
blackboard:
path: ./blackboards/
boards:
- id: findings
file: findings.md
cleanup: true
prompts:
# 0 — controller
- template:
system_template:
role: |
You are an orchestration controller.
Available agents:
analyze_agent — reads and analyses a document; findings go to blackboard
report_agent — generates the final report from the blackboard
Workflow: dispatch analyze_agent for each document, then dispatch report_agent.
Return JSON: next_agent, agent_input, done, reasoning.
prompt_template:
task: "{user_message}"
# 1 — analyze_agent (reads docs, outputs findings as final text)
- template:
system_template:
role: |
Read the document at FILE. Produce ONLY a structured findings block as your final response.
prompt_template:
task: "{message_passing}"
# 2 — writer_agent (receives findings via message_passing, appends to blackboard file)
- template:
system_template:
role: |
You receive an instruction block followed by a findings block in {message_passing}.
Call append_text_file(path=./blackboards/findings.md, content=<findings block>).
Reply with: "Written."
prompt_template:
task: "{message_passing}"
# 3 — report_agent (reads blackboard, writes report)
- template:
system_template:
role: |
Read {blackboard}. Write a full report using write_text_file.
Reply with: "Report written."
prompt_template:
findings: "{blackboard}"
task: "{message_passing}"
nodes:
- id: "controller"
model: 0
temperature: 0.0
max_tokens: 1536
show: true # controller output (final_answer) shown in report
message_passing: { output: true } # final_answer flows to downstream nodes
prompt: { template: 0, user_message: true }
react:
max_iterations: 30
compact: true
compact_threshold: 0.8 # default
react_output:
type: object
properties:
next_agent: { type: string }
agent_input: { type: string }
done: { type: boolean }
reasoning: { type: string }
required: ["done", "next_agent", "agent_input", "reasoning"]
- id: "analyze_agent"
model: 0
temperature: 0.0
max_tokens: 2048
max_tool_calls: 15
show: false # react agent — show has no effect
message_passing: { input: true, output: true }
mcp_servers:
- id: "files"
tools: ["read_text_file", "read_pdf_file"]
prompt: { template: 1 }
- id: "writer_agent"
model: 0
temperature: 0.0
max_tokens: 64
max_tool_calls: 2
show: false
message_passing: { input: true, output: false }
mcp_servers:
- id: "files"
tools: ["append_text_file"]
prompt: { template: 2 }
- id: "report_agent"
model: 0
temperature: 0.0
max_tokens: 4096
max_tool_calls: 10
show: false # react agents: show=true is ignored — use message_passing.output
message_passing: { input: true, output: false }
blackboard: { id: findings, read: true }
mcp_servers:
- id: "files"
tools: ["write_text_file", "append_text_file"]
prompt: { template: 3 }
edges:
- node: "controller"
react:
- node: "analyze_agent"
children:
- node: "writer_agent" # runs after analyze_agent in the sub-graph
- node: "report_agent"
16. CLI Usage¶
# Run once from the project directory (requires kegal.yml)
kegal run
# kegal.yml specifies the graph file and mode
# graph: validation_graph.yml
# mode: once (default)
# mode: chat (interactive loop)
kegal.yml fields:
graph: my_graph.yml # required — path to graph YAML, relative to kegal.yml
mode: once # once (default) | chat
tools_module: ./tools.py # optional — Python file with tool_executors dict
message: true # chat mode only — prompt for user_message each turn
chunks: false # chat mode only — prompt for RAG chunks each turn
tools_module loads Python tool executors at startup via importlib. The
file must define a non-empty tool_executors = {"name": fn} dict at module
level. Missing file or missing dict → hard error, exit 1.
17. Providers Quick Reference¶
llm value |
Provider | Required fields |
|---|---|---|
ollama |
Local Ollama | model, host |
anthropic |
Anthropic API | model, api_key |
anthropic_aws |
Claude via Bedrock | model (ARN), aws_region_name, aws_access_key, aws_secret_key |
bedrock |
AWS Bedrock native | model, aws_region_name, aws_access_key, aws_secret_key |
openai |
OpenAI API | model, api_key |
gemini |
Google Gemini | model, api_key |
All string fields in models: support ${ENV_VAR} substitution — the value is read from os.environ at graph load time. Use this to avoid hardcoding secrets in YAML files:
models:
- llm: "anthropic"
model: "claude-sonnet-4-6"
api_key: "${ANTHROPIC_API_KEY}"
- llm: "gemini"
model: "gemini-2.0-flash"
api_key: "${GEMINI_API_KEY}"
- llm: "bedrock"
model: "..."
aws_region_name: "${AWS_REGION}"
aws_access_key: "${AWS_ACCESS_KEY_ID}"
aws_secret_key: "${AWS_SECRET_ACCESS_KEY}"
Set the variables in your shell before running:
Or use a conda environment variable:conda env config vars set GEMINI_API_KEY="AIza..." -n myenv