Agent Loop

The agent loop is the core processing cycle: a message arrives, the agent assembles context, calls the LLM, executes tools, and produces a response. This page walks through each step based on the actual implementation.

Entry points

A turn can be triggered by:

A chat message via a channel (Feishu, Discord, HTTP)
A heartbeat timer firing
A scheduled task from the scheduler

Step-by-step flow

1. Receive the trigger message

The Think.think() method receives:

msg — the incoming message that triggered this turn
tool_msgs — results from tool calls in a previous iteration
mode — "text" or "voice"

2. Build the system prompt

The system prompt is assembled from 5 ordered layers (see System Prompt):

Fixed skeleton — agent role, persona, skills list, knowledge context, conversation summary
Runtime facts — agent ID, channel info, current time, workspace path, turn variables
Workspace context — contents of AGENTS.md, SOUL.md, IDENTITY.md if present
Heartbeat context — injected only for heartbeat-triggered turns
Ephemeral system prompt — per-turn temporary instructions (used for sub-agents)

3. Fetch conversation history

Recent messages are retrieved from the sensor memory store. Up to context_top_k (default 12) most recent messages for the current conversation are loaded. Each session key stores a rolling cap of 100 raw messages.

4. Build the message list

The LLM message list is constructed in order:

System prompt messages (from step 2)
Historical conversation messages (from step 3), prefixed with [timestamp] speaker: content
The current trigger message
Any tool result messages from a previous iteration

Messages are deduplicated by ID.

5. Sanitize messages

Message sanitization ensures OpenAI-compatible tool call ordering:

Orphan tool messages (no preceding assistant tool_call) are removed
Incomplete tool call blocks (missing tool responses) are dropped
Tool responses are ensured to be immediately after their assistant call

6. Call the LLM

The assembled messages are sent to the LLM via the configured provider. The agent can either:

Stream the response — yield deltas as they arrive
Return the complete response at once

The LLM can respond with:

Text — a direct reply
Tool calls — requests to execute tools (read, exec, web_fetch, etc.)

7. Handle the result

If the LLM returns text: the response is sent back through the channel
If the LLM returns tool calls: each tool is executed, results are collected, and the loop restarts from step 2 with the tool results as tool_msgs

Voice mode (mode="voice") uses a separate VoiceThink class that manages realtime WebSocket connections with the LLM for streaming audio, VAD, and session reconnection.

Tool execution

Available tools depend on the mode:

Mode	Available tools
Text	`read`, `write`, `exec`, `web_fetch`, `api_request`, `delegate_task`, `manage_schedule`, `process`
Voice	`skip_voice_reply` (plus a subset of text tools)

Tool results are collected and fed back into the loop. The LLM can make multiple rounds of tool calls before producing a final text response.

Context budget

The LLM context window is finite. MushroomAgent manages it through:

Message history: limited to context_top_k recent messages (configurable)
Workspace files: each file capped at 4000 chars, total at 12000 chars
Tool output: individual tools apply their own truncation limits
Token counting: max_completion_tokens is calculated based on available context window

If the combined prompt exceeds the model's context window, the LLM call will fail. Adjust memory.context_top_k or keep workspace files concise to avoid this.

Sub-agents

The delegate_task tool spawns child agents for isolated tasks. Sub-agents:

Receive an ephemeral system prompt with the task instructions
Skip workspace context files (skip_context_files=True)
Run in quiet mode (quiet_mode=True)
Output a plain-text result back to the parent agent

Entry points​

Step-by-step flow​

1. Receive the trigger message​

2. Build the system prompt​

3. Fetch conversation history​

4. Build the message list​

5. Sanitize messages​

6. Call the LLM​

7. Handle the result​

Tool execution​

Context budget​

Sub-agents​