Configuration Reference
MushroomAgent's primary agent runtime config lives in ~/.mushroom_agent/config.yaml. This page documents every section and field.
Configuration is loaded in order:
MUSHROOM_ROOT/config/config.yaml(project root)~/.mushroom_agent/config.yaml(local overrides)
Local settings override fields with the same name.
mushroom-agent init also creates two sidecar files next to config.yaml:
| File | Purpose |
|---|---|
hardware.yaml | Local device inputs and outputs for mushroom-agent start and mushroom-agent node attach |
remote.yaml | Remote node attachment settings for mushroom-agent node attach |
The shortest local chat path only requires editing config.yaml and setting llm.api_key.
hardware.yaml
hardware.yaml configures local device inputs and outputs for mushroom-agent start and mushroom-agent node attach.
For camera input, hardware.video_input.width, hardware.video_input.height, and hardware.video_input.jpeg_quality define the image size and JPEG compression used by RTC voice sessions and the node video preview. Camera frame sizing and compression are owned by hardware.yaml, not config.yaml.
Camera proximity greeting is configured under hardware.video_input.proximity:
hardware:
video_input:
enabled: true
devices:
- device_index: 0
camera_id: "camera_0"
- device_index: 1
camera_id: "camera_1"
proximity:
enabled: true
activate_cameras: ["camera_0"]
sample_interval_ms: 300
near_confidence: 0.6
near_consecutive: 2
far_consecutive: 3
near_face_ratio: 0.04
near_exit_ratio_margin: 0.02
backend: "onnxrt"
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable the best-effort proximity observer |
activate_cameras | string[] | [] | Optional list of video_input.devices[*].camera_id values used by proximity detection |
sample_interval_ms | int | 300 | Minimum interval between proximity samples |
near_confidence | float | 0.6 | Minimum detector confidence required before a face can be treated as near |
near_consecutive | int | 2 | Number of consecutive near samples required before emitting user_approached_camera |
far_consecutive | int | 3 | Number of consecutive far samples required before re-arming another approach event |
near_face_ratio | float | 0.04 | Largest-face area ratio required to enter the near state |
near_exit_ratio_margin | float | 0.02 | Amount subtracted from near_face_ratio before an already-near user is treated as far |
backend | string | "onnxrt" | Face inference backend: "onnxrt" or "tensorrt" |
When proximity is enabled, the node samples video frames before normal upload and emits user_approached_camera only after a debounced approach transition. If activate_cameras is empty, proximity uses the same stitched camera frame as the normal video stream. If it contains one or more camera IDs, proximity crops those camera segments from the stitched frame, re-stitches only the selected cameras in video_input.devices order, and runs detection on that temporary sample frame. The normal input_video.append path is unchanged and continues to upload all configured video input cameras.
near_confidence and near_face_ratio must both pass before a sample is treated as near. near_consecutive and far_consecutive control how quickly the runtime enters and exits the near state. Once the user is near, the runtime uses near_exit_ratio_margin to derive the lower exit threshold so slight movement around the boundary does not re-arm greetings. Face detector and embedder models are resolved from ~/.mushroom_agent/models/ or downloaded from the default model repositories when available. If identity recognition is enabled and the transition-time face embedding matches a registered identity with display_name, the greeting system event includes that display name; raw images and raw identity IDs are not included.
agent
Agent identity and runtime settings.
agent:
id: "local-agent"
name: "LocalAgent"
| Field | Type | Default | Description |
|---|---|---|---|
id | string | — | Unique agent identifier |
name | string | — | Display name for the agent |
ephemeral_system_prompt | string | "" | Per-turn temporary instructions (used by sub-agents) |
skip_context_files | bool | false | Skip workspace context files (AGENTS.md, SOUL.md, IDENTITY.md) |
quiet_mode | bool | false | Suppress non-essential output (used by sub-agents) |
llm
LLM provider configuration.
llm:
api_type: "openai"
api_key: "sk-..."
base_url: "https://api.openai.com/v1"
model: "gpt-5.2"
temperature: 0.0
max_completion_tokens: 4096
stream: false
timeout: 600
proxy: ""
| Field | Type | Default | Description |
|---|---|---|---|
api_type | string | — | Provider type: openai, deepseek, anthropic, gemini, etc. |
api_key | string | — | API key (also picked up from LLM_API_KEY env var) |
base_url | string | — | API endpoint URL |
api_version | string | — | API version (Azure) |
model | string | — | Model name |
temperature | float | 0.0 | Sampling temperature |
top_p | float | 1.0 | Nucleus sampling |
top_k | int | 0 | Top-k sampling |
max_completion_tokens | int | 4096 | Maximum output tokens |
stream | bool | false | Enable streaming responses |
timeout | int | 600 | Request timeout in seconds |
proxy | string | — | Proxy URL (overrides global proxy) |
llm.vision only controls native vision enablement, provider detail mode, and the maximum number of images attached to one model turn. It does not expose camera sizing, JPEG quality, image byte limits, or fetch timeout settings.
For the full list of supported providers, see LLM Providers.
tools
Tool enablement and configuration.
tools:
read:
enabled: true
exec:
enabled: true
allowlist:
- curl
- python
- python3
- ls
- rg
- cat
- pwd
default_timeout: 120
max_output_chars: 80000
| Field | Type | Default | Description |
|---|---|---|---|
tools.manage_schedule.enabled | bool | true | Enable schedule management tool |
tools.skip_voice_reply.enabled | bool | true | Enable voice skip-reply tool |
tools.emit_ui_command.enabled | bool | false | Enable UI command emit tool |
tools.memory_search.enabled | bool | true | Enable long-term memory search tool |
tools.update_identity_profile.enabled | bool | false | Enable identity profile update tool |
tools.read.enabled | bool | true | Enable file read tool |
tools.exec.enabled | bool | true | Enable shell exec tool |
tools.exec.allowlist | list | [] | Commands allowed in exec |
tools.exec.default_timeout | int | 1800 | Default timeout for exec commands (seconds) |
tools.exec.max_output_chars | int | 200000 | Max output chars captured |
tools.write.enabled | bool | false | Enable file write tool |
tools.web_fetch.enabled | bool | false | Enable web fetch tool |
tools.web_fetch.timeout | int | 20 | Web fetch request timeout (seconds) |
tools.web_fetch.max_bytes | int | 200000 | Max bytes to fetch |
tools.api_request.enabled | bool | false | Enable API request tool |
tools.web_search.enabled | bool | false | Enable web search tool |
tools.process.enabled | bool | false | Enable long-running process management |
tools.delegate_task.enabled | bool | false | Enable task delegation (sub-agents) |
tools.delegate_task.max_depth | int | 2 | Max sub-agent nesting depth |
tools.delegate_task.max_iterations | int | 20 | Max iterations per sub-agent |
In the local starter created by mushroom-agent init, read.cwd, exec.cwd, and write.cwd are intentionally omitted. These tools fall back to workspace.path automatically.
Available tools: read, write, exec, web_fetch, api_request, process, delegate_task, manage_schedule, skip_voice_reply.
channels
Platform channel configuration.
channels:
feishu:
enabled: false
app_id: ""
app_secret: ""
verification_token: ""
encrypt_key: ""
discord:
enabled: false
bot_token: ""
livekit:
enabled: false
url: ""
api_key: ""
api_secret: ""
| Field | Type | Default | Description |
|---|---|---|---|
feishu.enabled | bool | false | Enable Feishu bot |
feishu.app_id | string | — | Feishu app ID |
feishu.app_secret | string | — | Feishu app secret |
feishu.verification_token | string | — | Feishu verification token |
feishu.encrypt_key | string | — | Feishu encrypt key |
discord.enabled | bool | false | Enable Discord bot |
discord.bot_token | string | — | Discord bot token |
livekit.enabled | bool | false | Enable LiveKit voice |
livekit.url | string | — | LiveKit server URL |
livekit.api_key | string | — | LiveKit API key |
livekit.api_secret | string | — | LiveKit API secret |
rtc
Realtime voice mode switch and VAD settings.
rtc:
enabled: false
voice_mode: "realtime"
vad_mode: "vad"
allow_interruptions: true
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable voice-mode preflight and runtime voice configuration |
voice_mode | string | "realtime" | Voice runtime mode |
vad_mode | string | "vad" | Voice activity detection mode |
allow_interruptions | bool | true | Default voice interruption behavior for realtime sessions when clients do not provide session.create.config.allow_interruptions |
When rtc.enabled is false, mushroom-agent serve --ui and normal websocket chat do not require tts.tts_token or realtime_llm.api_key.
tts
Text-to-speech configuration.
tts:
provider: "minimax"
group_id: ""
tts_url: "wss://api.minimax.io"
tts_token: "YOUR_TTS_TOKEN"
voice_id: "Chinese (Mandarin)_Warm_Bestie"
| Field | Type | Default | Description |
|---|---|---|---|
provider | string | "minimax" | TTS provider |
group_id | string | — | Provider group ID |
tts_url | string | — | TTS WebSocket URL |
tts_token | string | — | TTS auth token |
voice_id | string | — | Voice identifier |
realtime_llm
Realtime LLM for voice mode.
realtime_llm:
model: "gpt-realtime"
api_key: "YOUR_REALTIME_API_KEY"
| Field | Type | Default | Description |
|---|---|---|---|
model | string | "gpt-realtime" | Realtime model name |
api_key | string | — | API key for realtime (falls back to llm.api_key) |
skills
Skills configuration.
skills:
enabled: true
roots: []
include_defaults: true
explicit_only: false
max_active: 3
active: []
allowlist: []
blocklist: []
inject_mode: "hybrid"
summary_max_chars: 280
content_max_chars: 2000
refresh_interval: 300
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable skills system |
roots | list | [] | Additional skill directories beyond workspace.path/skills |
include_defaults | bool | true | Include workspace skills |
explicit_only | bool | false | Only load explicitly requested skills |
max_active | int | 3 | Max concurrently active skills |
active | list | [] | Always-active skill names |
allowlist | list | [] | Only allow these skills |
blocklist | list | [] | Block these skills |
inject_mode | string | "hybrid" | "summary", "full", or "hybrid" |
summary_max_chars | int | 280 | Max chars in summary |
content_max_chars | int | 2000 | Max chars when loading full content |
refresh_interval | int | 300 | Auto-refresh interval (seconds) |
workspace
Workspace directory settings.
workspace:
path: "/absolute/path/to/workspace"
| Field | Type | Default | Description |
|---|---|---|---|
path | string | — | Workspace directory path |
use_uid | bool | — | Use UID-based path |
The local starter created by mushroom-agent init treats workspace.path as the only required path anchor. Logs, workspace skills/, memory storage, and tool working directories are derived internally from this directory.
memory
Memory and knowledge retrieval.
memory:
context_top_k: 12
chat_mp_top_k: 1
knowledge_top_k: 2
similar_threshold: 0.01
knowledge_similar_threshold: 0.1
| Field | Type | Default | Description |
|---|---|---|---|
context_top_k | int | 12 | Recent context items to retrieve |
chat_mp_top_k | int | 1 | Chat memory items to retrieve |
knowledge_top_k | int | 2 | Knowledge items to retrieve |
similar_threshold | float | 0.01 | Similarity threshold for context |
knowledge_similar_threshold | float | 0.1 | Similarity threshold for knowledge |
embedding
Embedding model configuration.
embedding:
api_type: "openai"
api_key: ""
base_url: "https://api.openai.com/v1"
model: "text-embedding-3-small"
dimensions: 1536
| Field | Type | Default | Description |
|---|---|---|---|
api_type | string | — | openai or dashscope |
api_key | string | — | API key |
base_url | string | — | API endpoint URL |
model | string | — | Embedding model name |
dimensions | int | 1536 | Embedding vector dimensions |
embed_batch_size | int | — | Batch size for embedding |
scheduler
Scheduled task settings.
scheduler:
execution_timeout_seconds: 300
| Field | Type | Default | Description |
|---|---|---|---|
execution_timeout_seconds | float | 300.0 | Max execution time for scheduled tasks (seconds) |
one_shot_retry_max_attempts | int | 3 | Max retry attempts for one-shot tasks |
error_auto_disable_threshold | int | 3 | Consecutive errors before auto-disabling a task |
heartbeat
Periodic heartbeat for proactive agent checks.
heartbeat:
enabled: true
every_seconds: 300
run_on_startup: false
active_hours: []
prompt: ""
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable periodic heartbeats |
every_seconds | int | 300 | Interval between heartbeats (seconds) |
run_on_startup | bool | false | Run a heartbeat immediately on startup |
wake_coalesce_ms | int | 250 | Coalesce window for wake events (ms) |
wake_retry_ms | int | 1000 | Retry interval for failed wake events (ms) |
ack_max_chars | int | 300 | Max characters for heartbeat acknowledgment |
conversation_ttl_seconds | int | 0 | Conversation inactivity TTL (0 = disabled) |
timezone | string | "" | Timezone for active hours (e.g., "Asia/Shanghai") |
active_hours | list | [] | Restrict heartbeats to these hours (e.g., ["09:00-18:00"]) |
prompt | string | — | Custom heartbeat prompt (defaults to built-in) |
server.accesskeys
Agent server and remote node authentication accesskey. Use mushroom-agent accesskey to manage this list; accesskey plaintext is shown only when created. Accesskeys do not expire automatically; revoke them with disable or delete.
server:
accesskeys:
- id: "akid_example"
name: "Kitchen Pi"
accesskey_hash: "<sha256>"
accesskey_preview: "ak_abcd...wxyz"
enabled: true
created_at: "2026-05-19T00:00:00Z"
intended_node_id: "pi-kitchen"
description: ""