Skip to main content

Configuration Reference

MushroomAgent's primary agent runtime config lives in ~/.mushroom_agent/config.yaml. This page documents every section and field.

Configuration is loaded in order:

  1. MUSHROOM_ROOT/config/config.yaml (project root)
  2. ~/.mushroom_agent/config.yaml (local overrides)

Local settings override fields with the same name.

mushroom-agent init also creates two sidecar files next to config.yaml:

FilePurpose
hardware.yamlLocal device inputs and outputs for mushroom-agent start and mushroom-agent node attach
remote.yamlRemote node attachment settings for mushroom-agent node attach

The shortest local chat path only requires editing config.yaml and setting llm.api_key.

hardware.yaml

hardware.yaml configures local device inputs and outputs for mushroom-agent start and mushroom-agent node attach.

For camera input, hardware.video_input.width, hardware.video_input.height, and hardware.video_input.jpeg_quality define the image size and JPEG compression used by RTC voice sessions and the node video preview. Camera frame sizing and compression are owned by hardware.yaml, not config.yaml.

Camera proximity greeting is configured under hardware.video_input.proximity:

hardware:
video_input:
enabled: true
devices:
- device_index: 0
camera_id: "camera_0"
- device_index: 1
camera_id: "camera_1"
proximity:
enabled: true
activate_cameras: ["camera_0"]
sample_interval_ms: 300
near_confidence: 0.6
near_consecutive: 2
far_consecutive: 3
near_face_ratio: 0.04
near_exit_ratio_margin: 0.02
backend: "onnxrt"
FieldTypeDefaultDescription
enabledboolfalseEnable the best-effort proximity observer
activate_camerasstring[][]Optional list of video_input.devices[*].camera_id values used by proximity detection
sample_interval_msint300Minimum interval between proximity samples
near_confidencefloat0.6Minimum detector confidence required before a face can be treated as near
near_consecutiveint2Number of consecutive near samples required before emitting user_approached_camera
far_consecutiveint3Number of consecutive far samples required before re-arming another approach event
near_face_ratiofloat0.04Largest-face area ratio required to enter the near state
near_exit_ratio_marginfloat0.02Amount subtracted from near_face_ratio before an already-near user is treated as far
backendstring"onnxrt"Face inference backend: "onnxrt" or "tensorrt"

When proximity is enabled, the node samples video frames before normal upload and emits user_approached_camera only after a debounced approach transition. If activate_cameras is empty, proximity uses the same stitched camera frame as the normal video stream. If it contains one or more camera IDs, proximity crops those camera segments from the stitched frame, re-stitches only the selected cameras in video_input.devices order, and runs detection on that temporary sample frame. The normal input_video.append path is unchanged and continues to upload all configured video input cameras.

near_confidence and near_face_ratio must both pass before a sample is treated as near. near_consecutive and far_consecutive control how quickly the runtime enters and exits the near state. Once the user is near, the runtime uses near_exit_ratio_margin to derive the lower exit threshold so slight movement around the boundary does not re-arm greetings. Face detector and embedder models are resolved from ~/.mushroom_agent/models/ or downloaded from the default model repositories when available. If identity recognition is enabled and the transition-time face embedding matches a registered identity with display_name, the greeting system event includes that display name; raw images and raw identity IDs are not included.

agent

Agent identity and runtime settings.

agent:
id: "local-agent"
name: "LocalAgent"
FieldTypeDefaultDescription
idstringUnique agent identifier
namestringDisplay name for the agent
ephemeral_system_promptstring""Per-turn temporary instructions (used by sub-agents)
skip_context_filesboolfalseSkip workspace context files (AGENTS.md, SOUL.md, IDENTITY.md)
quiet_modeboolfalseSuppress non-essential output (used by sub-agents)

llm

LLM provider configuration.

llm:
api_type: "openai"
api_key: "sk-..."
base_url: "https://api.openai.com/v1"
model: "gpt-5.2"
temperature: 0.0
max_completion_tokens: 4096
stream: false
timeout: 600
proxy: ""
FieldTypeDefaultDescription
api_typestringProvider type: openai, deepseek, anthropic, gemini, etc.
api_keystringAPI key (also picked up from LLM_API_KEY env var)
base_urlstringAPI endpoint URL
api_versionstringAPI version (Azure)
modelstringModel name
temperaturefloat0.0Sampling temperature
top_pfloat1.0Nucleus sampling
top_kint0Top-k sampling
max_completion_tokensint4096Maximum output tokens
streamboolfalseEnable streaming responses
timeoutint600Request timeout in seconds
proxystringProxy URL (overrides global proxy)

llm.vision only controls native vision enablement, provider detail mode, and the maximum number of images attached to one model turn. It does not expose camera sizing, JPEG quality, image byte limits, or fetch timeout settings.

For the full list of supported providers, see LLM Providers.

tools

Tool enablement and configuration.

tools:
read:
enabled: true
exec:
enabled: true
allowlist:
- curl
- python
- python3
- ls
- rg
- cat
- pwd
default_timeout: 120
max_output_chars: 80000
FieldTypeDefaultDescription
tools.manage_schedule.enabledbooltrueEnable schedule management tool
tools.skip_voice_reply.enabledbooltrueEnable voice skip-reply tool
tools.emit_ui_command.enabledboolfalseEnable UI command emit tool
tools.memory_search.enabledbooltrueEnable long-term memory search tool
tools.update_identity_profile.enabledboolfalseEnable identity profile update tool
tools.read.enabledbooltrueEnable file read tool
tools.exec.enabledbooltrueEnable shell exec tool
tools.exec.allowlistlist[]Commands allowed in exec
tools.exec.default_timeoutint1800Default timeout for exec commands (seconds)
tools.exec.max_output_charsint200000Max output chars captured
tools.write.enabledboolfalseEnable file write tool
tools.web_fetch.enabledboolfalseEnable web fetch tool
tools.web_fetch.timeoutint20Web fetch request timeout (seconds)
tools.web_fetch.max_bytesint200000Max bytes to fetch
tools.api_request.enabledboolfalseEnable API request tool
tools.web_search.enabledboolfalseEnable web search tool
tools.process.enabledboolfalseEnable long-running process management
tools.delegate_task.enabledboolfalseEnable task delegation (sub-agents)
tools.delegate_task.max_depthint2Max sub-agent nesting depth
tools.delegate_task.max_iterationsint20Max iterations per sub-agent

In the local starter created by mushroom-agent init, read.cwd, exec.cwd, and write.cwd are intentionally omitted. These tools fall back to workspace.path automatically.

Available tools: read, write, exec, web_fetch, api_request, process, delegate_task, manage_schedule, skip_voice_reply.

channels

Platform channel configuration.

channels:
feishu:
enabled: false
app_id: ""
app_secret: ""
verification_token: ""
encrypt_key: ""
discord:
enabled: false
bot_token: ""
livekit:
enabled: false
url: ""
api_key: ""
api_secret: ""
FieldTypeDefaultDescription
feishu.enabledboolfalseEnable Feishu bot
feishu.app_idstringFeishu app ID
feishu.app_secretstringFeishu app secret
feishu.verification_tokenstringFeishu verification token
feishu.encrypt_keystringFeishu encrypt key
discord.enabledboolfalseEnable Discord bot
discord.bot_tokenstringDiscord bot token
livekit.enabledboolfalseEnable LiveKit voice
livekit.urlstringLiveKit server URL
livekit.api_keystringLiveKit API key
livekit.api_secretstringLiveKit API secret

rtc

Realtime voice mode switch and VAD settings.

rtc:
enabled: false
voice_mode: "realtime"
vad_mode: "vad"
allow_interruptions: true
FieldTypeDefaultDescription
enabledboolfalseEnable voice-mode preflight and runtime voice configuration
voice_modestring"realtime"Voice runtime mode
vad_modestring"vad"Voice activity detection mode
allow_interruptionsbooltrueDefault voice interruption behavior for realtime sessions when clients do not provide session.create.config.allow_interruptions

When rtc.enabled is false, mushroom-agent serve --ui and normal websocket chat do not require tts.tts_token or realtime_llm.api_key.

tts

Text-to-speech configuration.

tts:
provider: "minimax"
group_id: ""
tts_url: "wss://api.minimax.io"
tts_token: "YOUR_TTS_TOKEN"
voice_id: "Chinese (Mandarin)_Warm_Bestie"
FieldTypeDefaultDescription
providerstring"minimax"TTS provider
group_idstringProvider group ID
tts_urlstringTTS WebSocket URL
tts_tokenstringTTS auth token
voice_idstringVoice identifier

realtime_llm

Realtime LLM for voice mode.

realtime_llm:
model: "gpt-realtime"
api_key: "YOUR_REALTIME_API_KEY"
FieldTypeDefaultDescription
modelstring"gpt-realtime"Realtime model name
api_keystringAPI key for realtime (falls back to llm.api_key)

skills

Skills configuration.

skills:
enabled: true
roots: []
include_defaults: true
explicit_only: false
max_active: 3
active: []
allowlist: []
blocklist: []
inject_mode: "hybrid"
summary_max_chars: 280
content_max_chars: 2000
refresh_interval: 300
FieldTypeDefaultDescription
enabledbooltrueEnable skills system
rootslist[]Additional skill directories beyond workspace.path/skills
include_defaultsbooltrueInclude workspace skills
explicit_onlyboolfalseOnly load explicitly requested skills
max_activeint3Max concurrently active skills
activelist[]Always-active skill names
allowlistlist[]Only allow these skills
blocklistlist[]Block these skills
inject_modestring"hybrid""summary", "full", or "hybrid"
summary_max_charsint280Max chars in summary
content_max_charsint2000Max chars when loading full content
refresh_intervalint300Auto-refresh interval (seconds)

workspace

Workspace directory settings.

workspace:
path: "/absolute/path/to/workspace"
FieldTypeDefaultDescription
pathstringWorkspace directory path
use_uidboolUse UID-based path

The local starter created by mushroom-agent init treats workspace.path as the only required path anchor. Logs, workspace skills/, memory storage, and tool working directories are derived internally from this directory.

memory

Memory and knowledge retrieval.

memory:
context_top_k: 12
chat_mp_top_k: 1
knowledge_top_k: 2
similar_threshold: 0.01
knowledge_similar_threshold: 0.1
FieldTypeDefaultDescription
context_top_kint12Recent context items to retrieve
chat_mp_top_kint1Chat memory items to retrieve
knowledge_top_kint2Knowledge items to retrieve
similar_thresholdfloat0.01Similarity threshold for context
knowledge_similar_thresholdfloat0.1Similarity threshold for knowledge

embedding

Embedding model configuration.

embedding:
api_type: "openai"
api_key: ""
base_url: "https://api.openai.com/v1"
model: "text-embedding-3-small"
dimensions: 1536
FieldTypeDefaultDescription
api_typestringopenai or dashscope
api_keystringAPI key
base_urlstringAPI endpoint URL
modelstringEmbedding model name
dimensionsint1536Embedding vector dimensions
embed_batch_sizeintBatch size for embedding

scheduler

Scheduled task settings.

scheduler:
execution_timeout_seconds: 300
FieldTypeDefaultDescription
execution_timeout_secondsfloat300.0Max execution time for scheduled tasks (seconds)
one_shot_retry_max_attemptsint3Max retry attempts for one-shot tasks
error_auto_disable_thresholdint3Consecutive errors before auto-disabling a task

heartbeat

Periodic heartbeat for proactive agent checks.

heartbeat:
enabled: true
every_seconds: 300
run_on_startup: false
active_hours: []
prompt: ""
FieldTypeDefaultDescription
enabledbooltrueEnable periodic heartbeats
every_secondsint300Interval between heartbeats (seconds)
run_on_startupboolfalseRun a heartbeat immediately on startup
wake_coalesce_msint250Coalesce window for wake events (ms)
wake_retry_msint1000Retry interval for failed wake events (ms)
ack_max_charsint300Max characters for heartbeat acknowledgment
conversation_ttl_secondsint0Conversation inactivity TTL (0 = disabled)
timezonestring""Timezone for active hours (e.g., "Asia/Shanghai")
active_hourslist[]Restrict heartbeats to these hours (e.g., ["09:00-18:00"])
promptstringCustom heartbeat prompt (defaults to built-in)

server.accesskeys

Agent server and remote node authentication accesskey. Use mushroom-agent accesskey to manage this list; accesskey plaintext is shown only when created. Accesskeys do not expire automatically; revoke them with disable or delete.

server:
accesskeys:
- id: "akid_example"
name: "Kitchen Pi"
accesskey_hash: "<sha256>"
accesskey_preview: "ak_abcd...wxyz"
enabled: true
created_at: "2026-05-19T00:00:00Z"
intended_node_id: "pi-kitchen"
description: ""