Restart Guard
Purpose / 目标
Safely restart gateway while preserving context and guaranteeing a post-restart report path to the user session.
安全重启网关,保留上下文,并保证重启后可主动回报到用户会话。
Trigger / 触发条件
Use this skill when the task involves OpenClaw gateway restart, watchdog recovery, or post-restart reporting.
当任务涉及 OpenClaw 网关重启、看门狗恢复、重启后回报时使用。
Natural-language triggers (must auto-run, do not ask user for script commands):
"可以重启了"
"现在重启吧"
"restart now"
"go ahead and restart"
自然语言触发(必须自动执行,不让用户手工跑脚本):
“可以重启了”
“现在重启吧”
“restart now”
“go ahead and restart”
Required Preconditions / 前置条件
openclawCLI is available.Restart config exists (
config.example.yamlorconfig/restart-guard.yaml.examplecopied to runtime path).Agent can execute shell commands.
openclawCLI 可用。重启配置文件存在(从示例拷贝到运行路径)。
agent 具备执行命令能力。
Workflow / 标准流程
0) Default behavior / 默认行为
When user expresses restart intent without specifying channel details:
Run full flow automatically via
scripts/auto_restart.py.Default
--notify-mode origin.Infer origin session key automatically (env/context/sessions), no user input required.
Auto-discover external channels and persist
effective_notify_plan.Before trigger, proactively announce disaster delivery route/channels to origin session.
After restart event arrives, net summarizes result to user.
当用户仅表达重启意图且未指定渠道细节时:
使用
scripts/auto_restart.py自动执行全流程。默认
--notify-mode origin。自动推断源会话 key(env/context/sessions),无需用户补参数。
自动发现外部渠道并写入
effective_notify_plan。触发前先在源会话预告灾难通知路由与渠道。
收到重启事件后,由 net 向用户汇总结果。
1) Discover channels and mode / 发现渠道与模式(可选)
python3 <skill-dir>/scripts/discover_channels.py --config <config-path> --json
Ask user:
notify mode (
originrecommended, orselected,all)selected channel/target if needed
询问用户:
通知模式(推荐
origin,可选selected、all)若需要,指定渠道与目标
2) Write context / 写入现场
python3 <skill-dir>/scripts/write_context.py \
--config <config-path> \
--reason "config change" \
--verify 'openclaw health --json' 'ok' \
--resume "report restart result to user"
3) Execute restart / 执行重启
Recommended one-command entry:
python3 <skill-dir>/scripts/auto_restart.py \
--config <config-path> \
--reason "config change" \
--notify-mode origin
推荐单命令入口:
python3 <skill-dir>/scripts/auto_restart.py \
--config <config-path> \
--reason "配置变更" \
--notify-mode origin
python3 <skill-dir>/scripts/restart.py \
--config <config-path> \
--reason "config change" \
--notify-mode origin \
--origin-session-key <session-key>
Selected channel mode:
python3 <skill-dir>/scripts/restart.py \
--config <config-path> \
--reason "config change" \
--notify-mode selected \
--channel telegram \
--target 726647436
4) Postcheck / 事后校验
python3 <skill-dir>/scripts/postcheck.py --config <config-path>
Contract / 契约
Event contract:
restart_guard.result.v1Required fields:
status,restart_idContext adds:
restart_idorigin_session_keynotify_modechannel_selectioneffective_notify_planstate_timestampsdiagnostics_filedelivery_status
Optional event fields:
severityfailure_phaseerror_codedelivery_attemptsdelivery_routedelivery_exhausteddiagnostics_file
Notes / 注意事项
webuiis not treated as disabled notification anymore; origin-session ACK is primary path.webui不再视为禁用通知;主路径是回发到发起会话。Verify/diagnostics commands run in strict non-shell mode.
校验/诊断命令以严格非 shell 模式执行(包含管道等 shell 元字符会被拒绝)。
For implementation-level replication details, see
ENHANCED_RESTART_IMPLEMENTATION_SPEC.md.若需按工程级标准复刻实现,请参考
ENHANCED_RESTART_IMPLEMENTATION_SPEC.md。Do not expose internal scripts/steps unless user explicitly asks for internals.
除非用户明确要求细节,否则不要向用户暴露内部脚本步骤。
Guardian uses strict success invariant:
down_detected && start_attempted && up_healthy
Guardian success requires strict invariant:
down_detected && start_attempted && up_healthy
Failure Handling / 故障处理
On timeout/failure, guardian writes local diagnostics file (
restart-diagnostics-<restart_id>.md/json), sends concise summary, and retries delivery within budget.若超时或失败,guardian 会写本地诊断文件(
restart-diagnostics-<restart_id>.md/json),发送简要摘要,并在预算内重试送达。Fixed disaster route:
origin session -> agent:main:main -> all discovered external channels.固定灾难路由:
源会话 -> agent:main:main -> 所有已发现外部渠道。Guardian exits after successful delivery or budget exhaustion; no long-lived watchdog process after disaster handling.
灾难处理结束后(送达成功或预算耗尽)guardian 必须退出,不长期驻留。