Local tool routing Experimental

The big feature. The remote Hermes agent can read, write, search, execute, capture, paste, and edit on your machine — not the server — through the same WSS relay it uses for chat. The agent's brain and conversation state stay on the host; your laptop is the hands.

What the agent can do

Tools are registered in the desktop toolset. The agent sees them as normal tools alongside its usual ones — no special syntax needed, just "read my notes" or "run tsc --noEmit".

Tool	Signature	Example use
`desktop_read_file`	`(path: string, max_bytes?: number)`	"Read my notes.md and summarize."
`desktop_write_file`	`(path: string, content: string, create_dirs?: boolean)`	"Write a quick-start guide to `~/Desktop/quickstart.md`."
`desktop_patch`	`(path: string, patch: string)`	Apply a unified diff. Strict — no fuzzy matching. Interactive approval prompt in `shell`/`chat` mode.
`desktop_terminal`	`(command: string, cwd?: string, timeout?: number)`	"Run `tsc --noEmit` and tell me what's broken."
`desktop_search_files`	`(pattern: string, cwd?: string, max_results?: number, content?: boolean)`	"Find every file mentioning `DesktopToolRouter`." ripgrep with pure-Node fallback; skips `.git` / `node_modules` / `dist` / `.next` / `.cache`.
`desktop_clipboard_read`	`()`	Read the user's system clipboard. Windows / macOS / Linux (Wayland-first).
`desktop_clipboard_write`	`(text: string)`	Write text to the system clipboard.
`desktop_screenshot`	`(display?: number \| string, save_to?: string)`	Capture all monitors (default), primary (`'primary'`), or a specific display (`1` / `2` / ...). Returns base64 + dimensions, or saves to `save_to` and returns the path.
`desktop_open_in_editor`	`(path: string, line?: number, col?: number, wait?: boolean)`	Open a file in the user's editor. Detects `$VISUAL` → `$EDITOR` → `code` / `cursor` / `subl` / `nvim` / `vim` on PATH → platform fallback. Injects `-g path:line:col` for GUI editors.

All run under a 30-second AbortController ceiling enforced by the router. desktop_terminal accepts a per-call timeout (seconds, per the wire spec — converted to ms internally) that's clamped to a 10-minute maximum. desktop_screenshot has its own 10 s timeout and 50 MB cap. desktop_clipboard_* 5 s timeout and 10 MB cap.

The router heartbeats desktop.status every 30 s, advertising the full handler-name list, so the server's desktop channel knows which tools your client can service. Servers ping /desktop/_ping?tool=<name> to fail fast when a tool isn't advertised.

How it works

You pair + connect via hermes-relay (bare = shell/TUI mode by default) or hermes-relay chat.
On connect, the CLI's DesktopToolRouter attaches to the relay's desktop channel and heartbeats every 30 s with the list of advertised tools.
Hermes's Python-side desktop_tool.py handlers register with tools.registry (same pattern as android_tool.py) — the agent sees desktop_read_file as just another tool.
When the agent calls a desktop_* tool, the Python handler HTTP-POSTs to localhost:8767/desktop/<tool_name> on the host.
The relay's desktop channel forwards the call over WSS to the connected CLI.
The CLI's DesktopToolRouter dispatches to an in-process handler (fs.ts, terminal.ts, search.ts, clipboard.ts, screenshot.ts, editor.ts).
The handler runs on your machine, returns the result, and the response bubbles back: CLI → relay → Python → Hermes → agent.
Typical round-trip: 60–100 ms for a simple command.

No hermes-agent core changes. It's the same pattern the Android client uses for android_tap / android_screenshot / etc. — just swapping the bridge endpoint for a desktop one.

`desktop_open_in_editor` and interactive patches

In shell / chat modes (interactive TTY, not daemon, not piped stdin), the router carries an interactive: true flag. Two handlers use it:

desktop_open_in_editor — launches the user's editor with the file at the requested line/col. Useful for "open this for me to review" agent flows.
desktop_patch — agent-proposed patches render as ANSI-colored unified diffs (green/red/cyan, NO_COLOR/isTTY aware) on stderr, then prompt:
```
Apply patch? [y]es / [n]o / [e]dit / [r]edraw  ›
```
- y — apply the patch (strict, no fuzz).
- n — reject; agent gets a structured error.
- e — open the patch in $EDITOR and re-read on close (so you can hand-tweak before applying).
- r — redraw the diff (in case it scrolled out).

In non-interactive modes (daemon, piped stdin), desktop_patch auto-rejects with a structured reason. The daemon never silently applies an agent-proposed edit.

Native paste pipeline (alpha.13/14)

The Ctrl+A v chord and the chat REPL's /paste command share the same plumbing:

Client reads its own clipboard via captureClipboardImage() (Windows: PowerShell with -STA flag — alpha.10 fixed an MTA bug that returned null on a populated clipboard; macOS: pngpaste; Linux: wl-paste --type image/png → xclip fallback).
Validates magic bytes (PNG 89 50 4E 47 / JPEG FF D8 FF / WEBP RIFF....WEBP) to prevent content-type laundering.
POSTs the bytes to /clipboard/inbox on the relay (the new shared stageClipboardImageToInbox(url, token) helper).
In Ctrl+A v mode: types /paste\r into the PTY so the upstream Hermes TUI consumes it.
In /paste mode: stages the image with the server via the image.attach.bytes RPC; the next prompt.submit ships with the image attached.

Server-side, the fork's _enrich_with_attached_images pipeline handles multimodal payload plumbing and session-scoped image state — same path a local Hermes paste takes.

Drag-drop a file from Explorer onto Windows Terminal also works for image attach (the server's input.detect_drop recognizes the dropped path).

On your first shell or chat session per relay URL with tools enabled, you'll see a prompt:

Desktop tools are about to be exposed to the remote Hermes agent.
The agent can read/write files, run shell commands, and search your filesystem.
This is AGENT-CONTROLLED access. Only use with trusted Hermes installs.
Type 'yes' to enable, or rerun with --no-tools to disable.
>

Only yes (case-insensitive) enables. Anything else (y, no, Enter, Ctrl+C) denies.

Consent is stored per-URL in ~/.hermes/remote-sessions.json as toolsConsented: true and sticks across sessions. You won't be asked again for this relay until the URL changes or you wipe the session.

Kill-switches:

--no-tools on any subcommand suppresses the router entirely for that invocation.
Non-TTY stdin (e.g. piped invocations) fails closed — never auto-consents.
Delete the session record (or set toolsConsented: false in the file) to force re-prompt.
daemon mode fails closed without toolsConsented: true already on the record. The --allow-tools flag (only valid alongside --token) is the explicit-trust escape hatch for service-managed installs.

Safety walls

The desktop tools run in-process on your machine with your full user privileges. That's a real risk — a compromised relay or a misaligned agent could ask to rm -rf /, exfiltrate tokens, or rewrite your .ssh/config. The walls:

Consent per-URL, not per-run. Once you say yes to ws://hermes.example.com, the agent on THAT server has persistent tool access. A different URL re-prompts.
No sudo / privilege escalation. All tools inherit your shell's environment. desktop_terminal "sudo rm -rf /" requires a passwordless sudo configuration to succeed — we're not adding it.
Per-call AbortController ceiling. 30 seconds per tool call hard stop. A long-running compromise would trip this.
Handler implementations are defensive:
- desktop_read_file caps at max_bytes (default 1 MB) and truncates with a marker.
- desktop_write_file refuses to create parent dirs unless create_dirs: true is set.
- desktop_patch is strict — any hunk mismatch aborts the whole patch. No fuzzy matching. Better to fail than to corrupt. Interactive approval in shell/chat mode; auto-rejects in daemon/non-interactive.
- desktop_terminal uses bash -lc on POSIX, cmd /c on Windows — no shell injection beyond what the command itself carries (it IS the command).
- desktop_search_files skips .git / node_modules / dist / .next / .cache by default.
- desktop_clipboard_* capped at 10 MB / 5 s timeout in either direction.
- desktop_screenshot capped at 50 MB / 10 s timeout; cleans up tempfiles when not saving to a user-supplied path.
No stdin. desktop_terminal pipes /dev/null to the child — a command that reads stdin hangs up immediately rather than blocking the handler.
SIGKILL on abort/timeout. No chance for a signal handler to trap and keep running.

What we DON'T have yet (v1.0 targets):

Command allowlist / blocklist per session.
Destructive-verb confirmation modal (like the Android bridge's send_sms/call prompts).
Per-tool sandbox (e.g., restrict desktop_read_file to a project root).
Code signing (hermes-relay binary is currently unsigned).

Diagnosing routing

If the agent says "desktop_terminal is not available" or calls time out immediately:

bash

# On the server, verify the channel sees your client
ssh bailey@<host> curl -s "http://127.0.0.1:8767/desktop/_ping?tool=desktop_terminal"

Expected:

json

{
  "connected": true,
  "advertised_tools": [
    "desktop_clipboard_read",
    "desktop_clipboard_write",
    "desktop_open_in_editor",
    "desktop_patch",
    "desktop_read_file",
    "desktop_screenshot",
    "desktop_search_files",
    "desktop_terminal",
    "desktop_write_file"
  ],
  "client_status": { ... },
  "last_seen_at": 1776964298.02,
  "pending_commands": 0
}

If connected: false:

No active shell/chat/daemon session is connected. Start one.
--no-tools was used. Retry without it.
Consent was denied. Delete the session record or re-pair.

If connected: true but the agent still says the tool is missing:

The toolset isn't enabled for this Hermes session. Inside the shell, ask Hermes: "enable the desktop toolset for this session." Or add it to your Hermes config's default enabled toolsets.
The plugin wasn't loaded on the gateway. See the hermes-relay-self-setup skill — plugins.enabled in ~/.hermes/config.yaml must include hermes-relay.

Daemon mode — tools without an open shell

hermes-relay daemon runs the WSS connection + tool router headless, so the agent can reach your machine while you're in another window or VS Code or off making coffee. See Subcommands → daemon for full lifecycle/log details.

Service installers (Windows sc.exe / systemd user unit / launchd plist) ship with v1.0; until then, run from a terminal tab or wrap with your service manager of choice. Tracked in ROADMAP.md.

Pairing — must pair before tools work.
Subcommands — --no-tools flag, daemon mode, tools.list introspection.
Troubleshooting — common tool routing errors.

Local tool routing Experimental ​

What the agent can do ​

How it works ​

desktop_open_in_editor and interactive patches ​

Native paste pipeline (alpha.13/14) ​

Consent gate ​

Safety walls ​

Diagnosing routing ​

Daemon mode — tools without an open shell ​

Related ​