Features
Hermes-Relay ships in two flavors — Google Play (easy install, conservative scope) and Sideload (manual install, full feature set). Most chat, voice, and session features are identical across both. The bridge channel — where the agent reads or controls your phone — is where the two tracks diverge.
A Sideload only badge on a feature row means the feature is compiled out of the Google Play APK and is only available in the sideload build. See the Release tracks page for the full breakdown and a decision guide.
Chat & Communication
| Feature | Description |
|---|---|
| Direct API Connection | HTTP/SSE streaming to Hermes API Server |
| Voice Mode | Real-time voice conversation — sphere listens, agent speaks back via your server's configured TTS/STT providers |
| Markdown Rendering | Full markdown with syntax-highlighted code blocks |
| Reasoning Display | Collapsible extended-thinking blocks |
| Personalities | Dynamic from GET /api/config — picker, agent name on bubbles |
| Command Palette | Searchable command browser — 29 gateway commands, personalities, 90+ skills |
| Slash Commands | Inline autocomplete as you type / |
| QR Code Pairing | Scan hermes-pair QR to auto-configure connection |
| Token Tracking | Per-message usage and cost |
| Tool Progress | Configurable display — Off, Compact, or Detailed |
Bridge — Phone Control
Reading the screen
| Feature | Description | Track |
|---|---|---|
| Read what's on screen | Agent sees the active window so it can answer "what does this say?" | Both |
| Multi-window read | Sees system overlays, popups, and the notification shade — not just the foregrounded app | Both |
| Filtered node search | "Find every clickable labelled 'Save'" — precise accessibility-tree queries instead of guessing | Both |
| Per-node property lookups | Stable node IDs that can be handed back to tap/scroll, plus full property bags for resolution | Both |
| Screen change detection | Cheap screen-hash + diff so the agent can wait for a screen to actually change without polling | Both |
| Real-time UI event stream | Live accessibility-event stream for "wait until this loads" and "notice when a dialog opens" waits | Both |
| Notification triage | Agent reads incoming notifications and summarizes them for you | Both |
Acting on the phone
| Feature | Description | Track |
|---|---|---|
| Tap, type, swipe, scroll | Core UI actions with destructive-verb confirmation | Both * |
| Long-press | Context menus, text selection, widget rearranging — by coordinate or node ID | Both * |
| Drag | Rearrange icons, pull notification shade, drag map pins — point A → point B over a duration | Both * |
| Smarter tap fallbacks | Three-tier cascade handles apps that wrap labels in non-clickable parents | Both * |
| Clipboard bridge | Read and write the system clipboard from the agent side | Both * |
| System media control | Play / pause / next / previous / volume on whichever app is playing | Both * |
| Macro batching | Run a sequence of actions as one workflow without a round-trip per step | Both * |
| Raw Intent escape hatch | Send a direct Android Intent or broadcast for apps that expose deep-link actions | Both * |
| Gesture reliability under idle | Short-lived wake-lock keeps gestures landing on dim/idle screens | Both * |
| Per-app playbooks | Bundled android skill with reusable flows for common apps | Both |
| Voice → bridge intent routing | "Text Sam I'll be 10 min late" — fully hands-free | Sideload only |
Vision-driven navigation (android_navigate) | Agent looks at the screen and figures out what to tap on its own | Sideload only |
| Workflow recording | Show the agent something once, ask it to repeat the workflow later | Sideload only |
Phone utilities Sideload only
| Feature | Description |
|---|---|
| Direct SMS | Send text messages via SmsManager with send-result confirmation — no dialer bounce |
| Contact search | Look up a phone number by contact name for voice intents like "text Mom" |
| One-tap dialing | Place a call directly, with a dialer-opener fallback on Google Play |
| Location awareness | GPS last-known-location read for "where am I?" and location-scoped commands |
* On Google Play, the accessibility service is read-only — the gesture-synthesis code is compiled out for policy reasons. The action UI still surfaces but write actions silently no-op. Sideload ships the full gesture surface and the phone-utility tools above.
Bridge — Safety Rails (always on, both tracks)
| Feature | Description |
|---|---|
| Per-app blocklist | Banking, password managers, and work email default-blocked from bridge actions |
| Confirmation on destructive verbs | "Send", "pay", "delete", "transfer" always prompt before acting |
| Auto-disable on idle | Bridge turns itself off after a configurable idle period; re-enable requires biometric |
| Activity log | Every command logged with timestamp, result, and screenshot thumbnail |
| Persistent notification | One-tap kill switch in the system tray whenever bridge is on |
Session Management
| Feature | Description |
|---|---|
| Session drawer | Create, switch, rename, delete sessions |
| Auto-titles | Sessions titled from first message |
| Message history | Loads from server on session switch |
| Persistence | Last session resumes on app restart |
Analytics
| Feature | Description |
|---|---|
| Stats for Nerds | TTFT, completion times, token usage, health latency, stream rates |
| Canvas bar charts | Purple gradient charts in Settings |
UX Polish
| Feature | Description |
|---|---|
| Animated splash screen | Scale + overshoot + fade animation, hold-while-loading |
| Chat empty state | Logo + suggestion chips |
| Animated streaming dots | Pulsing 3-dot indicator during streaming |
| Haptic feedback | On send, copy, stream complete, error |
| App context prompt | Toggleable system message for mobile context |
Security
| Feature | Description |
|---|---|
| Android Keystore session storage | StrongBox-preferred, TEE fallback |
| TOFU cert pinning | Trust-on-first-use SHA-256 SPKI fingerprints |
| Bearer token auth | Optional API key authentication |
| Per-channel grants | Time-bound access for terminal/bridge channels |
Choose your track
Hermes-Relay ships as two distinct APKs from the same source tree. Pick whichever fits — or install both side-by-side, they coexist on the device.
| Feature | Google Play | Sideload |
|---|---|---|
| Chat & voice | ||
Chat with your agent Direct streaming chat over the Hermes API. No middleman. | Included | Included |
Voice mode (push-to-talk) Hold a button, speak, the agent answers out loud. | Included | Included |
Sessions, personalities, slash commands Full session browser, profile picker, searchable command palette. | Included | Included |
| Bridge — read your phone | ||
Read what is on your screen Agent can see the active screen so it can answer "what does this say?" | Included | Included |
Notification triage Agent reads incoming notifications and summarizes them for you. | Included | Included |
Calendar read "What is on my schedule today?" — read-only access to your calendar. | Included | Included |
| Bridge — control your phone | ||
Tap, type, and swipe (with confirmation) Agent can perform UI actions on your behalf. On Google Play, the accessibility service is read-only by design — write actions silently no-op so the build stays inside Play’s policy envelope. Sideload ships the full gesture surface. | Limited | Included |
Reply to messages from voice "Text Sam I will be 10 min late" — fully hands-free. Voice-routed bridge intents are sideload-only. | Not in this track | Included |
Vision-driven navigation Agent looks at the screen and figures out what to tap on its own. Vision-driven UI navigation requires the unrestricted accessibility surface and is sideload-only. | Not in this track | Included |
Workflow recording (future) Show the agent something once, ask it to repeat the workflow later. | Not in this track | Included |
| Safety rails | ||
App blocklist Banking, password managers, and work email default-blocked from bridge actions. | Included | Included |
Confirmation on destructive verbs "Send", "pay", "delete", "transfer" — always prompt before acting. | Included | Included |
Auto-disable + activity log Bridge turns itself off when idle. Every action is logged with a thumbnail. | Included | Included |
| Install & updates | ||
One-tap install Install from the Play Store with no special permissions. Sideload requires enabling "Install unknown apps" for your browser the first time. | Included | Not in this track |
Automatic updates Get new versions without thinking about it. Sideload updates are a manual download from GitHub Releases. | Included | Limited |
Both tracks build from the same Kotlin source tree. Tier 3, 4, and 6 features are compiled out of the Google Play APK at build time via Gradle product flavors — they are not present in the binary, not just hidden behind a switch.
For the full decision guide and install instructions for each, see Release tracks.
Coming Soon
| Feature | Status |
|---|---|
| Push Notifications | Future — Agent-initiated alerts |
| Memory Viewer | Future — View/edit agent memories |
| Cross-device handoff | Future — Hand a task from phone to desktop terminal session |