Skip to content

Direct API Connection

Hermes-Relay connects directly to the Hermes API Server for chat, bypassing the relay server entirely.

How It Works

Phone (HTTP/SSE) → Hermes API Server (:8642)

The app uses the Hermes /api/sessions REST API:

MethodEndpointPurpose
GET/api/sessionsList sessions
POST/api/sessionsCreate session
GET/api/sessions/{id}/messagesGet message history
POST/api/sessions/{id}/chat/streamStream chat (SSE)
PATCH/api/sessions/{id}Rename session
DELETE/api/sessions/{id}Delete session
GET/healthHealth check

Authentication

If the Hermes server is configured with API_SERVER_KEY, the app sends:

Authorization: Bearer <API_SERVER_KEY>

Most local Hermes setups don't require a key. The API key field in Settings is optional.

When provided, the key is stored in Android's EncryptedSharedPreferences using AES-256-GCM encryption backed by the Android Keystore.

SSE Streaming

Chat responses stream via Server-Sent Events with these Hermes-native event types:

EventDescriptionKey Fields
session.createdSession initializedsession_id, run_id, title?
run.startedAgent run beginssession_id, run_id, user_message (object)
message.startedAssistant message beginssession_id, run_id, message (object with id, role)
assistant.deltaText content chunksession_id, run_id, message_id, delta
tool.progressReasoning/thinking chunksession_id, run_id, message_id, delta
tool.pendingTool queued for executionsession_id, run_id, tool_name, call_id
tool.startedTool execution startedsession_id, run_id, tool_name, call_id, preview?, args
tool.completedTool finished successfullysession_id, run_id, tool_call_id, tool_name, args, result_preview
tool.failedTool execution failedsession_id, run_id, call_id, tool_name, error
assistant.completedResponse finishedsession_id, run_id, message_id, content, completed, partial, interrupted
run.completedEntire agent run finishedsession_id, run_id, message_id, completed, partial, interrupted, api_calls?
errorError occurredmessage, error
doneStream closedsession_id, run_id, state: "final"

Why Direct?

Previous versions routed chat through a WebSocket relay. Direct connection is simpler, has lower latency, and aligns with how every other Hermes frontend works (Open WebUI, ClawPort, etc.).