Operational Setup — 2026-05-28 Live Session

This document captures everything established during the first live Hermes session on 2026-05-28. It represents the authoritative record of decisions made in Slack that were not committed at the time.

Deployment State

Gateway: Running on GCE instance instance-20250306-165614 (us-central1-c)
Host: admin@34.27.189.109 via ~/.ssh/easier_validation_gce_ed25519
Install path: /srv/easier-hermes
Hermes version: v2026.5.16
Slack channel: #int-agentops (C0B7JE5QYDN)
Slack app: @Easier
Config: /srv/easier-hermes/data/config.yaml

Model Modes (Permanent)

Three modes. Switch in Slack using plain language: "use free mode", "switch to quality", etc.

Mode	Model	Cost	Use for
Free (default)	`openrouter/free`	$0	Everything. Auto-cycles free providers.
Budget	`openrouter:deepseek/deepseek-v4-flash`	~$0.10/1M tok	Daily pulse, routine ops
Quality	`openrouter:anthropic/claude-sonnet-4.6`	~$3–5/day	Hard reasoning, strategic, client-facing

Rules: - Free mode is the permanent default. Never switch to a paid model automatically. - If all free providers are rate-limited: fail loudly, notify Anthony, offer to temporarily use budget. - For any one-off paid capability (web search, image gen, image analysis): quote estimated cost from live OpenRouter pricing and wait for explicit approval before proceeding. - Switch back to free automatically once the paid task is done.

Free Mode Configuration

Applied to /srv/easier-hermes/data/config.yaml on 2026-05-28:

model:
  provider: openrouter
  model: "openrouter/free"
  fallback: ""  # No automatic paid fallback

auxiliary:
  default: { provider: openrouter, model: "openrouter/free" }
  title_generation: { provider: openrouter, model: "openrouter/free" }
  vision: { provider: openrouter, model: "openrouter/free" }
  compression: { provider: openrouter, model: "openrouter/free" }
  session_search: { provider: openrouter, model: "openrouter/free" }

tools:
  web: false
  browser: false
  image_gen: false

Note on openrouter/free: The auto-router cycles through all available free models (currently GPT-OSS-120B, LFM2.5, etc.). The :free suffix variants (e.g. deepseek-v4-flash:free) are rate-limited and unreliable — use the router, not named free variants.

Image Analysis in Free Mode

image_gen is disabled (no image generation).
Vision/image analysis still works via auxiliary.vision which uses openrouter/free.
If a free model can't handle an image, Hermes must quote cost and request approval before using a paid vision model.

Context Window Management

Problem: OpenRouter logs showed 100k+ token context per call (~$0.05 each on DeepSeek).

Decisions made:

Caveman memory: Store key facts in ultra-minimal language in long-term memory (GitHub vault).
Bad: "Sure! I'd be happy to help. The issue is most likely caused by..."
Good: "Bug: auth middleware. Token expiry: < not <="
Short-term memory: Extract only essential context for active task; don't send full docs.
Auto-compression: When context nears model limit, summarise older turns before next call.
Session search: For recurring queries, retrieve from memory rather than re-derive.
Delegate: Split large jobs into parallel sub-agents via delegate_task.

Skills Created (2026-05-28)

free-mode-automation — Manages model switching, cost quoting, Slack workflows. Location: /srv/easier-hermes/data/skills/free-mode-automation/
caveman-memory — Stores critical info in ultra-minimal format. Location: /srv/easier-hermes/data/skills/caveman-memory/
project-onboarding — Created via self-improvement review.

Vault Structure (2026-05-28)

Created at /srv/easier-hermes/vault/:

vault/
  index.md                          # Navigation map
  log.md                            # Session log (first entry: 2026-05-28 handoff)
  raw/synthetic/
    marketing-overview-may-2026.md
    sales-pipeline-may-2026.md
    client-relationship-acme-2026-05.md
    fulfilment-monthly-report-may-2026.md
    operations-weekly-2026-05.md
    rd-research-log-may-2026.md
  briefs/coo-ai-ops-manager/
    dry-run-pulse-2026-05-28.md     # First COO daily pulse (format approved)
  evals/
    coo-eval-benchmark.md           # 15 eval questions

Cron Jobs

Job	Schedule	Status
`coo-daily-pulse`	08:00 UTC daily	Paused (format approved; resume when ready)

To resume: hermes cron resume coo-daily-pulse

Decisions Made

Decision	Choice
Free mode default	Permanent — never auto-switch to paid
Easier Now	Not a current concern; in-dev, hands off
Outreach	Not a current concern right now
Content pipeline attribution	Backlog — not yet
Vault sync to Obsidian	Not yet decided
Daily pulse delivery	#int-agentops, 8am UTC
Pulse format	Approved as-is
Model switching UI	Plain language in Slack ("use free mode" etc.)
Paid task approval	Must quote cost + get explicit approval
Context rot prevention	Caveman memory + short-term extraction
Slack tables	Use Block Kit JSON, not Markdown tables

What Was NOT Committed at the Time

The following were applied live on the server but not pushed to GitHub: - Live config.yaml changes (now reflected in config.yaml.template) - Skills (free-mode-automation, caveman-memory) - Vault files (index.md, log.md, synthetic notes, dry-run pulse, evals) - Cron job configuration

These remain on the GCE server. They are not version-controlled. See docs/16-hermes-git-workflow.md for how Hermes should commit going forward.

github-backup/docs/15-operational-setup-2026-05-28.md