GBrain Dashboard
Browse Notes
New Note
Edit: github-backup/docs/15-operational-setup-2026-05-28.md
Cancel
Note Content (Markdown)
# Operational Setup — 2026-05-28 Live Session This document captures everything established during the first live Hermes session on 2026-05-28. It represents the authoritative record of decisions made in Slack that were not committed at the time. --- ## Deployment State - **Gateway**: Running on GCE instance `instance-20250306-165614` (us-central1-c) - **Host**: `admin@34.27.189.109` via `~/.ssh/easier_validation_gce_ed25519` - **Install path**: `/srv/easier-hermes` - **Hermes version**: v2026.5.16 - **Slack channel**: #int-agentops (`C0B7JE5QYDN`) - **Slack app**: @Easier - **Config**: `/srv/easier-hermes/data/config.yaml` --- ## Model Modes (Permanent) Three modes. Switch in Slack using plain language: "use free mode", "switch to quality", etc. | Mode | Model | Cost | Use for | |------|-------|------|---------| | **Free (default)** | `openrouter/free` | $0 | Everything. Auto-cycles free providers. | | **Budget** | `openrouter:deepseek/deepseek-v4-flash` | ~$0.10/1M tok | Daily pulse, routine ops | | **Quality** | `openrouter:anthropic/claude-sonnet-4.6` | ~$3–5/day | Hard reasoning, strategic, client-facing | **Rules:** - Free mode is the permanent default. Never switch to a paid model automatically. - If all free providers are rate-limited: fail loudly, notify Anthony, offer to temporarily use budget. - For any one-off paid capability (web search, image gen, image analysis): quote estimated cost from live OpenRouter pricing and wait for explicit approval before proceeding. - Switch back to free automatically once the paid task is done. --- ## Free Mode Configuration Applied to `/srv/easier-hermes/data/config.yaml` on 2026-05-28: ```yaml model: provider: openrouter model: "openrouter/free" fallback: "" # No automatic paid fallback auxiliary: default: { provider: openrouter, model: "openrouter/free" } title_generation: { provider: openrouter, model: "openrouter/free" } vision: { provider: openrouter, model: "openrouter/free" } compression: { provider: openrouter, model: "openrouter/free" } session_search: { provider: openrouter, model: "openrouter/free" } tools: web: false browser: false image_gen: false ``` **Note on `openrouter/free`**: The auto-router cycles through all available free models (currently GPT-OSS-120B, LFM2.5, etc.). The `:free` suffix variants (e.g. `deepseek-v4-flash:free`) are rate-limited and unreliable — use the router, not named free variants. --- ## Image Analysis in Free Mode - `image_gen` is disabled (no image generation). - Vision/image **analysis** still works via `auxiliary.vision` which uses `openrouter/free`. - If a free model can't handle an image, Hermes must quote cost and request approval before using a paid vision model. --- ## Context Window Management **Problem**: OpenRouter logs showed 100k+ token context per call (~$0.05 each on DeepSeek). **Decisions made:** 1. **Caveman memory**: Store key facts in ultra-minimal language in long-term memory (GitHub vault). - Bad: "Sure! I'd be happy to help. The issue is most likely caused by..." - Good: "Bug: auth middleware. Token expiry: < not <=" 2. **Short-term memory**: Extract only essential context for active task; don't send full docs. 3. **Auto-compression**: When context nears model limit, summarise older turns before next call. 4. **Session search**: For recurring queries, retrieve from memory rather than re-derive. 5. **Delegate**: Split large jobs into parallel sub-agents via `delegate_task`. --- ## Skills Created (2026-05-28) - **`free-mode-automation`** — Manages model switching, cost quoting, Slack workflows. Location: `/srv/easier-hermes/data/skills/free-mode-automation/` - **`caveman-memory`** — Stores critical info in ultra-minimal format. Location: `/srv/easier-hermes/data/skills/caveman-memory/` - **`project-onboarding`** — Created via self-improvement review. --- ## Vault Structure (2026-05-28) Created at `/srv/easier-hermes/vault/`: ``` vault/ index.md # Navigation map log.md # Session log (first entry: 2026-05-28 handoff) raw/synthetic/ marketing-overview-may-2026.md sales-pipeline-may-2026.md client-relationship-acme-2026-05.md fulfilment-monthly-report-may-2026.md operations-weekly-2026-05.md rd-research-log-may-2026.md briefs/coo-ai-ops-manager/ dry-run-pulse-2026-05-28.md # First COO daily pulse (format approved) evals/ coo-eval-benchmark.md # 15 eval questions ``` --- ## Cron Jobs | Job | Schedule | Status | |-----|----------|--------| | `coo-daily-pulse` | 08:00 UTC daily | Paused (format approved; resume when ready) | To resume: `hermes cron resume coo-daily-pulse` --- ## Decisions Made | Decision | Choice | |----------|--------| | Free mode default | Permanent — never auto-switch to paid | | Easier Now | Not a current concern; in-dev, hands off | | Outreach | Not a current concern right now | | Content pipeline attribution | Backlog — not yet | | Vault sync to Obsidian | Not yet decided | | Daily pulse delivery | #int-agentops, 8am UTC | | Pulse format | Approved as-is | | Model switching UI | Plain language in Slack ("use free mode" etc.) | | Paid task approval | Must quote cost + get explicit approval | | Context rot prevention | Caveman memory + short-term extraction | | Slack tables | Use Block Kit JSON, not Markdown tables | --- ## What Was NOT Committed at the Time The following were applied live on the server but not pushed to GitHub: - Live `config.yaml` changes (now reflected in `config.yaml.template`) - Skills (`free-mode-automation`, `caveman-memory`) - Vault files (index.md, log.md, synthetic notes, dry-run pulse, evals) - Cron job configuration These remain on the GCE server. They are not version-controlled. See `docs/16-hermes-git-workflow.md` for how Hermes should commit going forward.
Save Changes