FreeOSBot is not a separate AI. It is the DevOps persona of Phoenix Daemon, pre-shipped with the right Postgres role, the right MCP allowlist, the right escalation policy, and the right safety envelope for regulated on-prem operations. ShadowOps is its edge extension: per-node Watchmen agents that pre-classify the log firehose so a single DevOps persona can supervise hundreds of hosts without saturating.
Persistent multi-persona AI operations platform. One engine, five-layer brain, eleven-stage cognitive pipeline, four-tier action gate, hard persona severance.
Phoenix's DevOps persona shipped as a turnkey package: pre-built persona YAML, Postgres role, MCP allowlist (k8s · Vault · ArgoCD · Wazuh · Helm · Trivy · Loki · Prometheus · Grafana · Dagster · Git · Shell), escalation contract, and safety envelope for healthcare-grade infrastructure.
Per-node agent (DaemonSet on k8s, systemd unit on bare-metal) that tails journalctl + docker events + kubelet locally. Pre-classifies log lines through a deterministic regex tier and an optional local Ollama tier. Sends only structured envelopes to FreeOSBot — never raw log text.
FreeOSBot does not reinvent observability or replace your AI assistant. It plugs into Phoenix's existing severance model and uses ShadowOps to scale fan-in past the point where a centralised log tail breaks.
Phoenix is the long-running daemon: brain, pipeline, gate, severance. It runs every persona — not just DevOps. Memory, audit, drift watchdog and constitutional core are all upstream.
Same engine, same brain, same safety. Difference is that FreeOSBot is exactly the DevOps persona YAML — pre-shipped with the operator contract, escalation policy and toolset that regulated on-prem clusters actually need.
An edge fleet that turns the log firehose into a manageable structured stream. Watchmen pre-filter, optionally pre-classify with a small local model, and emit envelopes — never raw text — to FreeOSBot.
correlation_id
FreeOSBot ships personas/devops.yaml with a healthcare-friendly default contract: escalation.assigned_person.name required before any Tier-1+ autonomous execution; auto_discover_sources seeded for k3s + Helm + ArgoCD + dpkg + npm + pip; security_feeds.yaml seeded with NVD, GitHub Advisory, kernel security list, OSS licence press; and a default-deny outbound allowlist that opens only the channels you configure. The tool catalogue is the same 279 tools the engine ships, gated to this one persona by the YAML overlay. Add a persona — by editing YAML, not by building a separate product.
A two-tier architecture. Watchmen at the edge handle the >95% of log lines that are deterministic. Only structured envelopes — never raw logs — fan in to FreeOSBot. The DevOps persona then applies Phoenix's full safety model to anything that needs a decision.
Regex match against seeds/log_patterns.yaml. CrashLoopBackOff → restart (node-scoped). OOMKilled → queue a GitOps PR template. Disk pressure → prune + escalate. Noise → drop.
Sub-millisecond. No LLM. No network. Audited locally; aggregate digest emitted hourly to FreeOSBot for visibility.
Per-node 4B-class model — single-token {remediate|escalate|ignore} verdict + confidence. Only fires when Tier 0 confidence is below threshold. Routed locally via bifrost_llm; bypasses Bifrost circuit breaker; no token accounting.
Feature-flagged · default off in Phase 1. Killed and disabled if node memory pressure spikes — fail-safe to Tier 0 only.
Stable schema. correlation_id, cluster_id, severity, category, auto_remediation, evidence (≤ 4 KB log excerpt). Never raw log text.
Phoenix admission ingress accepts the envelope. Pipeline runs RECON, memory probe, action gate, plan mode. Operator escalation routed per escalation.policy: page · escalate-1h · daily digest · hourly digest.
Without ShadowOps, the log firehose breaks any centralised AI assistant — economically and architecturally. With ShadowOps, the same DevOps persona stays under load and inside the Phoenix safety model.
Watchmen drop the noise locally and self-remediate the auto-fixable patterns — pod restarts, journal pruning, log rotation — within a contract that mechanically refuses any action outside the node. Only the ~1% that needs a real decision becomes an envelope. FreeOSBot then applies the full Phoenix pipeline: RECON pulls last-deploy diff, memory probe cites prior incidents, the action gate scales approval to blast radius, plan mode auto-triggers at three or more mutating steps. Operator escalation stays inside the existing telegram / email / PagerDuty contract.
Two columns, no overlap. The left is what every Phoenix persona ships with — and FreeOSBot inherits unchanged. The right is what FreeOSBot and ShadowOps add on top.
PHOENIX_BUDGET_HARD=on.FOR UPDATE SKIP LOCKED single-claim event dispatch.shell:* calls intercepted before MCP routing — no JSON-RPC overhead, no 30 s MCP hangs.PHOENIX_DRIFT_K8S_TARGETS; opens draft PRs with a reconcile checklist.pull_request open / sync · deterministic diff + memory probe + scout + comment.escalation.assigned_person is blank — fail-closed.payment-api goes into CrashLoopBackOff.A node-local Watchman tails journalctl on every host. The on-call SRE is asleep. Here is exactly what the next 90 seconds look like, end to end. Every line is something FreeOSBot + ShadowOps actually do today, not roadmap.
Per-node ShadowOps Watchman classifies the log line against seeds/log_patterns.yaml. Severity HIGH, category crash, blast-radius node-scope. Watchman attempts the node-scoped remediation it is allowed to take.
Container crashes again 12 s after restart. Watchman blast-radius is bounded to the node — it cannot roll back, cannot touch ArgoCD, cannot read Vault. It builds a v1 envelope and POSTs it to FreeOSBot's admission ingress.
envelope.correlation_id = 01HK… envelope.category = crash envelope.auto_remediation = { attempted: true, outcome: failed } envelope.evidence.log_excerpt = <last 20 lines>Phoenix admission control accepts the envelope. Sticky router hashes correlation_id → replica_id. All follow-up events for this incident will land on the same replica — no cross-replica chatter, no double-claim.
Investigative intent triggers RECON. Up to 12 iterations / 90 s, read-only enforced at two layers (text filter + subprocess permission drop). RECON pulls last 500 log lines, the last deploy diff, last 3 incidents involving payment-api, current cert chain, downstream dependencies. Memory probe surfaces [INC-2026-04-22-A4F1] — same crash signature, fixed by rolling back to :v3.4.1.
Diagnosis: image :v3.4.2 shipped 41 minutes ago changes the JWT validation library; the same memory entry from April flags this as the recurring failure mode. Recommended action: rollback to :v3.4.1 via kubectl set image. Cumulative tier ≥ 4 → plan mode triggers and persists to Postgres.
Memory probe also surfaces a standing common-sense entry: "always require operator approval for production rollbacks, even when previously successful." FreeOSBot sends a Telegram with diagnosis, evidence bundle, plan, and a request for a single-use token via the separate approval channel.
→ telegram(@on-call): 🚨 prod-eu-1 / payment-api crash plan: rollback :v3.4.2 → :v3.4.1 cite: INC-2026-04-22-A4F1 (same fix worked, 18d ago) reply with: APPROVE TOKEN-01HK… or DECLINEOperator wakes briefly, taps approve. Token verified, marked used, cannot be replayed. FreeOSBot records who approved, when, and against which correlation id — to the audit log, not just to a chat scrollback.
Tier 2: semi-reversible. FreeOSBot issues kubectl set image deploy/payment-api …:v3.4.1. Snapshot of the previous deploy is preserved. Operator can /cancel <corr_id> from any replica during the next 60 seconds; Redis pub/sub reaches replica B and ACK-confirms.
New pod is Ready. No undo invoked. Audit row written: actor, action, tier, evidence hash, correlation id, approver token reference. Permanent, append-only. The nightly consolidator marks the resolution as ADR-eligible — promoted to accepted after 7 days unless revoked.
Structured incident report on the operator's desk. Timeline, root cause cited from prior memory, what changed in the deploy, what was rolled back, recommended long-term fix (pin the JWT library version in the base image), who needs to know (the platform team committed :v3.4.2). All cross-referenced to the relevant DEC and PLAN ids.
FreeOSBot replies in 30 seconds. Cites INC-2026-04-22-A4F1 and INC-2026-05-10-B7E3. Quotes the operator's standing rollback policy. Surfaces the long-term fix that was filed as an ADR. The new SRE is productive on day one against an institution that did not have to reconvene a war room. This is the compounding return chatbots structurally cannot offer.
Start with a free 30-day pilot — we deploy a single Watchman into your cluster, read-only, no remediation. Convert when you're convinced. The platform is AGPL v3 — you keep it whether we keep working together or not, and hyperscalers cannot fork it into a closed managed service.
Whether you're scoping a pilot, evaluating a multi-cluster federation, or just want to understand how a DevOps persona under Phoenix actually works — we're happy to have that conversation.