Skip to content

gordon-risk

Purpose

gordon-risk is the portfolio-level circuit-breaker service. It watches trading.* aggregates and market_data.* (via named views) and intervenes when portfolio-level thresholds are breached. Five independent breakers evaluate on every scheduler tick: DrawdownBreaker (peak-to-trough equity), ConnectivityBreaker (gordon-data freshness), VpinBreaker (toxic-flow kill switch via live VPIN computation), MacroBreaker (FRED macro regime), and CorrelationBreaker (cross-asset correlation density). When any breaker fires, the halt-latch is set and flatten + pause commands are issued to gordon-executor and gordon-bot respectively. The emergency-flatten endpoint supports operator-initiated flattening with a drill target of under 30 seconds end-to-end. The service never touches individual orders; per-order invariants live in gordon-executor.

Version + port + env var

FieldValue
Version3.4.1
Port8082
Env overrideGORDON_RISK_BIND_ADDR
DB rolegordon_risk
Imageghcr.io/dlepaux/gordon-risk

HTTP endpoints

Health / ops

MethodPathPurpose
GET/healthzLiveness + degraded-state probe (breaker_eval_stale, vpin_missing, vpin_stale)
GET/readyzReadiness probe
GET/metricsPrometheus metrics
GET/configRedacted config dump

Business endpoints

MethodPathPurpose
POST/emergency-flattenOperator-initiated full portfolio flatten (token-gated)
POST/risk/resumeClear halt latch after operator review (token-gated)
POST/bots/{id}/pausePause a specific bot (column-level update on bot_configs.desired_state)

NATS subjects

SubjectDirectionDurable consumer
risk.events.{breaker_lowercase}Publishes breaker state-change events
risk.commandsPublishes flatten + pause commands to executor

gordon-risk does not consume NATS subjects. It receives DB state via direct Postgres reads and emits commands via both pg-NOTIFY (risk_commands channel) and NATS.

Database access

ActionDetail
Readertrading.* — portfolio valuation, positions, bot configs
Writertrading.risk_events, trading.risk_audit_log (INSERT only)
Writertrading.bot_commands (INSERT)
Column-level updatebot_configs.desired_state only (migration 0011 GRANT)
Views usedv_klines_reader (VPIN live computation from spot klines), v_metrics_reader, v_macro_reader
DB rolegordon_risk — least-privilege, migration 0044

Does not depend on gordon-data HTTP — data contract is the Postgres schema, not the service surface.

Prometheus metrics

  • gordon_risk_breaker_evaluations_total
  • gordon_risk_breaker_fires_total
  • gordon_risk_bot_commands_issued_total
  • gordon_risk_command_write_errors_total
  • gordon_risk_operator_flatten_requests_total
  • gordon_risk_operator_resume_requests_total
  • gordon_risk_operator_pause_bot_requests_total
  • gordon_risk_flatten_crashed_during_total
  • gordon_errors_total

Alert: rate(gordon_risk_command_write_errors_total[5m]) > 0 triggers GordonRiskCommandWriteFailed (critical).

Key env vars

VariablePurpose
GORDON_RISK_BIND_ADDRHTTP bind address (default :8082)
GORDON_DATABASE_URLPostgres connection string
GORDON_BUS_NATS_URLNATS JetStream URL
GORDON_RISK_OPERATOR_TOKENToken required on /emergency-flatten and /risk/resume
GORDON_RISK_DATA_URLgordon-data healthz base URL (ConnectivityBreaker probe, default http://gordon-data:8081)

Invariants

  • Portfolio-level only. Per-order invariants live in gordon-executor. This service commands, never executes.
  • Authoritative kill switch. When flatten fires, every bot pauses + every open position closes. No exceptions.
  • Halt latch is a hard barrier. Full contract: halt-latch. Resume requires explicit POST /risk/resume.
  • Sole writer of audit tables. risk_events, risk_audit_log, and halt-latch columns on risk_state written by gordon-risk only.
  • bot_configs write is column-scoped. Risk holds a column-level UPDATE grant on desired_state only. Every write captured by bot_configs_audit_trigger tagged changed_by = CURRENT_USER.
  • Fast flatten. Drill target: <30s end-to-end. Every flatten code path is measured.
  • Single env chokepoint. Every std::env::var call goes through Config::from_env().
  • No strategy knowledge. Risk does not know what a bot is trading, only that portfolio-level bounds are breached.
  • DP-18 startup flatten-reconcile probe (shipped 2026-05-17). On every boot, reconcile_orphaned_flattens() scans risk_audit_log for flatten_requested rows that never transitioned to flatten_completed or flatten_crashed_during, and closes them with a structured audit row. Metric: gordon_risk_flatten_crashed_during_total.

Status

Phase 4 Risk: 7/8 stories done. Five circuit breakers live. Emergency flatten escalation state machine + resume endpoint shipped. Coverage 88.33% overall (safety-critical floor 85%). One story open.

Gordon — keep compounding without blowing up