CrispyTrader / SBFB — Operator & Architecture Reference

This page is the day-to-day operator manual and a deep-dive into how the Slow Brain / Fast Brain (SBFB) system actually works. Every command on this page is the same one used by the launcher scripts and the ops CLI in this repo. When something on this page disagrees with the spec, the spec (MVP / FSD / PSD) wins.

Real money This is an algorithmic futures trading system. The architecture rule in .cursor/rules/00-architecture.mdc is non-negotiable: LLM output never reaches the OMS without (1) Pydantic schema validation, (2) sanity envelopes, and (3) the prop-firm-aware risk gate. The fast brain must never call an LLM API. The slow brain must never call the broker. On any uncertainty: degrade to last-known-good and stop opening new positions.

1. Overview & mental model

SBFB is exactly two layers and one bus between them. Get this picture and the rest of the system snaps into place.

flowchart LR subgraph SlowBrain [Slow Brain - LangGraph swarm] direction TB News["News Analyst
Haiku 4.5"] Tech["Tech Analyst
Haiku 4.5"] Composer["Thesis Composer
Sonnet 4.6"] Cross["Cross-check
OPENAI_GPT_MODEL"] Reviewer["Risk Reviewer
Sonnet 4.6"] News --> Composer Tech --> Composer Composer --> Cross Composer --> Reviewer Cross --> Reviewer end subgraph Validation [Pydantic schema and sanity envelopes and diff guard] Schema["Schema validation"] Env["Sanity envelopes"] Diff["Diff guard"] Schema --> Env --> Diff end subgraph FastBrain [Fast Brain - NautilusTrader deterministic] direction TB Bars["Databento bars"] Signal["Signal engine"] Gate["Prop-firm risk gate"] Sink["Execution sink"] Bars --> Signal --> Gate --> Sink end Reviewer --> Validation Validation --> Postgres["Postgres
theses, params_versions"] Validation --> Redis["Redis
params:active"] Redis --> FastBrain Sink -->|"paper"| Logs["Logs only"] Sink -->|"discord_advisor"| Discord["sbfb-trades channel"] Sink -->|"tradovate_auto"| TVBroker["Tradovate REST/WS"] Sink -->|"topstepx_auto"| TXBroker["TopstepX ProjectX REST
(SignalR deferred)"]

The 60-second mental model

  1. Slow Brain wakes up on a schedule (06:30 PT + top-of-hour 07:00–13:00 PT, M–F), reads news + bars, and emits a Pydantic-validated Thesis + Parameters.
  2. That object passes schema validation, hard-coded sanity envelopes (max_contracts, stop range, validity window), and a diff guard (>50% sizing change or direction flip needs a co-sign).
  3. The accepted version lands in Postgres and is published to Redis under params:active.
  4. Fast Brain consumes params:active, watches bars, runs the signal engine, and on every potential entry pushes the order through the prop-firm risk gate before the execution sink does anything.
  5. The execution sink is one of four: paper (log only), discord_advisor (Discord card + operator ack), tradovate_auto (bracketed Tradovate REST order with isAutomated: true), or topstepx_auto (bracketed TopstepX POST /api/Order/place with tick-based stopLossBracket/takeProfitBracket and customTag=auto=true). All four share the same gate and signal path.
  6. The Discord bot is the journal: trade open cards, threaded close replies, redacted error reports. In discord_advisor mode it also accepts ✅/❌ reactions from allow-listed user ids as ack/decline. It is not a control surface.
  7. The practice bot (scripts/practice_walkthrough.py) is a CLI that replays accepted theses or synthetic ORB-long theses against real Databento bars to estimate what the system would have done.

2. Prerequisites

Toolchain (Windows)

  • Docker Desktop — runs Postgres 16, Redis 7, Prometheus, Loki, Grafana, the watchdog, and the fast brain container.
  • uv (docs.astral.sh/uv) — Python env manager; the launcher refuses to run without it.
  • Python 3.12 — pinned in pyproject.toml (requires-python = ">=3.12,<3.13").
  • PowerShell — the launcher and slow-brain loop are .ps1 with .bat wrappers you can double-click.

External accounts & keys

Anthropic
Claude Sonnet 4.6 + Haiku 4.5. Key in ANTHROPIC_API_KEY.
OpenAI
Cross-family check only (OPENAI_GPT_MODEL, default gpt-5.4-mini). Key in OPENAI_API_KEY.
Databento
CME futures bars. Key in DATABENTO_API_KEY.
Discord application
Bot user + invite to your server. See §7.
Tradovate Demo
Optional until you flip FAST_BRAIN_EXECUTION_MODE=tradovate_auto. Live mode is hard-refused by code until PSD Phase 2.
TopstepX (ProjectX API)
Optional until you flip FAST_BRAIN_EXECUTION_MODE=topstepx_auto. Requires the $29/mo TopstepX API add-on; needs a username + API key. Live mode is hard-refused by code until PSD Phase 2.

Apex Do not point this system at Apex Trader Funding. Apex's compliance page prohibits all forms of automation; running SBFB there forfeits balances and closes accounts. See PSD §5.

Recommended prop-firm path

  1. MFFU Pro $50K — full automation explicitly permitted (since 2025-07-23).
  2. Topstep $50K — also allowed but the TopstepX API forbids VPS, so the fast brain has to run on your workstation.
  3. Tradeify Select — automation-friendly “Flex Policy”; pair it with FAST_BRAIN_EXECUTION_MODE=discord_advisor when the API is gated.

3. First-time bring-up

Do these once, in order, from the repo root:

Step 1 — fill in .env

Copy your real API keys into the variables defined in §12.1. Leave FAST_BRAIN_EXECUTION_MODE=paper; if you've also configured broker creds, keep TRADOVATE_ENV=demo and TOPSTEPX_ENV=demo for the first run.

Step 2 — bring up local infrastructure

docker compose up -d --wait

Expect 7 healthy services: postgres, redis, prometheus, loki, grafana, watchdog, fast-brain.

Step 3 — sync Python deps

uv sync

Step 4 — run database migrations

uv run python -m alembic upgrade head

Step 5 — smoke every external system

uv run python -m ops smoke

Expects up for postgres + redis, and configured or up for tradovate_demo / anthropic / databento. Anything down aborts the smoke. Note: the TopstepX broker is not yet in the smoke set — it ships in the same follow-up that adds SignalR realtime and ops flat dispatch for TopstepX.

Step 6 — fast-brain heartbeat (one-shot)

uv run python -m fast_brain.run --once

Boots the strategy, confirms the data feed, and exits. Used by scripts/launch.ps1 to fail fast if the fast brain can't start.

Step 7 — slow-brain pass (one-shot)

uv run python -m slow_brain.run --as-of 2026-05-08T14:30:00Z --live-llm

Replace the timestamp with a current UTC ISO timestamp. The happy-path proof is an accepted thesis; a clean rejection still proves the infrastructure works.

Done If steps 5–7 all printed green, you can move on to §4 Daily startup. The Makefile bundles a few of these: make up, make migrate, make fast-brain, make slow-brain, make smoke.

4. Daily startup

The day-to-day workflow is two double-clicks.

4.1 Bring the stack up

scripts\launch.bat

This wraps scripts\launch.ps1 with -KeepWindow so the window stays open. The launcher:

  1. Loads .env into the process environment.
  2. Verifies docker and uv are on PATH.
  3. Starts Docker Desktop if the daemon isn't reachable, with a 4-second probe and a 180-second wait.
  4. Runs docker compose up -d --wait, then uv sync, then alembic upgrade head.
  5. Runs a one-shot fast-brain heartbeat.
  6. Runs one slow-brain pass at now (skip with -NoSlowBrain).
  7. Opens Grafana at http://localhost:3000 (admin/admin).

Any failed step posts the last 60 log lines to #sbfb-errors via scripts/post_discord_error.py and aborts.

4.2 Start the hourly slow-brain loop

scripts\slow_brain_loop.bat

This wraps scripts\slow_brain_loop.ps1 and stays running all session. It fires uv run python -m slow_brain.run --as-of <utc-iso> --live-llm at these Pacific local times, Mon–Fri, holidays skipped:

Slot (PT)Equivalent (ET)Purpose
06:3009:30RTH open bias
07:0010:00top-of-hour refresh
08:0011:00top-of-hour refresh
09:0012:00top-of-hour refresh
10:0013:00top-of-hour refresh
11:0014:00top-of-hour refresh
12:0015:00last full-hour refresh
13:0016:00RTH close cycle
  • If you boot inside the 5-minute grace window after a slot, that slot fires immediately.
  • Cycle failures (missing OPENAI key, transient Anthropic 5xx, Postgres hiccup, etc.) are posted to #sbfb-errors and the loop keeps running. Missing one slot is bad; missing every subsequent slot is worse.
  • The holiday list is hardcoded for 2026–2027 in slow_brain_loop.ps1; refresh annually against CME's holiday calendar.
  • Preview the next eight slots without firing anything:
powershell -ExecutionPolicy Bypass -File scripts\slow_brain_loop.ps1 -DryRun

4.3 Verify the stack is armed

uv run python -m ops status
uv run python -m ops slow-brain-status

In advisor mode, ops status reports fast_brain_execution_mode: discord_advisor and broker_connection: n/a (discord_advisor) so the missing broker websocket isn't mistaken for a fault.

5. The Trader (fast brain)

The fast brain is a NautilusTrader strategy. It owns the hot path and is the only thing that talks to the broker.

5.1 Entry points

uv run python -m fast_brain.run
Long-running strategy loop (the one in the fast-brain docker service).
uv run python -m fast_brain.run --once
Boots, heartbeats once, exits. Used by the launcher to fail fast.
make fast-brain
Shorthand for the live mode (DATA_MODE=live).
make backtest-orb
Runs the data source in backtest mode (DATA_MODE=backtest) for ORB sweeps.

5.2 Execution modes

The single environment variable FAST_BRAIN_EXECUTION_MODE picks the sink at startup. The slow-brain output, signal generator, and risk gate are identical across modes; only the sink differs.

ModeSink classBroker actionOperator actionPersistence
paper default PaperSink log only none none
discord_advisor DiscordAdvisorSink none — bot posts a TradeAdvisoryEvent place trade in broker, react ✅/❌, or uv run python -m ops ack-fill advisories rows
tradovate_auto PSD Phase 2 TradovateSink bracketed REST order with isAutomated: true monitor orders / fills rows
topstepx_auto PSD Phase 2 TopstepxSink bracketed POST /api/Order/place with tick-based stopLossBracket / takeProfitBracket and customTag=auto=true; reconciliation polls /api/Order/searchOpen + /api/Position/searchOpen every 60 s. SignalR realtime stream is deferred. monitor orders / fills rows

5.3 Switching mode

  1. Edit .env: set FAST_BRAIN_EXECUTION_MODE to paper, discord_advisor, tradovate_auto, or topstepx_auto.
  2. If switching to tradovate_auto, also fill in the TRADOVATE_* block and confirm TRADOVATE_ENV=demo for now.
  3. If switching to topstepx_auto, fill in the TOPSTEPX_* block (USERNAME, API_KEY, TOKEN_ENCRYPTION_KEY), confirm TOPSTEPX_ENV=demo, and either set TOPSTEPX_ACCOUNT_ID or leave it blank to auto-discover via POST /api/Account/search at startup. Also point FAST_BRAIN_PROFILE_PATH at profiles/topstep_50k_combine.yaml only after attaching a dated Topstep rules PDF (the in-tree YAML is a stub).
  4. Restart only the fast brain; the rest of the stack stays up:
docker compose restart fast-brain
docker compose logs -f fast-brain

5.4 Order rules (fast brain side)

  • Bracket orders only — entry + stop + target. Market on entry, stop-market and limit on exits. No naked stops.
  • One open position per instrument in MVP; partial fills handled broker-side (Tradovate OCO; TopstepX Auto-OCO via the bracket payload).
  • Every order is tagged for automation per CME Group rules: Tradovate uses isAutomated: true; TopstepX uses customTag=thesis=…;params=…;auto=true.
  • Every fill writes to Postgres and recomputes session + account-level realized and unrealized P&L. Breach ⇒ immediate flatten + disable.
  • Reconciliation runs every 60 s: pulls open orders + positions from the broker (/orders + /positions on Tradovate; /api/Order/searchOpen + /api/Position/searchOpen on TopstepX), compares to local truth, alerts on drift.

5.5 Boot-time data feed flag

FAST_BRAIN_START_DATABENTO_LIVE=true

Set this to false if your Databento plan does not include live GLBX.MDP3. With historical-only access it raises 422 dataset_unavailable_range on boot and the fast brain crashes before it can heartbeat.

6. The Slow Brain

The slow brain is a small LangGraph swarm. Every cycle is one Python process; the loop in scripts/slow_brain_loop.ps1 runs it on a schedule.

6.1 Run it directly

uv run python -m slow_brain.run --as-of 2026-05-06T14:30:00Z              # uses VCR-recorded responses
uv run python -m slow_brain.run --as-of 2026-05-06T14:30:00Z --live-llm   # hits Anthropic + OpenAI
uv run python -m slow_brain.debrief --as-of 2026-05-06T20:30:00Z --live-llm

6.2 Agent graph

flowchart TD Begin([START]) NewsAnalyst["news_analyst
Haiku 4.5 plus Polygon news"] TechAnalyst["tech_analyst
Haiku 4.5 plus Databento bars"] Composer["thesis_composer
Sonnet 4.6 structured output"] CrossCheck["cross_check
OPENAI_GPT_MODEL"] RiskReviewer["risk_reviewer
Sonnet 4.6"] Validation["validation
schema + envelopes + diff guard"] Persist[("Postgres and Redis publish")] Begin --> NewsAnalyst Begin --> TechAnalyst NewsAnalyst -->|"NewsReport"| Composer TechAnalyst -->|"RegimeReport"| Composer Composer -->|"Thesis plus Parameters"| CrossCheck Composer --> RiskReviewer CrossCheck -->|"direction_bias"| Validation RiskReviewer -->|"approve / reject / mutate"| Validation Validation --> Persist

6.3 Schedule (per FSD §2.2)

  • 08:30 ET — pre-market bias.
  • 10:30 / 12:30 / 14:30 ET — intraday refreshes.
  • 16:30 ET — post-close debrief; writes lessons to the reflections table (bounded BM25 memory).

The PowerShell loop fires the equivalent slots in PT (06:30 + top-of-hour 07:00–13:00). Each call is one process; nothing is shared between cycles except what's persisted.

6.4 Provider strategy

news_analyst, tech_analyst
Haiku 4.5 ($1/$5 per MTok). Prompt caching saves up to 90% on repeat system prompts.
thesis_composer, risk_reviewer
Sonnet 4.6 ($3/$15 per MTok), temperature=0, structured output via Pydantic schema.
cross_check
OpenAI OPENAI_GPT_MODEL (default gpt-5.4-mini) — cross-family check on direction_bias. Disagreement coerces published params to flat.

OPENAI_GPT_MODEL defaults to gpt-5.4-mini (downgraded from the originally spec'd gpt-5.5 on 2026-05-08 for cost; both are gpt-5.x so the FSD §2.2 cross-family-diversity property is preserved). It must not be wired into the composer, risk reviewer, or any fast-brain path.

6.5 Token budget

~$5–$15/day at 4× intraday cadence with cache hits. The LLM_DAILY_BUDGET_USD env var caps spend; the swarm refuses to fire when over budget.

6.6 Re-record VCR fixtures

make rerecord-prompts
# equivalent to:
ALLOW_LIVE_LLM_RECORDING=1 uv run pytest tests/slow_brain/regression --record-mode=rewrite

7. The Discord Bot

The Discord bot is the journal, not a control surface. It publishes what the trader does (open / close / errors / health) and, in discord_advisor mode, accepts ✅/❌ reactions on its own advisory cards as ack/decline. Slash commands, DMs, and message-content reads remain forbidden.

7.1 One-time bot setup

  1. Create a Discord application at discord.com/developers/applications; name it SBFB Trader.
  2. Enable the Bot user, copy its token to DISCORD_BOT_TOKEN.
  3. OAuth2 URL Generator scopes: bot, applications.commands. Permissions: Send Messages, Embed Links, Add Reactions, Manage Messages, Read Message History, Use External Emojis.
  4. Invite the bot to your server with the generated URL.
  5. Create channels #sbfb-trades, #sbfb-errors, #sbfb-system; copy the guild + channel ids into .env.
  6. Optional: create a Discord role for CRITICAL pages and put its id in DISCORD_ROLE_CRITICAL.

7.2 Channel layout

ChannelPurpose
#sbfb-tradesOne card per trade open; closes thread under their open
#sbfb-errorsRedacted error cards; dedupe reactions and seen N× footers
#sbfb-systemReserved for system-level health (broker disconnect, etc.)
#sbfb-debriefsOptional: end-of-day debrief markdown threads (post-MVP)

7.3 Safety properties

  • Outbound only by default. No interactive components, no slash commands, no DMs. intents.message_content = False. The bot cannot read user messages.
  • Reaction-ack carve-out. When DISCORD_REACTION_ACK_ENABLED=true the bot enables intents.reactions and listens to on_raw_reaction_add. The listener gates on:
    1. the message must have been authored by the bot,
    2. the reactor's user id must appear in DISCORD_OPERATOR_USER_IDS,
    3. the emoji must be ✅ or ❌.
    Anything else is dropped and logged as discord_reaction_unauthorised.
  • Redaction before send. Every text field that reaches Discord passes through redact() (Anthropic / OpenAI / Slack / Discord / AWS keys, Bearer headers, JWTs, Postgres + Redis credentials, account numbers, emails). The redaction module is held to 100% line + branch coverage.
  • Hot path never blocks. dispatch() enqueues with put_nowait and drops on overflow; the worker is the only Discord caller.
  • Discord outage does not stop trading. The dispatcher drops on queue full and logs one discord_queue_overflow per streak.
  • Allowed mentions are pinned. Every send passes discord.AllowedMentions(everyone=False, users=False, roles=True).

7.4 Advisor-mode lifecycle (FAST_BRAIN_EXECUTION_MODE=discord_advisor)

sequenceDiagram participant FB as FastBrain participant Sink as DiscordAdvisorSink participant Bot as DiscordBot participant Op as Operator participant DB as Postgres FB->>Sink: Allow decision Sink->>Bot: TradeAdvisoryEvent amber embed prefixed ADVISORY Bot->>Bot: pre-seed check and x reactions Note over FB: pending_advisory_id slot held;
further entries reject with AdvisoryPending alt Operator acks Op->>Bot: tap check Bot->>Sink: on_reaction ack advisory_id Sink->>DB: status acked and fill_price equals signal_price Sink->>FB: release slot and mark long or short at signal price else Operator declines Op->>Bot: tap x Bot->>Sink: on_reaction decline advisory_id Sink->>DB: status declined Sink->>FB: release slot else No ack before valid_until_utc Sink->>DB: status expired Sink->>FB: release slot end Note over FB: AdvisorPositionMonitor watches bars;
when bar.high or bar.low crosses stop or target,
posts a threaded TradeClosedEvent.

7.5 Operator fallbacks

If Discord is down or you're on a locked-down workstation:

uv run python -m ops ack-fill --advisory-id <id> [--price 4321.25]
uv run python -m ops decline-advisory --advisory-id <id>

Same effect as the corresponding reaction. End of session, run:

uv run python -m ops eod-recon --csv path\to\tradeify-export.csv

Diffs the bot's ack log against the broker-side trade history. The command exits non-zero if any advisory has no broker counterpart.

7.6 Smoke tests

uv run python -m notifications.discord.smoke_test --dry-run   # CI-safe; renders to stdout
uv run python -m notifications.discord.smoke_test             # posts one of each card type

8. The Practice Bot

“Practice bot” is shorthand for scripts/practice_walkthrough.py — a CLI, not a daemon. It pulls 1-min OHLCV bars from Databento for the front-month contract and walks one or more theses bar-by-bar to estimate what the system would have done.

Conservative ceiling This is not a backtest. Fills are at the level the moment any bar touches it; slippage, commissions, and the live entry-confirmation gating are intentionally not modeled. Treat the dollar numbers as the upper bound.

8.1 Three canonical recipes

Replay yesterday's accepted theses against real bars

uv run --no-sync python -m scripts.practice_walkthrough --date 2026-05-07 --root-symbol ES

Loads every decision='accepted' thesis whose created_at falls on the target UTC date and walks each one against real Databento bars.

10-business-day synthetic ORB-long simulation

uv run --no-sync python -m scripts.practice_walkthrough --simulate-days 10 --root-symbol ES

Generates one synthetic ES opening-range-breakout long thesis per US business day, ending at yesterday (or --end-date if you pass it). Useful for eyeballing the strategy across a fortnight without needing real LLM theses.

Pin to a window, override size, and skip the risk gate

uv run --no-sync python -m scripts.practice_walkthrough \
  --simulate-days 5 \
  --window 09:30-12:00 --tz et \
  --contracts 3 \
  --no-risk-gate

Use this to stress-test a particular intraday window, or to sanity-check a sizing change without touching the live config.

8.2 Outcome labels (column outcome in the report)

LabelMeaning
take_profitBar high/low crossed the TP level first
stop_lossBar high/low crossed the SL level first (same-bar tag = stop, conservative)
time_stopHeld past exit.time_stop_min without TP/SL
expiredReached valid_until with no TP/SL/time-stop
no_entrydirection_bias='flat' — no entry would fire
outside_sessionsession=RTH but valid_from was outside RTH (use --ignore-session to override)
no_dataNo bars in [valid_from, valid_until]
gate_rejectRisk gate rejected (envelopes, blackouts, sizing risk, etc.)
gate_flattenRisk gate ordered flatten-all (e.g. dd-headroom breach)

8.3 Useful flags

--profile profiles\mffu_pro_50k.yaml
Pick the prop-firm profile the gate uses (default).
--no-risk-gate
Skip the gate entirely; raw fill simulation.
--atr-5m 10
ATR(5m) the gate uses for stop-envelope and sizing-risk math.
--raw-symbol ESM6
Override the front-month resolver (skips Databento metadata lookup).
--ignore-session
Walk the validity window even when valid_from is outside RTH.
--window 09:30-12:00 --tz et
Pin the synthetic thesis (or filter loaded theses) to an intraday window.
--end-date 2026-05-07
End date (inclusive) for --simulate-days.

Run with no arguments to see the full argparse help.

9. Daily ops runbook

Every operator command is a sub-command of the ops CLI. There is no web UI on purpose — this is the only control surface.

9.1 Status & health

uv run python -m ops status              # KILL/PAUSE flags, dd_headroom, open positions, broker conn
uv run python -m ops slow-brain-status    # most recent slow-brain cycle
uv run python -m ops smoke                # ping postgres / redis / tradovate / anthropic / databento
                                          # (TopstepX broker not yet in the smoke set)

9.2 Pause & resume

uv run python -m ops pause   # PAUSE=true: no new entries, manage existing exits
uv run python -m ops resume  # clears PAUSE

9.3 Flatten & kill 3-step confirm

uv run python -m ops flat   # flattens all open positions; KILL stays off
uv run python -m ops kill   # flattens AND sets KILL=true (refuses new orders until cleared)

Both commands force a three-step confirmation:

  1. Type the literal word FLATTEN.
  2. Confirm the printed open-position list is correct.
  3. Confirm again before any broker call goes out.

If Redis is unreachable the command refuses to claim flatten succeeded and exits non-zero.

9.4 Rollback to a prior params_version

uv run python -m ops rollback --version 42

Prints current and target params, asks once for confirmation, then republishes the target body to params:active and writes a row to the params history stream.

9.5 Advisor acks (only meaningful in discord_advisor mode)

uv run python -m ops ack-fill --advisory-id <id> [--price 4321.25]
uv run python -m ops decline-advisory --advisory-id <id>
uv run python -m ops eod-recon --csv path\to\tradeify-export.csv

eod-recon exits non-zero on any unmatched advisory or unmatched broker row. That non-zero exit is the EOD pass/fail signal during the advisor-mode 30-day evaluation.

9.6 Quick reference card

CommandEffectReversible?
ops pauseStops new entriesyes (resume)
ops resumeClears PAUSEyes
ops flatFlattens open positionsno — positions are closed
ops killFlattens + refuses new ordersrequires manual KILL clear
ops rollback --version NRepublishes a prior params_versionyes (republish another)
ops ack-fillMarks advisory as filledno — row is updated
ops decline-advisoryMarks advisory as declinedno

10. Architecture deep-dive

This section is the “why” behind the runbook. Read it once when you have time; refer back when you're debugging or considering a change.

10.1 Full system graph

Every component the operator touches, end to end. Read it as four layers stacked top to bottom: external feeds (news + bars + calendar) drop into the slow brain (an LLM swarm), which writes through validation into the state layer (Postgres for durable history, Redis for hot working state), which the fast brain consumes on every bar and pushes through the risk gate into one of four execution sinks. Operator surface (Discord, ops CLI, Grafana) and observability (Prometheus, Loki, watchdog) sit alongside the hot path; they never inject orders, only observe and gate them.

flowchart TB %% ========================================================= %% External feeds %% ========================================================= subgraph ext [External feeds] direction TB Polygon["Polygon News API"] DBLive["Databento Live
GLBX.MDP3 OHLCV-1m"] DBHist["Databento Historical
backfill to Parquet"] CMECal["CME Holiday Calendar
(slow_brain_loop hardcoded)"] end %% ========================================================= %% Slow brain (LangGraph swarm) %% ========================================================= subgraph sb [Slow brain - LangGraph swarm] direction TB SBLoop["slow_brain_loop.ps1
06:30 PT plus top-of-hour, M-F"] News["news_analyst
Haiku 4.5 + Polygon"] Tech["tech_analyst
Sonnet 4.6"] Composer["composer
Sonnet 4.6 temp=0"] Cross["cross_check
OpenAI gpt-5.x cross-family"] Reviewer["risk_reviewer
Sonnet 4.6 temp=0"] SBLoop --> News SBLoop --> Tech News --> Composer Tech --> Composer Composer --> Cross Cross --> Reviewer end Polygon --> News DBHist --> Tech DBLive -. "recent bars" .-> Tech CMECal --> SBLoop %% ========================================================= %% Validation %% ========================================================= subgraph val [Validation pipeline] direction TB Schema["Pydantic schema
frozen + extra=forbid"] Envelope["Sanity envelopes
max_contracts, stop_range, valid_until"] Diff["Diff guard
>50% sizing or direction flip needs co-sign"] Schema --> Envelope --> Diff end Reviewer --> Schema %% ========================================================= %% State layer %% ========================================================= subgraph state [State layer] direction LR PG[("Postgres
theses, params_versions,
advisories, fills, orders")] Rds[("Redis
params:active, KILL/PAUSE,
heartbeats, data:bars:ES")] end Diff -->|"INSERT theses + params_versions"| PG Diff -->|"SET params:active + XADD history"| Rds Diff -. "reject: keep last-known-good" .-> Rds DBLive -->|bars stream| Rds %% ========================================================= %% Fast brain (deterministic hot path) %% ========================================================= subgraph fb [Fast brain - deterministic hot path] direction TB Hydrator["ParameterHydrator
refreshes from Redis"] BarLoop["consume_bars
iter_live_bars from Redis"] Strategy["RiskManagedStrategy.on_signal
features lagged one bar"] Gate{"fast_brain.risk.gate.evaluate
news / envelopes / caps /
DDL / HFT / diff"} SinkDisp{Sink dispatch
FAST_BRAIN_EXECUTION_MODE} BarLoop --> Strategy Hydrator --> Strategy Strategy --> Gate Gate -->|Allow + OrderIntent| SinkDisp Gate -->|RejectWithReason| Strategy Gate -->|FlattenAll| Strategy end Rds -->|params:active| Hydrator Rds -->|data:bars:ES| BarLoop %% ========================================================= %% Execution sinks %% ========================================================= subgraph sinks [Execution sinks - one is selected at boot] direction LR Paper["PaperSink
log only"] DiscordSink["DiscordAdvisorSink
posts BUY/SELL card +
writes advisories row"] TVSink["TradovateSink"] TXSink["TopstepxSink"] end SinkDisp -->|paper| Paper SinkDisp -->|discord_advisor| DiscordSink SinkDisp -->|tradovate_auto| TVSink SinkDisp -->|topstepx_auto| TXSink %% ========================================================= %% Brokers %% ========================================================= subgraph brokers [Brokers] direction TB TVAdapter["TradovateExecutionAdapter
+ TradovateSymbology"] TVRest["Tradovate REST
/order/placeOSO
isAutomated=true"] TVWS["Tradovate WebSocket
fills, positions"] TVRecon["TradovateReconciler 60s"] TXAdapter["TopstepxExecutionAdapter
+ TopstepxSymbology
(dynamic front-month)"] TXRest["TopstepX REST
/api/Order/place
customTag=auto=true,
tick brackets"] TXRecon["TopstepxReconciler 60s
(SignalR realtime deferred)"] end TVSink --> TVAdapter --> TVRest TXSink --> TXAdapter --> TXRest TVRest <--> TVAPI[("api.tradovate.com
demo gateway")] TVWS <--> TVAPI TXRest <--> TXAPI[("api.topstepx.com
ProjectX gateway")] TVRecon --> TVRest TXRecon --> TXRest TVRecon -->|drift alert| Strategy TXRecon -->|drift alert| Strategy TVWS -->|fills| Strategy %% ========================================================= %% Operator surface %% ========================================================= subgraph op [Operator surface] direction TB Disc["Discord bot
#sbfb-trades / #sbfb-errors /
#sbfb-system / #sbfb-debriefs"] OpsCLI["ops CLI
status / pause / resume /
flat / kill / rollback /
ack-fill / eod-recon / smoke"] Grafana["Grafana
dashboards + alerts"] end DiscordSink --> Disc Strategy -. "lifecycle + errors" .-> Disc Disc -->|"reactions ack/decline"| OpsCLI OpsCLI -->|KILL / PAUSE / rollback| Rds OpsCLI -->|REST flatten| TVRest OpsCLI -->|reads heartbeat + open positions| Rds %% ========================================================= %% Observability + watchdog %% ========================================================= subgraph obs [Observability + safety net] direction LR Prom["Prometheus
/metrics scrape"] Loki["Loki + structlog
thesis_id + params_version
on every line"] WD["watchdog
EOD recon + heartbeat probe"] end Strategy --> Prom Hydrator --> Prom TVRecon --> Prom TXRecon --> Prom Strategy --> Loki Hydrator --> Loki Rds -->|heartbeat| WD WD -->|fail-safe REST flatten if heartbeat dies| TVRest Prom --> Grafana Loki --> Grafana

What the diagram does not show, on purpose: there is no path from the slow brain to the broker, and no path from the fast brain to any LLM. That asymmetry is enforced by code (see §10.6 and the architecture invariants in .cursor/rules/00-architecture.mdc). Every other arrow above is implemented in the files mapped at §12.3.

10.2 End-to-end thesis sequence

sequenceDiagram participant Loop as SlowBrainLoop participant SB as SlowBrainRun participant LLM as AnthropicAndOpenAI participant Val as Validation participant PG as Postgres participant Rds as Redis participant FB as FastBrainRun participant Gate as RiskGate participant Sink as ExecSink Loop->>SB: invoke with as_of UTC iso and live-llm flag SB->>LLM: news plus tech plus composer plus cross_check plus risk_reviewer LLM-->>SB: Thesis and Parameters as Pydantic structured output SB->>Val: validate Thesis Parameters and last_known_good alt schema or envelope fail Val-->>SB: reject and increment counters and exit else accepted Val->>PG: INSERT theses and params_versions Val->>Rds: SET params active and XADD history end Note over FB: subscribes to params active on boot FB->>FB: bar event triggers signal eval FB->>Gate: evaluate parameters account_state profile alt gate Allow Gate-->>FB: Allow FB->>Sink: route order via paper or discord_advisor or tradovate_auto or topstepx_auto else gate Reject or FlattenAll Gate-->>FB: reject or flatten end

10.3 Contract layer (condensed)

All boundary types are Pydantic v2 models with frozen=True and extra="forbid". The full definitions live in contracts/parameters.py per FSD §2.3.

class Parameters(BaseModel):
    model_config = ConfigDict(frozen=True, extra="forbid")
    instrument: Literal["ES", "MES", "NQ", "MNQ", "CL", "GC"]
    session: Literal["RTH", "ETH"] = "RTH"
    direction_bias: Direction          # "long" | "short" | "flat"
    regime: Regime                     # "trend_up" | "trend_down" | "mean_revert" | "chop"
    entry: EntryRule                   # type, ref_window_min, confirm_atr_mult, side
    exit: ExitRule                     # take_profit, stop_loss, time_stop_min, invalidation_text
    sizing: PositionSizing             # method, value, max_contracts (capped per profile)
    blackouts: RiskBlackouts           # pre/post_event_min, flat_by_local_time
    valid_from: datetime
    valid_until: datetime              # must be > valid_from; envelope caps at +6h
    max_concurrent_positions: int      # 0..5

    @model_validator(mode="after")
    def _consistency(self):
        if self.entry.side != self.direction_bias and self.direction_bias != "flat":
            raise ValueError("entry.side must match direction_bias unless flat")
        if self.valid_until <= self.valid_from:
            raise ValueError("valid_until must be > valid_from")
        return self

10.4 Validation pipeline

  1. Schema validation via Pydantic. Any failure → reject, increment schema_fail_counter, fast brain keeps running last-known-good.
  2. Sanity envelopes (hard-coded, not LLM-controlled):
    • sizing.max_contracts ≤ floor(account_dd_buffer / (atr_5m * tick_value)) — caps risk to ≤ 25% of remaining drawdown.
    • exit.stop_loss.value within [0.25 * ATR, 3 * ATR].
    • valid_until - valid_from ≤ 6 hours.
  3. Diff guard: if the new Parameters differ from the previous active by > 50% on sizing.value or flips direction, require risk_reviewer_cosign = true.
  4. Cross-check enforcement: composer's direction_bias must match the OpenAI cross-check; otherwise the published params are coerced to flat.
  5. Versioning: every accepted Parameters object is stored immutably in params_versions with a monotonically increasing version_id. Active version is published to Redis params:active. Rollback ≡ republish a prior version.

10.5 Prop-firm risk gate

Hard-coded precedence: firm rules > strategy params > LLM bias. Each prop firm is one YAML profile under profiles/. The gate loads the profile selected by FAST_BRAIN_PROFILE_PATH at boot.

# profiles/mffu_pro_50k.yaml (excerpt)
firm: MyFundedFutures
plan: Pro
account_size: 50000
drawdown_type: EOD_trailing
drawdown_amount: 1500
intraday_dll: null            # MFFU has no DLL on Pro; we self-impose 750
self_imposed_dll: 750
contract_caps:
  ES: 5
  MES: 50
news_rule:
  tier1_events: [FOMC, CPI, NFP]
  flat_window_min: 2
flat_by: "16:09 America/New_York"
auto_liq_at: "16:10 America/New_York"
consistency_rule:
  evaluation_pct: 50
  funded_pct: null
allow_overnight: false
allow_full_automation: true
source_pdf: docs/profiles/raw/mffu/2025-07-23.pdf
verified_on: 2026-05-06

The gate rejects any Parameters whose worst-case 1R loss × max_contracts exceeds min(self_imposed_dll, 0.5 * remaining_dd_buffer), plus per-firm rules for news blackouts, flat-by time, and contract caps.

10.6 Determinism rules

  • Fast brain is deterministic given inputs.
  • Use the engine's clock (self.clock.utc_now()) in strategy logic; never datetime.now().
  • Seed numpy and random in tests.
  • LLM calls always set temperature=0 for the composer + risk reviewer; never rely on it being deterministic, but minimize entropy.
  • No time.sleep in strategy event handlers.
  • Catch specific exceptions; never bare except:; reraise unknowns.
  • Decimal or tick-snapped int for prices — never raw float for money.

11. Failure modes

Two ironclad invariants On any uncertainty — schema fail, stale params past valid_until, broker disconnect, data gap > 5 s, reconciliation drift — degrade to last-known-good and stop opening new positions. Existing positions are managed by their broker-side stops. Never invent recovery. If the spec doesn't say what to do, halt and alert.

FailureDetectionRecovery
Slow-brain LLM call fails or times out LangGraph node exception Skip cycle; fast brain continues on last-known-good. After 3 consecutive failures, alert operator.
LLM produces schema-invalid output Pydantic ValidationError Reject; record failure; use last-known-good. Auto-retry once with a stricter prompt.
LLM produces “garbage but valid” (e.g. direction=long when news clearly bearish) Risk-reviewer veto + diff guard (> 50% sizing change) Reject without a co-sign from a second model.
Broker WS disconnect (Tradovate) Heartbeat missed > 5 s Auto-reconnect with exponential backoff; if positions open, attempt REST flatten as fallback.
TopstepX gateway error or 401 storm TopstepxApiError / repeated 401 after one re-auth Phase 1 has no SignalR stream — fallback is the 60 s REST reconciler in brokers.topstepx.reconciliation; persistent failure surfaces a drift alert and (per architecture failure-mode rule) the runtime degrades to last-known-good and refuses new entries.
Data feed gap Bar timestamp > 90 s old Flag stale; pause new entries; existing stops still work because they live broker-side.
Broker rejects order Reject reason in WS Log + alert; do not retry blindly — most rejects are risk-rule violations.
Postgres down Health check Switch to write-ahead-log to disk; refuse new params (read-only mode).
LLM provider outage Provider error or timeout Auto-failover to secondary provider; if both down, last-known-good with shortened valid_until.
Cross-check disagreement on direction_bias cross_check node return Coerce published Parameters to direction_bias=flat.
DD > 70% of buffer or schema-fail rate > 5%/hr Alert rule Auto KILL switch — refuses new entries; existing exits managed via broker stops; operator must clear KILL.

12. Reference

12.1 .env cheatsheet

Variable names only — never copy real values into this page or any other doc. Sensitive values (Anthropic key, Discord bot token, Tradovate creds, Tradovate token-encryption key, TopstepX API key, TopstepX token-encryption key, OpenAI key) live only in .env, which is gitignored.

Local infrastructure

DATABASE_URL
Postgres connection string. Default: postgresql+asyncpg://sbfb:sbfb_local_only@localhost:5432/sbfb.
REDIS_URL
Redis connection string. Default: redis://localhost:6379/0.

Data feeds

DATABENTO_API_KEY
Databento futures key.
DATA_MODE
live or backtest.

Slow brain (LLMs)

ANTHROPIC_API_KEY
Claude key.
ANTHROPIC_HAIKU_MODEL
Default claude-haiku-4-5.
ANTHROPIC_SONNET_MODEL
Default claude-sonnet-4-6.
OPENAI_API_KEY
OpenAI key (cross-check only).
OPENAI_GPT_MODEL
Cross-family check model. Default gpt-5.4-mini; must be gpt-5.x.
LLM_DAILY_BUDGET_USD
Hard daily cap; swarm refuses to fire over budget.

Discord bot

DISCORD_BOT_TOKEN
Bot token; held as SecretStr; never logged.
DISCORD_GUILD_ID
Server id.
DISCORD_CHANNEL_TRADES
Channel id for trade open/close cards.
DISCORD_CHANNEL_ERRORS
Channel id for redacted errors.
DISCORD_CHANNEL_SYSTEM
Channel id for system health.
DISCORD_CHANNEL_DEBRIEFS
Optional: end-of-day debriefs.
DISCORD_QUEUE_MAX_SIZE
Default 1000.
DISCORD_DEDUPE_WINDOW_SECONDS
Default 300 (5 min).
DISCORD_ROLE_CRITICAL
Optional: role to ping on CRITICAL alerts.
DISCORD_READY_TIMEOUT_SECONDS
Default 30.
DISCORD_REACTION_ACK_ENABLED
Default true; set false in CI.
DISCORD_OPERATOR_USER_IDS
Comma-separated allow-list for ✅/❌ acks.

Tradovate broker

TRADOVATE_ENV
Must be demo; live is hard-refused until PSD Phase 2.
TRADOVATE_USERNAME
Demo username.
TRADOVATE_PASSWORD
Demo password.
TRADOVATE_APP_ID
App id (crispytrader).
TRADOVATE_APP_VERSION
App version.
TRADOVATE_CID
OAuth client id.
TRADOVATE_SEC
OAuth client secret.
TRADOVATE_ACCOUNT_ID
Account id used by ops flat and the OMS.
TRADOVATE_TOKEN_ENCRYPTION_KEY
Symmetric key for token-store-at-rest.
TRADOVATE_TOKEN_STORE_PATH
Default .secrets/tradovate_tokens.enc.
RUN_TRADOVATE_DEMO_E2E
Set to 1 to opt into the live-demo E2E test (off by default).

TopstepX broker (ProjectX API)

TOPSTEPX_ENV
Must be demo; live is hard-refused until PSD Phase 2.
TOPSTEPX_USERNAME
TopstepX account username.
TOPSTEPX_API_KEY
TopstepX API key (generate from your TopstepX account settings).
TOPSTEPX_ACCOUNT_ID
Optional. Leave blank to auto-discover the first canTrade=true account via POST /api/Account/search at startup.
TOPSTEPX_TOKEN_ENCRYPTION_KEY
Fernet key for the on-disk JWT cache (generate via python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())").
TOPSTEPX_TOKEN_STORE_PATH
Default .secrets/topstepx_tokens.enc.
TOPSTEPX_BASE_URL
Default https://api.topstepx.com; override only for sandbox testing.
TOPSTEPX_RTC_URL
Default https://rtc.topstepx.com; reserved for the deferred SignalR client.

Fast brain

FAST_BRAIN_PROFILE_PATH
Risk-gate profile YAML. Default profiles/mffu_pro_50k.yaml.
FAST_BRAIN_EXECUTION_MODE
paper | discord_advisor | tradovate_auto | topstepx_auto.
FAST_BRAIN_ENABLE_TRADING
Legacy boolean; ignored when EXECUTION_MODE is set explicitly. Keep false in paper mode.
FAST_BRAIN_START_DATABENTO_LIVE
Set false if your Databento plan does not include live GLBX.MDP3.
FAST_BRAIN_DATA_STREAM
Default data:bars:ES.
FAST_BRAIN_HEARTBEAT_SECONDS
Default 5.

12.2 Local ports & URLs

ServicePortURL / notes
Postgres5432postgresql://sbfb:sbfb_local_only@localhost:5432/sbfb
Redis 6379redis://localhost:6379/0
Prometheus9090http://localhost:9090
Loki 3100internal; queried via Grafana
Grafana 3000http://localhost:3000 (admin / admin)

12.3 File map

contracts/
Pydantic v2 boundary types (Thesis, Parameters, sub-rules).
slow_brain/
LangGraph swarm: nodes/, prompts/, validation.py, run.py, debrief.py, status.py.
fast_brain/
NautilusTrader strategy, signal engine, execution sinks, risk gate.
brokers/
Per-broker integrations and the shared _token_store.py Fernet helper. brokers/tradovate/ = REST + WS + reconciliation; brokers/topstepx/ = ProjectX REST auth + dynamic symbology + bracket adapter + REST reconciler (SignalR realtime client deferred — see brokers/topstepx/realtime.py).
data/
Databento ingestion (historical.py, symbology.py) + DB engine.
profiles/
Prop-firm YAML profiles + loader + schema.
notifications/discord/
The journal bot: dispatcher, redaction, embeds, reactions, smoke test.
ops/
Operator CLI (ops/cli.py), redis flag helpers, watchdog.
scripts/
launch.ps1, slow_brain_loop.ps1, practice_walkthrough.py, post_discord_error.py, .bat wrappers.
obs/
Prometheus + Loki + Grafana provisioning + dashboards.
migrations/
Alembic migrations.
tests/
pytest, hypothesis, VCR-recorded LLM regressions.
.logs/
Local log captures from the launcher and slow-brain loop.
.secrets/
Encrypted token store(s); gitignored.
docker-compose.yml
Postgres, Redis, Prometheus, Loki, Grafana, watchdog, fast-brain.
Makefile
up, migrate, test, smoke, fast-brain, slow-brain, backfill, backtest-orb, rerecord-prompts.

12.4 Authoritative specs

  • docs/MVP.md — scope, success criteria, build cost, tech-stack rationale.
  • docs/FSD.md — functional spec: components, contract layer, risk layer, broker integration, failure modes.
  • docs/PSD.md — product/project spec: vision, NFRs, prop-firm landscape, milestones, risks, success metrics.
  • .cursor/rules/00-architecture.mdc — the hard invariants (two-layer rule, determinism, failure mode, stack lock, style).

13. Troubleshooting

Step 'Docker compose up' failed in scripts\launch.ps1

The Docker daemon isn't ready. The launcher tries to start Docker Desktop and waits up to 180 s; if that fails, open Docker Desktop manually, wait until the whale icon is steady, then re-run scripts\launch.bat.

422 dataset_unavailable_range at fast-brain boot

Your Databento plan does not include live GLBX.MDP3. Set FAST_BRAIN_START_DATABENTO_LIVE=false in .env and restart fast-brain. Backfill historical data with make backfill and use discord_advisor or paper mode until you upgrade.

TRADOVATE_ENV=live is hard-refused

Live broker mode is intentionally blocked by code until PSD Phase 2. Keep TRADOVATE_ENV=demo for the entire paper + sim period; flipping to live will raise on import.

TOPSTEPX_ENV=live is hard-refused

Same gate as Tradovate — brokers.topstepx.auth.TopstepxSettings raises LiveBrokerNotPermittedError at construction when env=live, blocking the runtime before any order can ship. Keep TOPSTEPX_ENV=demo until PSD Phase 2 is signed off.

TopstepxOrderError: TOPSTEPX_ACCOUNT_ID is unset at first order

Auto-discovery couldn't pin a tradable account. Either:

  1. Set TOPSTEPX_ACCOUNT_ID explicitly in .env if you know which account to use, or
  2. Confirm the API user has at least one account with canTrade=true — evaluation accounts in a flagged or read-only state will be skipped and the runtime halts rather than guess.

The bootstrap call lives in FastBrainRuntime._bootstrap_topstepx_account and runs once at run_forever startup.

ContractNotFoundError from TopstepX symbology

POST /api/Contract/search returned no rows for the configured instrument root (ES, MES, NQ, MNQ, etc.). Confirm the operator account actually has data entitlement for that contract; per the fast-brain rule, the runtime refuses to silently fall back to a stale or wrong contract id.

TopstepX bracket distance “rounded to zero ticks”

The Decimal stop or take-profit price was inside one tick of the latest bar close used as reference_price. Widen the strategy's brackets in contracts/parameters.py or check that the slow brain isn't emitting degenerate exits. The check lives in brokers.topstepx.adapter.compute_bracket_ticks; it dies at the boundary so a bad parameter never hits the wire.

Discord errors aren't being posted

Check, in order:

  1. DISCORD_BOT_TOKEN is set and the bot is invited to the server.
  2. The bot has Send Messages, Embed Links, Add Reactions, and Read Message History in each channel.
  3. The channel ids in .env match the actual channels (right-click → “Copy Channel ID” with developer mode on).
  4. Run uv run python -m notifications.discord.smoke_test — it posts one of each card type, so any failure isolates the broken channel.
The slow-brain loop posted to #sbfb-errors

Tail the cycle log:

Get-Content .logs\slow_brain_loop-*.log -Tail 200 | more

The most common causes are an unset OPENAI_API_KEY (cross-check fails), a transient Anthropic 5xx (retry next slot), or a Postgres hiccup (restart the postgres service). The loop keeps running; one failed slot is non-fatal.

Reaction-ack not registering
  1. Confirm DISCORD_REACTION_ACK_ENABLED=true.
  2. Confirm your Discord user id is in DISCORD_OPERATOR_USER_IDS (comma-separated).
  3. Make sure you're reacting to a message authored by the bot — reactions on other messages are dropped and logged as discord_reaction_unauthorised.
  4. Fall back to uv run python -m ops ack-fill --advisory-id <id>.
ops flat / ops kill says “Redis unreachable”

The CLI refuses to claim flatten succeeded if Redis is down. Bring Redis back up first (docker compose up -d redis), confirm with uv run python -m ops smoke, then re-run the flatten.

ops eod-recon exits non-zero

That's the alarm. Either an advisory has no broker counterpart (you didn't trade it but the bot asked you to) or there's a broker fill with no advisory (you traded outside the bot). Both are reasons to not advance to the next phase. Investigate the printed mismatches before continuing.

Slow-brain pass is rejected for cross-check disagreement

This is expected behavior, not a bug. The composer's direction_bias disagreed with the OpenAI cross-check, so the published params were coerced to flat. That's the FSD §2.2 cross-family-diversity gate doing its job — you'll see the disagreement in the Postgres theses row's decision metadata.

Practice walkthrough prints no accepted theses

Either no theses were accepted on that UTC date, or you're querying the wrong date. Check SELECT thesis_id, decision, created_at FROM theses ORDER BY created_at DESC LIMIT 20; in Postgres. Or use --simulate-days to run on synthetic theses instead.


Last reviewed against the spec: 2026-05-08. When the runbook drifts from the code, fix the code or fix this page — not both, and never silently.