Skip to content

Data backfill

When to use: bootstrapping a new environment with historical market data, triggering a gap-fill after an ingest outage, or auditing coverage.

gordon-data is the sole writer of market_data.*. All historical data flows through its backfill CLI — not through gordon-lab or direct SQL inserts. gordon-lab is read-only at the DB layer.

Prerequisites

  • gordon-data is running and reachable.
  • GORDON_DATABASE_URL is set.
  • For dev stack: make dev-up is running.
  • For production: the v7 stack on srv-apps is healthy (GET /healthz returns 200 on port 8081).

One-shot bootstrap

Run make dev-seed for the first-time bootstrap on a fresh dev environment. This triggers the full historical fetch for BTC+ETH:

bash
make dev-seed

First run: approximately 20 minutes for 1y of klines + funding + OI + metrics + Fear and Greed + stablecoin supply + GEX snapshot + FRED macro.

Re-runs are idempotent — only the gap since the last fetch is retrieved (ON CONFLICT DO NOTHING).

Skip stages:

bash
SKIP_GEX=1 SKIP_MACRO=1 make dev-seed

Override the window:

bash
FROM_DATE=2025-01-01 make dev-seed

gordon-data backfill CLI

For targeted backfills, gap-fills, or production use, call the gordon-data backfill CLI directly via the running container or process.

Trigger a backfill

bash
gordon-data backfill <source> trigger

Sources:

SourceWhat it fetches
klines1m spot and perp klines for all configured symbols
funding8h funding rates
open_interestHourly OI snapshots
metricsLong/short ratios
sentimentFear and Greed index + stablecoin supply ratio
macroFRED macro indicators
gexGamma exposure snapshots

Example — trigger a funding backfill:

bash
gordon-data backfill funding trigger

Coverage audit

bash
gordon-data backfill report

Prints a coverage table per source: symbol, timeframe, earliest row, latest row, gap count. Use this after an ingest outage to identify what needs to be re-fetched.

Expected output (abridged):

Source        Symbol   Earliest              Latest                Gaps
klines_spot   BTCUSDT  2024-01-01 00:00:00   2026-05-17 12:00:00   0
klines_spot   ETHUSDT  2024-01-01 00:00:00   2026-05-17 12:00:00   2
funding       BTCUSDT  2024-01-01 00:00:00   2026-05-17 08:00:00   0

Gap count > 0 means there are missing 1m bars in that symbol's window. Trigger a klines backfill to fill them; gordon-data fills gaps inline during ingest.

Job control

Active backfill jobs can be monitored and cancelled via the gordon-data REST API.

List active jobs

bash
curl -fsS http://localhost:8081/backfill/jobs | jq .

Expected output:

json
[
  {
    "id": "01JXXXXXXXXXXXXXXXXXXXXXXXXX",
    "source": "klines",
    "status": "running",
    "rows_fetched": 14400,
    "started_at": "2026-05-17T12:00:00Z"
  }
]

Cancel a job

bash
curl -X DELETE http://localhost:8081/backfill/jobs/<id>

Expected: HTTP 200 with {"status":"cancelled"}. The job stops at the next checkpoint; rows already written are not rolled back (writes are idempotent).

Verify

After a backfill completes:

bash
gordon-data backfill report

All gap counts should be 0 for the targeted source/symbol combination.

For klines specifically:

bash
docker compose exec postgres psql -U gordon -d gordon -c "
  SET search_path = market_data;
  SELECT symbol, COUNT(*) AS rows, MIN(open_time) AS earliest, MAX(open_time) AS latest
  FROM spot_klines
  GROUP BY symbol
  ORDER BY symbol;
"

Gordon — keep compounding without blowing up