Designing Durable Event Tables (Without Crying in Prod)
Event tables rot fast when every product pod ships whatever JSON felt right that sprint. Durable design reduces rework, retro fixes, and silent metric drift.
Common Failure Modes
- Mutable names (
button_clickedrenamed tocta_clicked) break historical queries. - Mixed semantic layers (session + user + anonymous in one table).
- Late-arriving mobile batches causing day splits to shift for days.
- Replays double counting revenue events.
Principles I Stick To
- Stability over clever names. Prefer
product_add_to_cartovercart_add. - One grain per table. If you need session + user, build a derived view.
- Append only. Fix upstream generation, never delete rows quietly.
- Version intentionally. If structure meaningfully changes, create
event_v2.
Schema Skeleton
CREATE TABLE events (
event_id BIGINT, -- platform generated
user_id BIGINT,
session_id BIGINT,
event_name TEXT,
event_time TIMESTAMP,
src_ts TIMESTAMP, -- raw ingestion time
received_at TIMESTAMP, -- warehouse processed time
context_json VARIANT, -- original payload
app_version TEXT,
platform TEXT,
country_code TEXT
) CLUSTER BY (event_time);
Patterns That Help
1. Late Data Bucketing: Maintain a rolling fix-up job that recomputes last 3 days aggregates only.
2. Replay Safety: Use a
primary_key(event_id) or hash of stable fields to dedupe idempotently.3. Soft Version Flag: Add
schema_version column even before you think you need it.4. Context Decoding Views: Create
events_flat_v1 with selected JSON fields for analysts— guardrails reduce regex spelunking.Data Quality Triggers
SELECT event_name,
COUNT(*) AS events,
COUNT(DISTINCT user_id) AS users,
SUM(CASE WHEN app_version IS NULL THEN 1 ELSE 0 END) AS missing_app_version
FROM events
WHERE event_time >= CURRENT_DATE - 1
GROUP BY 1
ORDER BY events DESC;
Evolving Safely
If you must change semantics (e.g., checkout_completed now fires only on payment success), freeze v1 and start emitting v2. Run a comparison report for 30 days so downstream jobs can migrate cleanly.
Durable event tables are boring by design. Boring scales.
Takeaways
- Name for clarity + future queries.
- Enforce a single grain.
- Expect late data; design for it.
- Add a schema version flag early.
- Guard replays with deterministic IDs.
Build once, extend sanely, sleep better.