Analytics Dashboard — Case Study

01 — The problem

Reporting was a monthly forensic project.

Every month, the same exercise: pull a fresh Shopify CSV, ferry it into a different sheet for each question (revenue, retention, cohorts, geography, payments, RFM), and stitch the answers into a deck. Half the month was over by the time anyone had the numbers from the half that ended.

A few earlier dashboards existed — fragile Google Sheets that recalculated for 20 minutes, a Streamlit prototype on a laptop tunnelled out through Cloudflare. Both worked. Neither was something the team could actually rely on.

02 — Approach

Parquet at rest, DuckDB at query.

The bet was simple: store everything as columnar Parquet, query it with DuckDB, ship a thin FastAPI layer and a React frontend. No data warehouse, no Snowflake bill, no ETL platform. The whole stack runs on one t3.large in Tokyo.

A nightly sync.py appends new Shopify orders to Parquet. On service start, the query cache pre-warms the heaviest views so the first user of the day doesn't pay the cold-start tax.

190k+ Orders queried

Nightly Incremental sync

<200ms Warm query latency

SHOPIFY API

Nightly cron

→

sync.py

Append to Parquet

→

DUCKDB

Pre-warmed cache

→

FASTAPI

JSON endpoints

→

Dashboard

React · Vite

03 — Modules

Eight views, one source of truth.

Each module reads from the same Parquet snapshot. No hidden joins, no second copy of the data, no "this number doesn't match the other dashboard."

Executive KPIs — revenue, AOV, units, repeat-customer rate, returning-revenue share, with deltas vs prior period.
Revenue trends — daily/weekly/monthly with channel splits and rolling averages.
India geography heatmap — state and pincode-level performance overlaid on real coordinates.
Payments — COD vs prepaid split, gateway breakdown, GoKwik reconciliation.
Product performance — SKU sales, returns rate, combo SKUs decomposed to components.
Cohort retention — triangular matrix, ageing windows, retention curves by acquisition month.
RFM segmentation — recency, frequency, monetary scoring and named segments.
Weekday patterns — hour × day heatmaps revealing dispatch and demand cycles.

04 — The migration

From a laptop and a tunnel to AWS Tokyo.

Earlier life: Streamlit on a personal laptop, exposed via a Cloudflare tunnel. The day the laptop closed, the dashboard died. Migration target: treat the dashboard like the production system it had quietly become.

Uptime

0%

Pages loaded

8 modules

Daily backup

Automated

SSO

Workspace

FastAPI DuckDB Parquet React Vite Shopify API AWS EC2 t3.large Caddy systemd Google Workspace SSO Cloudflare DNS

05 — Outcome

Numbers, on demand.

Leadership opens the dashboard, picks a date range, gets answers in milliseconds. The monthly reporting forensics ended. The pincode RTO and combo-affinity work below were only feasible because this layer existed first — they piggyback on the same Parquet snapshots and DuckDB engine.

Analytics —Shopify BI

Reporting was a monthly forensic project.

Parquet at rest, DuckDB at query.

Eight views, one source of truth.

From a laptop and a tunnel to AWS Tokyo.

Numbers, on demand.

Analytics —
Shopify BI