Tools: SQL (SQLite) · Python · DB Browser for SQLite
Dataset: Brazilian E-Commerce by Olist (Kaggle) — 96,477 orders
Status: ✅ Complete
Analyse 96,000+ e-commerce transactions to answer key business questions on revenue, customers, delivery, and product performance.
- Data Loading — Python (Pandas + SQLite) to import 5 CSV tables
- SQL Analysis — 10 advanced queries (CTEs, Window Functions, JOINs)
- Business Insights — actionable recommendations from findings
- Total Revenue: $15,422,461 across 96,477 orders
- Avg Order Value: $153.07
- Top Category: Beauty & Health — $1,258,681 revenue
- Pareto Finding: 16 out of 73 categories drive 79% of revenue
- Peak Month: November 2017 — $1,153,528 (Black Friday effect)
- Payment: Credit card dominates — 76% of all transactions
- Delivery: São Paulo fastest — avg 8.8 days
- Late Deliveries: Alagoas worst — 23.93% late rate
- Retention: 99.9% one-time buyers — huge retention opportunity
- JOINS (INNER JOIN across 5 tables)
- CTEs (Common Table Expressions)
- Window Functions (RANK, LAG, SUM OVER, AVG OVER)
- Aggregate Functions (SUM, COUNT, AVG, ROUND)
- CASE WHEN statements
- Subqueries
- Date functions (strftime, julianday)
- Focus marketing on Beauty & Health — top revenue category
- Launch loyalty program — 99.9% one-time buyers is a big gap
- Improve delivery in Northeast — AL, MA, PI have 15-24% late rates
- Target November campaigns — Black Friday drives 53% MoM growth
- Expand credit card offers — 76% preference, highest avg order value
| File | Description |
|---|---|
queries/analysis.sql |
All 10 SQL queries |
setup.ipynb |
Python script to load CSVs into SQLite |
ecommerce.db |
SQLite database |
WITH category_revenue AS (
SELECT product_category_name AS category,
ROUND(SUM(price), 2) AS revenue
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
GROUP BY category
)
SELECT category, revenue,
ROUND(SUM(revenue) OVER (ORDER BY revenue DESC) * 100.0 /
SUM(revenue) OVER(), 2) AS cumulative_pct
FROM category_revenue
ORDER BY revenue DESC;