Lewis NDAMBIRI | Industrial Engineer & AI/Data Science Specialist

Dual Master's in Industrial Engineering & Computer Science | Optimization, Data, Automation

View My GitHub Profile

Retail Analytics Dashboard: End-to-End SQL & ETL Project

Python PostgreSQL Power BI License

🎯 Business Problem Statement

Retail executives lack real-time, granular visibility into performance drivers - making it difficult to optimize inventory, allocate marketing spend, and retain high-value customers. Without an integrated analytics pipeline that transforms raw transactional data into actionable insights, decisions are reactive, inefficient, and often based on incomplete information.

This project solves that gap by delivering an end-to-end analytics engine that:


From Python ETL β†’ PostgreSQL β†’ Advanced SQL β†’ Power BI

This is an end-to-end analytics project demonstrating how raw transactional data can be transformed into actionable business insights using PostgreSQL, Python ETL, advanced SQL, and Power BI.

This project combines data engineering, analytical SQL, and business intelligence principles to answer key questions about revenue, products, stores, and customer behavior.

Retail Analytics Dashboard


🎯 Objective


πŸ“ Repository Structure

retail-analytics-sql/
β”œβ”€β”€ data/ # Raw synthetic CSV files generated by ETL scripts
β”œβ”€β”€ docs/ # ER diagram & Power BI dashboard screenshot
β”œβ”€β”€ etl/ # Python scripts for data generation and loading
β”œβ”€β”€ sql/ # SQL schema and business-focused queries
β”œβ”€β”€ requirements.txt # Python dependencies
└── README.md # Project documentation

πŸ—ƒ Database Schema

The project uses a star schema for OLAP queries:

The schema is defined in sql/schema.sql and enables efficient analytical queries.

ER Diagram


βš™οΈ ETL Pipeline

The ETL workflow is implemented in Python (etl/generate_data.py and etl/load_data.py):

  1. Extract: Read raw CSV data from data/.
  2. Transform:
    • Load dimension tables directly.
    • Calculate profit per transaction:
      profit = total_amount - (quantity * unit_cost)
      
    • Prepare sales_fact data for loading.
  3. Load: Insert transformed data into PostgreSQL tables (sales_fact and dimension tables).

🧠 Advanced SQL Queries

The sql/queries.sql file contains queries designed to answer real business questions:

Key Analysis

  1. Sales Overview
    • Monthly revenue, growth %, and running totals by region
    • Example:
      SELECT
          sale_month,
          SUM(total_amount) AS monthly_revenue,
          LAG(SUM(total_amount)) OVER (ORDER BY sale_month) AS prev_month_revenue
      FROM sales_fact
      GROUP BY sale_month;
      
  2. Product Analytics
    • Top products by revenue, category rollups, store-level rankings
  3. Customer Analytics
    • RFM segmentation: Recency, Frequency, Monetary value
    • Cohort retention: M0, M1, M2 analysis
  4. Store & Regional Performance
    • Revenue trends, Average Order Value (AOV), and KPI vs target
  5. KPI & Dashboard Metrics
    • Profit margins, growth %, ranking, and regional performance

Techniques used: CTEs, window functions (LAG, RANK, NTILE), ROLLUP for subtotals, and safe division (NULLIF) for production-ready queries.


πŸ“Š Key Metrics & Insights


πŸš€ Setup Instructions

Prerequisites

1. Create Database

2. Install Python Dependencies

3. Generate & Load Data

4. Run Queries

5. Visualize Metrics

πŸ“Š Insights & Use Cases

πŸ›  Technologies Used

πŸ“œ License

This project is licensed under the MIT License – see LICENSE for details. Free to use, modify, and learn from. Attribution appreciated.

πŸ’‘ This project demonstrates how to transform raw transactional data into actionable insights using data engineering, SQL analytics, and business intelligence techniques.