E-Commerce Sales Data Project
- Bethel Okoye
- Sep 16
- 2 min read
Phase 1: Data Analysis with Python
In this phase, we use Python to load and analyze a synthetic e-commerce sales dataset. The dataset contains 20,000 transactions with columns including OrderID, CustomerID, ProductID, Quantity, Price, OrderDate, Region, and TotalAmount.
Key steps in Python:
· - Load the data using pandas.
· - Clean and preprocess the data.
· - Perform basic analysis like total sales, average order value, top-selling products, and regional performance.
Solution
1, firstly import the right python package, then import the dataset and clean it.

Basic check for data validation,


2, Performing basic analysis like total sales, average order value, top-selling products, and regional performance.



Phase 2: Data Querying with SQL
In this phase, the cleaned dataset is loaded into a relational database. SQL is used to perform queries such as:
· - Total sales per region.
· - Monthly sales trends.
· - Top 10 customers by sales.
· - Most frequently ordered products.
Solution
Firstly we select the right database to use, then do basic analyses


· Total sales per region.

· Top 10 customers by sales.

· Monthly sales trends.

·· Most frequently ordered products.

Phase 3: Data Visualization
In this phase, tools like Tableau, Power BI, or matplotlib/seaborn are used to create visualizations such as:
· - Sales trends over time.
· - Regional sales comparison.
· - Heatmaps of product popularity.
· - Customer segmentation by purchase behavior.
Solution
· - Sales trends over time.


· - Regional sales comparison.


· product popularity.


Customer segmentation by purchase behavior.


Dataset Information
The dataset used for this project is stored as a CSV file: /mnt/data/ecommerce_sales_data.csv