profile

Starting With Data

🟒 Python Power-Ups: Boost Your Sales Analysis in 3 Simple Steps (And Yes, the Sample Code is Included!)

Published 7 months agoΒ β€’Β 7 min read

Starting with Data

Actionable data analytics tips, tricks, and tutorials to increase your earning potential. Delivered right to your inbox every Saturday.

Python Power-Ups: Boost Your Sales Analysis in 3 Simple Steps (And Yes, the Sample Code is Included!)

Every time I speak to sales managers, they share the same complaint.

"I have so much sales data, but I don't know what to do with it."

Sound familiar?

Drowning in numbers:

  • Monthly targets
  • Lead conversion rates
  • Quarterly revenue figures

A relentless barrage of data. Impossible to keep up with.

And even if you somehow manage, interpreting that data to make meaningful decisions? A colossal task.

You've considered learning programming, and diving into Python, perhaps. But even the thought of setting up complicated software, dealing with installations, and debugging code is daunting.

It's a complete nightmare. And it doesn't feel worth all the hassle.

Never feel overwhelmed with data again

In this guide, I break down the entire process into 3 manageable steps.

The best part?

No installations.

No fuss.

Just pure, actionable steps.

Here's what's waiting for you:

  • Quickly generate insights from your sales data
  • Guide your sales team based on real numbers
  • Make proactive decisions to drive business goals

Sounds like a dream, right?

The D.A.T.A Framework

Over the past 15+ years of my data career, I've followed the same 4 steps for every new project. And I call it the D.A.T.A. Framework. Clever, I know. πŸ˜‰

  • Define: Start by defining sales objectives and the key metrics that matter.
  • Acquire: Gather the data you need. This might involve exporting data from different systems or tools. You also want to gather requirements at this step.
  • Transform: Convert raw data into a format suitable for analysis. Clean up inconsistencies, handle missing values, and structure your data to make your life easy.
  • Analyze: Use statistical tools, charts, and graphs to study trends, compare sales figures, and deduce patterns that can inform business strategies.

Don't overcomplicate it! Just keep that framework in mind (like I do) to keep things simple.

πŸ‘‰ btw if you want me to share more about this framework (and others that I use all the time) just let me know!

Okay, let's apply the D.A.T.A. Framework to solve the sales manager's problem with Python.

Remember: there are only two types of data professionals: action-takers and sideliners.

This is where you decide if you're an action-taker or a sideliner.

Use my Global Superstore Sales Analysis with Python notebook so you can follow along. A free Google Colab account is required, but you can also download the ipynb file.

Step 1: Setup and Configuration

The first step is setting everything up.

Protip: Always ensure you have the latest version of libraries to avoid compatibility issues.

# Install the latest version of the plotly library
!pip install plotly --upgrade

# Import necessary libraries
import pandas as pd             # Library for data manipulation and analysis
import plotly.express as px    # Plotly Express for simple syntax plotting
import plotly.io as pio        # To access Plotly's input-output module 

# Set the default plotly template to 'seaborn' for consistent and visually appealing plots
px.defaults.template = "seaborn"

Step 2: Load the Data and Get a Preview

Next, let's load the Superstore Sales dataset and get a preview. Understanding your data structure is the first step in any analysis. This is the Acquire step of the D.A.T.A Framework.

Rookie mistake: Not checking the first few rows of your dataset. Always inspect the initial rows to understand your data's structure.

# Load the Superstore Sales dataset

# URL pointing to the raw data file on GitHub
url = 'https://raw.githubusercontent.com/yannie28/Global-Superstore/master/Global_Superstore(CSV).csv'
# Read the CSV data from the URL into a pandas DataFrame
data = pd.read_csv(url)

# Display the first few rows of the data to understand its structure and get a quick overview
data.head()

Here's what it should look like:

Step 3: Analyze the Data

The final step for using Python to build a sales analysis is to actually create the analysis.

Now, if you've been following the D.A.T.A. Framework, this is the final two parts. I'll show you how to transform the raw CSV data into something useable and then we'll build out three different analyses, all in Python.

Analysis 1: Sales Analysis by Category

Visualizing total sales by category provides a high-level overview of where the majority of sales are coming from.

Proper labeling in your plots is essential. It makes your charts easily understandable to anyone viewing them.

# Create a bar chart using Plotly Express
fig = px.bar(
    # Group data by 'Category', sum the values, then reset the index for plotting
    data.groupby('Category').sum().reset_index(),
    # Set 'Category' as the x-axis variable
    x='Category',
    # Set 'Sales' as the y-axis variable
    y='Sales',
    # Provide a title for the chart
    title='Total Sales by Category',
    # Rename the 'Sales' label for clarity
    labels={'Sales': 'Total Sales ($)'},
    # Use the 'seaborn' theme for the chart
    template='seaborn'
)
# Update the trace to display y-values on the bars and format them as currency
fig.update_traces(texttemplate='%{y:$,.0f}', textposition='outside')
# Display the generated chart
fig.show()

Here's what you should see

Analysis 2: Monthly Sales Analysis

Breaking down sales on a monthly basis helps in understanding trends, seasonal variations, and anomalies.

Protip: Be sure to convert any date columns into a datetime datatype. Always ensure date-related operations are performed on columns of the correct datatype.

# Convert the 'Order Date' column from string format to datetime format
data['Order Date'] = pd.to_datetime(data['Order Date'])

# Extract the month and year from 'Order Date' and store it in a new column 'Order Month'
data['Order Month'] = data['Order Date'].dt.to_period('M').astype(str)

# Group the data by 'Order Month' and calculate the sum for each month, then reset the index for plotting
monthly_sales = data.groupby('Order Month').sum().reset_index()

# Create a line chart to visualize monthly sales over time
fig = px.line(monthly_sales, 
              x='Order Month',  # x-axis: Month and Year
              y='Sales',        # y-axis: Sales amount
              labels={'Sales': 'Monthly Sales ($)', 'Order Month': 'Month'},
              title='Monthly Sales Over Time')

# Update the y-axis to display values as currency (with $ prefix)
fig.update_layout(yaxis_tickprefix='$')

# Display the generated chart
fig.show()

Here's the sales data by month so you can easily spot trends:

Analysis 3: Monthly Sales Over Time by Category

Okay, now we're getting somewhere! Let's combine the insights we've gotten from the first two analyses into one where we can look at monthly trends by sales category.

# Group the data by both 'Order Month' and 'Category', 
# calculate the sum for each grouping, then reset the index for plotting
monthly_sales_by_category = data.groupby(['Order Month', 'Category']).sum().reset_index()

# Create a line chart to visualize monthly sales over time, segmented by product category
fig = px.line(monthly_sales_by_category, 
              x='Order Month',  # x-axis: Month and Year
              y='Sales',        # y-axis: Sales amount
              color='Category', # Differentiate lines by product category
              labels={'Sales': 'Monthly Sales ($)', 'Order Month': 'Month', 'Category': 'Product Category'},
              title='Monthly Sales Over Time by Category')

# Update the y-axis to display values as currency (with $ prefix)
fig.update_layout(yaxis_tickprefix='$')

# Display the generated chart
fig.show()

And here's what it should look like. Nice!!

Analysis 4: Profit vs Sales Analysis by Subcategory

Visualizing the relationship between profit and sales for each subcategory can reveal which products are the most lucrative.

More profit = more cash money!

A scatter plot is particularly useful for this type of analysis as it visually separates high-profit, high-sales products from the rest.

# Group the data by 'Sub-Category', 
# calculate the sum for each subcategory, then reset the index for plotting
subcategory_data = data.groupby('Sub-Category').sum().reset_index()

# Create a scatter plot to visualize the relationship between profit and sales for each subcategory
fig = px.scatter(subcategory_data, 
                 x='Sales',          # x-axis: Sales amount
                 y='Profit',         # y-axis: Profit amount
                 color='Sub-Category', # Differentiate points by subcategory using color
                 size='Sales',       # Vary point size based on sales amount
                 labels={'Sales': 'Total Sales ($)', 'Profit': 'Total Profit ($)'},
                 title='Profit vs Sales by Subcategory')

# Update the x and y axes to display values as currency (with $ prefix)
fig.update_layout(xaxis_tickprefix='$', yaxis_tickprefix='$')

# Display the generated scatter plot
fig.show()

And here's the profit vs. sales scatter plot (created in Python and Plotly from CSV data!)

Great job!

If you worked through those steps, you are well on your way to analyzing all sorts of data with Python.

Here's a quick recap of what you learned

  • Getting everything set up and installed properly using Google Colab
  • Loading sales data and previewing it quickly
  • Creating a category bar chart and monthly sales line chart
  • Combining categories and monthly sales to understand the data at a deeper level
  • Understanding profit vs sales by subcategory using a scatterplot

By following the D.A.T.A. Framework, we streamlined the process of deriving actionable insights from raw sales data.

Ideas for next steps

  1. Expand: What other data could you include to make your analysis even more useful? Product details? Marketing data? Sales rep data?
  2. Predict: Python has a ton of features to help you with predictive analytics use cases. This could be a game-changer in proactive decision-making.
  3. Enhance: Enhance data visualization with interactive dashboards. Tools like Dash by Plotly can be integrated with Python for this purpose.
  4. Iterate: Implement a system to act on the insights and continuously monitor the outcomes. This iterative approach ensures the sales strategy remains dynamic and responsive.

The idea for this newsletter came directly from a reader – just like you!

​Take 3 minutes to let me know what you want help with next.​

Until next time, keep exploring and happy analyzing!

Brian

PS: The final Solving with SQL cohort for 2023 (possibly EVER) starts on Oct 16. If you want to level up your SQL skills by solving real-world business problems alongside other data professionals, then you should definitely register now. I'm planning out 2024 now and this is probably the last time I'll offer cohorts for SQL only.


Whenever you're ready, there are three ways I can help

  1. Get more data analytics tips in my previous newsletter articles​
  2. Build your data analytics portfolio in 2 weeks with The Data Analytics Portfolio Playbook​
  3. Join the waitlist for Solving with SQL, my 30-day cohort-based course (final cohort for 2023)

You are receiving this because you signed up for Starting with Data, purchased one of my data analytics products, or enrolled in one of my data analytics courses. Unsubscribe at any time using the link below.

113 Cherry St #92768, Seattle, WA 98104
​Unsubscribe οΎ· Preferences​

Starting With Data

Weekly data analytics tutorials and guides. 5 minutes to read.

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Business leaders are dumb. That's what a lot of business analysts think because they create something "awesome" with a dataset and then it gets ignored. Unfortunately, these types of business analysts don't realize that leaders aren't dumb. They are just busy. They are responsible for making decisions and making them quickly. And leaders need answers (based on data) more than anything. Think about it: if they need answers and you have the skills to provide those answers... You become their...

5 months agoΒ β€’Β 5 min read

Starting with Data Actionable data analytics tips, tricks, and tutorials to increase your earning potential. Delivered right to your inbox every Saturday. Python Data Blending Made Easy: 3 Simple Steps to Combine Data Sets in Python (Even if You’re Not Sure Where to Start) Hey Reader, Have you ever been given a few different spreadsheets and had to combine them into one? Use clunky VLOOKUPs in Excel Copy and paste (hoping you don't break something) Give up and ask a more technical coworker...

5 months agoΒ β€’Β 3 min read

Starting with Data Actionable data analytics tips, tricks, and tutorials to increase your earning potential. Delivered right to your inbox every Saturday. Stop Guessing, Start Visualizing: Turn Retail Data into Decision-Making Tools with Interactive Python Dashboards (Tutorial + Code) Hey Reader, You won't get noticed unless you are solving real-world business problems. find a problem create a solution share your work land your dream job Pretty simple. The hard part is knowing where to start....

6 months agoΒ β€’Β 5 min read
Share this post