Data Science for Sales with Python

Source: https://unsplash.com/photos/AT77Q0Njnt0

Every data scientist, even a beginner one, knows that Python is the most popular language currently, she also knows what neural networks and support vector machines are. But what are the most popular tasks that really bring value for companies on a daily basis? What business-related skills should a successful data scientist have?

Well, the first and most important thing for a business is to sell something. Sure logistics, process-optimization, HR, and other departments are important and have metrics of their own, but if there are no sales there is no business. That’s why there are many more metrics and data associated with customers and sales than there are in any other department and that’s why the data scientists can have a major impact on analyzing sources of revenue.

In this article, I am going to go through some of the most common business problems that data science can solve, and I will do it in Python. To follow the examples you can find a mall-dataset, a movie-dataset, and a whiskey-dataset here.

Profiling: also known as behavior-description, profiling attempts to characterize the typical behavior of an individual, group, or population. An example profiling question would be: “What is the typical cell phone usage of this customer segment?”. The most common procedure here is to use Matplotlib to explore the shape of the distributions of variables and draw conclusions about them. Let’s see an example with the mall customers dataset:

We can see there is a categorical variable “Gender” that can translate to a dummy variable just using one-hot encoding like this:

As we can see in the histogram there are more female shoppers than males:

0 are females and 1 are males

Now let’s see the age pyramid:

Most typical shoppers are 30–40 years old

Finally, let’s plot a histogram of the income and spending scores:

As expected, income follows a skewed distribution

Spending distribution is more uniform than income

Basing on these conclusions, we could go on and group our customers using clustering in our preferred form (by low, average, or high income/spending, gender, age, etc). That’s what we’ll do in the following examples.

Association Discovery and Link Prediction: also known as frequent itemset mining, and market-basket analysis, these techniques attempt to find associations between customers based on transactions involving them. The most common application of this technique is used in cross-sales: “people who bought item X also bought item Y”

Let’s see an example with the movies dataset, first we load the “u.data” and the “movies” file and merge them in a single Pandas DataFrame, then we drop the timestamp column that is useless.

We have joined user ratings and movie titles in a single DataFrame. Now we can explore the average rating and number of ratings for each movie like this:

Average rating and rating count per movie

We can pivot our DataFrame and obtain a matrix with movie titles in the columns and users in the rows, doing this:

Finally, we can calculate correlations for the ratings of a movie and use them to recommend similar ones, let’s do it with Starwars for example:

Movies recommended for people who liked Starwars

Clustering and Similarity Matching: These techniques deal with grouping individuals in a population together by their similarity. They are used most commonly in customer segmentation and recommendation systems, to answer questions like do our customers form groups? What products should we offer or develop? How should our customer care teams (or sales teams) be structured?

In the example of the Whisky dataset we can see that there are 5 types of whisky:

And paying attention to the distribution of scores I see most of them are concentrated around the mean (score= 87) so I could classify whisky types in three categories: low quality(scores below 85), average quality(scores 85–90), and premium quality (scores greater than 90).

Here is my method to re-score them:

Now there are 5 types of Whisky with 3 categories of ratings

Finally, it’s time to do the clustering: I will use the elbow method to determine the right number of clusters and then implement a K-Means algorithm:

Error decreases as the number of clusters (K) increases

Looking at the elbow-graph we can see the error flattens when we use 15 groups which match exactly our expectations: 5 types x 3 categories of rating.

We can group Whisky types just like this:

Perfect, we have grouped our unsupervised Whisky data with 15 labels, now we can use this new dataset for supervised learning tasks, but that will be in my following article.

Happy coding!

Data Science for Sales with Python

Data Science for Sales with Python

Recommend

Deep Learning Use Cases: Separating Reality from Hype in Neural Networks

Kaggle Competitions Top Classification Algorithm

Python Sales Forecasting Kaggle Competition

Advanced Regression: Improve Your Predictions

Modeling Climate Change With Python

Dimensionality Reduction with Python

Data Science

How do CPUs read machine code? — 6502 part 2

6502 breadboard computer: part 1

guizero-calc

About Joyk