1

Business Intelligence Visualizations with Python — Part 2

 3 years ago
source link: https://towardsdatascience.com/business-intelligence-visualizations-with-python-part-2-92f8a8463026
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

1. Additional Plot Types

Even though these plot types are included in the second part of this series of Business Intelligence Visualizations with Python, they are not less important, as they complement the already-introduced plots. I believe you’ll find them even more interesting than basic plots!

To begin with this series, we must install required libraries:

# Imports
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D

A. Horizontal Bar Plots with error bars:

A Bar plot is a chart that presents data using rectangular bars with heights and lengths proportional to the values they represent. The basic command utilized for bar charts is plt.bar(x_values, y_values).

The additional feature involved in this plot are Error Bars, which are graphical representations of the variability of data. They’re commonly used to indicate the estimated error in a desired measure.

This time, we’ll be plotting a horizontal bar plot with the following input data:

# Input data for error bars and labels
mean_values = [1, 2, 3]
std_dev = [0.2, 0.3, 0.4]
bar_labels = ['Bar 1', 'Bar 2', 'Bar 3']
y_values = [0,1,2]

Now let’s plot the bars with the plt.barh command:

# Create bar plots
plt.yticks(y_values, bar_labels, fontsize=10)
plt.barh(y_values, mean_values, xerr=std_dev,align='center', alpha=0.5, color='red')# Labels and plotting
plt.title('Horizontal Bar plot with error', fontsize=13)
plt.xlim([0, 3.5])
plt.grid()
plt.show()
Image for post
Image for post
Sample plot — Image by Author

A variation of this plot can be made with the insertion of labels or texts to the bars. We’ll do this with the following input data:

# Input data for error bars and labels
data = range(200, 225, 5)
bar_labels = ['Bar 1', 'Bar 2', 'Bar 3']
y_values = [0,1,2,3,4]

Proceed with the plots preparation:

# Create bar plots
fig = plt.figure(figsize=(12,8))
plt.yticks(y_values, bar_labels, fontsize=15)
bars = plt.barh(y_values, data,align='center', alpha=0.5, color='orange', edgecolor='red')# Labels and plotting
for b,d in zip(bars, data):
plt.text(b.get_width() + b.get_width()*0.08, b.get_y() + b.get_height()/2,'{0:.2%}'.format(d/min(data)),ha='center', va='bottom', fontsize=12)
plt.title('Horizontal bar plot with labels', fontsize=15)
plt.ylim([-1,len(data)+0.5])
plt.xlim((125,240))
plt.vlines(min(data), -1, len(data)+0.5, linestyles='dashed')
plt.show()
Image for post
Image for post
Sample plot — Image by Author

B. Back-to-back Bar Plots:

We continue with the family of bar plots, in this case with a variation that compares two sets of data horizontally. The commands to create this plot are the same as with the horizontal bar plot, but negating values for one of the sets of data.

# Input data for both sets of data utilizing numpy arrays to negate one set:
X1 = np.array([1, 2, 3])
X2 = np.array([3, 2, 1])
y_values = [0,1,2]
bar_labels = ['Bar 1', 'Bar 2', 'Bar 3']

Now let’s plot the bars with the plt.barh command and the negation feature:

# Plot bars
fig = plt.figure(figsize=(12,8))
plt.yticks(y_values, bar_labels, fontsize=13)
plt.barh(y_values, X1,align='center', alpha=0.5, color='blue')
plt.barh(y_values, -X2,align='center', alpha=0.5, color='purple')plt.title('Back-to-back Bar Plot', fontsize=13)
plt.ylim([-1,len(X1)+0.1])
plt.grid()
plt.show()
Image for post
Image for post
Sample plot - Image by author

C. Bar Plots with height labels:

This chart is equivalent to the previous shown, with the exception that it has vertical orientation and that I’ve added height labels to have a clearer visualization of such a metric. This can be done with the command ax.text.

In addition, I introduced the method autofmt_xdate included in Matplotlib to automate the rotation of labels. Take a look at the code:

# Input information:
n_bars = [0,1,2,3]
values = [3000, 5000, 12000, 20000]
labels = ['Group 1', 'Group 2','Group 3', 'Group 4']# Create figure and plots
fig, ax = plt.subplots(figsize=(12,8))
ax.set_facecolor('xkcd:gray')
fig.patch.set_facecolor('xkcd:gray')
fig.autofmt_xdate()
bars = plt.bar(idx, values, align='center', color='peru', edgecolor='steelblue')
plt.xticks(idx, labels, fontsize=13)# Add text labels to the top of the bars
def rotate_label(bars):
for bar in bars:
height = bar.get_height()
ax.text(bar.get_x() + bar.get_width()/2., 1.05 * height,'%d' % int(height),ha='center', va='bottom', fontsize=13)# Labels and plotting
rotate_label(bars)
plt.ylim([0, 25000])
plt.title('Bar plot with Height Labels', fontsize=14)
plt.tight_layout()
plt.show()
Image for post
Image for post
Sample plot — Image by Author

D. Bar Plots with color gradients:

Let’s add some color to the equation. In the following chart, I introduce the built-in module called colormap, which is utilized to implement intuitive color schemes for the plotted parameters. First, I’ll proceed with the imports:

import matplotlib.colors as col
import matplotlib.cm as cm

Now I’ll insert sample data to plot the chart. As you can see, colormap is implemented through the ScalarMappable class which applies data normalization before returning RGBA colors from the given colormap.

To clarify the previous statement, RGBA colors are a form of digital color representation, together with HEX and HSL. HEX is the most utilized and re-known, for being a simple representation of 6-digit numbers that can create Red, Green, and Blue. An example of a Hex color representation is #123456, 12 is Red, 34 is Green and 56 is Blue. On the other hand, RGBA colors add a new factor, the alpha, which is the opacity or transparency that follows the same percentage scheme: 0% represents absolute transparency and 100% represents absolute opacity which is the way we traditionally see colors. More details in this website.

In this link to Matplotlib’s documentation you’ll find further details to the different colormaps that can be chosen. Take a look at the code to generate the plot in order to have a clearer view:

# Sample values
means = range(10,18)
x_values = range(0,8)# Create Colormap
cmap1 = cm.ScalarMappable(col.Normalize(min(means), max(means), cm.spring))
cmap2 = cm.ScalarMappable(col.Normalize(0, 20, cm.spring))# Plot bars
# Subplot 1
fig, ax = plt.subplots(figsize=(12,8))
plt.subplot(121)
plt.bar(x_values, means, align='center', alpha=0.5, color=cmap1.to_rgba(means))
plt.ylim(0, max(means) * 1.1)# Subplot 2
plt.subplot(122)
plt.bar(x_values, means, align='center', alpha=0.5, color=cmap2.to_rgba(means))
plt.ylim(0, max(means) * 1.1)
plt.show()
Image for post
Image for post
Sample plot — Image by Author

E. Bar Plots with pattern fill:

Now we’re going to add some styling to our data presentation using bar plots and pattern fills. This can be done utilizing the set_hatch command or including as an argument in the plt.bar configuration the hatch command.

# Input data:
patterns = ('-', '+', 'x', '\\', '*', 'o', 'O', '.')
mean_values = range(1, len(patterns)+1)
y_values = [0,1,2,3,4,5,6,7]# Create figure and bars
fig, ax = plt.subplots(figsize=(12,8))
bars = plt.bar(y_values,mean_values,align='center',color='salmon')
for bar, pattern in zip(bars, patterns):
bar.set_hatch(pattern)# Labeling and plotting
plt.xticks(y_values, patterns, fontsize=13)
plt.title('Bar plot with patterns')
plt.show()
Image for post
Image for post
Sample plot — Image by Author

F. Simple Heatmap:

A Heatmap is a graphical representation of data in which values are depicted by color. They make it easy to visualize complex data and understand it at a glance. The variation in color may be by hue or intensity, giving obvious visual cues to the reader about how the represented values are distributed.

In this case, the variation in color represents the number of observations clustered in a particular range of values, which is implemented with the colorbar feature of Matplotlib. Also, the plot is made with a 2-dimensional histogram, created with the command plt.hist2d.

In the code below, I create two normally-distributed variables X and Y with a mean of 0 and 5 respectively.

Image for post
Image for post

When you plot the 2D hist, you see a 2D histogram. Think about it like looking at a histogram from the “top”. In addition to that, to have a clearer understanding of the color distribution, consider that colors centered at the 2D histogram are yellowish and correspond to the highest values of the colorbar, which is reasonable since X values should peak at 0 and Y values should peak at 5.

# Input a sample of normally distributed observations centered at x=0 and y=5
x = np.random.randn(100000)
y = np.random.randn(100000) + 5# Create figure, 2D histogram and labels
plt.figure(figsize=(10,8))
plt.hist2d(x, y, bins=40)
plt.xlabel('X values - Centered at 0', fontsize=13)
plt.ylabel('Y values - Centered at 5', fontsize=13)
cbar = plt.colorbar()
cbar.ax.set_ylabel('Number of observations', fontsize=13)
plt.show()
Image for post
Image for post
Sample plot — Image by Author

G. Shadowed Pie chart:

Pie charts are used to display elements of a data set as proportions of a whole. In addition to the traditional plt.pie command, we’ll utilize the shadow=True boolean feature to bring some styling to the sliced of the pie chart.

# Create figure and plot the chart:
plt.figure(figsize=(10,8))
plt.pie((10,5),labels=('Blue','Orange'),shadow=True,colors=('steelblue', 'orange'),
explode=(0,0.15),
startangle=90,
autopct='%1.1f%%'
)
plt.legend(fancybox=True, fontsize=13)
plt.axis('equal')
plt.title('Shadowed Pie Chart',fontsize=15)
plt.tight_layout()
plt.show()
Image for post
Image for post
Sample plot — Image by Author

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK