6

Coronavirus Time Series Line and Bar Chart

 3 years ago
source link: https://datacrayon.com/posts/statistics/data-is-beautiful/coronavirus-time-series-line-and-bar-chart/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Preamble

import numpy as np                   # for multi-dimensional containers 
import pandas as pd                  # for DataFrames
import plotly
import plotly.graph_objects as go    # for data visualisation
from plotly.subplots import make_subplots

Introduction

In this section, we're going to use daily confirmed cases data for COVID-19 in the UK made available at coronavirus.data.gov.uk to create a time series plot. Our goal will be to visualise the number of new cases and cumulative cases over time.

Terms of use taken from the data source

No special restrictions or limitations on using the item’s content have been provided.

Bunny

Visualising the Table

The first step is to read the CSV data into a pandas.DataFrame and display the first five samples.

data = pd.read_csv('https://shahinrostami.com/datasets/coronavirus-cases_latest.csv')
data.head()

areaType areaName areaCode date newCasesByPublishDate cumCasesByPublishDate 0 nation England E92000001 2020-07-22 519.0 255038 1 nation Northern Ireland N92000002 2020-07-22 9.0 5868 2 nation Scotland S92000003 2020-07-22 10.0 18484 3 nation Wales W92000004 2020-07-22 22.0 16987 4 nation England E92000001 2020-07-21 399.0 254519

Let's filter this data to only include rows where the Area name is England.

data = data[data['areaName']=='England']
data.head()

areaType areaName areaCode date newCasesByPublishDate cumCasesByPublishDate 0 nation England E92000001 2020-07-22 519.0 255038 4 nation England E92000001 2020-07-21 399.0 254519 8 nation England E92000001 2020-07-20 535.0 254120 12 nation England E92000001 2020-07-19 672.0 253585 16 nation England E92000001 2020-07-18 796.0 252913

This data looks ready to plot. We have our dates in a column named Specimen date, the new daily cases in a column named Daily lab-confirmed cases, and the daily cumulative cases in a column named Cumulative lab-confirmed cases. For this plot, we'll enable a secondary y-axis so that we can present our cumulative cases as a line, and our new cases with bars.

from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(go.Scatter(x=data['date'], y=data['cumCasesByPublishDate'],
                         mode='lines+markers',
                         name='Total Cases',
                         line_color='crimson'),
                         secondary_y=True)

fig.add_trace(go.Bar(x=data['date'], y=data['newCasesByPublishDate'],
                     name='New Cases',
                     marker_color='darkslategray'),
                     secondary_y=False)
fig.show()
Apr 2020May 2020Jun 2020Jul 2020010002000300040005000050k100k150k200k250kTotal CasesNew Cases

It's an interactive plot, so you can hover over it to get more information.

Conclusion

In this section, we went on a rather quick journey. This involved loading in the CSV data directly from a web resource, and then plotting lines and bars to the same plot.

Bunny

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK