50

Correlating stock returns using Python, visualizing in Seaborn

 5 years ago
source link: https://www.tuicool.com/articles/hit/ru2EfeZ
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In this tutorial I'll walk you through a simple methodology to correlate various stocks against each other. We'll grab the prices of the selected stocks using the IEX api , drop them into a clean dataframe, run a correlation, and visualize our results.

(1) Import libraries, select our list of stocks to correlate

import numpy as np

import pandas as pd

#used to grab the stock prices, with IEX api

import pandas_datareader as web

from datetime import datetime

#to visualize the results

import matplotlib.pyplot as plt

import seaborn

#select start date for correlation window as well as list of tickers

start = datetime(2017, 1, 1)

symbols_list = ['AAPL', 'F', 'TWTR', 'FB', 'AAL', 'AMZN', 'GOOGL', 'GE']

(2) Pull stock prices, push into clean dataframe

#array to store prices

symbols=[]

#pull price using iex for each symbol in list defined above

for ticker in symbols_list:

r = web.DataReader(ticker, "iex", start)

# add a symbol column

r['Symbol'] = ticker

symbols.append(r)

# concatenate into df

df = pd.concat(symbols)

df = df.reset_index()

df = df[['date', 'close', 'Symbol']]

df.head()

We're now left with a table in this format:

date close symbol 2017-01-03 113.4101 AAPL 2017-01-04 113.2832 AAPL 2017-01-05 113.8593 AAPL

3) However, we want our symbols represented as columns so we'll have to pivot the dataframe:

df_pivot = df.pivot('date','symbol','close').reset_index()

df_pivot.head()

date AAL AAPL AMZN F FB GE GOOGL TWTR 0 2017-01-03 45.7092 113.4101 753.67 11.4841 116.86 30.1370 808.01 16.44 1 2017-01-04 46.1041 113.2832 757.18 12.0132 118.69 30.1465 807.77 16.86 2 2017-01-05 45.3044 113.8593 780.45 11.6483 120.67 29.9754 813.02 17.09 3 2017-01-06 45.6203 115.1286 795.99 11.6392 123.41 30.0610 825.21 17.17 4 2017-01-09 46.4792 116.1832 796.92 11.5206 124.90 29.9183 827.18 17.50

(4) Next, we can run the correlation. Using the Pandas 'corr' function to compute the Pearson correlation coeffecient between each pair of equities

corr_df = df_pivot.corr(method='pearson')

#reset symbol as index (rather than 0-X)

corr_df.head().reset_index()

del corr_df.index.name

corr_df.head(10)

symbol AAL AAPL AMZN F FB GE GOOGL TWTR AAL 1.000000 0.235239 0.226061 0.068024 0.356063 -0.332681 0.494075 0.169487 AAPL 0.235239 1.000000 0.868763 0.184501 0.911380 -0.895210 0.903191 0.781755 AMZN 0.226061 0.868763 1.000000 0.108351 0.744732 -0.937415 0.864455 0.955373 F 0.068024 0.184501 0.108351 1.000000 0.206055 -0.216064 0.189753 0.161078 FB 0.356063 0.911380 0.744732 0.206055 1.000000 -0.814703 0.900033 0.650404 GE -0.332681 -0.895210 -0.937415 -0.216064 -0.814703 1.000000 -0.882526 -0.866871 GOOGL 0.494075 0.903191 0.864455 0.189753 0.900033 -0.882526 1.000000 0.789379 TWTR 0.169487 0.781755 0.955373 0.161078 0.650404 -0.866871 0.789379 1.000000

5) Finally, we can plot a heatmap of the correlations (with Seaborn and Matplotlib) to better visualize the results:

#take the bottom triangle since it repeats itself

mask = np.zeros_like(corr_df)

mask[np.triu_indices_from(mask)] = True

#generate plot

seaborn.heatmap(corr_df, cmap='RdYlGn', vmax=1.0, vmin=-1.0 , mask = mask, linewidths=2.5)

plt.yticks(rotation=0)

plt.xticks(rotation=90)

plt.show()

py_corr_output.png

Here we can see that, as expected, the tech companies are generally pretty highly correlated (as indicated by dark green -- AMZN/GOOGL, FB/AAPL, etc). Conversely, Ford (F) and General Electric (GE) are either not correlated or negatively correlated with the rest of the group.

Hope you enjoyed the post. If you're looking to land a role in the data science/analytics space (or just looking to learn a bit more about it), feel free to sign up to receive a few DS interview questions per week .


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK