13

Building Models with Keras

 4 years ago
source link: https://mc.ai/building-models-with-keras/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Keras is a high-level API for building neural networks in python. The API supports sequential neural networks, recurrent neural networks, and convolutional neural networks. It also allows for easy and fast prototyping due to its modularity, user-friendliness, and extensibility. In this post, we will walk through the process of building sequential neural networks for regression and classification tasks using Keras. The documentation for Keras can be found here .

Let’s get started!

REGRESSION

DATA PREPARATION

The data we will be using for regression is the California Housing Price dataset. The data can be found here .

First, let’s import the data and print the first five rows:

import pandas as pd 
df = pd.read_csv("housing.csv")
print(df.head())

Now, let’s define our input and target variables. We will be using longitude, latitude, housing_median_age, total_rooms, total_bedrooms, population, households, and median_income to predict median_house_value.

import numpy as np
X = np.array(df[['longitude', 'latitude', 'housing_median_age', 'total_rooms', 'total_bedrooms', 'population', 'households', 'median_income']])
y = np.array(df['median_house_value'])

We will then split our data for training and testing:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

Now all of our necessary variables are defined. Let’s build some models!

DEFINING THE MODEL

The sequential model is a linear stack of layers.

from keras.models import Sequential
model = Sequential()

If you get the following error:

Try importing Keras from tensorflow in this and all subsequent imports:

from tensorflow.keras.models import Sequential
model = Sequential()

ADDING LAYERS

We can use the .add() method to add layers. We will add dense layers which we need to import separately:

from keras.layers import Dense

The model should know what input shape to expect in the first layer. Because of this, you need to pass information about the number of features in your model. Since we have 8 features, we need to pass in an input shape of (8,). We will add a dense layer with 8 nodes:

model.add(Dense(8, input_shape = (8,)))

Let’s add an additional hidden layer. For our hidden layer we will use the relu function :

model.add(Dense(8, activation = 'relu'))

Finally, let’s add our output layer. For regression problems, we typically define the activation function in the output layer to be linear. Additionally, the output layer has one node for regression problems:

model.add(Dense(1, activation = 'linear'))

COMPILING

The next thing we have to do is configure the learning process. This is done using the compile method. In the compile method, we have to pass the following parameters:

  1. Loss Function: This is the function that evaluated how well your algorithm models your data set.
  2. Optimizer: This is a method that finds the weights that minimize your loss function.
  3. Metrics: For regression, we typically define the metric to be the loss function. This allows us to keep track of the loss as the model is being trained.

We will use the root mean squared propagator for the optimizer, the mean squared error for the loss function, and mean squared error for the metrics:

model.compile(optimizer='rmsprop', loss='mse', metrics =['mse'])

We can look at the model summary to analyze our neural network architecture:

print(model.summary())

FITTING

For model training we will use the .fit() method:

model.fit(X_train, y_train)

We should get the following output:

We can pass a value for number of epochs (number of iterations over the data) to try to improve accuracy:

model.fit(X_train, y_train, epochs = 10)

You can play around with the number of epochs to try to minimize error:

model.fit(X_train, y_train, epochs = 50)

You can do the same with the number of hidden layers. Let’s try adding two additional hidden layers:

model.add(Dense(8, input_shape = (8,)))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(8, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
model.compile(optimizer='rmsprop', loss='mse', metrics =['mse'])
model.fit(X_train, y_train, epochs = 50)

We see that the last five epochs have a lower mean squared error. We can also try adding more nodes. Let’s try 16 nodes instead of 8:

model.add(Dense(16, input_shape = (8,)))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
model.compile(optimizer='rmsprop', loss='mse', metrics =['mse'])
model.fit(X_train, y_train, epochs = 50)

Finally, let’s try using a different optimizer. Let’s give the ‘adam’ optimizer a shot:

model.add(Dense(16, input_shape = (8,)))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
model.compile(optimizer='adam', loss='mse', metrics =['mse'])
model.fit(X_train, y_train, epochs = 50)

The resulting model performs about the same. Finally, let’s try a large number of epochs, say 500, and passing a larger batch_size. Let’s pass a batch_size of 100:

model.add(Dense(16, input_shape = (8,)))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
model.compile(optimizer='adam', loss='mse', metrics =['mse'])
model.fit(X_train, y_train, epochs = 500, batch_size=100)

We see some improvements as a result. Though we are minimizing mean squared error, we can display a different error metric, like the mean absolute percentage error:

model.add(Dense(16, input_shape = (8,)))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
model.compile(optimizer='adam', loss='mse', metrics =['mape'])
model.fit(X_train, y_train, epochs = 500, batch_size=100)

We see the last epoch has a mean absolute percentage error of 28%.

PREDICTING

In order to generate predictions we do the following:

y_pred = model.predict(X_test)

We can visualize our predictions using matplotlib:

import matplotlib.pyplot as pltplt.clf()
fig = plt.figure()
fig.suptitle('Scatter plot of Actual versus Predicted')
plt.scatter(x=y_pred, y=y_test, marker='.')
plt.xlabel('Predicted')
plt.ylabel('Actual ')
plt.show()

The more the relationship between predictions and actual values resembles a straight line the more accurate our model. There is much more we can try in terms of optimizing our model but I’ll leave that for you to play around with.

CLASSIFICATION

Now let’s walk through the same process for building a classification model. There are many similarities in the workflow with a few small differences!

DATA PREPARATION

The data we will be using for classification is the Telco Customer Churn dataset. It can be found here .

First, let’s import the data and print the first five rows:

import pandas as pd 
df = pd.read_csv("Customer_Churn.csv")
print(df.head())

For simplicity, we will be using all of the categorical and numerical data to predict Churn. First, we need to convert the catgorical columns into numerical values that the neural network can handle. For example, for gender we have:

df.gender = pd.Categorical(df.gender)
df['gender_code'] = df.gender.cat.codes

Now let’s define out input and output arrays:

import numpy as npfeatures = ['gender_code', 'SeniorCitizen_code', 'PhoneService_code', 'MultipleLines_code', 
 'InternetService_code', 'Partner_code', 'Dependents_code', 'PaymentMethod_code', 
 'PaymentMethod_code', 'PaperlessBilling_code','Contract_code', 'StreamingMovies_code',
 'StreamingTV_code', 'TechSupport_code', 'DeviceProtection_code', 'OnlineBackup_code',
 'OnlineSecurity_code', 'Dependents_code', 'Partner_code','tenure', 'MonthlyCharges']X = np.array(df[features])
y = np.array(df['Churn_code'])

Let’s also standardize the input:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

We will then split our data for training and testing:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

Now all of our necessary variables are defined. Let’s build some models!

DEFINING THE MODEL & ADDING LAYERS

Let’s start with an 8 node input layer with input shape corresponding to the number of features:

model = Sequential()
model.add(Dense(8, input_shape = (len(features),)))

Let’s add one hidden layer:

model.add(Dense(8, activation='relu'))

Next, let’s add our output layer. For binary classification we use 1 node for the output and the sigmoid activation function:

model.add(Dense(1, activation='sigmoid')) 

COMPILING

Now let’s compile our model. We will use the ‘Adam’ propagator, binary cross-entropy for loss, and ‘accuracy’ for metrics. The Keras documentation advises that we set the metric to the value ‘accuracy’:

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Let’s print the summary of our model:

print(model.summary())

FITTING

For model training we will use the .fit() method:

model.fit(X_train, y_train)

We should get the following output:

Similar to the regression problem, feel free to play around with the number of nodes, layers, number of epochs, batch_size and the optimizer type.

PREDICTING

In the prediction step we want to convert the output (which are arrays) to floats, then round the values using list comprehension:

y_pred = [round(float(x)) for x in model.predict(X_test)]

We can visualize the predictions using metrics classification report:

from sklearn import metrics
print(metrics.classification_report(y_test, y_pred))

We can also look at the roc_auc_score and the f1_scores:

A value equal to 1.0, in both cases, is perfect. You can significantly improve performance by tuning model parameters. I encourage you to try increasing the number of neurons (nodes), epochs, layers, and engineering additional features.

CONCLUSION

In this post, we walked through the process of building regression and classification models using the Keras neural network API. We went over the process of defining a model object, adding layers, configuring the models with the compile method, training our models, making predictions and evaluating our model performance. I encourage you to experiment with the neural network architectures for both regression and classification. Once you feel comfortable, try applying your knowledge to other datasets and prediction problems. I hope this post was interesting. The code from this post will be available on GitHub . Thank you for reading and happy machine learning!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK