Predicting Irish electricity consumption with neural networks
source link: https://www.tuicool.com/articles/hit/7zaUZbv
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Summary of Study
This analysis is divided into two parts:
- The neuralnet library in R is used to predict electricity consumption through the use of various explanatory variables
- An LSTM network is generated using Keras to predict electricity consumption using the time series exclusive of any explanatory variables
The relevant data was sourced from data.gov.ie and met.ie . Electricity consumption data was provided on an hourly basis, but converted to daily data for the purpose of this analysis.
The variables are as follows:
- eurgbp: EUR/GBP currency rate
- rain: Rainfall
- maxt: Maximum temperature
- mint: Minimum temperature
- wdsp: Wind speed
- sun: Sunlight hours
- kwh: KWH (consumption)
With Ireland obtaining about 45% of its electricity from natural gas, 96% of which is imported from Scotland, EUR/GBP currency fluctuations clearly have a significant impact on the cost of electricity in Ireland, and was therefore included as an explanatory variable.
Moreover, with weather conditions also significantly influencing electricity usage, weather data for the Dublin region was also included for the relevant dates in question.
Key Findings
It was found that of the two models, LSTM was able to predict electricity consumption more accurately, with the training and test predictions closely mirroring actual consumption:
The model demonstrated an average error of 353.25 on the training dataset, and 255.13 on the test dataset (out of thousands of kilowatts).
Part 1: neuralnet
A neural network consists of:
- Input layers: Layers that take inputs based on existing data
- Hidden layers: Layers that use backpropagation to optimise the weights of the input variables in order to improve the predictive power of the model
- Output layers: Output of predictions based on the data from the input and hidden layers
1.1. Data Normalization
The data is normalized and split into training and test data:
# MAX-MIN NORMALIZATION > normalize <- function(x) { > return ((x - min(x)) / (max(x) - min(x))) > } > maxmindf <- as.data.frame(lapply(fullData, normalize))
# TRAINING AND TEST DATA trainset <- maxmindf[1:378, ] testset <- maxmindf[379:472, ]
1.2. Neural Network Output
The neural network is then run and the parameters are generated:
# NEURAL NETWORK > library(neuralnet) > nn <- neuralnet(kwh ~ eurgbp + rain + maxt + mint + wdsp + sun,data=trainset, hidden=c(5,2), linear.output=TRUE, threshold=0.01) > nn$result.matrix 1 error 2.168927756297 reached.threshold 0.008657878909 steps 994.000000000000 Intercept.to.1layhid1 -0.943475389102 eurgbp.to.1layhid1 1.221792852624 rain.to.1layhid1 0.222508044224 maxt.to.1layhid1 1.356892947349 mint.to.1layhid1 -0.377284881968 wdsp.to.1layhid1 0.749993672528 sun.to.1layhid1 -0.250669884677 Intercept.to.1layhid2 3.424295572041 eurgbp.to.1layhid2 -4.921292790902 rain.to.1layhid2 3.380551856044 maxt.to.1layhid2 -2.353604121342 mint.to.1layhid2 0.877423599705 wdsp.to.1layhid2 -0.581900515451 sun.to.1layhid2 -7.083263552687 Intercept.to.1layhid3 0.352457802915 eurgbp.to.1layhid3 3.715376984054 rain.to.1layhid3 -1.030450129246 maxt.to.1layhid3 -0.672907974572 mint.to.1layhid3 0.898040603876 wdsp.to.1layhid3 -1.474470972212 sun.to.1layhid3 -1.793900522508 Intercept.to.1layhid4 0.819225033685 eurgbp.to.1layhid4 -16.770362105816 rain.to.1layhid4 -2.483557437596 maxt.to.1layhid4 -0.059472312293 mint.to.1layhid4 2.650852686615 wdsp.to.1layhid4 3.863732942893 sun.to.1layhid4 0.224801123127 Intercept.to.1layhid5 -13.987427433833 eurgbp.to.1layhid5 -1.661519269508 rain.to.1layhid5 -52.279711798215 maxt.to.1layhid5 22.717540151979 mint.to.1layhid5 11.670399514036 wdsp.to.1layhid5 9.713301368020 sun.to.1layhid5 10.804887927196 Intercept.to.2layhid1 -0.834412474581 1layhid.1.to.2layhid1 1.629948945316 1layhid.2.to.2layhid1 -3.064448233097 1layhid.3.to.2layhid1 0.197497636177 1layhid.4.to.2layhid1 -0.370098281335 1layhid.5.to.2layhid1 -0.402324278545 Intercept.to.2layhid2 -1.176093680811 1layhid.1.to.2layhid2 1.312897190062 1layhid.2.to.2layhid2 0.593640022150 1layhid.3.to.2layhid2 1.906008701982 1layhid.4.to.2layhid2 1.811035017074 1layhid.5.to.2layhid2 -0.725078284924 Intercept.to.kwh -0.093973916107 2layhid.1.to.kwh 0.700847362516 2layhid.2.to.kwh 0.922218125575
Here is what our neural network looks like in visual format:
1.3. Model Validation
Then, we validate (or test the accuracy of our model) by comparing the estimated consumption in KWH yielded from the neural network to the actual consumption as reported in the test output:
> results <- data.frame(actual = testset$kwh, prediction = nn.results$net.result) > results actual prediction 379 0.8394856269 0.72836479401 380 0.7976933676 0.72836479401 381 0.8125463657 0.72836479401 382 0.8377382154 0.72836479401 383 0.8394856269 0.72836479401 384 0.8415242737 0.72836479401 .......... 467 0.7464359625 0.80778769677 468 0.7018769682 0.82063018370 469 0.7004207919 0.78094824279 470 0.6726078249 0.77185373598 471 0.7176036721 0.91671846789 472 0.7199335541 0.80974222504
1.4. Accuracy
In the below code, we are then converting the data back to its original format, and yielding an accuracy of 98% on a mean absolute deviation basis (i.e. the average deviation between estimated and actual electricity consumption stands at a mean of 2%). Note that we are also converting our data back into standard values given that they were previously scaled using the max-min normalization technique:
> predicted=results$prediction * abs(diff(range(kwh))) + min(kwh) > actual=results$actual * abs(diff(range(kwh))) + min(kwh) > comparison=data.frame(predicted,actual) > deviation=((actual-predicted)/actual) > comparison=data.frame(predicted,actual,deviation) > accuracy=1-abs(mean(deviation)) > accuracy [1] 0.9828191884
A mean accuracy of 98% is obtained using a (5,2) hidden configuration. However, note that since this is a mean accuracy, it does not necessarily imply that all predictions generated by the model will have such high accuracy. Indeed, accuracy is lower in certain cases as can be observed from the histogram below.
When we plot a histogram of the deviation (with 100 breaks), we see that the majority of forecasts fall within 10% from the actual consumption.
When plotting the predicted and actual consumption, it is observed that while the prediction series generated by the neural network follows the general range of the actual (i.e. between 4200–5000 Kwhs), the model is not particularly adept at predicting the peaks and valleys in the series (or periods of abnormally low or high usage).
Part 2: LSTM (Long-Short Term Memory Network)
A shortcoming of traditional neural network models is that they do not account for dependencies across time series data.
When a neural network was generated using neuralnet, it was assumed that all observations are independent to each other. However, this is not necessarily the case.
2.1. Issue of Stationarity
When observing line charts for both KWH (consumption) and the EUR/GBP, we can see that the KWH time series shows a stationary pattern (stationary meaning that the mean, variance, and autocorrelation are constant):
However, when the EUR/GBP currency fluctuations are plotted over the same time period, the data is clearly non-stationary, i.e. the mean, variance, and autocorrelation differ over time:
Given that non-stationarity was present in certain explanatory variables, the LSTM model will now be used to predict future values of KWH against the test set — independent of any other explanatory variables.
In other words, only the values of KWH will be predicted using LSTM. The analysis is carried out using the Keras library in Python. The following guide also provides a detailed overview of predictions with LSTM using a separate example.
2.2. Data Processing
Firstly, the relevant libraries are imported and data processing is carried out:
# Import libraries import numpy as np import matplotlib.pyplot as plt from pandas import read_csv import math from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error import os; path="filepath" os.chdir(path) os.getcwd()
# Form dataset matrix def create_dataset(dataset, previous=1): dataX, dataY = [], [] for i in range(len(dataset)-previous-1): a = dataset[i:(i+previous), 0] dataX.append(a) dataY.append(dataset[i + previous, 0]) return np.array(dataX), np.array(dataY)
# fix random seed for reproducibility np.random.seed(7)
# load dataset dataframe = read_csv('data.csv', usecols=[0], engine='python', skipfooter=3) dataset = dataframe.values dataset = dataset.astype('float32')
# normalize dataset with MinMaxScaler scaler = MinMaxScaler(feature_range=(0, 1)) dataset = scaler.fit_transform(dataset)
# Training and Test data partition train_size = int(len(dataset) * 0.8) test_size = len(dataset) - train_size train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
# reshape into X=t and Y=t+1 previous = 1 X_train, Y_train = create_dataset(train, previous) X_test, Y_test = create_dataset(test, previous)
# reshape input to be [samples, time steps, features] X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1])) X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1]))
2.3. LSTM Generation and Predictions
Then, the LSTM model is generated and predictions are yielded:
# Generate LSTM network model = Sequential() model.add(LSTM(4, input_shape=(1, previous))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(X_train, Y_train, epochs=100, batch_size=1, verbose=2)
# Generate predictions trainpred = model.predict(X_train) testpred = model.predict(X_test)
# Convert predictions back to normal values trainpred = scaler.inverse_transform(trainpred) Y_train = scaler.inverse_transform([Y_train]) testpred = scaler.inverse_transform(testpred) Y_test = scaler.inverse_transform([Y_test])
# calculate RMSE trainScore = math.sqrt(mean_squared_error(Y_train[0], trainpred[:,0])) print('Train Score: %.2f RMSE' % (trainScore)) testScore = math.sqrt(mean_squared_error(Y_test[0], testpred[:,0])) print('Test Score: %.2f RMSE' % (testScore))
# Train predictions trainpredPlot = np.empty_like(dataset) trainpredPlot[:, :] = np.nan trainpredPlot[previous:len(trainpred)+previous, :] = trainpred
# Test predictions testpredPlot = np.empty_like(dataset) testpredPlot[:, :] = np.nan testpredPlot[len(trainpred)+(previous*2)+1:len(dataset)-1, :] = testpred
# Plot all predictions inversetransform, =plt.plot(scaler.inverse_transform(dataset)) trainpred, =plt.plot(trainpredPlot) testpred, =plt.plot(testpredPlot) plt.title("Predicted vs. Actual Consumption") plt.show()
The model is trained over 100 epochs, and the predictions are generated.
2.4. Accuracy
When plotting the actual consumption (blue line) with the training and test predictions (orange and green lines), the two series follow each other quite closely, with the exception of certain spikes downward (or periods of abnormally low usage):
Moreover, here is our output when 100 epochs are generated:
Epoch 94/100 - 1s - loss: 0.0108 Epoch 95/100 - 1s - loss: 0.0108 Epoch 96/100 - 1s - loss: 0.0107 Epoch 97/100 - 1s - loss: 0.0108 Epoch 98/100 - 1s - loss: 0.0108 Epoch 99/100 - 1s - loss: 0.0108 Epoch 100/100 - 1s - loss: 0.0109
>>> # calculate RMSE ... trainScore = math.sqrt(mean_squared_error(Y_train[0], trainpred[:,0])) >>> print('Train Score: %.2f RMSE' % (trainScore)) Train Score: 353.25 RMSE >>> testScore = math.sqrt(mean_squared_error(Y_test[0], testpred[:,0])) >>> print('Test Score: %.2f RMSE' % (testScore)) Test Score: 255.13 RMSE
The model has an average error of 353.25 on the training dataset, and 255.13 on the test dataset (out of thousands of kilowatts).
However, when running this model, the prediction was made over a 1-day, i.e. t+1 period. How would the model perform over longer time periods, e.g. 10 days, 50 days? Let’s find out.
10 days
- Training error: 345.31 RMSE
- Test error: 283.77 RMSE
50 days
- Training error: 288.94 RMSE
- Test error: 396.36 RMSE
While the test error was slightly higher across the 10 and 50 day periods, this was not by a great margin. Moreover, the overall errors remain low in the context of the average of 4609 kilowatts per day in the time series itself.
Conclusion
Of the two neural networks, LSTM proved to be more accurate at predicting fluctuations in electricity consumption.
In the case of neuralnet, the model was not completely adept at handling non-stationary data present in various explanatory variables.
Moreover, factors such as temperature already follow set historical trends generally (with the exception of abnormal weather patterns which might have an effect on consumption).
In this regard, a traditional neural network with explanatory variables proved less effective in this instance than LSTM, which was able to model fluctuations in consumption without the need for explanatory data.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK