30

Climate Forecasting with Deep Learning and Keras

 3 years ago
source link: https://towardsdatascience.com/climate-forecasting-with-deep-learning-and-keras-ba75f72e9672
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Climate Forecasting with Deep Learning and Keras

Deep Learning for Climate Time Series Forecasting

Image for post
Image for post
Source: https://unsplash.com/photos/JZRlnfsdcj0

In my previous article about climate change, I complained about the relative scarcity of AI research dedicated to such an important topic. In this post, I want to dig deeper into the lack of deep learning efforts for climate forecasting and contribute to the topic by showing a nice Keras project.

The truth is that most climatic models currently are built using either multiple regression or time series forecasting techniques such as ARIMA or ARMA. This is so mainly because neural networks have focused on unstructured perceptual data (like images, video, text, or speech) which is difficult to deal with using traditional approaches. However, the fact that deep learning is so successful in classification problems with images or text does not mean that it cannot help with regression or forecasting problems as well.

In fact, the power of deep learning can be unleashed for long time series forecasting as well, and little by little we are seeing new types of networks developed for more accurate numerical predictions.

Climate Data Time-Series

For this project, we are going to use the Jena Climate dataset recorded by the Max Planck Institute for Biogeochemistry. The dataset consists of 14 features such as temperature, pressure, humidity, etc, recorded once per 10 minutes.

Location: Weather Station, Max Planck Institute for Biogeochemistry in Jena, Germany

Time-frame Considered: Jan 10, 2009 — December 31, 2016

Let’s grab this dataset and make a few plots to gain some insights about it:

This will output all the features variation during the time-series:

Image for post
Image for post

Now let’s visualize the correlation between features

As we can see with this correlation matrix, there are several features that have high correlations with each other so we should perform dimensionality reduction techniques before training the model.

Image for post
Image for post

Data Preprocessing

Here we are picking ~300,000 data points for training. Observation is recorded every 10 mins, which means 6 times per hour. We will resample one point per hour since no drastic change is expected within 60 minutes. We do this via the sampling_rate argument in timeseries_dataset_from_array utility.

We are tracking data from the past 720 timestamps (720/6=120 hours). This data will be used to predict the temperature after 72 timestamps (76/6=12 hours).

Since every feature has values with varying ranges, we do normalization to confine feature values to a range of [0, 1] before training a neural network. We do this by subtracting the mean and dividing by the standard deviation of each feature.

The training data will be 71.5 % of the total dataset, i.e. 300,693 rows called split_fraction (we can play with these values). The model is fed data for the first 5 days i.e. 720 observations, that are sampled every hour. The temperature after 72 (12 hours * 6 observations per hour) observation will be used as a label.

We can see from the correlation heatmap, few parameters like Relative Humidity and Specific Humidity are redundant. Hence we will be using select features, not all.

The selected parameters are: Pressure, Temperature, Saturation vapor pressure, Vapor pressure deficit, Specific humidity, Airtight, Wind speed

The training dataset labels start from the 792nd observation (720 + 72).

Keras preprocessing module provides a time_series_dataset_from_array method that takes in a sequence of data-points gathered at equal intervals, along with time-series parameters such as length of the sequences/windows, the spacing between two sequence/windows, etc., to produce batches of sub-time-series inputs and targets sampled from the main time-series:

Validation dataset

The validation dataset must not contain the last 792 rows as we won’t have label data for those records, hence 792 must be subtracted from the end of the data.

The validation label dataset must start from 792 after train_split, hence we must add past + future (792) to label_start.

Let’s Train the Model

We’ll use the ModelCheckpoint callback to regularly save checkpoints, and the EarlyStopping callback to interrupt training when the validation loss is no longer improving.

Forecast

We are now ready to forecast 5 sets of values from the validation set.

Conclusion

The model has been trained using layers of the Keras module “LSTM” which is a reference for Long Short-Term Memory Network.

Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. It can not only process single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition, and anomaly detection in network traffic or IDSs (intrusion detection systems).

A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.

LSTM networks are well-suited to classifying, processing, and making predictions based on time series data (like in this project) since there can be lags of unknown duration between important events in a time series. LSTMs were developed to deal with the vanishing gradient problem that can be encountered when training traditional RNNs. Relative insensitivity to gap length is an advantage of LSTM over RNNs, hidden Markov models, and other sequence learning methods in numerous applications

As we can see in this case our model is able to predict future values that cannot be inferred by merely looking at the time-series. The behavior of the dataset is quite erratic and future values divert considerably from the seasonal average trend. This is very exciting news and gives positive feedback for Keras and deep learning as powerful instruments to keep developing long term climate predictions.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK