The Experiment of Forest Fires Prediction using Deep Learning

Nuzulul Khairu Nissa
10 min readFeb 13, 2022

Forest fires is one of the important catastrophic events and have great impact on environment, infrastructure and human life. For the need of an early warning detection system of forest fires, there are various methods that have been used including : physics-based model, statistical model, machine learning model and deep learning model.

This article aims to conduct the hyperparameter tuning experiment for predicting the burned area of the forest fires specifically in the northeast region of Portugal, based on the spatial, temporal and weather variables where the fire is spotted using deep learning.

We use public dataset from UCI Machine Learning Repository : http://archive.ics.uci.edu/ml/datasets/Forest+Fires. This prediction can be used for calculating the forces sent to the incident and deciding the urgency of the situation. The method that we will use is the artificial neural network/ deep learning with the classification problem for predicting the forest fires.

A brief overview of artificial neural networks (ANN):
ANN are made of layers with an input and an output dimension. The latter is determined by the number of neurons (also called ‘nodes’), a computational unit that connects the weighted inputs through activation function (which helps the neuron to switch on/off). The weights, like in most of the machine learning algorithms, are randomly initialized and optimized during the training to minimize a loss function.

Deep Learning with Python: Neural Networks (complete tutorial)

Here are the steps to do the experiment:

Step 1: Understanding Dataset

Before we import the dataset, we must import the required libraries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn')
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
import tensorflow as tensorflow
from keras.models import Sequential
from keras.layers import Dense, Dropout
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.utils.vis_utils import plot_model

The dataset used in this article is sourced from UCI Machine Learning Repository : http://archive.ics.uci.edu/ml/datasets/Forest+Fires

For importing dataset, do this following steps:

df = pd.read_csv('dataset.csv')
df.head(10)

Attribute Information:

  • X : x-axis spatial coordinate within the Montesinho park map: 1 to 9
  • Y : y-axis spatial coordinate within the Montesinho park map: 2 to 9
  • month : month of the year: ‘jan’ to ‘dec’
  • day : day of the week: ‘mon’ to ‘sun’
  • FFMC : FFMC (Fine Fuel Moisture Code) index from the FWI system: 18.7 to 96.20
  • DMC : DMC (Duff Moisture Code) index from the FWI system: 1.1 to 291.3
  • DC : DC (Drought Code) index from the FWI system: 7.9 to 860.6
  • ISI : ISI (Initial Spread Index) index from the FWI system: 0.0 to 56.10
  • temp : temperature in Celsius degrees: 2.2 to 33.30
  • RH : relative humidity in %: 15.0 to 100
  • wind : wind speed in km/h: 0.40 to 9.40
  • rain : outside rain in mm/m2 : 0.0 to 6.4
  • area : the burned area of the forest (in ha): 0.00 to 1090.84

Step 2: Data Preprocessing

1) Add a new column = size_category

For classification problem, we attempt to add a new column, namely size_category to categorize the data into two categories:

  • If the value of the area < 6 then the size_category will be labeled as 0 (Small Fire)
  • If the value of the area ≥ 6 then the size_category will be labeled as 1 (Wide Fire)
df['size_category'] = np.where(df['area']>6, '1', '0')
df['size_category']= pd.to_numeric(df['size_category'])
df.tail(10)

2) Data Preprocessing for Days

The distribution for the day seems pretty. We will, instead of encoding 7 variables, separate these into weekend ( True ) or not weekend ( False). With the assumption, if the amount of area burned in a fire is also related to how the fire fighters responded to the flame. During the weekend, the amount of firefighters or the response in general may be different compared during the weekday.

# converting to is weekend
df['day'] = ((df['day'] == 'sun') | (df['day'] == 'sat'))
# renaming column
df = df.rename(columns = {'day' : 'is_weekend'})
# visualizing
sns.countplot(df['is_weekend'])
plt.title('Count plot of weekend vs weekday')

The skew is not too large so we are happy with this conversion.

3) Scaling Area and Rain

The distributions of rain and areaare too skewed and have large outliers so we will scale it to even out the distribution.

# natural logarithm scaling (+1 to prevent errors at 0)
df.loc[:, ['rain', 'area']] = df.loc[:, ['rain', 'area']].apply(lambda x: np.log(x + 1), axis = 1)
# visualizing
fig, ax = plt.subplots(2, figsize = (5, 8))
ax[0].hist(df['rain'])
ax[0].title.set_text('histogram of rain')
ax[1].hist(df['area'])
ax[1].title.set_text('histogram of area')

The distribution for rain is not good but the distribution for areais highly improved. Now we scale the entire dataset. Note that we plan on testing a neural network on the dataset so we will scale the area as a preventative measure against an exploding gradient.

First we will split the data into train and test splits so that we can scale the train set and then scale the test set based on the train set. Then we will scale everything.

4) Train Test Split

Data is randomly splitted into training data (80 %) and testing data(20%).

features = df.drop(['size_category'], axis = 1)
labels = df['size_category'].values.reshape(-1, 1)
X_train, X_test, y_train, y_test = train_test_split(features,labels, test_size = 0.2, random_state = 42)

5) Feature Scaling: StandardScaler

Apply the feature scaling : the standardscaler to the data

# fitting scaler
sc_features = StandardScaler()
# transforming features
X_test = sc_features.fit_transform(X_test)
X_train = sc_features.transform(X_train)
# features
X_test = pd.DataFrame(X_test, columns = features.columns)
X_train = pd.DataFrame(X_train, columns = features.columns)
# labels
y_test = pd.DataFrame(y_test, columns = ['size_category'])
y_train = pd.DataFrame(y_train, columns = ['size_category'])
X_train.head()

Step 3: Hyperparameter/ Experiment Results

1) Experiment 1 : Base Model

Here we are going to create our ANN object by using a certain class of Keras named Sequential. Once we initialize our ANN, we are now going to create layers. Here we are going to create a base model network that will have :

  • 1 input layer
  • 2 hidden layers
  • 1 dropout layer
  • 1 output layer

Here we have created our first hidden layer by using the Dense class which is part of the layers module. This class accepts 2 inputs:

  • units: number of neurons that will be present in the respective layer
  • activation: specify which activation function to be used

We create a sequence of layers to define the neural network and define each layer by initializing weights, defining the activation function and selecting the nodes per hidden layer.

model = Sequential()# input layer + 1st hidden layer
model.add(Dense(6, input_dim=13, activation='relu'))
# 2nd hidden layer
model.add(Dense(6, activation='relu'))
# output layer
model.add(Dense(6, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'relu'))
model.summary()

The next step, we will compile our ANN with using the hyperparameter below:

  • Epoch : how many times neural networks will be trained
  • Batch Size : how many observations should be there in the batch.
  • Activation Function : The primary role of the activation function is to transform the summed weighted input from the node into an output value to be fed to the next hidden layer or as output.
  • Loss Function : loss functions are used to determine the error between the output of our algorithms and the given target value.
  • Learning Rate : learning rate is a hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated.
  • Optimizers : optimizers are algorithms or methods used to minimize an error function(loss function)or to maximize the efficiency of production.

To check the performance of the methods, we calculate the accuracy measure.

# Compile Model
model.compile(optimizer = 'adam', metrics=['accuracy'], loss ='binary_crossentropy')
# Train Model
history = model.fit(X_train, y_train, validation_data = (X_test, y_test), batch_size = 10, epochs = 100)
_, train_acc = model.evaluate(X_train, y_train, verbose=0)
_, valid_acc = model.evaluate(X_test, y_test, verbose=0)
print('Train: %.3f, Valid: %.3f' % (train_acc, valid_acc))

Based on the results of experiment 1, which used the the hyperparameter of the base model, the accuracy score of the train data is 96% and the accuracy score of the valid or the test data is 92%.

plt.figure(figsize=[8,5])
plt.plot(history.history['accuracy'], label='Train')
plt.plot(history.history['val_accuracy'], label='Valid')
plt.legend()
plt.xlabel('Epochs', fontsize=16)
plt.ylabel('Accuracy', fontsize=16)
plt.title('Accuracy Curves Epoch 100, Batch Size 10', fontsize=16)
plt.show()

Based on the output of the accuracy graph, the model begins to show the stability at epochs 60 to 100.

2) Experiment 2: Batch Size: 4, 6, 10, 16, 32, 64, 128, 260

For the experiment 2, we will do the ANN’s modelling with hyperparameter details as below:

# Fit a model and plot learning curve
def fit_model(X_train, y_train, X_test, y_test, n_batch):
# Define Model
model = Sequential()
model.add(Dense(6, input_dim=13, activation='relu'))
model.add(Dense(6, activation='relu'))
model.add(Dense(6, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'relu'))
# Compile Model
model.compile(optimizer = 'adam',
metrics=['accuracy'],
loss = 'binary_crossentropy')
# Fit Model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=100, verbose=0, batch_size=n_batch)
# Plot Learning Curves
plt.plot(history.history['accuracy'], label='train')
plt.plot(history.history['val_accuracy'], label='test')
plt.title('batch='+str(n_batch))
plt.legend()
# Create learning curves for different batch sizes
batch_sizes = [4, 6, 10, 16, 32, 64, 128, 260]
plt.figure(figsize=(10,15))
for i in range(len(batch_sizes)):
# Determine the Plot Number
plot_no = 420 + (i+1)
plt.subplot(plot_no)
# Fit model and plot learning curves for a batch size
fit_model(X_train, y_train, X_test, y_test, batch_sizes[i])
# Show learning curves
plt.show()

Based on the accuracy graph above, the model that is good enough to show stability is the model which batch = 6.

3) Experiment 3: Batch Size = 6, Epochs = 20, 50, 100, 120, 150, 200, 300, 400

For the experiment 3, we will do the ANN’s modelling with hyperparameter details as below:

# fit a model and plot learning curve
def fit_model(trainX, trainy, validX, validy, n_epoch):
# define model
model = Sequential()
model.add(Dense(6, input_dim=13, activation='relu'))
model.add(Dense(6, activation='relu'))
model.add(Dense(6, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'relu'))
# compile model
model.compile(optimizer ='adam', metrics=['accuracy'], loss = 'binary_crossentropy')
# fit model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=n_epoch, verbose=0, batch_size=6)
# plot learning curves
plt.plot(history.history['accuracy'], label='train')
plt.plot(history.history['val_accuracy'], label='test')
plt.title('epoch='+str(n_epoch))
plt.legend()
# Create learning curves for different batch sizes
epochs = [20, 50, 100, 120, 150, 200, 300, 400]
plt.figure(figsize=(10,15))
for i in range(len(batch_sizes)):
# Determine the Plot Number
plot_no = 420 + (i+1)
plt.subplot(plot_no)
# Fit model and plot learning curves for a batch size
fit_model(X_train, y_train, X_test, y_test, epochs[i])
# Show learning curves
plt.show()

Based on the accuracy graph above, the model that is good enough to show stability is the model which epoch = 200,300 and 400.

4) Experiment 4

Batch Size = 6, Early Stopping (Patience, Model Checkpoint)

For the experiment 4, we will do the ANN’s modelling with hyperparameter details as below:

def init_model():# define model
model = Sequential()
model.add(Dense(6, input_dim=13, activation='relu'))
model.add(Dense(6, activation='relu'))model.add(Dense(6, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'relu'))
model.compile(optimizer ='adam',
metrics=['accuracy'],
loss = 'binary_crossentropy')
return model# init model
model = init_model()
# simple early stopping
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=150)
# model checkpoint
mc = ModelCheckpoint('best_model.h5', monitor='val_accuracy', mode='max', verbose=1, save_best_only=True)
# fitting model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=250, verbose=0, batch_size=6, callbacks=[es, mc])
# plot training history
plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='valid')
plt.legend()
plt.xlabel('Epochs', fontsize=14)
plt.ylabel('Loss', fontsize=14)
plt.title('Loss Curves', fontsize=16)
plt.show()
plt.figure(figsize=[8,5])
plt.plot(history.history['accuracy'], label='Train')
plt.plot(history.history['val_accuracy'], label='Valid')
plt.legend()
plt.xlabel('Epochs', fontsize=16)
plt.ylabel('Accuracy', fontsize=16)
plt.title('Accuracy Curves', fontsize=16)
plt.show()
_, train_acc = model.evaluate(X_train, y_train, verbose=0)
_, valid_acc = model.evaluate(X_test, y_test, verbose=0)
print('Train: %.3f, Valid: %.3f' % (train_acc, valid_acc))

Based on the results of experiment 4, which used the early stopping with patience and model checkpoint method, the accuracy score of the train data is 97% and the accuracy score of the valid or the test data is 99%.

Conclusion and Discussion

One of the key success for controlling the forest fire is the early detection of the fire. In this article we conduct the hyperparameter tuning experiment for predicting the burned area of the forest fires specifically in the northeast region of Portugal, based on the spatial, temporal and weather variables where the fire is spotted using deep learning.

To find another best method, we suggest to use the other options of the data preprocessing and try to use machine learning algorithm such as Support Vector Machines (SVM), Decision Tree, Random Forest Classifier, Naive Bayes Classifier etc.

References:

http://archive.ics.uci.edu/ml/datasets/Forest+Fires

🟠 Become a Writer at MLearning.ai

--

--