Choose Optimal Number of Epochs to Train a Neural Network in Keras - GeeksforGeeks (2024)

Last Updated : 20 Mar, 2024

Summarize

Comments

Improve

One of the critical issues while training a neural network on the sample data is Overfitting. When the number of epochs used to train a neural network model is more than necessary, the training model learns patterns that are specific to sample data to a great extent. This makes the model incapable to perform well on a new dataset. This model gives high accuracy on the training set (sample data) but fails to achieve good accuracy on the test set. In other words, the model loses generalization capacity by overfitting the training data. To mitigate overfitting and increase the generalization capacity of the neural network, the model should be trained for an optimal number of epochs. A part of the training data is dedicated to the validation of the model, to check the performance of the model after each epoch of training. Loss and accuracy on the training set as well as on the validation set are monitored to look over the epoch number after which the model starts overfitting.

keras.callbacks.callbacks.EarlyStopping()

Either loss/accuracy values can be monitored by the Early stopping call back function. If the loss is being monitored, training comes to a halt when there is an increment observed in loss values. Or, If accuracy is being monitored, training comes to a halt when there is a decrement observed in accuracy values.

Syntax:

keras.callbacks.EarlyStopping(monitor=’val_loss’, min_delta=0, patience=0, verbose=0, mode=’auto’, baseline=None, restore_best_weights=False)

where,

  • monitor: The value to be monitored by the function should be assigned. It can be validation loss or validation accuracy.
  • mode: It is the mode in which change in the quantity monitored should be observed. This can be ‘min’ or ‘max’ or ‘auto’. When the monitored value is loss, its value is ‘min’. When the monitored value is accuracy, its value is ‘max’. When the mode is set is ‘auto’, the function automatically monitors with the suitable mode.
  • min_delta: The minimum value should be set for the change to be considered i.e., Change in the value being monitored should be higher than ‘min_delta’ value.
  • patience: Patience is the number of epochs for the training to be continued after the first halt. The model waits for patience number of epochs for any improvement in the model.
  • verbose: Verbose is an integer value-0, 1 or 2. This value is to select the way in which the progress is displayed while training.
    • Verbose = 0: Silent mode-Nothing is displayed in this mode.
    • Verbose = 1: A bar depicting the progress of training is displayed.
    • Verbose = 2: In this mode, one line per epoch, showing the progress of training per epoch is displayed.
  • restore_best_weights: This is a boolean value. True value restores the weights which are optimal.

Importing Libraries and Dataset

Python libraries make it very easy for us to handle the data and perform typical and complex tasks with a single line of code.

  • Pandas – This library helps to load the data frame in a 2D array format and has multiple functions to perform analysis tasks in one go.
  • Numpy – Numpy arrays are very fast and can perform large computations in a very short time.
  • Matplotlib – This library is used to draw visualizations.
  • Sklearn – This module contains multiple libraries having pre-implemented functions to perform tasks from data preprocessing to model development and evaluation.
  • OpenCV – This is an open-source library mainly focused on image processing and handling.
  • TensorFlow – This is an open-source library that is used for Machine Learning and Artificial intelligence and provides a range of functions to achieve complex functionalities with single lines of code.
Python3
import kerasfrom keras.utils.np_utils import to_categoricalfrom keras.datasets import mnist# Loading data(train_images, train_labels),\ (test_images, test_labels) = mnist.load_data()# Reshaping data-Adding number of# channels as 1 (Grayscale images)train_images = train_images.reshape((train_images.shape[0], train_images.shape[1], train_images.shape[2], 1))test_images = test_images.reshape((test_images.shape[0], test_images.shape[1], test_images.shape[2], 1))# Scaling down pixel valuestrain_images = train_images.astype('float32')/255test_images = test_images.astype('float32')/255# Encoding labels to a binary class matrixy_train = to_categorical(train_labels)y_test = to_categorical(test_labels)

From this step onward we will use the TensorFlow library to build our CNN model. Keras framework of the tensor flow library contains all the functionalities that one may need to define the architecture of a Convolutional Neural Network and train it on the data.

Model Architecture

We will implement a Sequential model which will contain the following parts:

  • Three Convolutional Layers followed by MaxPooling Layers.
  • The Flatten layer flattens the output of the convolutional layer.
  • Then we will have two fully connected layers followed by the output of the flattened layer.
  • The final layer is the output layer which outputs soft probabilities for the three classes.
Python3
from keras import modelsfrom keras import layersmodel = models.Sequential()model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1)))model.add(layers.MaxPooling2D(2, 2))model.add(layers.Conv2D(64, (3, 3), activation="relu"))model.add(layers.MaxPooling2D(2, 2))model.add(layers.Flatten())model.add(layers.Dense(64, activation="relu"))model.add(layers.Dense(10, activation="softmax"))model.summary()

Output:

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320

max_pooling2d (MaxPooling2D (None, 13, 13, 32) 0
)

conv2d_1 (Conv2D) (None, 11, 11, 64) 18496

max_pooling2d_1 (MaxPooling (None, 5, 5, 64) 0
2D)

flatten (Flatten) (None, 1600) 0

dense (Dense) (None, 64) 102464

dense_1 (Dense) (None, 10) 650

=================================================================
Total params: 121,930
Trainable params: 121,930
Non-trainable params: 0
_________________________________________________________________

Model Compilation

While compiling a model we provide these three essential parameters:

  • optimizer – This is the method that helps to optimize the cost function by using gradient descent.
  • loss – The loss function by which we monitor whether the model is improving with training or not.
  • metrics – This helps to evaluate the model by predicting the training and the validation data.
Python3
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=['accuracy'])

Data Preprocessing

While training a machine learning model it is considered a good practice to split the data into training and the validation part this helps us visualize the performance of the model epoch by epoch as the training process moves forward.

Python3
val_images = train_images[:10000]partial_images = train_images[10000:]val_labels = y_train[:10000]partial_labels = y_train[10000:]

Early Stopping Callback

If model performance is not improving then training will be stopped by EarlyStopping. We can also define some custom callbacks to stop training in between if the desired results have been obtained early.

Python3
from keras import callbacksearlystopping = callbacks.EarlyStopping(monitor="val_loss", mode="min", patience=5, restore_best_weights=True)history = model.fit(partial_images, partial_labels, batch_size=128, epochs=25, validation_data=(val_images, val_labels), callbacks=[earlystopping])

Output:

Epoch 10/25
391/391 [==============================] - 13s 33ms/step - loss: 0.0082 - accuracy: 0.9976
- val_loss: 0.0464 - val_accuracy: 0.9893
Epoch 11/25
391/391 [==============================] - 12s 31ms/step - loss: 0.0064 - accuracy: 0.9981
- val_loss: 0.0487 - val_accuracy: 0.9905
Epoch 12/25
391/391 [==============================] - 14s 35ms/step - loss: 0.0062 - accuracy: 0.9982
- val_loss: 0.0454 - val_accuracy: 0.9885
Epoch 13/25
391/391 [==============================] - 13s 32ms/step - loss: 0.0046 - accuracy: 0.9986
- val_loss: 0.0502 - val_accuracy: 0.9894
Epoch 14/25
391/391 [==============================] - 15s 38ms/step - loss: 0.0039 - accuracy: 0.9987
- val_loss: 0.0511 - val_accuracy: 0.9904

Note: Training stopped at the 14th epoch i.e., the model will start overfitting from the 15th epoch. As the number of epochs increases beyond 14, training set loss decreases and becomes nearly zero. Whereas, validation loss increases depicting the overfitting of the model on training data.

Let’s visualize the training and validation accuracy with each epoch.

Python3
import pandas as pdimport matplotlib.pyplot as plthistory_df = pd.DataFrame(history.history)history_df.loc[:, ['loss', 'val_loss']].plot()history_df.loc[:, ['accuracy', 'val_accuracy']].plot()plt.show()

Output:

Choose Optimal Number of Epochs to Train a Neural Network in Keras - GeeksforGeeks (1)

Comparison between Accuracy and Validation Accuracy Epoch-By-Epoch



M

manmayi

Choose Optimal Number of Epochs to Train a Neural Network in Keras - GeeksforGeeks (2)

Improve

Previous Article

Gradient Descent Optimization in Tensorflow

Next Article

Python | Classify Handwritten Digits with Tensorflow

Please Login to comment...

Choose Optimal Number of Epochs to Train a Neural Network in Keras - GeeksforGeeks (2024)

FAQs

What is the optimal number of epochs to train a neural network? ›

Finding the Balance Between Batch Size and Epochs
HyperparameterTypical RangeBest Practices
Number of Epochs10–50 for small datasets, 50–200 for medium datasets, 100–500+ for large datasetsStart with a larger number, use early stopping to avoid overfitting
1 more row
Jul 10, 2024

How many epochs is enough for training? ›

Generally, a number of 11 epochs is ideal for training on most datasets.

Is 50 epochs too much? ›

Answer: Yes, an excessive number of epochs can contribute to overfitting in machine learning models.

Is more or less epoch better? ›

Too few epochs can result in underfitting, where the model is unable to capture the patterns in the data. On the other hand, too many epochs can result in overfitting, where the model becomes too specific to the training data and is unable to generalize to new data.

Does number of epochs increase accuracy? ›

Initially, as the number of epochs increases, the model learns more from the training data, and the prediction accuracy on both the training and validation datasets tends to improve. This is because the model gets more opportunities to adjust its weights and biases to minimize the loss function.

What's the optimal batch size to train a neural network? ›

It was recommended to use small batch sizes of 32 to 64 with a low learning rate to improve the performance of the network.

What is the minimum number of epochs? ›

Difference Between Epoch and Batch Machine Learning
EpochBatch
The number of epochs can be anything between one and infinity.The batch size is always equal to or more than one and equal to or less than the number of samples in the training set.
2 more rows
Nov 7, 2023

What does 50 epochs mean? ›

An “epoch” means one pass of the whole training set. If my training set has 10 samples, then running 20 epochs means my model will be trained on these 10 samples for 20 times.

How many epochs to train for LSTM? ›

Neural networks need a loss function to guide optimization problem resolution. The loss function is similar to an objective function for process-based hydrological models. Among the developed models, only LSTM needs early stopping at 40 epochs (Fig. 8).

Why is the number of epochs important? ›

The number of epochs is an important hyperparameter for the training process of a machine learning model. Too few epochs can result in an underfitted model, where the model has not learned enough from the training data to make accurate predictions.

How many epochs does Bert have? ›

BERT based original model is trained with 3 epoch, and BERT with additional layer is trained on 4 epoch. All hidden size are 100 in BiDAF based models other than embedding layer that are detailed explained above.

How many epochs are needed for early stopping? ›

People typically define a patience, i.e. the number of epochs to wait before early stop if no progress on the validation set. The patience is often set somewhere between 10 and 100 (10 or 20 is more common), but it really depends on your dataset and network.

How to make a neural network run faster? ›

Gradient clipping reduces training time by helping the training to converge.
  1. Use Transfer Learning. ...
  2. Optimize Network Architecture. ...
  3. Normalize Data. ...
  4. Stop Training Early. ...
  5. Disable Optional Visualizations. ...
  6. Reduce Validation Time.

How to improve Keras model accuracy? ›

Techniques to Improve Accuracy
  1. Data Preprocessing. Data preprocessing is a crucial step in enhancing model performance. ...
  2. Increase the Number of Layers. ...
  3. Increase the Number of Neurons. ...
  4. Use Dropout Regularization. ...
  5. Increase the Number of Epochs. ...
  6. Hyperparameter Tuning.
Jul 6, 2023

How many images to train CNN? ›

while there's no fixed threshold for the number of images per class, having hundreds to thousands of images per class is generally recommended for training a CNN effectively.

How many epochs should I train RVC? ›

If the training dataset's audio quality is poor and the noise floor is high, 20-30 epochs are sufficient. Setting it too high won't improve the audio quality of your low-quality training set. If the training set audio quality is high, the noise floor is low, and there is sufficient duration, you can increase it.

What is the optimal number of layers in a neural network? ›

If data is less complex and is having fewer dimensions or features then neural networks with 1 to 2 hidden layers would work. If data is having large dimensions or features then to get an optimum solution, 3 to 5 hidden layers can be used.

Top Articles
50 PI to INR - Exchange - How much Indian Rupee (INR) is 50 PiCoin (PI) ? Exchange Rates by Walletinvestor.com
Is Uphold Safe and Legit
Fighter Torso Ornament Kit
Practical Magic 123Movies
Shorthand: The Write Way to Speed Up Communication
Unraveling The Mystery: Does Breckie Hill Have A Boyfriend?
Western Razor David Angelo Net Worth
Culver's Flavor Of The Day Monroe
South Ms Farm Trader
Uc Santa Cruz Events
Aquatic Pets And Reptiles Photos
Lqse-2Hdc-D
Winterset Rants And Raves
Nioh 2: Divine Gear [Hands-on Experience]
Aspen.sprout Forum
Samsung Galaxy S24 Ultra Negru dual-sim, 256 GB, 12 GB RAM - Telefon mobil la pret avantajos - Abonament - In rate | Digi Romania S.A.
Teenleaks Discord
Star Wars: Héros de la Galaxie - le guide des meilleurs personnages en 2024 - Le Blog Allo Paradise
iZurvive DayZ & ARMA Map
Jbf Wichita Falls
Iroquois Amphitheater Louisville Ky Seating Chart
Babbychula
Craigs List Tallahassee
F45 Training O'fallon Il Photos
Horn Rank
2000 Ford F-150 for sale - Scottsdale, AZ - craigslist
Helpers Needed At Once Bug Fables
When His Eyes Opened Chapter 3123
John Philip Sousa Foundation
How to Use Craigslist (with Pictures) - wikiHow
Ezstub Cross Country
Napa Autocare Locator
The Pretty Kitty Tanglewood
Blue Beetle Movie Tickets and Showtimes Near Me | Regal
Craigslist West Seneca
Does Iherb Accept Ebt
Autozone Locations Near Me
Wisconsin Women's Volleyball Team Leaked Pictures
Flags Half Staff Today Wisconsin
“To be able to” and “to be allowed to” – Ersatzformen von “can” | sofatutor.com
Lacy Soto Mechanic
Craigslist Minneapolis Com
Dontrell Nelson - 2016 - Football - University of Memphis Athletics
La Qua Brothers Funeral Home
Noga Funeral Home Obituaries
Erespassrider Ual
Razor Edge Gotti Pitbull Price
Puss In Boots: The Last Wish Showtimes Near Valdosta Cinemas
Phumikhmer 2022
Ff14 Palebloom Kudzu Cloth
Latest Posts
Article information

Author: Twana Towne Ret

Last Updated:

Views: 5898

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Twana Towne Ret

Birthday: 1994-03-19

Address: Apt. 990 97439 Corwin Motorway, Port Eliseoburgh, NM 99144-2618

Phone: +5958753152963

Job: National Specialist

Hobby: Kayaking, Photography, Skydiving, Embroidery, Leather crafting, Orienteering, Cooking

Introduction: My name is Twana Towne Ret, I am a famous, talented, joyous, perfect, powerful, inquisitive, lovely person who loves writing and wants to share my knowledge and understanding with you.