MAX POOLING (2024)

The pooling operation involves sliding a two-dimensional filter over each channel of feature map and summarising the features lying within the region covered by the filter.
For a feature map having dimensions nh x nw x nc, the dimensions of output obtained after a pooling layer is

(nh - f + 1)/s *(nw - f+ 1)/s *nc

where

 nh - height of feature map
nw - width of feature map
nc - number of channels in the feature map
f - size of filter
s - stride length

A common CNN model architecture is to have a number of convolution and pooling layers stacked one after the other.

Pooling Layers?

  • Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network.
  • The pooling layer summarises the features present in a region of the feature map generated by a convolution layer. So, further operations are performed on summarised features instead of precisely positioned features generated by the convolution layer. This makes the model more robust to variations in the position of the features in the input image.

Types of Pooling:

  1. MaxPooling
  2. Average Pooling
  3. Global Pooling

Max Pooling

Max pooling is a pooling operation that selects the maximum element from the region of the feature map covered by the filter. Thus, the output after max-pooling layer would be a feature map containing the most prominent features of the previous feature map.

Average Pooling

Average pooling computes the average of the elements present in the region of feature map covered by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the feature map, average pooling gives the average of features present in a patch.

Global pooling reduces each channel in the feature map to a single value. Thus, an nh x nw x nc feature map is reduced to 1 x 1 x nc feature map. This is equivalent to using a filter of dimensions nh x nw i.e. the dimensions of the feature map.
Further, it can be either global max pooling or global average pooling.

MaxPooling is a down-sampling operation often used in Convolutional Neural Networks (CNNs) to reduce the spatial dimensions of the input volume. It is a form of pooling layer, and it helps in retaining the most important information while discarding less important details. MaxPooling is typically applied after convolutional layers in a CNN.

The basic idea behind MaxPooling is to divide the input image into non-overlapping rectangular regions and, for each region, output the maximum value. This operation is performed independently for each channel in the input.

Here’s a simple explanation of how MaxPooling works:

Input Region:

  • The input image is divided into small regions (usually 2x2 or 3x3).
  • For each region, the maximum value is computed.

Output Feature Map:

  • The maximum value for each region is taken and forms the output of that region.
  • The result is a down-sampled version of the input, with reduced spatial dimensions.

Mathematically, if we denote the input as X and the output as Y, the MaxPooling operation can be defined as:

Y[i,j,k]=max(X[2i:2i+2,2j:2j+2,k])

where i and j iterate over the height and width dimensions of the input, and k iterates over the channels.

Common choices for the size of the pooling window are 2x2 or 3x3, and the stride (the step size when moving the pooling window) is often set to be equal to the size of the window for non-overlapping pooling.

MAX POOLING (2)
import numpy as np
from keras.models import Sequential
from keras.layers import MaxPooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single max pooling layer
model = Sequential(
[MaxPooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[9. 7.]
[8. 6.]]

Let’s go through a simple example of MaxPooling with a 2x2 pooling window. Consider a small 4x4 input matrix:

MAX POOLING (3)

Now, let’s apply 2x2 MaxPooling to this input matrix. The pooling operation involves moving a 2x2 window across the input and, for each window, taking the maximum value. The output matrix, Y, will have reduced spatial dimensions.

Y[i,j]=max(X[2i:2i+2,2j:2j+2])

Let’s calculate Y step by step:

  1. For i=0 and j=0:

[0,0]=max(X[0:2,0:2])

=max([1 5​
3 6​]) =6
  1. For i=0 and j=1:

Y[0,1]=max(X[0:2,2:4])

=max([2 7​ 
4 8​])=8
  1. For i=1 and j=0:

Y[1,0]=max(X[2:4,0:2])

max([9 13​ 
10 14​])=14
  1. For i=1 and j=1:

Y[1,1]=max(X[2:4,2:4])

=max([11 15 
​12 16​])=16
The resulting output matrix Y is:

Y=[ 6 14
​8 16 ]

Max pooling offers several benefits in the context of CNNs:

  • Feature Invariance: Max pooling helps the model to become invariant to the location and orientation of features. This means that the network can recognize an object in an image no matter where it is located.
  • Dimensionality Reduction: By downsampling the input, max pooling significantly reduces the number of parameters and computations in the network, thus speeding up the learning process and reducing the risk of overfitting.
  • Noise Suppression: Max pooling helps to suppress noise in the input data. By taking the maximum value within the window, it emphasizes the presence of strong features and diminishes the weaker ones.

In practice, max pooling layers are placed after convolutional layers in a CNN. After a convolutional layer extracts features from the input image, the max pooling layer reduces the spatial size of the convolved feature map, keeping only the most salient information. This process is repeated for multiple convolutional and pooling layers, allowing the network to learn a hierarchy of features at various levels of abstraction.

Max pooling is a simple yet effective technique that has been instrumental in the success of CNNs in various applications, particularly in image and video recognition tasks. Its ability to reduce the computational burden while maintaining the essential features has made it a staple component in deep learning architectures.

Despite its benefits, max pooling is not without its challenges. One criticism is that it can sometimes be too aggressive, discarding potentially useful information that could be important for the classification task. Moreover, max pooling is a fixed operation and does not learn from the data, unlike convolutional layers that have learnable parameters.

As a result, some modern CNN architectures have started to move away from traditional max pooling layers, using alternatives like strided convolutions for downsampling or incorporating learnable pooling operations that can adapt to the data.

The link to the last article which contains the initial part of the article

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
layers.Conv2D(filters=64, kernel_size=3), # activation is None
layers.MaxPool2D(pool_size=2),
# More layers follow
])

A MaxPool2D layer is much like a Conv2Dlayer, except that it uses a simple maximum function instead of a kernel, with the pool_size parameter analogous to kernel_Size. A MaxPool2D layer doesn't have any trainable weights like a convolutional layer does in its kernel, however.

Let’s take another look at the extraction figure from the last lesson. Remember that MaxPool2D is the Condense step.

MAX POOLING (4)

Notice that after applying the ReLU function (Detect) the feature map ends up with a lot of “dead space,” that is, large areas containing only 0’s (the black areas in the image). Having to carry these 0 activations through the entire network would increase the size of the model without adding much useful information. Instead, we would like to condense the feature map to retain only the most useful part — the feature itself.

This in fact is what maximum pooling does. Max pooling takes a patch of activations in the original feature map and replaces them with the maximum activation in that patch.

MAX POOLING (5)

When applied after the ReLU activation, it has the effect of “intensifying” features. The pooling step increases the proportion of active pixels to zero pixels.

Translation Invariance

We called the zero-pixels “unimportant”. Does this mean they carry no information at all? In fact, the zero-pixels carry positional information. The blank space still positions the feature within the image. When MaxPool2D removes some of these pixels, it removes some of the positional information in the feature map. This gives a convnet a property called translation invariance. This means that a convnet with maximum pooling will tend not to distinguish features by their location in the image. ("Translation" is the mathematical word for changing the position of something without rotating it or changing its shape or size.)

Watch what happens when we repeatedly apply maximum pooling to the following feature map.

MAX POOLING (6)

The two dots in the original image became indistinguishable after repeated pooling. In other words, pooling destroyed some of their positional information. Since the network can no longer distinguish between them in the feature maps, it can’t distinguish them in the original image either: it has become invariant to that difference in position.

In fact, pooling only creates translation invariance in a network over small distances, as with the two dots in the image. Features that begin far apart will remain distinct after pooling; only some of the positional information was lost, but not all of it.

MAX POOLING (7)

This invariance to small differences in the positions of features is a nice property for an image classifier to have. Just because of differences in perspective or framing, the same kind of feature might be positioned in various parts of the original image, but we would still like for the classifier to recognize that they are the same.

Other Pooling Layers

MAX POOLING (8)
import numpy as np
from keras.models import Sequential
from keras.layers import AveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single average pooling layer
model = Sequential(
[AveragePooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[4.25 4.25]
[4.25 3.5 ]]
import numpy as np
from keras.models import Sequential
from keras.layers import GlobalMaxPooling2D
from keras.layers import GlobalAveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define gm_model containing just a single global-max pooling layer
gm_model = Sequential(
[GlobalMaxPooling2D()])

# define ga_model containing just a single global-average pooling layer
ga_model = Sequential(
[GlobalAveragePooling2D()])

# generate pooled output
gm_output = gm_model.predict(image)
ga_output = ga_model.predict(image)

# print output image
gm_output = np.squeeze(gm_output)
ga_output = np.squeeze(ga_output)
print("gm_output: ", gm_output)
print("ga_output: ", ga_output)

This ends the basic understanding of MaxPooling Layer in CNN architecture

MAX POOLING (2024)
Top Articles
Useful Stats: Income inequality across the states | SSTI
The Danger Zone: Following Food Safety Temperatures
Chs.mywork
Kreme Delite Menu
Mountain Dew Bennington Pontoon
Goodbye Horses: The Many Lives of Q Lazzarus
Faint Citrine Lost Ark
A Complete Guide To Major Scales
Cumberland Maryland Craigslist
Mikayla Campino Video Twitter: Unveiling the Viral Sensation and Its Impact on Social Media
Craigslistdaytona
Alaska Bücher in der richtigen Reihenfolge
Walgreens On Nacogdoches And O'connor
Blue Beetle Showtimes Near Regal Swamp Fox
Arboristsite Forum Chainsaw
Craigslist Blackshear Ga
Immortal Ink Waxahachie
How Much Is Tay Ks Bail
Jalapeno Grill Ponca City Menu
U Break It Near Me
Accuweather Mold Count
Wgu Academy Phone Number
Why do rebates take so long to process?
Optum Urgent Care - Nutley Photos
Air Quality Index Endicott Ny
Gazette Obituary Colorado Springs
Caring Hearts For Canines Aberdeen Nc
Apartments / Housing For Rent near Lake Placid, FL - craigslist
Plost Dental
What Equals 16
Sound Of Freedom Showtimes Near Movie Tavern Brookfield Square
Maine Racer Swap And Sell
Shoe Station Store Locator
Amazing Lash Bay Colony
Donald Trump Assassination Gold Coin JD Vance USA Flag President FIGHT CIA FBI • $11.73
Wasmo Link Telegram
Covalen hiring Ai Annotator - Dutch , Finnish, Japanese , Polish , Swedish in Dublin, County Dublin, Ireland | LinkedIn
Space Marine 2 Error Code 4: Connection Lost [Solved]
Gets Less Antsy Crossword Clue
Rage Of Harrogath Bugged
Winco Money Order Hours
Emulating Web Browser in a Dedicated Intermediary Box
Carroll White Remc Outage Map
Lucifer Morningstar Wiki
Panolian Batesville Ms Obituaries 2022
About Us
Wzzm Weather Forecast
Rovert Wrestling
Www Extramovies Com
Invitation Quinceanera Espanol
San Pedro Sula To Miami Google Flights
Latest Posts
Article information

Author: Lakeisha Bayer VM

Last Updated:

Views: 5699

Rating: 4.9 / 5 (49 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lakeisha Bayer VM

Birthday: 1997-10-17

Address: Suite 835 34136 Adrian Mountains, Floydton, UT 81036

Phone: +3571527672278

Job: Manufacturing Agent

Hobby: Skimboarding, Photography, Roller skating, Knife making, Paintball, Embroidery, Gunsmithing

Introduction: My name is Lakeisha Bayer VM, I am a brainy, kind, enchanting, healthy, lovely, clean, witty person who loves writing and wants to share my knowledge and understanding with you.