ML Practicum: Image Classification

Check Your Understanding: Convolution

A two-dimensional, 3x3 convolutional filter is applied to a two-dimensional 4x4 input feature map (no padding added):

What is the shape of the output feature map?
As the 3x3 filter slides over the 4x4 feature map, there are 4 unique locations in which it can be placed, which results in a 2x2 output feature map: Animation showing a 3x3 convolutional filter sliding over a 4x4 feature map.
           There are 4 unique positions where the 3x3 filter can be placed, each corresponding to
           one of 4 elements in the 2x2 output feature map.
While the filter itself is 3x3, the output feature map is smaller because there are fewer than 9 (3 times 3) possible locations where the filter can be placed on the 4x4 input feature map.
To generate an output feature map with the same dimensions as the input feature map with no padding, the convolutional filter would have to be 1x1 in shape. A filter larger than 1x1 will produce an output feature map that is smaller than the input feature map. Because our filter is 3x3, the output feature map must be smaller than 4x4.