MaxPool2d

Contents

MaxPool2d#

class brainstate.nn.MaxPool2d(kernel_size, stride=None, padding='VALID', channel_axis=-1, return_indices=False, name=None, in_size=None)#

Applies a 2D max pooling over an input signal composed of several input planes.

In the simplest case, the output value of the layer with input size \((N, H, W, C)\), output \((N, H_{out}, W_{out}, C)\) and kernel_size \((kH, kW)\) can be precisely described as:

\[\begin{split}\begin{aligned} out(N_i, h, w, C_j) ={} & \max_{m=0, \ldots, kH-1} \max_{n=0, \ldots, kW-1} \\ & \text{input}(N_i, \text{stride[0]} \times h + m, \text{stride[1]} \times w + n, C_j) \end{aligned}\end{split}\]

If padding is non-zero, then the input is implicitly padded with negative infinity on both sides for padding number of points. dilation controls the spacing between the kernel points. It is harder to describe, but this link has a nice visualization of what dilation does.

Shape:
  • Input: \((N, H_{in}, W_{in}, C)\) or \((H_{in}, W_{in}, C)\)

  • Output: \((N, H_{out}, W_{out}, C)\) or \((H_{out}, W_{out}, C)\), where

    \[H_{out} = \left\lfloor\frac{H_{in} + 2 * \text{padding[0]} - \text{dilation[0]} \times (\text{kernel\_size[0]} - 1) - 1}{\text{stride[0]}} + 1\right\rfloor\]
    \[W_{out} = \left\lfloor\frac{W_{in} + 2 * \text{padding[1]} - \text{dilation[1]} \times (\text{kernel\_size[1]} - 1) - 1}{\text{stride[1]}} + 1\right\rfloor\]
Parameters:
  • kernel_size (int | Sequence[int] | integer | Sequence[integer]) – An integer, or a sequence of integers defining the window to reduce over.

  • stride (int | Sequence[int]) – An integer, or a sequence of integers, representing the inter-window stride. Default: kernel_size

  • padding (str | int | Tuple[int, ...] | Sequence[Tuple[int, int]]) – Either the string ‘SAME’, the string ‘VALID’, or a sequence of n (low, high) integer pairs that give the padding to apply before and after each spatial dimension. Default: ‘VALID’

  • channel_axis (int | None) – Axis of the spatial channels for which pooling is skipped. If None, there is no channel axis. Default: -1

  • return_indices (bool) – If True, will return the max indices along with the outputs. Useful for MaxUnpool2d. Default: False

  • name (str | None) – The object name.

  • in_size (int | Sequence[int] | integer | Sequence[integer] | None) – The shape of the input tensor.

Examples

>>> import brainstate
>>> # pool of square window of size=3, stride=2
>>> m = MaxPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = MaxPool2d((3, 2), stride=(2, 1), channel_axis=-1)
>>> input = brainstate.random.randn(20, 50, 32, 16)
>>> output = m(input)
>>> output.shape
(20, 24, 31, 16)