MaxPool3d#
- class brainstate.nn.MaxPool3d(kernel_size, stride=None, padding='VALID', channel_axis=-1, return_indices=False, name=None, in_size=None)#
Applies a 3D max pooling over an input signal composed of several input planes.
In the simplest case, the output value of the layer with input size \((N, D, H, W, C)\), output \((N, D_{out}, H_{out}, W_{out}, C)\) and
kernel_size\((kD, kH, kW)\) can be precisely described as:\[\begin{split}\begin{aligned} \text{out}(N_i, d, h, w) ={} & \max_{k=0, \ldots, kD-1} \max_{m=0, \ldots, kH-1} \max_{n=0, \ldots, kW-1} \\ & \text{input}(N_i, \text{stride[0]} \times d + k, \text{stride[1]} \times h + m, \text{stride[2]} \times w + n, C_j) \end{aligned}\end{split}\]If
paddingis non-zero, then the input is implicitly padded with negative infinity on both sides forpaddingnumber of points.dilationcontrols the spacing between the kernel points. It is harder to describe, but this link has a nice visualization of whatdilationdoes.- Shape:
Input: \((N, D_{in}, H_{in}, W_{in}, C)\) or \((D_{in}, H_{in}, W_{in}, C)\).
Output: \((N, D_{out}, H_{out}, W_{out}, C)\) or \((D_{out}, H_{out}, W_{out}, C)\), where
\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - \text{dilation}[2] \times (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]
- Parameters:
kernel_size (
int|Sequence[int] |integer|Sequence[integer]) – An integer, or a sequence of integers defining the window to reduce over.stride (
int|Sequence[int]) – An integer, or a sequence of integers, representing the inter-window stride. Default: kernel_sizepadding (
str|int|Tuple[int] |Sequence[Tuple[int,int]]) – Either the string ‘SAME’, the string ‘VALID’, or a sequence of n (low, high) integer pairs that give the padding to apply before and after each spatial dimension. Default: ‘VALID’channel_axis (
int|None) – Axis of the spatial channels for which pooling is skipped. IfNone, there is no channel axis. Default: -1return_indices (
bool) – If True, will return the max indices along with the outputs. Useful for MaxUnpool3d. Default: Falsein_size (
int|Sequence[int] |integer|Sequence[integer] |None) – The shape of the input tensor.
Examples
>>> import brainstate >>> # pool of square window of size=3, stride=2 >>> m = MaxPool3d(3, stride=2) >>> # pool of non-square window >>> m = MaxPool3d((3, 2, 2), stride=(2, 1, 2), channel_axis=-1) >>> input = brainstate.random.randn(20, 50, 44, 31, 16) >>> output = m(input) >>> output.shape (20, 24, 43, 15, 16)