LPPool3d

LPPool3d#

class brainstate.nn.LPPool3d(norm_type, kernel_size, stride=None, padding='VALID', channel_axis=-1, name=None, in_size=None)#

Applies a 3D power-average pooling over an input signal composed of several input planes.

On each window, the function computed is the (normalized) power-mean:

\[f(X) = \left( \frac{1}{N} \sum_{x \in X} |x|^{p} \right)^{1/p}\]

where \(N\) is the number of elements in the window (\(N = \prod_i \text{kernel\_size}[i]\)).

As \(p \to \infty\), the result approaches max pooling (the limit; \(p = \infty\) itself is not supported – norm_type must be finite)
At \(p = 1\), one gets average pooling (with absolute values)
At \(p = 2\), one gets root mean square (RMS) pooling

Note

This is a normalized power-mean (the sum is divided by the window size \(N\)). It therefore differs from PyTorch’s LPPool, which computes the unnormalized power-sum \(\left( \sum_{x \in X} |x|^{p} \right)^{1/p}\).

Shape:

Input: \((N, D_{in}, H_{in}, W_{in}, C)\) or \((D_{in}, H_{in}, W_{in}, C)\).
Output: \((N, D_{out}, H_{out}, W_{out}, C)\) or \((D_{out}, H_{out}, W_{out}, C)\), where

\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - \text{kernel\_size}[0]}{\text{stride}[0]} + 1\right\rfloor\]

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - \text{kernel\_size}[1]}{\text{stride}[1]} + 1\right\rfloor\]

\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - \text{kernel\_size}[2]}{\text{stride}[2]} + 1\right\rfloor\]

Parameters:

norm_type (float) – Exponent for the pooling operation. Default: 2.0
kernel_size (int | Sequence[int] | integer | Sequence[integer]) – An integer, or a sequence of integers defining the window to reduce over.
stride (int | Sequence[int]) – An integer, or a sequence of integers, representing the inter-window stride. Default: kernel_size
padding (str | int | Tuple[int] | Sequence[Tuple[int, int]]) – Either the string ‘SAME’, the string ‘VALID’, or a sequence of n (low, high) integer pairs that give the padding to apply before and after each spatial dimension. Default: ‘VALID’
channel_axis (int | None) – Axis of the spatial channels for which pooling is skipped. If None, there is no channel axis. Default: -1
name (str | None) – The object name.
in_size (int | Sequence[int] | integer | Sequence[integer] | None) – The shape of the input tensor.

Examples

>>> import brainstate
>>> # power-average pooling of cube window of size=3, stride=2
>>> m = LPPool3d(2, 3, stride=2)
>>> # pool of non-cubic window with norm_type=1.5
>>> m = LPPool3d(1.5, (3, 2, 2), stride=(2, 1, 2), channel_axis=-1)
>>> input = brainstate.random.randn(20, 50, 44, 31, 16)
>>> output = m(input)
>>> output.shape
(20, 24, 43, 15, 16)

LPPool3d

Contents

LPPool3d#

Modeling

Infrastructure

Compilation