FeatureAlphaDropout

FeatureAlphaDropout#

class brainstate.nn.FeatureAlphaDropout(prob=0.5, channel_axis=-1, name=None)#

Randomly masks out entire channels with Alpha Dropout properties.

Instead of setting activations to zero as in regular Dropout, the activations are set to the negative saturation value of the SELU activation function to maintain self-normalizing properties.

Each channel (e.g., the \(j\)-th channel of the \(i\)-th sample in the batch input is a tensor \(\text{input}[i, j]\)) will be masked independently for each sample on every forward call with probability using samples from a Bernoulli distribution. The elements to be masked are randomized on every forward call, and scaled and shifted to maintain zero mean and unit variance.

Usually the input comes from convolutional layers with SELU activation.

As described in the paper [2], if adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then i.i.d. dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease.

In this case, FeatureAlphaDropout will help promote independence between feature maps and should be used instead.

Parameters:
  • prob (float) – Probability of an element to be kept. Default is 0.5.

  • channel_axis (int) – The axis representing the channel dimension. Default is -1.

  • name (str | None) – The name of the dynamic system.

Notes

Input shape: \((N, C, *)\) where C is the channel dimension.

Output shape: Same shape as input.

References

Examples

>>> import brainstate
>>> m = brainstate.nn.FeatureAlphaDropout(prob=0.8)
>>> x = brainstate.random.randn(20, 16, 4, 32, 32)
>>> with brainstate.environ.context(fit=True):
...     output = m(x)
>>> output.shape
(20, 16, 4, 32, 32)