FeatureAlphaDropout#
- class brainstate.nn.FeatureAlphaDropout(prob=0.5, channel_axis=-1, name=None)#
Randomly masks out entire channels with Alpha Dropout properties.
Instead of setting activations to zero as in regular Dropout, the activations are set to the negative saturation value of the SELU activation function to maintain self-normalizing properties.
Each channel (e.g., the \(j\)-th channel of the \(i\)-th sample in the batch input is a tensor \(\text{input}[i, j]\)) will be masked independently for each sample on every forward call with probability using samples from a Bernoulli distribution. The elements to be masked are randomized on every forward call, and scaled and shifted to maintain zero mean and unit variance.
Usually the input comes from convolutional layers with SELU activation.
As described in the paper [2], if adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then i.i.d. dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease.
In this case,
FeatureAlphaDropoutwill help promote independence between feature maps and should be used instead.- Parameters:
Notes
Input shape: \((N, C, *)\) where C is the channel dimension.
Output shape: Same shape as input.
References
Examples
>>> import brainstate >>> m = brainstate.nn.FeatureAlphaDropout(prob=0.8) >>> x = brainstate.random.randn(20, 16, 4, 32, 32) >>> with brainstate.environ.context(fit=True): ... output = m(x) >>> output.shape (20, 16, 4, 32, 32)