smooth_labels

smooth_labels#

class braintools.metric.smooth_labels(labels, alpha)#

Apply label smoothing regularization to one-hot encoded labels.

Label smoothing is a regularization technique that prevents neural networks from becoming overconfident in their predictions by introducing controlled uncertainty in the training labels. This technique replaces hard targets with a weighted mixture of the original one-hot labels and a uniform distribution over all classes.

The smoothing transformation is defined as:

\[\tilde{y}_k = (1 - \alpha) y_k + \frac{\alpha}{K}\]

where \(y_k\) is the original label for class \(k\), \(\alpha\) is the smoothing parameter, \(K\) is the number of classes, and \(\tilde{y}_k\) is the smoothed label.

Parameters:

labels (Array | ndarray | bool | number | bool | int | float | complex | Quantity) – One-hot encoded labels with shape (..., num_classes) where the last dimension represents class probabilities. Must be floating-point type. Each row should contain exactly one 1.0 and zeros elsewhere for proper one-hot encoding.
alpha (float) –
Smoothing parameter in the range [0, 1] controlling the degree of smoothing:
- alpha = 0.0: No smoothing (original hard labels)
- alpha = 0.1: Light smoothing (common choice)
- alpha = 1.0: Maximum smoothing (uniform distribution)
Typical values range from 0.05 to 0.2 depending on the task complexity. Values outside [0, 1] raise a ValueError.

Returns:

Smoothed label distribution with the same shape as input. Each row sums to 1.0 provided the corresponding input row is itself a valid probability distribution (i.e. its entries sum to 1.0); see Notes.

Return type:

Array

Raises:

TypeError – If labels is not a floating-point array.
ValueError – If alpha is outside the closed interval [0, 1].

Notes

Row-sum precondition. The smoothing is \(\tilde{y} = (1 - \alpha) y + \alpha / K\). Summing over the \(K\) classes gives \((1 - \alpha) \sum_k y_k + \alpha\). This equals 1.0 only when \(\sum_k y_k = 1\) for the input row (e.g. proper one-hot or probability rows). If the input rows do not sum to 1, the smoothed rows will not sum to 1 either; this function does not normalize the input.

alpha is validated to lie in [0, 1]. Note that alpha is treated as a static Python float; passing a traced JAX value will raise during the bounds check, so keep alpha concrete (or mark it static under jax.jit).

Label smoothing provides several benefits:

Improved calibration: Reduces overconfident predictions
Better generalization: Acts as regularization to prevent overfitting
Robustness: Less sensitive to label noise and annotation errors
Gradient stability: Provides more stable training dynamics

The technique is particularly effective for:

Image classification with large numbers of classes
Tasks with potential label ambiguity or noise
Training very deep networks prone to overconfidence
Knowledge distillation scenarios

Common usage patterns:

Use with cross-entropy loss for classification
Combine with other regularization techniques (dropout, weight decay)
Tune alpha based on validation performance

Examples

Basic label smoothing for 3-class classification:

>>> import jax.numpy as jnp
>>> import braintools
>>> # One-hot labels for 2 samples, 3 classes
>>> labels = jnp.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
>>> braintools.metric.smooth_labels(labels, alpha=0.1)
Array([[0.93333334, 0.03333334, 0.03333334],
       [0.03333334, 0.93333334, 0.03333334]], dtype=float32)

Verify probability distribution properties for valid one-hot inputs:

>>> smoothed = braintools.metric.smooth_labels(jnp.eye(4), alpha=0.2)
>>> bool(jnp.allclose(jnp.sum(smoothed, axis=1), 1.0))
True
>>> bool(jnp.all(smoothed >= 0))
True

smooth_labels

Contents

smooth_labels#

Modeling

Infrastructure

Compilation