ExponentialLR#
- class braintools.optim.ExponentialLR(base_lr=0.001, gamma=0.95, last_epoch=0)#
Exponential learning rate scheduler - Decays learning rate exponentially.
ExponentialLR multiplies the learning rate by gamma at every epoch, creating a smooth exponential decay. This scheduler is useful when you want a continuous and predictable decrease in the learning rate throughout training.
- Parameters:
base_lr (
float|List[float]) – Initial learning rate(s). Can be a single float or a list of floats for multiple parameter groups. Default: 1e-3.gamma (
float) – Multiplicative factor of learning rate decay per epoch. Must be in range (0, 1). Typical values: 0.95-0.99 for slow decay, 0.9-0.95 for moderate decay.last_epoch (
int) – The index of the last epoch. Used for resuming training. Default: -1 (starts from beginning).
Notes
The learning rate at epoch \(t\) is computed as:
\[\eta_t = \eta_0 \cdot \gamma^t\]where \(\eta_0\) is the initial learning rate (base_lr) and \(t\) is the current epoch number.
Key characteristics:
Smooth exponential decay every epoch
Learning rate decreases continuously
Simple one-parameter control (gamma)
Decay rate is constant in logarithmic scale
Gamma selection guidelines:
gamma=0.95: Moderate decay, lr halves every ~14 epochs
gamma=0.96: Gentle decay, lr halves every ~17 epochs
gamma=0.98: Slow decay, lr halves every ~35 epochs
gamma=0.99: Very slow decay, lr halves every ~69 epochs
When to use:
When you want smooth, continuous learning rate reduction
For fine-tuning with gradual decay
When step-based schedules are too abrupt
For long training runs with gradual convergence
Examples
Basic exponential decay:
>>> import braintools >>> import brainstate >>> >>> model = brainstate.nn.Linear(10, 5) >>> # Decay by 0.95 each epoch >>> scheduler = braintools.optim.ExponentialLR(base_lr=0.1, gamma=0.95) >>> optimizer = braintools.optim.SGD(lr=scheduler, momentum=0.9) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> for epoch in range(20): ... # Training code ... scheduler.step() ... if epoch % 5 == 0: ... print(f"Epoch {epoch}: lr = {optimizer.current_lr:.6f}") Epoch 0: lr = 0.100000 Epoch 5: lr = 0.077378 # lr * 0.95^5 Epoch 10: lr = 0.059874 # lr * 0.95^10 Epoch 15: lr = 0.046329 # lr * 0.95^15
Slow decay for fine-tuning:
>>> # Very gentle decay with gamma=0.99 >>> scheduler = braintools.optim.ExponentialLR(base_lr=0.001, gamma=0.99) >>> optimizer = braintools.optim.Adam(lr=scheduler) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> for epoch in range(100): ... finetune_epoch(model, optimizer, finetune_loader) ... scheduler.step() # After 100 epochs: lr ≈ 0.001 * 0.99^100 ≈ 0.000366
Moderate decay for standard training:
>>> # Moderate decay with gamma=0.96 >>> scheduler = braintools.optim.ExponentialLR(base_lr=0.1, gamma=0.96) >>> optimizer = braintools.optim.SGD(lr=scheduler, momentum=0.9, weight_decay=1e-4) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> for epoch in range(50): ... optimizer.step(grads) ... scheduler.step() # lr smoothly decreases from 0.1 to ~0.013
Combining with warmup:
>>> # Warmup followed by exponential decay >>> warmup = braintools.optim.LinearLR( ... start_factor=0.1, ... end_factor=1.0, ... total_iters=5 ... ) >>> decay = braintools.optim.ExponentialLR(base_lr=0.01, gamma=0.95) >>> scheduler = braintools.optim.ChainedScheduler([warmup, decay]) >>> >>> optimizer = braintools.optim.Adam(lr=scheduler) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> for epoch in range(100): ... optimizer.step(grads) ... scheduler.step()
Using with different optimizers:
>>> # Works with any optimizer >>> scheduler = braintools.optim.ExponentialLR(base_lr=0.001, gamma=0.98) >>> >>> # With Adam >>> adam_opt = braintools.optim.Adam(lr=scheduler) >>> adam_opt.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> # Or with RMSprop >>> model2 = brainstate.nn.Linear(10, 5) >>> scheduler2 = braintools.optim.ExponentialLR(base_lr=0.001, gamma=0.98) >>> rmsprop_opt = braintools.optim.RMSprop(lr=scheduler2) >>> rmsprop_opt.register_trainable_weights(model2.states(brainstate.ParamState))
Saving and loading state:
>>> scheduler = braintools.optim.ExponentialLR(base_lr=0.1, gamma=0.95) >>> optimizer = braintools.optim.SGD(lr=scheduler, momentum=0.9) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> # Train for some epochs >>> for epoch in range(50): ... scheduler.step() >>> >>> # Save checkpoint >>> checkpoint = { ... 'epoch': 50, ... 'model': model.state_dict(), ... 'scheduler': scheduler.state_dict(), ... } >>> >>> # Resume training >>> new_scheduler = braintools.optim.ExponentialLR(base_lr=0.1, gamma=0.95) >>> new_scheduler.load_state_dict(checkpoint['scheduler']) >>> # lr will be correctly set to 0.1 * 0.95^50
Aggressive decay:
>>> # Fast decay with gamma=0.9 >>> scheduler = braintools.optim.ExponentialLR(base_lr=0.1, gamma=0.9) >>> optimizer = braintools.optim.SGD(lr=scheduler, momentum=0.9) >>> optimizer.register_trainable_weights(model.states(brainstate.ParamState)) >>> >>> for epoch in range(30): ... optimizer.step(grads) ... scheduler.step() # After 30 epochs: lr ≈ 0.1 * 0.9^30 ≈ 0.00424
See also
StepLRStep-wise learning rate decay
CosineAnnealingLRCosine annealing schedule
MultiStepLRMulti-step learning rate decay
References