Online Learning Algorithms#
braintrace provides online learning algorithms based on
eligibility trace propagation. All algorithms share the same interface:
wrap a model, compile the graph, then call the algorithm as a drop-in
replacement for the model’s forward pass.
Base Classes#
The base class for the eligibility trace algorithm. |
|
The state for storing the eligibility trace during the computation of online learning algorithms. |
D-RTRL (Parameter Dimension)#
The Decoupled Real-Time Recurrent Learning algorithm with diagonal approximation. Memory complexity: \(O(B \cdot |\theta|)\), where \(B\) is the batch size and \(|\theta|\) is the number of parameters.
The online gradient computation algorithm with the diagonal approximation and the parameter dimension complexity. |
D_RTRL is an alias for ParamDimVjpAlgorithm.
ES-D-RTRL (Input-Output Dimension)#
The Event-Synchronized D-RTRL algorithm. Factorizes the eligibility trace into input and output components with exponential smoothing. Memory complexity: \(O(B(I + O))\), where \(I\) and \(O\) are the input and output dimensions.
The online gradient computation algorithm with the diagonal approximation and the input-output dimensional complexity. |
ES_D_RTRL and pp_prop are aliases for IODimVjpAlgorithm.
VJP Algorithm Base#
The base class for the eligibility trace algorithm which supporting the VJP gradient computation (reverse-mode differentiation). |
SNN Online-Learning Algorithms#
Paper-faithful algorithms tailored to spiking neural networks. All are
ETraceVjpAlgorithm subclasses (or factories over the VJP algorithms).
Eligibility Propagation. |
|
Factory returning the appropriate VJP algorithm for the selected regime. |
|
Online Training with Postsynaptic Estimates. |
|
Online Training Through Time. |
|
Online Spatio-Temporal Target Projection. |
SNN helpers reusable across the above algorithms:
Frozen random feedback matrix |
|
Low-pass output-side filter |
|
Leaky accumulator |
Algorithm Comparison#
Algorithm |
Memory |
Computation |
Best For |
|---|---|---|---|
|
\(O(B \cdot |\theta|)\) |
\(O(B \cdot I \cdot O)\) |
RNNs, general-purpose |
|
\(O(B(I + O))\) |
\(O(B \cdot I \cdot O)\) |
Large SNNs, memory-constrained |
|
\(O(B \cdot |\theta|)\) |
\(O(B \cdot I \cdot O)\) |
SNNs with κ-filtered / random-feedback learning signals |
|
depends on regime |
depends on regime |
SNN regime-switchable factory (D-RTRL / pp_prop) |
|
\(O(B \cdot I \cdot O)\) (full) / \(O(B(I+O))\) (approx) |
\(O(B \cdot I \cdot O)\) |
Deep SNNs; F-OTPE trades rank for memory |
|
\(O(B \cdot I)\) |
\(O(B \cdot I \cdot O)\) |
Very large SNNs; presynaptic λ-trace only |
|
\(O(B \cdot |\theta|)\) |
\(O(B \cdot I \cdot O)\) |
Target-projection via fixed random feedback |