Online Learning Algorithms#

braintrace provides online learning algorithms based on eligibility trace propagation. All algorithms share the same interface: wrap a model, compile the graph, then call the algorithm as a drop-in replacement for the model’s forward pass.

Base Classes#

ETraceAlgorithm

The base class for the eligibility trace algorithm.

EligibilityTrace

The state for storing the eligibility trace during the computation of online learning algorithms.

D-RTRL (Parameter Dimension)#

The Decoupled Real-Time Recurrent Learning algorithm with diagonal approximation. Memory complexity: \(O(B \cdot |\theta|)\), where \(B\) is the batch size and \(|\theta|\) is the number of parameters.

\[\boldsymbol{\epsilon}^t \approx \mathbf{D}^t \boldsymbol{\epsilon}^{t-1} + \operatorname{diag}(\mathbf{D}_f^t) \otimes \mathbf{x}^t\]
\[\nabla_{\boldsymbol{\theta}} \mathcal{L} = \sum_{t' \in \mathcal{T}} \frac{\partial \mathcal{L}^{t'}}{\partial \mathbf{h}^{t'}} \circ \boldsymbol{\epsilon}^{t'}\]

ParamDimVjpAlgorithm

The online gradient computation algorithm with the diagonal approximation and the parameter dimension complexity.

D_RTRL is an alias for ParamDimVjpAlgorithm.

ES-D-RTRL (Input-Output Dimension)#

The Event-Synchronized D-RTRL algorithm. Factorizes the eligibility trace into input and output components with exponential smoothing. Memory complexity: \(O(B(I + O))\), where \(I\) and \(O\) are the input and output dimensions.

\[\boldsymbol{\epsilon}^t \approx \boldsymbol{\epsilon}_{\mathbf{f}}^t \otimes \boldsymbol{\epsilon}_{\mathbf{x}}^t\]
\[\boldsymbol{\epsilon}_{\mathbf{x}}^t = \alpha \boldsymbol{\epsilon}_{\mathbf{x}}^{t-1} + \mathbf{x}^t\]
\[\boldsymbol{\epsilon}_{\mathbf{f}}^t = \alpha \operatorname{diag}(\mathbf{D}^t) \circ \boldsymbol{\epsilon}_{\mathbf{f}}^{t-1} + (1 - \alpha) \operatorname{diag}(\mathbf{D}_f^t)\]

IODimVjpAlgorithm

The online gradient computation algorithm with the diagonal approximation and the input-output dimensional complexity.

ES_D_RTRL and pp_prop are aliases for IODimVjpAlgorithm.

VJP Algorithm Base#

ETraceVjpAlgorithm

The base class for the eligibility trace algorithm which supporting the VJP gradient computation (reverse-mode differentiation).

SNN Online-Learning Algorithms#

Paper-faithful algorithms tailored to spiking neural networks. All are ETraceVjpAlgorithm subclasses (or factories over the VJP algorithms).

EProp

Eligibility Propagation.

OSTL

Factory returning the appropriate VJP algorithm for the selected regime.

OTPE

Online Training with Postsynaptic Estimates.

OTTT

Online Training Through Time.

OSTTP

Online Spatio-Temporal Target Projection.

SNN helpers reusable across the above algorithms:

FixedRandomFeedback

Frozen random feedback matrix B ℝ^{n_target × n_layer} with stop_gradient guard.

KappaFilter

Low-pass output-side filter x_filt (1-κ)·x + κ·x_filt used by EProp.

PresynapticTrace

Leaky accumulator â λ·â + x_t used by OTTT and OTPE-Approx.

Algorithm Comparison#

Algorithm

Memory

Computation

Best For

D_RTRL

\(O(B \cdot |\theta|)\)

\(O(B \cdot I \cdot O)\)

RNNs, general-purpose

ES_D_RTRL

\(O(B(I + O))\)

\(O(B \cdot I \cdot O)\)

Large SNNs, memory-constrained

EProp

\(O(B \cdot |\theta|)\)

\(O(B \cdot I \cdot O)\)

SNNs with κ-filtered / random-feedback learning signals

OSTL

depends on regime

depends on regime

SNN regime-switchable factory (D-RTRL / pp_prop)

OTPE

\(O(B \cdot I \cdot O)\) (full) / \(O(B(I+O))\) (approx)

\(O(B \cdot I \cdot O)\)

Deep SNNs; F-OTPE trades rank for memory

OTTT

\(O(B \cdot I)\)

\(O(B \cdot I \cdot O)\)

Very large SNNs; presynaptic λ-trace only

OSTTP

\(O(B \cdot |\theta|)\)

\(O(B \cdot I \cdot O)\)

Target-projection via fixed random feedback