ETraceVjpGraphExecutor#
- class braintrace.ETraceVjpGraphExecutor(model, vjp_method='single-step')#
The eligibility trace graph executor for the VJP-based online learning algorithms.
This class is used for executing the eligibility trace graph for the VJP-based online learning algorithms, including:
IODimVjpAlgorithm(aliasES_D_RTRL/pp_prop()) for the algorithm with input-output dimensional complexity.ParamDimVjpAlgorithm(aliasD_RTRL) for the algorithm with parameter dimensional complexity.
- Parameters:
model (
Module) – The model to build the eligibility trace graph. The models should only define the one-step behavior.vjp_method (
str) –The method for computing the VJP. It should be either “single-step” or “multi-step”.
”single-step”: The VJP is computed at the current time step, i.e., \(\partial L^t/\partial h^t\).
”multi-step”: The VJP is computed at multiple time steps, i.e., \(\partial L^t/\partial h^{t-k}\), where \(k\) is determined by the data input.
- compile_graph(*args)[source]#
Building the eligibility trace graph for the model according to the given inputs.
This is the most important method for the eligibility trace graph. It builds the graph for the model, which is used for computing the weight spatial gradients and the hidden state Jacobian.
- Parameters:
*args – The positional arguments for the model.
- Return type:
- property is_multi_step_vjp: bool#
Whether the VJP method is
multi-step.- Returns:
Whether the VJP method is
multi-step.- Return type:
- property is_single_step_vjp: bool#
Whether the VJP method is
single-step.- Returns:
Whether the VJP method is
single-step.- Return type:
- solve_h2w_h2h_jacobian(*args)[source]#
Solving the hidden-to-weight and hidden-to-hidden Jacobian according to the given inputs and parameters.
This function is typically used for computing the forward propagation of hidden-to-weight Jacobian.
Now we mathematically define what computations are done in this function.
For the state transition function \(y, h^t = f(h^{t-1}, \theta, x)\), this function aims to solve:
The function output: \(y\)
The updated hidden states: \(h^t\)
3. The Jacobian matrix of hidden-to-weight, i.e., \(\partial h^t / \partial \theta^t\). 2. The Jacobian matrix of hidden-to-hidden, i.e., \(\partial h^t / \partial h^{t-1}\).
- Parameters:
*args – The positional arguments for the model.
- Return type:
Tuple[Any,Dict[Tuple[str,...],Any],Dict[Tuple[str,...],Any],Tuple[Dict[int,Array],Dict[Tuple[int,str],Array]],Sequence[Array]]- Returns:
The outputs, hidden states, other states, and the spatial gradients of the weights. Return the single-step results if inputs do not contain multiple-step data, otherwise return the multi-step data.
- solve_h2w_h2h_l2h_jacobian(*args)[source]#
Solving the hidden-to-weight and hidden-to-hidden Jacobian and the VJP transformed loss-to-hidden gradients according to the given inputs.
This function is typically used for computing both the forward propagation of hidden-to-weight Jacobian and the loss-to-hidden gradients at the current time-step.
Particularly, this function aims to solve:
The Jacobian matrix of hidden-to-weight. That is, \(\partial h / \partial w\), where \(h\) is the hidden state and \(w\) is the weight.
The Jacobian matrix of hidden-to-hidden. That is, \(\partial h / \partial h\), where \(h\) is the hidden state.
The partial gradients of the loss with respect to the hidden states. That is, \(\partial L / \partial h\), where \(L\) is the loss and \(h\) is the hidden state.
- Parameters:
*args – The positional arguments for the model.
- Return type:
Tuple[Any,Dict[Tuple[str,...],Any],Dict[Tuple[str,...],Any],Tuple[Dict[int,Array],Dict[Tuple[int,str],Array]],Sequence[Array],VjpResiduals]- Returns:
The outputs, hidden states, other states, the spatial gradients of the weights, and the residuals.