Approximators
BaseModule
BaseModule ()
*Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*
LinearModel
LinearModel (input_size:int, output_size:int, relu_output:bool=False)
Linear regression model
Type | Default | Details | |
---|---|---|---|
input_size | int | number of features | |
output_size | int | number of outputs/actions | |
relu_output | bool | False | whether to apply ReLU activation to the output |
MLP
MLP (input_size:int, output_size:int, hidden_layers:list, drop_prob:float=0.0, batch_norm:bool=False, relu_output:bool=False)
Multilayer perceptron model
Type | Default | Details | |
---|---|---|---|
input_size | int | number of features | |
output_size | int | number of outputs/actions | |
hidden_layers | list | list of number of neurons in each hidden layer | |
drop_prob | float | 0.0 | dropout probability |
batch_norm | bool | False | whether to apply batch normalization |
relu_output | bool | False | whether to apply ReLU activation to the output |
Transformer
Transformer (input_size:int, output_size:int, max_context_length:int=128, n_layer:int=3, n_head:int=8, n_embd_per_head:int=32, rope_scaling:Optional[Dict]=None, min_multiple=256, gating=True, drop_prob:float=0.0, final_activation:Literal[' relu','sigmoid','tanh','elu','leakyrelu','identity']='identi ty')
Multilayer perceptron model
Type | Default | Details | |
---|---|---|---|
input_size | int | number of (time steps, features) | |
output_size | int | number of outputs/actions | |
max_context_length | int | 128 | maximum context lenght during inference |
n_layer | int | 3 | number of layers in the transformer |
n_head | int | 8 | number of heads per layer |
n_embd_per_head | int | 32 | number of embedding per head |
rope_scaling | Optional | None | whether to use rope scaling, not implemented yet |
min_multiple | int | 256 | minimum multiple for neurons in the MLP block of the transformer |
gating | bool | True | Whether to apply the gating mechanism from the original Llama model (used in LagLlama) |
drop_prob | float | 0.0 | dropout probability |
final_activation | Literal | identity | final activation function |
apply_rotary_pos_emb
apply_rotary_pos_emb (q, k, cos, sin, position_ids)
rotate_half
rotate_half (x)
Rotates half the hidden dims of the input.
LlamaRotaryEmbedding
LlamaRotaryEmbedding (dim, max_position_embeddings=2048, base=10000, device=None)
Rotary positional embeddings (RoPE) based on https://arxiv.org/abs/2104.09864 Code following the implementation in https://github.com/time-series-foundation-models/lag-llama
find_multiple
find_multiple (n:int, k:int)
CausalSelfAttention
CausalSelfAttention (n_embd_per_head, n_head, block_size, dropout)
Causeal self-attention module Based on the implementation in https://github.com/time-series-foundation-models/lag-llama, without usage of kv_cache since we always make a prediction for only the next step
MLP_block
MLP_block (n_embd_per_head, n_head, min_multiple=256, gating=True)
*Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*
RMSNorm
RMSNorm (size:int, dim:int=-1, eps:float=1e-05)
*Root Mean Square Layer Normalization as implemented in https://github.com/time-series-foundation-models/lag-llama.
Derived from https://github.com/bzhangGo/rmsnorm/blob/master/rmsnorm_torch.py. BSD 3-Clause License: https://github.com/bzhangGo/rmsnorm/blob/master/LICENSE.*
Block
Block (n_embd_per_head, n_head, block_size, dropout, min_multiple=256, gating=True)
*Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*