Approximators

Generic networks that approximate a function, used for supervised learning

BaseModule

 BaseModule ()

*Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*

source

LinearModel

 LinearModel (input_size:int, output_size:int, relu_output:bool=False)

Linear regression model

	Type	Default	Details
input_size	int		number of features
output_size	int		number of outputs/actions
relu_output	bool	False	whether to apply ReLU activation to the output

source

MLP

 MLP (input_size:int, output_size:int, hidden_layers:list,
      drop_prob:float=0.0, batch_norm:bool=False, relu_output:bool=False)

Multilayer perceptron model

	Type	Default	Details
input_size	int		number of features
output_size	int		number of outputs/actions
hidden_layers	list		list of number of neurons in each hidden layer
drop_prob	float	0.0	dropout probability
batch_norm	bool	False	whether to apply batch normalization
relu_output	bool	False	whether to apply ReLU activation to the output

source

Transformer

 Transformer (input_size:int, output_size:int, max_context_length:int=128,
              n_layer:int=3, n_head:int=8, n_embd_per_head:int=32,
              rope_scaling:Optional[Dict]=None, min_multiple=256,
              gating=True, drop_prob:float=0.0, final_activation:Literal['
              relu','sigmoid','tanh','elu','leakyrelu','identity']='identi
              ty')

Multilayer perceptron model

	Type	Default	Details
input_size	int		number of (time steps, features)
output_size	int		number of outputs/actions
max_context_length	int	128	maximum context lenght during inference
n_layer	int	3	number of layers in the transformer
n_head	int	8	number of heads per layer
n_embd_per_head	int	32	number of embedding per head
rope_scaling	Optional	None	whether to use rope scaling, not implemented yet
min_multiple	int	256	minimum multiple for neurons in the MLP block of the transformer
gating	bool	True	Whether to apply the gating mechanism from the original Llama model (used in LagLlama)
drop_prob	float	0.0	dropout probability
final_activation	Literal	identity	final activation function

source

apply_rotary_pos_emb

 apply_rotary_pos_emb (q, k, cos, sin, position_ids)

source

rotate_half

 rotate_half (x)

Rotates half the hidden dims of the input.

source

LlamaRotaryEmbedding

 LlamaRotaryEmbedding (dim, max_position_embeddings=2048, base=10000,
                       device=None)

Rotary positional embeddings (RoPE) based on https://arxiv.org/abs/2104.09864 Code following the implementation in https://github.com/time-series-foundation-models/lag-llama

source

find_multiple

 find_multiple (n:int, k:int)

source

CausalSelfAttention

 CausalSelfAttention (n_embd_per_head, n_head, block_size, dropout)

Causeal self-attention module Based on the implementation in https://github.com/time-series-foundation-models/lag-llama, without usage of kv_cache since we always make a prediction for only the next step

source

MLP_block

 MLP_block (n_embd_per_head, n_head, min_multiple=256, gating=True)

*Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*

source

RMSNorm

 RMSNorm (size:int, dim:int=-1, eps:float=1e-05)

*Root Mean Square Layer Normalization as implemented in https://github.com/time-series-foundation-models/lag-llama.

Derived from https://github.com/bzhangGo/rmsnorm/blob/master/rmsnorm_torch.py. BSD 3-Clause License: https://github.com/bzhangGo/rmsnorm/blob/master/LICENSE.*

source

Block

 Block (n_embd_per_head, n_head, block_size, dropout, min_multiple=256,
        gating=True)

*Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*