from ddopai.envs.inventory.single_period import NewsvendorEnv
from ddopai.dataloaders.tabular import XYDataLoader
from ddopai.experiments.experiment_functions import run_experiment, test_agent
ERM agents
NewsvendorXGBAgent
NewsvendorXGBAgent (environment_info:ddopai.utils.MDPInfo, cu:float|numpy.ndarray, co:float|numpy.ndarray, obsprocessors:Optional[List[object]]=None, agent_name:str|None='XGBAgent', eta:float=0.3, gamma:float=0, max_depth:int=6, min_child_weight:float=1, max_delta_step:float=0, subsample:float=1, sampling_method:str='uniform', colsample_bytree:float=1, colsample_bylevel:float=1, colsample_bynode:float=1, lambda_:float=1, alpha:float=0, tree_method:str='auto', scale_pos_weight:float=1, refresh_leaf:int=1, grow_policy:str='depthwise', max_leaves:int=0, max_bin:int=256, num_parallel_tree:int=1, multi_strategy:str='one_output_per_tree', max_cached_hist_node:int=65536, nthread:int=1, device:str='CPU')
Agent solving the Newsvendor problem within the ERM framework (i.e., using quantile regression) using the XGBoost library.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
cu | float | numpy.ndarray | underage cost | |
co | float | numpy.ndarray | overage cost | |
obsprocessors | Optional | None | |
agent_name | str | None | XGBAgent | |
eta | float | 0.3 | ## XGB params |
gamma | float | 0 | |
max_depth | int | 6 | |
min_child_weight | float | 1 | |
max_delta_step | float | 0 | |
subsample | float | 1 | |
sampling_method | str | uniform | |
colsample_bytree | float | 1 | |
colsample_bylevel | float | 1 | |
colsample_bynode | float | 1 | |
lambda_ | float | 1 | |
alpha | float | 0 | |
tree_method | str | auto | |
scale_pos_weight | float | 1 | |
refresh_leaf | int | 1 | updater will always use default |
grow_policy | str | depthwise | process type will always use default |
max_leaves | int | 0 | |
max_bin | int | 256 | |
num_parallel_tree | int | 1 | |
multi_strategy | str | one_output_per_tree | |
max_cached_hist_node | int | 65536 | |
nthread | int | 1 | ## General params |
device | str | CPU |
SGDBaseAgent
SGDBaseAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoader, input_shape:Tuple, output_shape:Tuple, dataset_params:Optional[dict]=None, dataloader_params:Optional[dict]=None, optimizer_params:Optional[dict]=None, learning_rate_scheduler_params:Optional[Dict]=None, obsprocessors:Optional[List]=None, device:str='cpu', agent_name:str|None=None, test_batch_size:int=1024, receive_batch_dim:bool=False)
Base class for Agents that are trained using Stochastic Gradient Descent (SGD) on PyTorch models.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
dataset_params | Optional | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | Optional | None | default: {“batch_size”: 32, “shuffle”: True} |
optimizer_params | Optional | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
learning_rate_scheduler_params | Optional | None | default: None. If dict, then first key is “scheduler” and the rest are the parameters |
obsprocessors | Optional | None | default: [] |
device | str | cpu | “cuda” or “cpu” |
agent_name | str | None | None | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False |
Important notes:
SGD-based agents are all agents that are trained via SGD such as Linear Models or Neural Networks. Some specific requirements are necessary to make them interface properly with the environment.
Torch perprocessors:
- In addition to the general Numpy-based pre-processor, we also provide pre-processors that work on tensor level within the
fit_epoch
method and thepredict
method. They can be used in addition to the numpy-based pre-processors or instead of them. It’s important to ensure that the shape of observations (after pre-processing) is the same for those from the environemnt and those from the dataloader during training.
Dataloader:
- As for normal supervised learning via Torch, we make use of the Torch dataloader to load the data. Instead of defining a custom dataset class, we provide a Wrapper that can be used around our dataloader to make its output and interface the same as a Torch dataset. The dataloader is then initialized when the agent is created such that the agent has access to the same dataloader as the environment.
Training process:
- The outper loop of the training process (epochs) is handled outside the agent by the
[`run_experiment`](https://opimwue.github.io/ddopai/40_experiments/experiment_functions.html#run_experiment)
functions (or can also be customized). The agent needs to have afit_epoch
method that tells the agent what to do within an epoch. This includes:- Getting the data from the dataloader
- Pre-processing the data
- Forward pass
- Loss calculation
- Backward pass
SGDBaseAgent.set_dataloader
SGDBaseAgent.set_dataloader (dataloader:ddopai.dataloaders.base.BaseData Loader, dataset_params:dict, dataloader_params:dict)
Set the dataloader for the agent by wrapping it into a Torch Dataset
Type | Details | |
---|---|---|
dataloader | BaseDataLoader | |
dataset_params | dict | |
dataloader_params | dict | dict with keys: batch_size, shuffle |
Returns | None |
SGDBaseAgent.set_loss_function
SGDBaseAgent.set_loss_function ()
Set loss function for the model
SGDBaseAgent.set_model
SGDBaseAgent.set_model (input_shape:Tuple, output_shape:Tuple)
Set the model for the agent
SGDBaseAgent.set_optimizer
SGDBaseAgent.set_optimizer (optimizer_params:dict)
Set the optimizer for the model
Type | Details | |
---|---|---|
optimizer_params | dict | dict with keys: optimizer, lr, weight_decay |
SGDBaseAgent.set_learning_rate_scheduler
SGDBaseAgent.set_learning_rate_scheduler (learning_rate_scheduler_params)
Set learning rate scheudler (can be None)
Details | |
---|---|
learning_rate_scheduler_params |
SGDBaseAgent.fit_epoch
SGDBaseAgent.fit_epoch ()
Fit the model for one epoch using the dataloader
SGDBaseAgent.draw_action_
SGDBaseAgent.draw_action_ (observation:numpy.ndarray)
Draw an action based on the fitted model (see predict method)
Type | Details | |
---|---|---|
observation | ndarray | |
Returns | ndarray |
SGDBaseAgent.predict
SGDBaseAgent.predict (X:numpy.ndarray)
Do one forward pass of the model and return the prediction
Type | Details | |
---|---|---|
X | ndarray | |
Returns | ndarray |
SGDBaseAgent.train
SGDBaseAgent.train ()
set the internal state of the agent and its model to train
SGDBaseAgent.eval
SGDBaseAgent.eval ()
set the internal state of the agent and its model to eval
SGDBaseAgent.to
SGDBaseAgent.to (device:str)
Move the model to the specified device
Type | Details | |
---|---|---|
device | str |
SGDBaseAgent.save
SGDBaseAgent.save (path:str, overwrite:bool=True)
Save the PyTorch model to a file in the specified directory.
Type | Default | Details | |
---|---|---|---|
path | str | The directory where the file will be saved. | |
overwrite | bool | True | Allow overwriting; if False, a FileExistsError will be raised if the file exists. |
SGDBaseAgent.load
SGDBaseAgent.load (path:str)
Load the PyTorch model from a file.
Type | Details | |
---|---|---|
path | str | Only the path to the folder is needed, not the file itself |
NVBaseAgent
NVBaseAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoader, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, optimizer_params:dict|None=None, learning_rate_scheduler_params=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, obsprocessors:list|None=None, device:str='cpu', agent_name:str|None=None, test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function:Literal['quantile','pinball']='quantile')
Base agent for the Newsvendor problem implementing the loss function for the Empirical Risk Minimization (ERM) approach based on quantile loss.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
learning_rate_scheduler_params | NoneType | None | TODO: add base class for learning rate scheduler for typing |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
obsprocessors | list | None | None | default: [] |
device | str | cpu | “cuda” or “cpu” |
agent_name | str | None | None | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
NVBaseAgent.set_loss_function
NVBaseAgent.set_loss_function ()
Set the loss function for the model to the quantile loss. For training the model uses quantile loss and not the pinball loss with specific cu and co values to ensure similar scale of the feedback signal during training.
NewsvendorlERMAgent
NewsvendorlERMAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoader, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, optimizer_params:dict|None=None, learning_rate_scheduler_params=None, model_params:dict|None=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, obsprocessors:list|None=None, device:str='cpu', agent_name:str|None='lERM', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function:Literal[ 'quantile','pinball']='quantile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a linear (regression) model. Note that this implementation finds the optimal regression parameters via SGD.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
learning_rate_scheduler_params | NoneType | None | TODO: add base class for learning rate scheduler for typing |
model_params | dict | None | None | default: {“relu_output”: False} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
obsprocessors | list | None | None | default: [] |
device | str | cpu | “cuda” or “cpu” |
agent_name | str | None | lERM | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
Further information:
References
----------
.. [1] Gah-Yi Ban, Cynthia Rudin, "The Big Data Newsvendor: Practical Insights
from Machine Learning", 2018.
NewsvendorlERMAgent.set_model
NewsvendorlERMAgent.set_model (input_shape, output_shape)
Set the model for the agent to a linear model
Example usage:
= 800 #90_000
val_index_start = 900 #100_000
test_index_start
= np.random.rand(1000, 2)
X = np.random.rand(1000, 1)
Y
= XYDataLoader(X, Y, val_index_start, test_index_start)
dataloader
= NewsvendorEnv(
environment = dataloader,
dataloader = 0.42857,
underage_cost = 1.0,
overage_cost = 0.999,
gamma = 365,
horizon_train
)
= NewsvendorlERMAgent(environment.mdp_info,
agent
dataloader,=np.array([0.42857]),
cu=np.array([1.0]),
co=(2,),
input_shape=(1,),
output_shape= {"optimizer": "Adam", "lr": 0.01, "weight_decay": 0.0}, # other optimizers: "SGD", "RMSprop"
optimizer_params= None, # TODO add base class for learning rate scheduler for typing
learning_rate_scheduler_params = {"relu_output": False}, #
model_params ={"batch_size": 32, "shuffle": True},
dataloader_params= "cpu", # "cuda" or "cpu"
device
)
environment.test()eval()
agent.
= test_agent(agent, environment)
R, J
print(R, J)
2, run_id = "test") # fit agent via run_experiment function
run_experiment(agent, environment,
environment.test()eval()
agent.
= test_agent(agent, environment)
R, J
print(R, J)
input shape (2,)
INFO:root:Network architecture:
/Users/magnus/miniforge3/envs/inventory_gym_2/lib/python3.11/site-packages/torchinfo/torchinfo.py:462: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
action_fn=lambda data: sys.getsizeof(data.storage()),
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
LinearModel [1, 1] --
├─Linear: 1-1 [1, 1] 3
├─Identity: 1-2 [1, 1] --
==========================================================================================
Total params: 3
Trainable params: 3
Non-trainable params: 0
Total mult-adds (M): 0.00
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00
==========================================================================================
INFO:root:Starting experiment
INFO:root:Initial evaluation: R=-29.736253318797445, J=-28.287550833928687
INFO:root:Starting training with epochs fit
-23.17678889235405 -22.124720267178684
Experiment directory: results/test
100%|██████████| 25/25 [00:00<00:00, 903.73it/s]
100%|██████████| 25/25 [00:00<00:00, 1999.34it/s]
100%|██████████| 2/2 [00:00<00:00, 35.22it/s]
INFO:root:Finished training with epochs fit
INFO:root:Evaluation after training: R=-15.499745268755348, J=-14.77032101771835
-16.54230338871762 -15.75806274718322
NewsvendorDLAgent
NewsvendorDLAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoader, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, learning_rate_scheduler_params:Optional[Dict]=None, optimizer_params:dict|None=None, model_params:dict|None=None, dataloader_params:dict|None=None, dataset_params:dict|None=None, device:str='cpu', obsprocessors:list|None=None, agent_name:str|None='DLNV', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function:Literal['q uantile','pinball']='quantile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a deep learning model.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
learning_rate_scheduler_params | Optional | None | |
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
model_params | dict | None | None | default: {“hidden_layers”: [64, 64], “drop_prob”: 0.0, “batch_norm”: False, “relu_output”: False} |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
device | str | cpu | “cuda” or “cpu” |
obsprocessors | list | None | None | default: [] |
agent_name | str | None | DLNV | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
Further information:
References
----------
.. [1] Afshin Oroojlooyjadid, Lawrence V. Snyder, Martin Takáˇc,
"Applying Deep Learning to the Newsvendor Problem", 2018.
NewsvendorDLAgent.set_model
NewsvendorDLAgent.set_model (input_shape, output_shape)
Set the model for the agent to an MLP
Example usage:
= XYDataLoader(X, Y, val_index_start, test_index_start)
dataloader
= NewsvendorEnv(
environment = dataloader,
dataloader = 0.42857,
underage_cost = 1.0,
overage_cost = 0.999,
gamma = 365,
horizon_train
)
= {
model_params "hidden_layers": [64, 64],
}
= NewsvendorDLAgent(environment.mdp_info,
agent
dataloader,=np.array([0.42857]),
cu=np.array([1.0]),
co=(2,),
input_shape=(1,),
output_shape= {"optimizer": "Adam", "lr": 0.01, "weight_decay": 0.0}, # other optimizers: "SGD", "RMSprop"
optimizer_params= None, # TODO add base class for learning rate scheduler for typing
learning_rate_scheduler_params = model_params, #
model_params ={"batch_size": 32, "shuffle": True},
dataloader_params= "cpu" # "cuda" or "cpu"
device
)
environment.test()eval()
agent.
= test_agent(agent, environment)
R, J
print(R, J)
2, run_id = "test") # fit agent via run_experiment function
run_experiment(agent, environment,
environment.test()eval()
agent.
= test_agent(agent, environment)
R, J
print(R, J)
INFO:root:Network architecture:
input shape (2,)
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
MLP [1, 1] --
├─Sequential: 1-1 [1, 1] --
│ └─Linear: 2-1 [1, 64] 192
│ └─ReLU: 2-2 [1, 64] --
│ └─Dropout: 2-3 [1, 64] --
│ └─Linear: 2-4 [1, 64] 4,160
│ └─ReLU: 2-5 [1, 64] --
│ └─Dropout: 2-6 [1, 64] --
│ └─Linear: 2-7 [1, 1] 65
│ └─Identity: 2-8 [1, 1] --
==========================================================================================
Total params: 4,417
Trainable params: 4,417
Non-trainable params: 0
Total mult-adds (M): 0.00
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.02
Estimated Total Size (MB): 0.02
==========================================================================================
INFO:root:Starting experiment
INFO:root:Initial evaluation: R=-20.030297947350757, J=-19.11491558256756
INFO:root:Starting training with epochs fit
-22.66337395888819 -21.548795898866043
Experiment directory: results/test
100%|██████████| 25/25 [00:00<00:00, 1212.35it/s]
100%|██████████| 25/25 [00:00<00:00, 1277.10it/s]
100%|██████████| 2/2 [00:00<00:00, 32.30it/s]
INFO:root:Finished training with epochs fit
INFO:root:Evaluation after training: R=-15.082729205825588, J=-14.380392673719802
-16.096224629924393 -15.338865711420437
BaseMetaAgent
BaseMetaAgent ()
Initialize self. See help(type(self)) for accurate signature.
NewsvendorlERMMetaAgent
NewsvendorlERMMetaAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoade r, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, optimizer_params:dict|None=None, learning_rate_scheduler_params=None, model_params:dict|None=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, obsprocessors:list|None=None, device:str='cpu', agent_name:str|None='lERMMeta', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function:Lite ral['quantile','pinball']='quantile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a linear (regression) model. In addition to the features, the agent also gets the sl as input to be able to forecast the optimal order quantity for different sl values. Depending on the training pipeline, this model can be adapted to become a full meta-learning algorithm cross products and cross sls.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | Parameters for lERM agent | |
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
learning_rate_scheduler_params | NoneType | None | TODO: add base class for learning rate scheduler for typing |
model_params | dict | None | None | default: {“relu_output”: False} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
obsprocessors | list | None | None | default: [] |
device | str | cpu | “cuda” or “cpu” |
agent_name | str | None | lERMMeta | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
NewsvendorDLMetaAgent
NewsvendorDLMetaAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseDataLoader, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, learning_rate_scheduler_params=None, optimizer_params:dict|None=None, model_params:dict|None=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, device:str='cpu', obsprocessors:list|None=None, agent_name:str|None='DLNV', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function:Litera l['quantile','pinball']='quantile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a Neural Network. In addition to the features, the agent also gets the sl as input to be able to forecast the optimal order quantity for different sl values. Depending on the training pipeline, this model can be adapted to become a full meta-learning algorithm cross products and cross sls.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
learning_rate_scheduler_params | NoneType | None | TODO: add base class for learning rate scheduler for typing |
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
model_params | dict | None | None | default: {“hidden_layers”: [64, 64], “drop_prob”: 0.0, “batch_norm”: False, “relu_output”: False} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
device | str | cpu | “cuda” or “cpu” |
obsprocessors | list | None | None | default: [] |
agent_name | str | None | DLNV | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
NewsvendorDLTransformerAgent
NewsvendorDLTransformerAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.BaseData Loader, cu:numpy.ndarray|ddopai.utils.Parameter, co:numpy.ndarray|ddopai.utils.Parameter, input_shape:Tuple, output_shape:Tuple, lear ning_rate_scheduler_params:Optional[Dict]=N one, optimizer_params:dict|None=None, model_params:dict|None=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, device:str='cpu', obsprocessors:list|None=None, agent_name:str|None='DLNV', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_function :Literal['quantile','pinball']='quantile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a deep learning model with a Transformer architecture.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
learning_rate_scheduler_params | Optional | None | |
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
model_params | dict | None | None | default: {“max_context_length”: 128, “n_layer”: 3, “n_head”: 8, “n_embd_per_head”: 32, “rope_scaling”: None, “min_multiple”: 256, “gating”: True, “drop_prob”: 0.0, “final_activation”: “identity”} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
device | str | cpu | “cuda” or “cpu” |
obsprocessors | list | None | None | default: [] |
agent_name | str | None | DLNV | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |
NewsvendorDLTransformerMetaAgent
NewsvendorDLTransformerMetaAgent (environment_info:ddopai.utils.MDPInfo, dataloader:ddopai.dataloaders.base.Base DataLoader, cu:numpy.ndarray|ddopai.uti ls.Parameter, co:numpy.ndarray|ddopai.u tils.Parameter, input_shape:Tuple, output_shape:Tuple, learning_rate_sched uler_params:Optional[Dict]=None, optimizer_params:dict|None=None, model_params:dict|None=None, dataset_params:dict|None=None, dataloader_params:dict|None=None, device:str='cpu', obsprocessors:list|None=None, agent_name:str|None='DLNV', test_batch_size:int=1024, receive_batch_dim:bool=False, loss_func tion:Literal['quantile','pinball']='qua ntile')
Newsvendor agent implementing Empirical Risk Minimization (ERM) approach based on a Neural Network using the attention mechanism. In addition to the features, the agent also gets the sl as input to be able to forecast the optimal order quantity for different sl values. Depending on the training pipeline, this model can be adapted to become a full meta-learning algorithm cross products and cross sls.
Type | Default | Details | |
---|---|---|---|
environment_info | MDPInfo | ||
dataloader | BaseDataLoader | ||
cu | numpy.ndarray | ddopai.utils.Parameter | ||
co | numpy.ndarray | ddopai.utils.Parameter | ||
input_shape | Tuple | ||
output_shape | Tuple | ||
learning_rate_scheduler_params | Optional | None | |
optimizer_params | dict | None | None | default: {“optimizer”: “Adam”, “lr”: 0.01, “weight_decay”: 0.0} |
model_params | dict | None | None | default: {“hidden_layers”: [64, 64], “drop_prob”: 0.0, “batch_norm”: False, “relu_output”: False} |
dataset_params | dict | None | None | parameters needed to convert the dataloader to a torch dataset |
dataloader_params | dict | None | None | default: {“batch_size”: 32, “shuffle”: True} |
device | str | cpu | “cuda” or “cpu” |
obsprocessors | list | None | None | default: [] |
agent_name | str | None | DLNV | |
test_batch_size | int | 1024 | |
receive_batch_dim | bool | False | |
loss_function | Literal | quantile |