Base dataloader
BaseDataLoader
BaseDataLoader ()
Base class for data loaders. The idea of the data loader is to provide all external information to the environment (including lagged data, demand etc.). Internal data influenced by past decisions (like inventory levels) is to be added from within the environment
Train-Val-Test split:
The dataloader contains all data, including the training, validation and test sets.
Retrieval of the dataset types is achieved by setting the internal state to train, validation or test using appropriate functions. Then the index will automatically be adjusted to the correct dataset (see below on data retrieval).
During training, both the agent and experiment function may have to know the length of the dataset. Therefore, the functions
len_train
,len_val
andlen_test
with decorator@property
must be defined
Data retrieval:
Data retrieval is done with the
___getitem___
function. The function takes an index and returns the data at that index, typically as and X and Y pair.For non-distribution-based dataloaders, the
__init__
function must have argumentsval_index_start
andtest_index_start
from which the attributesval_index_start
andtest_index_start
andtrain_index_end
are set. The__getitem__
function must then check the index and return the correct data based on the internal state of the dataloader.
BaseDataLoader.__len__
BaseDataLoader.__len__ ()
Returns the length of the dataset. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the dataset.
BaseDataLoader.__getitem__
BaseDataLoader.__getitem__ (idx)
Returns always a tuple of X and Y data. If no X data is available, return None.
BaseDataLoader.X_shape
BaseDataLoader.X_shape ()
Returns the shape of the X data. It should follow the format (n_samples, n_features). If the data has a time dimension with a fixed length, the shape should be (n_samples, n_time_steps, n_features). If the data is generated from a distribtition, n_samples should be set to 1.
BaseDataLoader.Y_shape
BaseDataLoader.Y_shape ()
Returns the shape of the Y data. It should follow the format (n_samples, n_SKUs). If the variable of interst is only a single SKU, the shape should be (n_samples, 1). If the data is generated from a distribtition, n_samples should be set to 1.
BaseDataLoader.get_all_X
BaseDataLoader.get_all_X (dataset_type:str='train')
Returns the entire features dataset. If no X data is available, return None. Return either the train, val, test, or all data.
Type | Default | Details | |
---|---|---|---|
dataset_type | str | train | can be ‘train’, ‘val’, ‘test’, ‘all’ |
BaseDataLoader.get_all_Y
BaseDataLoader.get_all_Y (dataset_type:str='train')
Returns the entire target dataset. If no Y data is available, return None. Return either the train, val, test, or all data.
Type | Default | Details | |
---|---|---|---|
dataset_type | str | train | can be ‘train’, ‘val’, ‘test’, ‘all’ |
BaseDataLoader.len_train
BaseDataLoader.len_train ()
Returns the length of the training set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the training set.
BaseDataLoader.len_val
BaseDataLoader.len_val ()
*Returns the length of the validation set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the validation set.
If no valiation set is defined, raise an error.*
BaseDataLoader.len_test
BaseDataLoader.len_test ()
*Returns the length of the test set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the test set.
If no test set is defined, raise an error.*
BaseDataLoader.train
BaseDataLoader.train ()
Set the internal state of the dataloader to train
BaseDataLoader.val
BaseDataLoader.val ()
Set the internal state of the dataloader to validation
BaseDataLoader.test
BaseDataLoader.test ()
Set the internal state of the dataloader to test
DummyDataLoader
DummyDataLoader ()
Dummy class for data loaders that can be usef for environment that do not require any data.