Base dataloader

Base class for dataloaders



 BaseDataLoader ()

Base class for data loaders. The idea of the data loader is to provide all external information to the environment (including lagged data, demand etc.). Internal data influenced by past decisions (like inventory levels) is to be added from within the environment

Train-Val-Test split:

  • The dataloader contains all data, including the training, validation and test sets.

  • Retrieval of the dataset types is achieved by setting the internal state to train, validation or test using appropriate functions. Then the index will automatically be adjusted to the correct dataset (see below on data retrieval).

  • During training, both the agent and experiment function may have to know the length of the dataset. Therefore, the functions len_train, len_val and len_test with decorator @property must be defined

Data retrieval:

  • Data retrieval is done with the ___getitem___ function. The function takes an index and returns the data at that index, typically as and X and Y pair.

  • For non-distribution-based dataloaders, the __init__ function must have arguments val_index_start and test_index_start from which the attributes val_index_start and test_index_start and train_index_endare set. The __getitem__ function must then check the index and return the correct data based on the internal state of the dataloader.



 BaseDataLoader.__len__ ()

Returns the length of the dataset. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the dataset.



 BaseDataLoader.__getitem__ (idx)

Returns always a tuple of X and Y data. If no X data is available, return None.



 BaseDataLoader.X_shape ()

Returns the shape of the X data. It should follow the format (n_samples, n_features). If the data has a time dimension with a fixed length, the shape should be (n_samples, n_time_steps, n_features). If the data is generated from a distribtition, n_samples should be set to 1.



 BaseDataLoader.Y_shape ()

Returns the shape of the Y data. It should follow the format (n_samples, n_SKUs). If the variable of interst is only a single SKU, the shape should be (n_samples, 1). If the data is generated from a distribtition, n_samples should be set to 1.



 BaseDataLoader.get_all_X (dataset_type:str='train')

Returns the entire features dataset. If no X data is available, return None. Return either the train, val, test, or all data.

Type Default Details
dataset_type str train can be ‘train’, ‘val’, ‘test’, ‘all’



 BaseDataLoader.get_all_Y (dataset_type:str='train')

Returns the entire target dataset. If no Y data is available, return None. Return either the train, val, test, or all data.

Type Default Details
dataset_type str train can be ‘train’, ‘val’, ‘test’, ‘all’



 BaseDataLoader.len_train ()

Returns the length of the training set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the training set.



 BaseDataLoader.len_val ()

*Returns the length of the validation set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the validation set.

If no valiation set is defined, raise an error.*



 BaseDataLoader.len_test ()

*Returns the length of the test set. For dataloaders based on distributions, this should return an error that the length is not defined, otherwise it should return the number of samples in the test set.

If no test set is defined, raise an error.*



 BaseDataLoader.train ()

Set the internal state of the dataloader to train



 BaseDataLoader.val ()

Set the internal state of the dataloader to validation



 BaseDataLoader.test ()

Set the internal state of the dataloader to test



 DummyDataLoader ()

Dummy class for data loaders that can be usef for environment that do not require any data.