Dataset Preparation for Kaggle M5 dataset for Meta-Learning

Some pre-processings steps implemented to prepare the Kaggle M5 dataset for meta-learning

source

KaggleM5DatasetLoader

 KaggleM5DatasetLoader (data_path, overwrite=False,
                        product_as_feature=False)

Class to download the Kaggle M5 dataset and apply some preprocessing steps to prepare it for application in inventory management.

run_test = False
if run_test:
    data_path = "/Users/magnus/Documents/02_PhD/03_Newsvendor_foundation_model/experiments/datasets/raw/kaggle_m5"
    if data_path is not None:
        loader = KaggleM5DatasetLoader(data_path, overwrite=False, product_as_feature=False)
        demand, SKU_features, time_features, time_SKU_features, mask = loader.load_dataset()
INFO:root:Using existing data from disk
INFO:root:Importing data
INFO:root:Preprocessing data
INFO:root:--Creating catogory mapping and features
INFO:root:--Preparing sales time series data
INFO:root:--Preparing calendric information
INFO:root:--Preparing snap features
INFO:root:--Preparing price information
INFO:root:--Creating indicator table if products are available for purchase
INFO:root:--Preparing final outputs and ensure consistency of time and feature dimensions
      HOBBIES_1_001_CA_1  HOBBIES_1_002_CA_1  HOBBIES_1_003_CA_1  \
0                      0                   0                   0   
1                      0                   0                   0   
2                      0                   0                   0   
3                      0                   0                   0   
4                      0                   0                   0   
...                  ...                 ...                 ...   
1936                   0                   0                   0   
1937                   3                   0                   2   
1938                   3                   0                   3   
1939                   0                   0                   0   
1940                   1                   0                   1   

      HOBBIES_1_004_CA_1  HOBBIES_1_005_CA_1  HOBBIES_1_006_CA_1  \
0                      0                   0                   0   
1                      0                   0                   0   
2                      0                   0                   0   
3                      0                   0                   0   
4                      0                   0                   0   
...                  ...                 ...                 ...   
1936                   1                   0                   0   
1937                   3                   0                   0   
1938                   0                   2                   5   
1939                   2                   1                   2   
1940                   6                   0                   0   

      HOBBIES_1_007_CA_1  HOBBIES_1_008_CA_1  HOBBIES_1_009_CA_1  \
0                      0                  12                   2   
1                      0                  15                   0   
2                      0                   0                   7   
3                      0                   0                   3   
4                      0                   0                   0   
...                  ...                 ...                 ...   
1936                   1                   5                   0   
1937                   0                   4                   0   
1938                   1                   1                   0   
1939                   1                  40                   1   
1940                   0                  32                   0   

      HOBBIES_1_010_CA_1  ...  FOODS_3_818_WI_3  FOODS_3_819_WI_3  \
0                      0  ...                 0                14   
1                      0  ...                 0                11   
2                      1  ...                 0                 5   
3                      0  ...                 0                 6   
4                      0  ...                 0                 5   
...                  ...  ...               ...               ...   
1936                   1  ...                 3                 6   
1937                   1  ...                 1                 4   
1938                   0  ...                 3                 4   
1939                   0  ...                 0                 1   
1940                   1  ...                 0                 1   

      FOODS_3_820_WI_3  FOODS_3_821_WI_3  FOODS_3_822_WI_3  FOODS_3_823_WI_3  \
0                    1                 0                 4                 0   
1                    1                 0                 4                 0   
2                    1                 0                 2                 2   
3                    1                 0                 5                 2   
4                    1                 0                 2                 0   
...                ...               ...               ...               ...   
1936                 3                 0                 0                 1   
1937                 3                 1                 2                 0   
1938                 3                 1                 1                 0   
1939                 0                 0                 3                 1   
1940                 1                 4                 4                 1   

      FOODS_3_824_WI_3  FOODS_3_825_WI_3  FOODS_3_826_WI_3  FOODS_3_827_WI_3  
0                    0                 0                 0                 0  
1                    0                 6                 0                 0  
2                    0                 0                 0                 0  
3                    0                 2                 0                 0  
4                    0                 2                 0                 0  
...                ...               ...               ...               ...  
1936                 0                 1                 0                 0  
1937                 1                 0                 1                 2  
1938                 0                 1                 1                 2  
1939                 1                 0                 1                 5  
1940                 0                 2                 0                 1  

[1941 rows x 30490 columns]