ddop.datasets.load_SID
- ddop.datasets.load_SID(include_date=False, one_hot_encoding=False, label_encoding=False, return_X_y=False)
Load and return the store item demand dataset.
Dataset Characteristics:
- Number of Instances
887284
- Number of Targets
1
- Number of Features
6
- Target Information
‘demand’ the corresponding demand observation
- Feature Information
‘date’ the date
‘weekday’ the day of the week,
‘month’ the month of the year,
‘year’ the year,
‘store’ the store id,
‘item’ the item id
- Parameters
include_date (bool, default=False) – Whether to include the demand date
one_hot_encoding (bool, default=False) – Whether to one hot encode categorical features
label_encoding (bool, default=False) – Whether to convert categorical columns (weekday, month, year) to continuous. Will only be applied if one_hot_encoding=False
return_X_y (bool, default=False.) – If True, returns
(data, target)
instead of a Bunch object. See below for more information about the data and target object.
- Returns
data (sklearn Bunch) – Dictionary-like object, with the following attributes.
- dataPandas DataFrame of shape (887284, n_features)
The data matrix.
- target: Pandas DataFrame of shape (887284, n_targets)
The target values.
- n_features: int
The number of features included
- n_targets: int
The number of target variables included
- DESCR: str
The full description of the dataset.
- data_filename: str
The path to the location of the data.
- target_filename: str
The path to the location of the target.
(data, target) (tuple if
return_X_y
is True)
Notes
The store item demand dataset was published within a demand forecasting challenge on kaggle [1]
References
Examples
>>> from ddop.datasets import load_SID >>> X, y = load_SID(return_X_y=True) >>> print(X.shape) (887284, 5)