Class implementing the Newsvendor problem, working for the single- and multi-item case. If underage_cost and overage_cost are scalars and there are multiple SKUs, then the same cost is used for all SKUs. If underage_cost and overage_cost are arrays, then they must have the same length as the number of SKUs. Num_SKUs can be set as parameter or inferred from the DataLoader.
Type
Default
Details
underage_cost
Union
1
underage cost per unit
overage_cost
Union
1
overage cost per unit
q_bound_low
Union
0
lower bound of the order quantity
q_bound_high
Union
inf
upper bound of the order quantity
dataloader
BaseDataLoader
None
dataloader
num_SKUs
int
None
if None it will be inferred from the DataLoader
gamma
float
1
discount factor
horizon_train
int | str
use_all_data
if “use_all_data” then horizon is inferred from the DataLoader
postprocessors
list[object] | None
None
default is empty list
mode
str
train
Initial mode (train, val, test) of the environment
return_truncation
str
True
whether to return a truncated condition in step function
Step function implementing the Newsvendor logic. Note that the dataloader will return an observation and a demand, which will be relevant in the next period. The observation will be returned directly, while the demand will be temporarily stored under self.demand and used in the next step.
Example usage of [`NewsvendorEnv`](https://opimwue.github.io/ddopai/20_environments/21_envs_inventory/single_period_envs.html#newsvendorenv) with a distributional dataloader:
Example usage of [`NewsvendorEnv`](https://opimwue.github.io/ddopai/20_environments/21_envs_inventory/single_period_envs.html#newsvendorenv) using a fixed dataset:
from sklearn.datasets import make_regressionfrom sklearn.preprocessing import MinMaxScalerfrom ddopai.dataloaders.tabular import XYDataLoader
# create a simple dataset bounded between 0 and 1.# We just scale all the data, pretending that it is the demand.# When using real data, one should only fit the scaler on the training dataX, Y = make_regression(n_samples=8, n_features=2, n_targets=2, noise=0.1, random_state=42)scaler = MinMaxScaler()X = scaler.fit_transform(X)Y = scaler.fit_transform(Y)dataloader = XYDataLoader(X, Y, val_index_start =4, test_index_start =6)test_env = NewsvendorEnv(underage_cost=np.array([1,1]), overage_cost=np.array([0.5,0.5]), dataloader=dataloader, horizon_train="use_all_data")obs = test_env.reset(start_index=0)print("#################### RESET ####################")print("#################### RUN IN TRAIN MODE ####################")run_test_loop(test_env)print("#################### RUN IN VAL MODE ####################")test_env.val()run_test_loop(test_env)print("#################### RUN IN TEST MODE ####################")test_env.test()run_test_loop(test_env)print("#################### RUN IN TRAIN MODE AGAIN ####################")test_env.train()run_test_loop(test_env)
Newsvendor Env that can provide a variable service level
Static inventory environment where a decision only affects the next period (Newsvendor problem), but with a variable service level (random during training, fixed during testing)
Class implementing the Newsvendor problem, working for the single- and multi-item case. If underage_cost and overage_cost are scalars and there are multiple SKUs, then the same cost is used for all SKUs. If underage_cost and overage_cost are arrays, then they must have the same length as the number of SKUs. Num_SKUs can be set as parameter or inferred from the DataLoader.
Type
Default
Details
sl_bound_low
Union
0.1
lower bound of the service level during training
sl_bound_high
Union
0.9
upper bound of the service level during training
sl_distribution
Literal
fixed
distribution of the random service level during training, if fixed then the service level is fixed to sl_test_val
evaluation_metric
Literal
quantile_loss
quantile loss is the generic quantile loss (independent of cost levels) while pinball loss uses the specific under- and overage costs
sl_test_val
Union
None
service level during test and validation, alternatively use cu and co
underage_cost
Union
1
underage cost per unit
overage_cost
Union
1
overage cost per unit
q_bound_low
Union
0
lower bound of the order quantity
q_bound_high
Union
inf
upper bound of the order quantity
dataloader
BaseDataLoader
None
dataloader
num_SKUs
int
None
if None it will be inferred from the DataLoader
gamma
float
1
discount factor
horizon_train
int | str
use_all_data
if “use_all_data” then horizon is inferred from the DataLoader
postprocessors
list[object] | None
None
default is empty list
mode
str
train
Initial mode (train, val, test) of the environment
return_truncation
str
True
whether to return a truncated condition in step function
SKUs_in_batch_dimension
bool
True
whether SKUs in the observation space are in the batch dimension (used for meta-learning)
Set the observation space of the environment. This is a standard function for simple observation spaces. For more complex observation spaces, this function should be overwritten. Note that it is assumped that the first dimension is n_samples that is not relevant for the observation space.
Type
Default
Details
shape
tuple
shape of the dataloader features
low
Union
-inf
lower bound of the observation space
high
Union
inf
upper bound of the observation space
samples_dim_included
bool
True
whether the first dimension of the shape input is the number of samples
Return the current observation. This function is for the simple case where the observation is only an x,y pair. For more complex observations, this function should be overwritten.