Wrappers example

In this notebook, we will explore Sinergym’s pre-defined wrappers and how to use them.

You can also create your own wrappers by inheriting from gym.Wrapper or any of its variants.

[4]:

import gymnasium as gym
import numpy as np

import sinergym
from sinergym.utils.wrappers import *

Multi-objective wrapper

MO-Gymnasium is an open-source Python library for developing and comparing multi-objective reinforcement learning algorithms.

Available MO-Gymnasium environments return a reward vector instead of a scalar value, one for each objective.

This wrapper enables Sinergym to return a reward vector. This way, Sinergym is made compatible with both multi-objective algorithms and algorithms that work with a traditional reward value.

We can transform the returned reward into a vector using as follows:

[2]:

env = gym.make('Eplus-5zone-hot-discrete-v1')
env = MultiObjectiveReward(env, reward_terms=['energy_term', 'comfort_term'])

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
[WRAPPER MultiObjectiveReward] (INFO) : wrapper initialized.

Make sure that reward_terms are available in the info dict returned by the environment’s step method. Otherwise, an execution error will occur.

By default, Sinergym environments return all reward terms of the reward class in the info dict.

[3]:

env.reset()
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
env.close()

print(reward)

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 4527.09%/s, 100% completed][-0.005897377870609513, -0.09451834601888898]

Previous observation wrappers

This wrapper will add previous timestep observation values to the current environment observation.

You can select the variables whose previous observed values should be tracked. The observation space will be updated with the corresponding new dimension.

[4]:

env = gym.make('Eplus-5zone-hot-discrete-v1')
env = PreviousObservationWrapper(env, previous_variables=[
    'htg_setpoint',
    'clg_setpoint',
    'air_temperature'])

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:12<00:00,  8.06%/s, 100% completed]
[WRAPPER PreviousObservationWrapper] (INFO) : Wrapper initialized.

You can see how the observation values have been updated:

[5]:

env.reset()
obs, _, _, _, _ = env.step(env.action_space.sample())
obs_dict = dict(zip(env.get_wrapper_attr('observation_variables'), obs))
env.close()

print('NEW OBSERVATION: ', obs_dict)

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 9127.97%/s, 100% completed]NEW OBSERVATION:  {'month': np.float32(1.0), 'day_of_month': np.float32(1.0), 'hour': np.float32(0.0), 'outdoor_temperature': np.float32(4.8), 'outdoor_humidity': np.float32(61.0), 'wind_speed': np.float32(4.65), 'wind_direction': np.float32(160.0), 'diffuse_solar_radiation': np.float32(0.0), 'direct_solar_radiation': np.float32(0.0), 'htg_setpoint': np.float32(19.0), 'clg_setpoint': np.float32(23.25), 'air_temperature': np.float32(19.77215), 'air_humidity': np.float32(28.321066), 'people_occupant': np.float32(0.0), 'co2_emission': np.float32(0.0), 'HVAC_electricity_demand_rate': np.float32(3497.2122), 'total_electricity_HVAC': np.float32(3668829.2), 'htg_setpoint_previous': np.float32(12.8), 'clg_setpoint_previous': np.float32(40.0), 'air_temperature_previous': np.float32(19.952078)}

Datetime wrapper

This wrapper will replace the day value with the is_weekend flag, and hour and month with codified sin and cos values.

The observation space is also automatically updated.

[6]:

env = gym.make('Eplus-5zone-hot-discrete-v1')
env = DatetimeWrapper(env)

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res2
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:15<00:00,  6.49%/s, 100% completed]
[WRAPPER DatetimeWrapper] (INFO) : Wrapper initialized.

This wrapper removes the observation variables month, day, and hour, and replace them by month_sin, month_cos, is_weekend, hour_sin, and hour_cos:

[7]:

env.reset()
obs, _, _, _, _ = env.step(env.action_space.sample())
obs_dict = dict(zip(env.get_wrapper_attr('observation_variables'), obs))
env.close()
print('NEW OBSERVATION: ', obs_dict)

print('NEW OBSERVATION: ', obs_dict)

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res2/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 6941.45%/s, 100% completed]NEW OBSERVATION:  {'month_cos': np.float32(1.0), 'month_sin': np.float32(0.0), 'is_weekend': np.float32(0.0), 'hour_cos': np.float32(1.0), 'hour_sin': np.float32(0.0), 'outdoor_temperature': np.float32(4.8), 'outdoor_humidity': np.float32(61.0), 'wind_speed': np.float32(4.65), 'wind_direction': np.float32(160.0), 'diffuse_solar_radiation': np.float32(0.0), 'direct_solar_radiation': np.float32(0.0), 'htg_setpoint': np.float32(19.0), 'clg_setpoint': np.float32(23.25), 'air_temperature': np.float32(19.77215), 'air_humidity': np.float32(28.321066), 'people_occupant': np.float32(0.0), 'co2_emission': np.float32(0.0), 'HVAC_electricity_demand_rate': np.float32(3497.2122), 'total_electricity_HVAC': np.float32(3668829.2)}
NEW OBSERVATION:  {'month_cos': np.float32(1.0), 'month_sin': np.float32(0.0), 'is_weekend': np.float32(0.0), 'hour_cos': np.float32(1.0), 'hour_sin': np.float32(0.0), 'outdoor_temperature': np.float32(4.8), 'outdoor_humidity': np.float32(61.0), 'wind_speed': np.float32(4.65), 'wind_direction': np.float32(160.0), 'diffuse_solar_radiation': np.float32(0.0), 'direct_solar_radiation': np.float32(0.0), 'htg_setpoint': np.float32(19.0), 'clg_setpoint': np.float32(23.25), 'air_temperature': np.float32(19.77215), 'air_humidity': np.float32(28.321066), 'people_occupant': np.float32(0.0), 'co2_emission': np.float32(0.0), 'HVAC_electricity_demand_rate': np.float32(3497.2122), 'total_electricity_HVAC': np.float32(3668829.2)}

Action normalization wrapper

Here’s an example of how to normalize a continuous action space using the NormalizeAction wrapper.

If the normalization range is not defined, it will be [-1,1] by default.

[8]:

# Create a continuous environment
env = gym.make('Eplus-5zone-hot-continuous-v1')
print('ORIGINAL ACTION SPACE: ', env.get_wrapper_attr('action_space'))

# Apply the normalization wrapper
env = NormalizeAction(env, normalize_range=(-1.0, 1.0))
print('WRAPPED ACTION SPACE: ', env.get_wrapper_attr('action_space'))

env.reset()
for i in range(5):
    action = env.action_space.sample()
    print('Normalized action: ', action)
    _, _, _, _, info = env.step(action)
    print('Action performed in the simulator: ', info['action'])
env.close()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-continuous-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:08<00:00, 11.40%/s, 100% completed]
ORIGINAL ACTION SPACE:  Box([12.   23.25], [23.25 30.  ], (2,), float32)
[WRAPPER NormalizeAction] (INFO) : New normalized action space: Box(-1.0, 1.0, (2,), float32)
[WRAPPER NormalizeAction] (INFO) : Wrapper initialized.
WRAPPED ACTION SPACE:  Box(-1.0, 1.0, (2,), float32)
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
Normalized action:  [-0.52202827 -0.46923232]
Action performed in the simulator:  [14.688591003417969, 25.04134178161621]
Normalized action:  [-0.5633869   0.09416266]
Action performed in the simulator:  [14.455948829650879, 26.942798614501953]
Normalized action:  [ 0.04455094 -0.9088971 ]
Action performed in the simulator:  [17.875598907470703, 23.557472229003906]
Normalized action:  [0.05806986 0.1181407 ]
Action performed in the simulator:  [17.951642990112305, 27.023725509643555]
Normalized action:  [ 0.7687502  -0.16491799]
Action performed in the simulator:  [21.94921875, 26.068401336669922]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-continuous-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 3270.13%/s, 100% completed]

Action discretization wrapper

Let’s see how to discretize a continuous action space. We will need to specify the new discrete action space and an action mapping function whose output matches the original unwrapped action space:

[5]:

# We will create a continuous environment
env = gym.make('Eplus-5zone-hot-continuous-v1')
print('ORIGINAL ACTION SPACE: ', env.get_wrapper_attr('action_space'))
print('IS DISCRETE?: ', env.get_wrapper_attr('is_discrete'))

# Defining new discrete space and action mapping function
new_discrete_space = gym.spaces.Discrete(10)  # Action values [0,9]


def action_mapping_function(action):
    mapping = {
        0: np.array([12, 30], dtype=np.float32),
        1: np.array([13, 29], dtype=np.float32),
        2: np.array([14, 28], dtype=np.float32),
        3: np.array([15, 27], dtype=np.float32),
        4: np.array([16, 26], dtype=np.float32),
        5: np.array([17, 25], dtype=np.float32),
        6: np.array([18, 24], dtype=np.float32),
        7: np.array([19, 23.25], dtype=np.float32),
        8: np.array([20, 23.25], dtype=np.float32),
        9: np.array([21, 23.25], dtype=np.float32)
    }

    return mapping[action]


# Apply the discretization wrapper
env = DiscretizeEnv(env, discrete_space=new_discrete_space,
                    action_mapping=action_mapping_function)
print('WRAPPED ACTION SPACE: ', env.get_wrapper_attr('action_space'))
print('IS DISCRETE?: ', env.get_wrapper_attr('is_discrete'))
env.reset()
for i in range(5):
    action = env.action_space.sample()
    print('ACTION DISCRETE: ', action)
    _, _, _, _, info = env.step(action)
    print('Action done in simulator: ', info['action'])
env.close()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-continuous-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
ORIGINAL ACTION SPACE:  Box([12.   23.25], [23.25 30.  ], (2,), float32)
IS DISCRETE?:  False
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
WRAPPED ACTION SPACE:  Discrete(10)
IS DISCRETE?:  True
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
ACTION DISCRETE:  3
Action done in simulator:  [15.0, 27.0]
ACTION DISCRETE:  0
Action done in simulator:  [12.0, 30.0]
ACTION DISCRETE:  5
Action done in simulator:  [17.0, 25.0]
ACTION DISCRETE:  5
Action done in simulator:  [17.0, 25.0]
ACTION DISCRETE:  6
Action done in simulator:  [18.0, 24.0]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-continuous-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 9598.61%/s, 100% completed]

Discrete incremental wrapper

This wrapper updates an environment to utilize an incremental setpoint action space.It converts the environment into a discrete environment with an action mapping function and action space depending on the step and delta values specified.

The action is added to the current setpoint value instead of overwriting the latest action. Thus, the action is the current setpoint values with the applied increment/decrement, rather than the discrete value action that defines the increment/decrement itself.

[6]:

env = gym.make('Eplus-5zone-hot-continuous-v1')
print('ORIGINAL ACTION SPACE: ', env.get_wrapper_attr('action_space'))

env = DiscreteIncrementalWrapper(
    env, initial_values=[21.0, 25.0], delta_temp=2, step_temp=0.5)

print('WRAPPED ACTION SPACE: ', env.get_wrapper_attr('action_space'))
print('WRAPPED ACTION MAPPING: ', env.get_wrapper_attr('action_mapping'))

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-continuous-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:09<00:00, 10.55%/s, 100% completed]
ORIGINAL ACTION SPACE:  Box([12.   23.25], [23.25 30.  ], (2,), float32)
[WRAPPER DiscreteIncrementalWrapper] (INFO) : New incremental action mapping: 17
[WRAPPER DiscreteIncrementalWrapper] (INFO) : {0: array([0., 0.], dtype=float32), 1: array([0.5, 0. ], dtype=float32), 2: array([1., 0.], dtype=float32), 3: array([1.5, 0. ], dtype=float32), 4: array([2., 0.], dtype=float32), 5: array([-0.5,  0. ], dtype=float32), 6: array([-1.,  0.], dtype=float32), 7: array([-1.5,  0. ], dtype=float32), 8: array([-2.,  0.], dtype=float32), 9: array([0. , 0.5], dtype=float32), 10: array([0., 1.], dtype=float32), 11: array([0. , 1.5], dtype=float32), 12: array([0., 2.], dtype=float32), 13: array([ 0. , -0.5], dtype=float32), 14: array([ 0., -1.], dtype=float32), 15: array([ 0. , -1.5], dtype=float32), 16: array([ 0., -2.], dtype=float32)}
[WRAPPER DiscreteIncrementalWrapper] (INFO) : Wrapper initialized
WRAPPED ACTION SPACE:  Discrete(17)
WRAPPED ACTION MAPPING:  <bound method DiscreteIncrementalWrapper.action_mapping of <DiscreteIncrementalWrapper<OrderEnforcing<PassiveEnvChecker<EplusEnv<Eplus-5zone-hot-continuous-v1>>>>>>

The maximum and minimum values defined when creating the action mapping are read from the environment action space, ensuring that the setpoint increments and decrements do not exceed the corresponding limits.

The delta and step values are used to determine how the discrete space of these increments and decrements will be constructed.

Here’s an example of how it works:

[7]:

env.reset()
print('CURRENT SETPOINTS VALUES: ', env.get_wrapper_attr('current_setpoints'))

for i in range(5):
    action = env.action_space.sample()
    _, _, _, _, info = env.step(action)
    print('Action number ', i, ': ',
          env.get_wrapper_attr('action_mapping')(action))
    print('Setpoints update: ', info['action'])
env.close()

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
CURRENT SETPOINTS VALUES:  [21. 25.]
Action number  0 :  [0. 2.]
Setpoints update:  [21.0, 27.0]
Action number  1 :  [0. 1.]
Setpoints update:  [21.0, 28.0]
Action number  2 :  [0.  1.5]
Setpoints update:  [21.0, 29.5]
Action number  3 :  [-2.  0.]
Setpoints update:  [19.0, 29.5]
Action number  4 :  [0.  1.5]
Setpoints update:  [19.0, 30.0]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-continuous-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 8961.81%/s, 100% completed]

Normalization wrapper

This wrapper is used to transform observations received from the simulator to values between in range [-1, 1].

It is based on Gymnasium’s dynamic normalization wrapper.

Until properly calibrated, it may not be precise, and the values may often be out of range, so use this wrapper with caution.

[8]:

# Original env
env = gym.make('Eplus-5zone-hot-discrete-v1')

# Normalized env
env = NormalizeObservation(
    env=env)

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:07<00:00, 12.90%/s, 100% completed]
[WRAPPER NormalizeObservation] (INFO) : Wrapper initialized.

You can check how the specified variables have been correctly normalized:

[9]:

env.reset()

obs, _, _, _, _ = env.step(env.action_space.sample())
obs_dict = dict(zip(env.get_wrapper_attr('observation_variables'), obs))
env.close()

print('OBSERVATION WITH NORMALIZATION: ', obs_dict)

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
[WRAPPER NormalizeObservation] (INFO) : Normalization calibration saved.
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 6032.45%/s, 100% completed]OBSERVATION WITH NORMALIZATION:  {'month': np.float32(0.005000767), 'day_of_month': np.float32(0.005000767), 'hour': np.float32(0.0), 'outdoor_temperature': np.float32(0.98759085), 'outdoor_humidity': np.float32(-0.9745623), 'wind_speed': np.float32(0.99739647), 'wind_direction': np.float32(0.9908533), 'diffuse_solar_radiation': np.float32(0.0), 'direct_solar_radiation': np.float32(0.0), 'htg_setpoint': np.float32(0.9994412), 'clg_setpoint': np.float32(-0.9994248), 'air_temperature': np.float32(-0.44181496), 'air_humidity': np.float32(0.14094329), 'people_occupant': np.float32(0.0), 'co2_emission': np.float32(0.0), 'HVAC_electricity_demand_rate': np.float32(1.0000393), 'total_electricity_HVAC': np.float32(1.0000393)}

Logging and storing data with logger wrappers

LoggerWrapper layer

This wrapper uses Sinergym’s LoggerStorage class to properly capture the interaction flow with the environment.

The class used by the wrapper can be replaced with a different back-end. It can then be combined with different wrappers to save the stored data, such as CSVLogger or WandBLogger. For more information about Sinergym’s logger, visit Logging System Overview, Logger Wrappers and an example about custom loggers.

[10]:

env = gym.make('Eplus-5zone-hot-discrete-v1')
env = LoggerWrapper(env, storage_class=LoggerStorage)

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:09<00:00, 10.01%/s, 100% completed]
[WRAPPER LoggerWrapper] (INFO) : Wrapper initialized.

This wrapper enables the use of a LoggerStorage instance within the environment class and automatically captures interaction data while actions are sent by an agent. At each reset, the data from this class is cleared to start the next episode. The idea is to combine it with other output loggers like those listed below:

LoggerCSV layer

[11]:

env = CSVLogger(env)

env.reset()
truncated = terminated = False
current_month = 0

while not (terminated or truncated):
    a = env.action_space.sample()
    _, _, terminated, truncated, _ = env.step(a)
env.close()

[WRAPPER CSVLogger] (INFO) : Wrapper initialized.
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
[WRAPPER CSVLogger] (INFO) : Environment closed, data updated in monitor and progress.csv.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:23<00:00,  4.34%/s, 100% completed]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]

Once the LoggerWrapper is applied, this wrapper can be used to output episode data through Sinergym’s output, along with summary metrics added to CSV files. More details on this structure can be found in OutputFormat.

Sinergym will raise an error if this wrapper is used without first enabling LoggerWrapper or a similar custom logger.

WandBLogger layer

[ ]:

# env = WandBLogger(env = Env,
#                 entity = <wandb_account_entity>,
#                 project_name = <wandb_project_name>,
#                 run_name = <run_name>
#                 group = 'Notebook_example',
#                 tags: ['tag1','tag2'],
#                 save_code = False,
#                 dump_frequency = 1000,
#                 artifact_save = True,
#                 artifact_type = 'output',
#                 excluded_info_keys = ['reward',
#                                   'action',
#                                   'timestep',
#                                   'month',
#                                   'day',
#                                   'hour',
#                                   'time_elapsed(hours)',
#                                   'reward_weight',
#                                   'is_raining'],
#                 excluded_episode_summary_keys = ['terminated',
#                                              'truncated']):

# env.reset()
# truncated = terminated = False
# current_month = 0
# while not (terminated or truncated):
#     a = env.action_space.sample()
#     _,_,terminated,truncated,_=env.step(a)
# env.close()

Similar to CSVLogger, this wrapper requires the environment to have been previously encapsulated by a LoggerWrapper or any custom logger.

The user must have a pre-existing Weights and Biases account and correctly configured it.

This wrapper does not override CSVLogger, so both can be applied simultaneously.

Multi-observation wrapper

This wrapper stacks observations in a history queue, whose size can be customized:

[ ]:

# Original environment
env = gym.make('Eplus-5zone-hot-discrete-v1')
obs, info = env.reset()
print('BEFORE MULTI OBSERVATION: ', obs)

# Multi-observation environment with a queue of size 5
env = MultiObsWrapper(env, n=5, flatten=True)
obs, info = env.reset()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
BEFORE MULTI OBSERVATION:  [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05]
[WRAPPER MultiObsWrapper] (INFO) : Wrapper initialized.
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 2: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-2/output.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 476.79%/s, 100% completed]
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 2 started.
Simulation Progress [Episode 2]: 100%|██████████| 100/100 [00:00<00:00, 194180.74%/s, 100% completed]

The result is:

[13]:

print('MULTI OBSERVATION: \n', obs)
env.close()

MULTI OBSERVATION:
 [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05 1.00000000e+00 1.00000000e+00 0.00000000e+00
 4.40000010e+00 6.50000000e+01 3.87500000e+00 1.45000000e+02
 0.00000000e+00 0.00000000e+00 1.28000002e+01 4.00000000e+01
 1.99520779e+01 2.82156696e+01 0.00000000e+00 0.00000000e+00
 1.17947556e+02 1.06152805e+05 1.00000000e+00 1.00000000e+00
 0.00000000e+00 4.40000010e+00 6.50000000e+01 3.87500000e+00
 1.45000000e+02 0.00000000e+00 0.00000000e+00 1.28000002e+01
 4.00000000e+01 1.99520779e+01 2.82156696e+01 0.00000000e+00
 0.00000000e+00 1.17947556e+02 1.06152805e+05 1.00000000e+00
 1.00000000e+00 0.00000000e+00 4.40000010e+00 6.50000000e+01
 3.87500000e+00 1.45000000e+02 0.00000000e+00 0.00000000e+00
 1.28000002e+01 4.00000000e+01 1.99520779e+01 2.82156696e+01
 0.00000000e+00 0.00000000e+00 1.17947556e+02 1.06152805e+05
 1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-discrete-v1]
Simulation Progress [Episode 2]: 100%|██████████| 100/100 [00:00<00:00, 6459.04%/s, 100% completed]

Weather forecasting wrapper

This wrapper adds weather forecast information to the current observation.

[14]:

# Original environment
env = gym.make('Eplus-5zone-hot-discrete-v1')
obs, info = env.reset()
print('OBSERVATION VARIABLES BEFORE WEATHER FORECASTING: ',
      env.get_wrapper_attr('observation_variables'))
print('OBSERVATION BEFORE WEATHER FORECASTING: ', obs)

# Weather forecasting environment
env = WeatherForecastingWrapper(env, n=5, delta=1)
obs, info = env.reset()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
Simulation Progress [Episode 2]: 100%|██████████| 100/100 [00:09<00:00, 10.14%/s, 100% completed]
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
OBSERVATION VARIABLES BEFORE WEATHER FORECASTING:  ['month', 'day_of_month', 'hour', 'outdoor_temperature', 'outdoor_humidity', 'wind_speed', 'wind_direction', 'diffuse_solar_radiation', 'direct_solar_radiation', 'htg_setpoint', 'clg_setpoint', 'air_temperature', 'air_humidity', 'people_occupant', 'co2_emission', 'HVAC_electricity_demand_rate', 'total_electricity_HVAC']
OBSERVATION BEFORE WEATHER FORECASTING:  [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05]
[WRAPPER WeatherForecastingWrapper] (INFO) : Wrapper initialized.
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 2: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-2/output.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 476.89%/s, 100% completed]
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 2 started.

We can observe the results:

[15]:

print('OBSERVATION VARIABLES AFTER WEATHER FORECASTING: ',
      env.get_wrapper_attr('observation_variables'))

print('OBSERVATION AFTER WEATHER FORECASTING: ', obs)

OBSERVATION VARIABLES AFTER WEATHER FORECASTING:  ['month', 'day_of_month', 'hour', 'outdoor_temperature', 'outdoor_humidity', 'wind_speed', 'wind_direction', 'diffuse_solar_radiation', 'direct_solar_radiation', 'htg_setpoint', 'clg_setpoint', 'air_temperature', 'air_humidity', 'people_occupant', 'co2_emission', 'HVAC_electricity_demand_rate', 'total_electricity_HVAC', 'forecast_1_Dry Bulb Temperature', 'forecast_1_Relative Humidity', 'forecast_1_Wind Direction', 'forecast_1_Wind Speed', 'forecast_1_Direct Normal Radiation', 'forecast_1_Diffuse Horizontal Radiation', 'forecast_2_Dry Bulb Temperature', 'forecast_2_Relative Humidity', 'forecast_2_Wind Direction', 'forecast_2_Wind Speed', 'forecast_2_Direct Normal Radiation', 'forecast_2_Diffuse Horizontal Radiation', 'forecast_3_Dry Bulb Temperature', 'forecast_3_Relative Humidity', 'forecast_3_Wind Direction', 'forecast_3_Wind Speed', 'forecast_3_Direct Normal Radiation', 'forecast_3_Diffuse Horizontal Radiation', 'forecast_4_Dry Bulb Temperature', 'forecast_4_Relative Humidity', 'forecast_4_Wind Direction', 'forecast_4_Wind Speed', 'forecast_4_Direct Normal Radiation', 'forecast_4_Diffuse Horizontal Radiation', 'forecast_5_Dry Bulb Temperature', 'forecast_5_Relative Humidity', 'forecast_5_Wind Direction', 'forecast_5_Wind Speed', 'forecast_5_Direct Normal Radiation', 'forecast_5_Diffuse Horizontal Radiation']
OBSERVATION AFTER WEATHER FORECASTING:  [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05 4.60000000e+00 5.80000000e+01 2.00000000e+02
 5.60000000e+00 0.00000000e+00 0.00000000e+00 3.70000000e+00
 6.40000000e+01 1.50000000e+02 4.00000000e+00 0.00000000e+00
 0.00000000e+00 2.20000000e+00 7.00000000e+01 1.40000000e+02
 3.00000000e+00 0.00000000e+00 0.00000000e+00 1.20000000e+00
 7.40000000e+01 2.10000000e+02 2.90000000e+00 0.00000000e+00
 0.00000000e+00 8.00000000e-01 7.80000000e+01 2.10000000e+02
 2.40000000e+00 0.00000000e+00 0.00000000e+00]

Energy cost wrapper

This wrapper adds energy cost information to the current observation:

[16]:

# Original environment
env = gym.make('Eplus-5zone-hot-discrete-v1')
obs, info = env.reset()
print('OBSERVATION VARIABLES BEFORE ADDING ENERGY COST: \n',
      env.get_wrapper_attr('observation_variables'))
print('OBSERVATION VALUES BEFORE ADDING ENERGY COST: \n', obs)

# Energy Cost environment
env = EnergyCostWrapper(
    env, energy_cost_data_path='/workspaces/sinergym/sinergym/data/energy_cost/PVPC_active_energy_billing_Iberian_Peninsula_2023.csv')
obs, info = env.reset()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-discrete-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER DiscretizeEnv] (INFO) : New Discrete Space and mapping: Discrete(10)
[WRAPPER DiscretizeEnv] (INFO) : Make sure that the action space is compatible and contained in the original environment.
[WRAPPER DiscretizeEnv] (INFO) : Wrapper initialized
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
OBSERVATION VARIABLES BEFORE ADDING ENERGY COST:
 ['month', 'day_of_month', 'hour', 'outdoor_temperature', 'outdoor_humidity', 'wind_speed', 'wind_direction', 'diffuse_solar_radiation', 'direct_solar_radiation', 'htg_setpoint', 'clg_setpoint', 'air_temperature', 'air_humidity', 'people_occupant', 'co2_emission', 'HVAC_electricity_demand_rate', 'total_electricity_HVAC']
OBSERVATION VALUES BEFORE ADDING ENERGY COST:
 [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05]
[REWARD] (INFO) : Reward function initialized.
[REWARD] (INFO) : Reward function initialized.
[WRAPPER EnergyCostWrapper] (INFO) : Wrapper initialized.
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 2: Eplus-5zone-hot-discrete-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-discrete-v1-res1/episode-2/output.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:00<00:00, 471.24%/s, 100% completed]
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 2 started.

This is the result:

[17]:

print('OBSERVATION VARIABLES AFTER ADDING ENERGY COST: \n', env.get_wrapper_attr('observation_variables'))
print('OBSERVATION VALUES AFTER ADDING ENERGY COST: \n',obs)

OBSERVATION VARIABLES AFTER ADDING ENERGY COST:
 ['month', 'day_of_month', 'hour', 'outdoor_temperature', 'outdoor_humidity', 'wind_speed', 'wind_direction', 'diffuse_solar_radiation', 'direct_solar_radiation', 'htg_setpoint', 'clg_setpoint', 'air_temperature', 'air_humidity', 'people_occupant', 'co2_emission', 'HVAC_electricity_demand_rate', 'total_electricity_HVAC', 'energy_cost']
OBSERVATION VALUES AFTER ADDING ENERGY COST:
 [1.00000000e+00 1.00000000e+00 0.00000000e+00 4.40000010e+00
 6.50000000e+01 3.87500000e+00 1.45000000e+02 0.00000000e+00
 0.00000000e+00 1.28000002e+01 4.00000000e+01 1.99520779e+01
 2.82156696e+01 0.00000000e+00 0.00000000e+00 1.17947556e+02
 1.06152805e+05 4.14500000e+01]

Nesting wrappers

All wrappers included in Sinergym are stackable and organized in layers. However, the order in which these layers are applied can affect the final result, depending on the wrappers being used.

For instance, applying the logger before normalizing differs from doing it in the reverse order. In the first case, the data will be logged without normalization, even though the agent will operate in a normalized environment. In the second case, the logger will capture the normalized values since it encapsulates the normalization applied by the previous layer.

An example of how to nest wrappers is shown below:

[18]:

env = gym.make('Eplus-5zone-hot-continuous-v1')
env = MultiObjectiveReward(
    env=env,
    reward_terms=[
        'energy_term',
        'comfort_term'])
env = PreviousObservationWrapper(env, previous_variables=[
    'htg_setpoint',
    'clg_setpoint',
    'air_temperature'])
env = DatetimeWrapper(env)
env = DiscreteIncrementalWrapper(
    env, initial_values=[21.0, 25.0], delta_temp=2, step_temp=0.5)
env = NormalizeObservation(
    env=env)
env = LoggerWrapper(env=env)
env = MultiObsWrapper(env=env, n=5, flatten=True)

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: Eplus-5zone-hot-continuous-v1
#==============================================================================================#
[MODEL] (INFO) : Working directory created: /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1
[MODEL] (INFO) : Model Config is correct.
[MODEL] (INFO) : Building model Output:Variable updated with defined variable names.
[MODEL] (INFO) : Updated building model Output:Meter with meter names.
[MODEL] (INFO) : Runperiod established.
[MODEL] (INFO) : Episode length (seconds): 31536000.0
[MODEL] (INFO) : timestep size (seconds): 900.0
[MODEL] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER MultiObjectiveReward] (INFO) : wrapper initialized.
[WRAPPER PreviousObservationWrapper] (INFO) : Wrapper initialized.
[WRAPPER DatetimeWrapper] (INFO) : Wrapper initialized.
[WRAPPER DiscreteIncrementalWrapper] (INFO) : New incremental action mapping: 17
[WRAPPER DiscreteIncrementalWrapper] (INFO) : {0: array([0., 0.], dtype=float32), 1: array([0.5, 0. ], dtype=float32), 2: array([1., 0.], dtype=float32), 3: array([1.5, 0. ], dtype=float32), 4: array([2., 0.], dtype=float32), 5: array([-0.5,  0. ], dtype=float32), 6: array([-1.,  0.], dtype=float32), 7: array([-1.5,  0. ], dtype=float32), 8: array([-2.,  0.], dtype=float32), 9: array([0. , 0.5], dtype=float32), 10: array([0., 1.], dtype=float32), 11: array([0. , 1.5], dtype=float32), 12: array([0., 2.], dtype=float32), 13: array([ 0. , -0.5], dtype=float32), 14: array([ 0., -1.], dtype=float32), 15: array([ 0. , -1.5], dtype=float32), 16: array([ 0., -2.], dtype=float32)}
[WRAPPER DiscreteIncrementalWrapper] (INFO) : Wrapper initialized
[WRAPPER NormalizeObservation] (INFO) : Wrapper initialized.
[WRAPPER LoggerWrapper] (INFO) : Wrapper initialized.
[WRAPPER MultiObsWrapper] (INFO) : Wrapper initialized.

Now we can simply use the wrapped environment as follows:

[19]:

for i in range(1):
    obs, info = env.reset()
    truncated = terminated = False
    current_month = 0
    while not (terminated or truncated):
        a = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(a)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', reward, info)
env.close()

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: Eplus-5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODEL] (INFO) : Episode directory created.
[MODEL] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODEL] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path in /workspaces/sinergym/examples/Eplus-5zone-hot-continuous-v1-res1/episode-1/output.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
[ENVIRONMENT] (INFO) : Episode 1 started.
Reward:  [-0.48383429007956374, 0.0] {'time_elapsed(hours)': 0.5416666666666666, 'month': 1, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [21.0, 23.25], 'timestep': 1, 'reward': -0.48383429007956374, 'energy_term': -0.48383429007956374, 'comfort_term': 0.0, 'energy_penalty': -9676.685801591275, 'comfort_penalty': 0, 'total_power_demand': 9676.685801591275, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  10%|█         | 10/100 [00:01<00:17,  5.23%/s, 10% completed]  Reward:  [-0.005897377870609513, 0.0] {'time_elapsed(hours)': 744.25, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [13.75, 27.75], 'timestep': 2976, 'reward': -0.005897377870609513, 'energy_term': -0.005897377870609513, 'comfort_term': 0.0, 'energy_penalty': -117.94755741219025, 'comfort_penalty': 0, 'total_power_demand': 117.94755741219025, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  17%|█▋        | 17/100 [00:02<00:14,  5.88%/s, 17% completed]Reward:  [-0.005897377870609513, 0.0] {'time_elapsed(hours)': 1416.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [12.75, 26.0], 'timestep': 5664, 'reward': -0.005897377870609513, 'energy_term': -0.005897377870609513, 'comfort_term': 0.0, 'energy_penalty': -117.94755741219025, 'comfort_penalty': 0, 'total_power_demand': 117.94755741219025, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  26%|██▌       | 26/100 [00:04<00:14,  5.07%/s, 26% completed]Reward:  [-0.005897377870609513, 0.0] {'time_elapsed(hours)': 2160.25, 'month': 4, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [17.25, 23.25], 'timestep': 8640, 'reward': -0.005897377870609513, 'energy_term': -0.005897377870609513, 'comfort_term': 0.0, 'energy_penalty': -117.94755741219025, 'comfort_penalty': 0, 'total_power_demand': 117.94755741219025, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  34%|███▍      | 34/100 [00:06<00:14,  4.50%/s, 34% completed]Reward:  [-0.08878907070147038, 0.0] {'time_elapsed(hours)': 2880.25, 'month': 5, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [23.25, 28.0], 'timestep': 11520, 'reward': -0.08878907070147038, 'energy_term': -0.08878907070147038, 'comfort_term': 0.0, 'energy_penalty': -1775.7814140294076, 'comfort_penalty': 0, 'total_power_demand': 1775.7814140294076, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  42%|████▏     | 42/100 [00:08<00:13,  4.19%/s, 42% completed]Reward:  [-0.03912978501610814, -0.08803840086374848] {'time_elapsed(hours)': 3624.25, 'month': 6, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [20.75, 24.25], 'timestep': 14496, 'reward': -0.12716818587985662, 'energy_term': -0.03912978501610814, 'comfort_term': -0.08803840086374848, 'energy_penalty': -782.5957003221629, 'comfort_penalty': -0.17607680172749696, 'total_power_demand': 782.5957003221629, 'total_temperature_violation': 0.17607680172749696, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  51%|█████     | 51/100 [00:10<00:11,  4.21%/s, 51% completed]Reward:  [-0.03708112087791508, 0.0] {'time_elapsed(hours)': 4344.25, 'month': 7, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [23.25, 26.0], 'timestep': 17376, 'reward': -0.03708112087791508, 'energy_term': -0.03708112087791508, 'comfort_term': 0.0, 'energy_penalty': -741.6224175583017, 'comfort_penalty': 0, 'total_power_demand': 741.6224175583017, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  59%|█████▉    | 59/100 [00:12<00:08,  4.62%/s, 59% completed]Reward:  [-0.06671566172639565, 0.0] {'time_elapsed(hours)': 5088.25, 'month': 8, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [23.25, 23.75], 'timestep': 20352, 'reward': -0.06671566172639565, 'energy_term': -0.06671566172639565, 'comfort_term': 0.0, 'energy_penalty': -1334.3132345279128, 'comfort_penalty': 0, 'total_power_demand': 1334.3132345279128, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  67%|██████▋   | 67/100 [00:14<00:07,  4.65%/s, 67% completed]Reward:  [-0.03010492807075567, -0.21086620795861855] {'time_elapsed(hours)': 5832.25, 'month': 9, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [13.5, 26.25], 'timestep': 23328, 'reward': -0.24097113602937423, 'energy_term': -0.03010492807075567, 'comfort_term': -0.21086620795861855, 'energy_penalty': -602.0985614151134, 'comfort_penalty': -0.4217324159172371, 'total_power_demand': 602.0985614151134, 'total_temperature_violation': 0.4217324159172371, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  76%|███████▌  | 76/100 [00:16<00:05,  4.45%/s, 76% completed]Reward:  [-0.03751588574194862, 0.0] {'time_elapsed(hours)': 6552.25, 'month': 10, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [12.25, 28.0], 'timestep': 26208, 'reward': -0.03751588574194862, 'energy_term': -0.03751588574194862, 'comfort_term': 0.0, 'energy_penalty': -750.3177148389724, 'comfort_penalty': 0, 'total_power_demand': 750.3177148389724, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  84%|████████▍ | 84/100 [00:17<00:02,  5.39%/s, 84% completed]Reward:  [-0.005897377870609513, 0.0] {'time_elapsed(hours)': 7296.25, 'month': 11, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [20.0, 29.5], 'timestep': 29184, 'reward': -0.005897377870609513, 'energy_term': -0.005897377870609513, 'comfort_term': 0.0, 'energy_penalty': -117.94755741219025, 'comfort_penalty': 0, 'total_power_demand': 117.94755741219025, 'total_temperature_violation': 0, 'reward_weight': 0.5}
Simulation Progress [Episode 1]:  92%|█████████▏| 92/100 [00:19<00:01,  4.78%/s, 92% completed]Reward:  [-0.005897377870609513, 0.0] {'time_elapsed(hours)': 8016.25, 'month': 12, 'day': 1, 'hour': 0, 'is_raining': False, 'action': [12.5, 26.0], 'timestep': 32064, 'reward': -0.005897377870609513, 'energy_term': -0.005897377870609513, 'comfort_term': 0.0, 'energy_penalty': -117.94755741219025, 'comfort_penalty': 0, 'total_power_demand': 117.94755741219025, 'total_temperature_violation': 0, 'reward_weight': 0.5}
[WRAPPER NormalizeObservation] (INFO) : Normalization calibration saved.
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:23<00:00,  4.30%/s, 100% completed]
[ENVIRONMENT] (INFO) : Environment closed. [Eplus-5zone-hot-continuous-v1]