18. Basic example

Sinergym utilizes the standard Farama Gymnasium API. Let’s explore how to create a basic loop.

To begin, we need to import Sinergym and create an environment. In this example, we will use the Eplus-demo-v1 environment.

[1]:
import gymnasium as gym
import numpy as np

import sinergym
env = gym.make('Eplus-demo-v1')
#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: demo-v1
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created.
[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-demo-v1-res1
[MODELING] (INFO) : Model Config is correct.
[MODELING] (INFO) : Update building model Output:Variable with variable names.
[MODELING] (INFO) : Update building model Output:Meter with meter names.
[MODELING] (INFO) : Extra config: runperiod updated to {'apply_weekend_holiday_rule': 'No', 'begin_day_of_month': 1, 'begin_month': 1, 'begin_year': 1991, 'day_of_week_for_start_day': 'Monday', 'end_day_of_month': 1, 'end_month': 3, 'end_year': 1991, 'use_weather_file_daylight_saving_period': 'Yes', 'use_weather_file_holidays_and_special_days': 'Yes', 'use_weather_file_rain_indicators': 'Yes', 'use_weather_file_snow_indicators': 'Yes'}
[MODELING] (INFO) : Updated episode length (seconds): 5184000.0
[MODELING] (INFO) : Updated timestep size (seconds): 3600.0
[MODELING] (INFO) : Updated timesteps per episode: 1440
[MODELING] (INFO) : Runperiod established.
[MODELING] (INFO) : Episode length (seconds): 5184000.0
[MODELING] (INFO) : timestep size (seconds): 3600.0
[MODELING] (INFO) : timesteps per episode: 1440
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.

At first glance, Sinergym might seem to be only imported and not used. However, importing Sinergym inherently defines all its Environments for use. In this instance, Eplus-demo-v1 is readily available with all its features.

With this straightforward setup, we’re prepared to iterate over the episodes. For this basic example, we’ll consider just one episode. Essentially, the required code would look something like this:

[2]:
for i in range(1):
    obs, info = env.reset()
    rewards = []
    truncated = terminated = False
    current_month = 0
    while not (terminated or truncated):
        a = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', sum(rewards), info)
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: demo-v1
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created.
[MODELING] (INFO) : Weather file USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw used.
[MODELING] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path.
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward:  -43.96143518328036 {'time_elapsed(hours)': 2.5, 'month': 1, 'day': 1, 'hour': 1, 'is_raining': False, 'action': array([21.257265, 22.842768], dtype=float32), 'timestep': 1, 'reward': -43.96143518328036, 'energy_term': -43.67932315835093, 'comfort_term': -0.2821120249294271, 'reward_weight': 0.5, 'abs_energy_penalty': -87.35864631670186, 'abs_comfort_penalty': -0.5642240498588542, 'total_power_demand': 87.35864631670186, 'total_temperature_violation': 0.5642240498588542}
Simulation Progress [Episode 1]:  53%|█████▎    | 53/100 [00:00<00:00, 206.77%/s, 53% completed] Reward:  -1654090.4591468547 {'time_elapsed(hours)': 745.0833333333334, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([15.304309, 23.142662], dtype=float32), 'timestep': 744, 'reward': -10307.216587526324, 'energy_term': -10307.216587526324, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -20614.433175052647, 'abs_comfort_penalty': 0, 'total_power_demand': 20614.433175052647, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  98%|█████████▊| 98/100 [00:00<00:00, 176.82%/s, 98% completed]Reward:  -2817928.777706766 {'time_elapsed(hours)': 1417.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([15.631611, 24.258501], dtype=float32), 'timestep': 1416, 'reward': -2181.3837787755947, 'energy_term': -2181.3837787755947, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -4362.767557551189, 'abs_comfort_penalty': 0, 'total_power_demand': 4362.767557551189, 'total_temperature_violation': 0.0}

As always, remember to close the environment once the interaction is complete:

[3]:
env.close()
Simulation Progress [Episode 1]:  98%|█████████▊| 98/100 [00:02<00:00, 38.51%/s, 98% completed]
[ENVIRONMENT] (INFO) : Environment closed. [demo-v1]

Now, let’s examine the final rewards:

[4]:
print(
    'Mean reward: ',
    np.mean(rewards),
    'Cumulative reward: ',
    sum(rewards))
Mean reward:  -1968.0675045384069 Cumulative reward:  -2834017.206535306

Sinergym has an extensive list of registered environments. We utilize building files with varying characteristics, such as continuous or discrete action spaces, different weather types, weather noise, run periods, timesteps, reward functions, and more. We’ll explore these in the upcoming notebooks.