16. Basic example
Sinergym uses the standard Farama Gymnasium API. Lets see how to create a basic loop.
First, we need to include Sinergym and to create an environment, in our case using Eplus-demo-v1
[2]:
import gymnasium as gym
import numpy as np
import sinergym
env = gym.make('Eplus-demo-v1')
#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment... [demo-v1]
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created [/workspaces/sinergym/examples/Eplus-env-demo-v1-res4]
[MODELING] (INFO) : runperiod established: {'start_day': 1, 'start_month': 1, 'start_year': 1991, 'end_day': 31, 'end_month': 12, 'end_year': 1991, 'start_weekday': 1, 'n_steps_per_hour': 4}
[MODELING] (INFO) : Episode length (seconds): 31536000.0
[MODELING] (INFO) : timestep size (seconds): 900.0
[MODELING] (INFO) : timesteps per episode: 35040
[MODELING] (INFO) : Model Config is correct.
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment demo-v1 created successfully.
At first glance, it may appear that Sinergym is only imported, but never used. Importing Sinergym, all its Environments are defined to be used. In this case, Eplus-demo-v1
is available with all the features contained.
After this simple definition, we are ready to loop the episodes. For this simple example, we are going to consider only 1 episode. In summary, the code which we need is something like this:
[3]:
for i in range(1):
obs, info = env.reset()
rewards = []
terminated = False
current_month = 0
while not terminated:
a = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(a)
rewards.append(reward)
if info['month'] != current_month: # display results every month
current_month = info['month']
print('Reward: ', sum(rewards), info)
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode... [demo-v1] [Episode 1]
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created [/workspaces/sinergym/examples/Eplus-env-demo-v1-res4/Eplus-env-sub_run1]
[MODELING] (INFO) : Weather file USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw used.
[MODELING] (INFO) : Updated building model with whole Output:Variable available names
[MODELING] (INFO) : Updated building model with whole Output:Meter available names
[MODELING] (INFO) : Extra config: runperiod updated to {'apply_weekend_holiday_rule': 'No', 'begin_day_of_month': 1, 'begin_month': 1, 'begin_year': 1991, 'day_of_week_for_start_day': 'Tuesday', 'end_day_of_month': 1, 'end_month': 3, 'end_year': 1991, 'use_weather_file_daylight_saving_period': 'Yes', 'use_weather_file_holidays_and_special_days': 'Yes', 'use_weather_file_rain_indicators': 'Yes', 'use_weather_file_snow_indicators': 'Yes'}
[MODELING] (INFO) : Updated episode length (seconds): 5184000.0
[MODELING] (INFO) : Updated timestep size (seconds): 3600.0
[MODELING] (INFO) : Updated timesteps per episode: 1440
[MODELING] (INFO) : Adapting weather to building model. [USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw]
[ENVIRONMENT] (INFO) : Saving episode output path... [/workspaces/sinergym/examples/Eplus-env-demo-v1-res4/Eplus-env-sub_run1/output]
/usr/local/lib/python3.10/dist-packages/opyplus/weather_data/weather_data.py:493: FutureWarning: the 'line_terminator'' keyword is deprecated, use 'lineterminator' instead.
epw_content = self._headers_to_epw(use_datetimes=use_datetimes) + df.to_csv(
[SIMULATOR] (INFO) : Running EnergyPlus with args: ['-w', '/workspaces/sinergym/examples/Eplus-env-demo-v1-res4/Eplus-env-sub_run1/USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw', '-d', '/workspaces/sinergym/examples/Eplus-env-demo-v1-res4/Eplus-env-sub_run1/output', '/workspaces/sinergym/examples/Eplus-env-demo-v1-res4/Eplus-env-sub_run1/5ZoneAutoDXVAV.epJSON']
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward: -0.7359103431230598 {'time_elapsed(hours)': 2.5, 'month': 1, 'day': 1, 'hour': 1, 'is_raining': False, 'action': array([18.868284, 25.783413], dtype=float32), 'timestep': 2, 'reward': -0.7359103431230598, 'energy_term': -0.04774477368803297, 'comfort_term': -0.6881655694350268, 'reward_weight': 0.5, 'abs_energy': 954.8954737606593, 'abs_comfort': 1.3763311388700536, 'energy_values': [954.8954737606593], 'temp_values': [18.623668861129946]}
Reward: -184.50823358985986 {'time_elapsed(hours)': 745.25, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([22.103514, 25.652822], dtype=float32), 'timestep': 745, 'reward': -0.17422294281217932, 'energy_term': -0.04774477368803297, 'comfort_term': -0.12647816912414633, 'reward_weight': 0.5, 'abs_energy': 954.8954737606593, 'abs_comfort': 0.25295633824829267, 'energy_values': [954.8954737606593], 'temp_values': [19.747043661751707]}
Reward: -297.68860299767925 {'time_elapsed(hours)': 1417.3333333333333, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([18.174824, 25.777523], dtype=float32), 'timestep': 1417, 'reward': -0.02780659811395238, 'energy_term': -0.019212040973969797, 'comfort_term': -0.008594557139982584, 'reward_weight': 0.5, 'abs_energy': 384.2408194793959, 'abs_comfort': 0.017189114279965167, 'energy_values': [384.2408194793959], 'temp_values': [19.982810885720035]}
And, as always, don’t forget to close the environment when the interaction finishes:
[4]:
env.close()
[ENVIRONMENT] (INFO) : Environment closed. [demo-v1]
Now, we can see the final rewards:
[5]:
print(
'Mean reward: ',
np.mean(rewards),
'Cumulative reward: ',
sum(rewards))
Mean reward: -0.21052321517551723 Cumulative reward: -303.1534298527454
The list of environments that we have registered in Sinergym is extensive and we use buildings files changing particularities. For example, continuous or discrete action spaces, different types of weathers, noise over weather, runperiod, timesteps, reward functions, etc. We will see it in the following notebooks.