15. Basic example
Sinergym uses the standard OpenAI gym API. Lets see how to create a basic loop.
First we need to include sinergym and create an environment, in our case using ‘Eplus-demo-v1’
[1]:
import gym
import numpy as np
import sinergym
env = gym.make('Eplus-demo-v1')
/usr/local/lib/python3.10/dist-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32
logger.warn(
[2022-10-07 09:00:27,602] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf ExternalInterface object if it is not present...
[2022-10-07 09:00:27,603] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2022-10-07 09:00:27,605] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf OutPut:Variable and variables XML tree model for BVCTB connection.
[2022-10-07 09:00:27,606] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2022-10-07 09:00:27,607] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...
At first glance may appear that sinergym is only imported but never used, but by importing Sinergym all its Environments are defined to be used, in this case ‘Eplus-demo-v1’ with all the information contained in the idf file and the config file.
After this simple definition we are ready to loop the episodes, for this simple example we are going to consider only 1 episode. In summary the code we need is something like this:
[2]:
for i in range(1):
obs = env.reset()
rewards = []
done = False
current_month = 0
while not done:
a = env.action_space.sample()
obs, reward, done, info = env.step(a)
rewards.append(reward)
if info['month'] != current_month: # display results every month
current_month = info['month']
print('Reward: ', sum(rewards), info)
[2022-10-07 09:00:27,862] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2022-10-07 09:00:27,873] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-demo-v1-res1/Eplus-env-sub_run1
Reward: -0.5693658209031192 {'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'total_power': 3780.170717786078, 'total_power_no_units': -0.3780170717786078, 'comfort_penalty': -0.7607145700276305, 'abs_comfort': 0.7607145700276305, 'temperatures': [19.23928542997237], 'out_temperature': 1.8, 'action_': [18, 27]}
Reward: -2112.7625904745278 {'timestep': 2976, 'time_elapsed': 2678400, 'year': 1991, 'month': 2, 'day': 1, 'hour': 0, 'total_power': 22348.47236479097, 'total_power_no_units': -2.234847236479097, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.18911065438731], 'out_temperature': -7.0, 'action_': [22, 23]}
Reward: -4104.265504441257 {'timestep': 5664, 'time_elapsed': 5097600, 'year': 1991, 'month': 3, 'day': 1, 'hour': 0, 'total_power': 12948.13071758654, 'total_power_no_units': -1.2948130717586541, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.87810890551877], 'out_temperature': 8.1, 'action_': [21, 24]}
Reward: -5417.441233237612 {'timestep': 8640, 'time_elapsed': 7776000, 'year': 1991, 'month': 4, 'day': 1, 'hour': 0, 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.12519567758555894, 'abs_comfort': 0.12519567758555894, 'temperatures': [19.87480432241444], 'out_temperature': 7.7, 'action_': [19, 26]}
Reward: -6334.076617115651 {'timestep': 11520, 'time_elapsed': 10368000, 'year': 1991, 'month': 5, 'day': 1, 'hour': 0, 'total_power': 9909.57629951364, 'total_power_no_units': -0.9909576299513639, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.98956711598672], 'out_temperature': 13.0, 'action_': [22, 22]}
Reward: -7196.337141748097 {'timestep': 14496, 'time_elapsed': 13046400, 'year': 1991, 'month': 6, 'day': 1, 'hour': 0, 'total_power': 257.32472912215, 'total_power_no_units': -0.025732472912215004, 'comfort_penalty': -1.6342081608280488, 'abs_comfort': 1.6342081608280488, 'temperatures': [21.36579183917195], 'out_temperature': 18.4, 'action_': [19, 26]}
Reward: -10119.078686707608 {'timestep': 17376, 'time_elapsed': 15638400, 'year': 1991, 'month': 7, 'day': 1, 'hour': 0, 'total_power': 175.7796758221068, 'total_power_no_units': -0.017577967582210682, 'comfort_penalty': -2.308881942532789, 'abs_comfort': 2.308881942532789, 'temperatures': [20.69111805746721], 'out_temperature': 17.7, 'action_': [20, 25]}
Reward: -13344.34795468286 {'timestep': 20352, 'time_elapsed': 18316800, 'year': 1991, 'month': 8, 'day': 1, 'hour': 0, 'total_power': 4274.841504910937, 'total_power_no_units': -0.42748415049109373, 'comfort_penalty': -2.6746607947314196, 'abs_comfort': 2.6746607947314196, 'temperatures': [20.32533920526858], 'out_temperature': 20.6, 'action_': [16, 29]}
Reward: -16532.561015096362 {'timestep': 23328, 'time_elapsed': 20995200, 'year': 1991, 'month': 9, 'day': 1, 'hour': 0, 'total_power': 296.4221825034278, 'total_power_no_units': -0.02964221825034278, 'comfort_penalty': -2.1542061821218716, 'abs_comfort': 2.1542061821218716, 'temperatures': [20.84579381787813], 'out_temperature': 18.8, 'action_': [17, 28]}
Reward: -19347.21771433634 {'timestep': 26208, 'time_elapsed': 23587200, 'year': 1991, 'month': 10, 'day': 1, 'hour': 0, 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.14597880471529123, 'abs_comfort': 0.14597880471529123, 'temperatures': [19.85402119528471], 'out_temperature': 13.3, 'action_': [15, 30]}
Reward: -20424.606228211076 {'timestep': 29184, 'time_elapsed': 26265600, 'year': 1991, 'month': 11, 'day': 1, 'hour': 0, 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.1659288529232903, 'abs_comfort': 0.1659288529232903, 'temperatures': [19.83407114707671], 'out_temperature': 13.0, 'action_': [15, 30]}
Reward: -21596.063810147996 {'timestep': 32064, 'time_elapsed': 28857600, 'year': 1991, 'month': 12, 'day': 1, 'hour': 0, 'total_power': 5255.294847641754, 'total_power_no_units': -0.5255294847641755, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.02303149966062], 'out_temperature': 5.1, 'action_': [21, 24]}
Reward: -23614.86930856602 {'timestep': 35040, 'time_elapsed': 31536000, 'year': 1992, 'month': 1, 'day': 1, 'hour': 0, 'total_power': 5016.972171658432, 'total_power_no_units': -0.5016972171658431, 'comfort_penalty': -1.9844573338465317, 'abs_comfort': 1.9844573338465317, 'temperatures': [18.01554266615347], 'out_temperature': -12.0, 'action_': [15, 30]}
And as always don’t forget to close the environment:
[3]:
env.close()
[2022-10-07 09:00:40,720] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.
Now we can see the final rewards:
[4]:
print(
'Mean reward: ',
np.mean(rewards),
'Cumulative reward: ',
sum(rewards))
Mean reward: -0.6739403341485966 Cumulative reward: -23614.86930856602
The list of environments that we have registered in Sinergym is extensive and we use buildings changing particularities. For example, continuous action space or discrete, noise over weather, runperiod, timesteps, reward function, etc. We will see it in the following notebooks.****