15. Basic example

Sinergym uses the standard OpenAI gym API. Lets see how to create a basic loop.

First we need to include sinergym and create an environment, in our case using ‘Eplus-demo-v1’

[3]:

import gymnasium as gym
import numpy as np

import sinergym
env = gym.make('Eplus-demo-v1')

[2023-02-09 11:00:27,049] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf ExternalInterface object if it is not present...
[2023-02-09 11:00:27,050] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2023-02-09 11:00:27,052] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf OutPut:Variable and variables XML tree model for BVCTB connection.
[2023-02-09 11:00:27,053] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2023-02-09 11:00:27,054] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...

At first glance may appear that sinergym is only imported but never used, but by importing Sinergym all its Environments are defined to be used, in this case ‘Eplus-demo-v1’ with all the information contained in the idf file and the config file.

After this simple definition we are ready to loop the episodes, for this simple example we are going to consider only 1 episode. In summary the code we need is something like this:

[4]:

for i in range(1):
    obs, info = env.reset()
    rewards = []
    terminated = False
    current_month = 0
    while not terminated:
        a = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', sum(rewards), info)

[2023-02-09 11:00:30,139] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-02-09 11:00:30,146] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-demo-v1-res1/Eplus-env-sub_run1
Reward:  -1.086220480544118 {'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'action': [22, 23], 'total_power': 21724.40961088236, 'total_power_no_units': -2.172440961088236, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.81947413483711]}
Reward:  -2083.3260107177957 {'timestep': 2976, 'time_elapsed': 2678400, 'year': 1991, 'month': 2, 'day': 1, 'hour': 0, 'action': [16, 29], 'total_power': 4300.67198003095, 'total_power_no_units': -0.43006719800309506, 'comfort_penalty': -1.470462231008021, 'abs_comfort': 1.470462231008021, 'temperatures': [18.52953776899198]}
Reward:  -4068.1430820739643 {'timestep': 5664, 'time_elapsed': 5097600, 'year': 1991, 'month': 3, 'day': 1, 'hour': 0, 'action': [22, 23], 'total_power': 17443.56199085256, 'total_power_no_units': -1.7443561990852559, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.52682664353406]}
Reward:  -5408.011099508528 {'timestep': 8640, 'time_elapsed': 7776000, 'year': 1991, 'month': 4, 'day': 1, 'hour': 0, 'action': [17, 28], 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.02634475252593]}
Reward:  -6324.363418661066 {'timestep': 11520, 'time_elapsed': 10368000, 'year': 1991, 'month': 5, 'day': 1, 'hour': 0, 'action': [16, 29], 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.87808257037088]}
Reward:  -7166.002341495086 {'timestep': 14496, 'time_elapsed': 13046400, 'year': 1991, 'month': 6, 'day': 1, 'hour': 0, 'action': [22, 22], 'total_power': 5901.306181804051, 'total_power_no_units': -0.5901306181804051, 'comfort_penalty': -1.00595003066665, 'abs_comfort': 1.00595003066665, 'temperatures': [21.99404996933335]}
Reward:  -10107.38621508419 {'timestep': 17376, 'time_elapsed': 15638400, 'year': 1991, 'month': 7, 'day': 1, 'hour': 0, 'action': [22, 22], 'total_power': 5959.248391726638, 'total_power_no_units': -0.5959248391726638, 'comfort_penalty': -1.003880381582551, 'abs_comfort': 1.003880381582551, 'temperatures': [21.99611961841745]}
Reward:  -13368.049092385163 {'timestep': 20352, 'time_elapsed': 18316800, 'year': 1991, 'month': 8, 'day': 1, 'hour': 0, 'action': [22, 23], 'total_power': 20642.54173343959, 'total_power_no_units': -2.064254173343959, 'comfort_penalty': -1.2461460714288606, 'abs_comfort': 1.2461460714288606, 'temperatures': [21.75385392857114]}
Reward:  -16573.882795341822 {'timestep': 23328, 'time_elapsed': 20995200, 'year': 1991, 'month': 9, 'day': 1, 'hour': 0, 'action': [18, 27], 'total_power': 296.4125183546004, 'total_power_no_units': -0.029641251835460045, 'comfort_penalty': -2.85266561770349, 'abs_comfort': 2.85266561770349, 'temperatures': [20.14733438229651]}
Reward:  -19392.89340178917 {'timestep': 26208, 'time_elapsed': 23587200, 'year': 1991, 'month': 10, 'day': 1, 'hour': 0, 'action': [20, 25], 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.33172235919861]}
Reward:  -20446.271430476787 {'timestep': 29184, 'time_elapsed': 26265600, 'year': 1991, 'month': 11, 'day': 1, 'hour': 0, 'action': [15, 30], 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.45708717771629]}
Reward:  -21616.477165610137 {'timestep': 32064, 'time_elapsed': 28857600, 'year': 1991, 'month': 12, 'day': 1, 'hour': 0, 'action': [18, 27], 'total_power': 182.8619608065522, 'total_power_no_units': -0.01828619608065522, 'comfort_penalty': -1.0452010578529105, 'abs_comfort': 1.0452010578529105, 'temperatures': [18.95479894214709]}
Reward:  -23634.85490851 {'timestep': 35040, 'time_elapsed': 31536000, 'year': 1992, 'month': 1, 'day': 1, 'hour': 0, 'action': [22, 23], 'total_power': 23109.07172644784, 'total_power_no_units': -2.310907172644784, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.04969081696662]}

And as always don’t forget to close the environment:

[5]:

env.close()

[2023-02-09 11:00:46,130] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.

Now we can see the final rewards:

[6]:

print(
    'Mean reward: ',
    np.mean(rewards),
    'Cumulative reward: ',
    sum(rewards))

Mean reward:  -0.6745106994437942 Cumulative reward:  -23634.85490851

The list of environments that we have registered in Sinergym is extensive and we use buildings changing particularities. For example, continuous action space or discrete, noise over weather, runperiod, timesteps, reward function, etc. We will see it in the following notebooks.****