Default building control using an empty action space

It is possible to run a simulation using the default building control performed by EnergyPlus (as specified in the building model file).

For instance, from the container’s workspace, run the following command:

$ energyplus -w sinergym/data/weather/USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw sinergym/data/buildings/5ZoneAutoDXVAV.epJSON

However, doing this without our framework has some drawbacks:

You will only have the default EnergyPlus output and will not have access to additional output information, such as data provided by the logger wrapper, which tracks all the environment interactions.
Moreover, building models have a default Site:Locationand SizingPeriod:DesignDay, which Sinergym automatically adjusts based on the specified weather, so you would need to manually modify these settings before launching the simulation.
Lastly, you would also need to manually adjust the RunPeriod in the building file before starting the simulation.

Hence, to avoid manual configurations, we recommend setting up an empty action interface in a Sinergym environment. For instance:

[1]:

import gymnasium as gym
import numpy as np

import sinergym
from sinergym.utils.wrappers import LoggerWrapper

env = gym.make(
    'Eplus-office-hot-continuous-v1',
    actuators={},
    action_space=gym.spaces.Box(
        low=0,
        high=0,
        shape=(0,)))
env = LoggerWrapper(env)

for i in range(1):
    obs, info = env.reset()
    rewards = []
    truncated = terminated = False
    current_month = 0
    while not (terminated or truncated):
        a = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', sum(rewards), info)
    print(
        'Episode ',
        i,
        'Mean reward: ',
        np.mean(rewards),
        'Cumulative reward: ',
        sum(rewards))
env.close()

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: office-hot-continuous-v1
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created.
[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-office-hot-continuous-v1-res1
[MODELING] (INFO) : Model Config is correct.
[MODELING] (INFO) : Update building model Output:Variable with variable names.
[MODELING] (INFO) : Update building model Output:Meter with meter names.
[MODELING] (INFO) : Runperiod established.
[MODELING] (INFO) : Episode length (seconds): 31536000.0
[MODELING] (INFO) : timestep size (seconds): 900.0
[MODELING] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
[WRAPPER LoggerWrapper] (INFO) : Wrapper initialized.
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: office-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created.
[MODELING] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODELING] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path.
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward:  -3.400918408046664 {'time_elapsed(hours)': 0.5, 'month': 1, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 1, 'reward': -3.400918408046664, 'energy_term': -0.008430450234156376, 'comfort_term': -3.3924879578125076, 'reward_weight': 0.5, 'abs_energy_penalty': -168.6090046831275, 'abs_comfort_penalty': -6.784975915625015, 'total_power_demand': 168.6090046831275, 'total_temperature_violation': 6.784975915625015}
Simulation Progress [Episode 1]:  10%|█         | 10/100 [00:02<00:22,  3.94%/s, 10% completed] Reward:  -28187.63927138061 {'time_elapsed(hours)': 744.25, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 2976, 'reward': -11.285602485359158, 'energy_term': -0.010066938284793056, 'comfort_term': -11.275535547074366, 'reward_weight': 0.5, 'abs_energy_penalty': -201.33876569586113, 'abs_comfort_penalty': -22.551071094148732, 'total_power_demand': 201.33876569586113, 'total_temperature_violation': 22.551071094148732}
Simulation Progress [Episode 1]:  17%|█▋        | 17/100 [00:03<00:14,  5.57%/s, 17% completed]Reward:  -45940.77927743045 {'time_elapsed(hours)': 1416.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 5664, 'reward': -10.268961643225715, 'energy_term': -0.010066938284793056, 'comfort_term': -10.258894704940923, 'reward_weight': 0.5, 'abs_energy_penalty': -201.33876569586113, 'abs_comfort_penalty': -20.517789409881846, 'total_power_demand': 201.33876569586113, 'total_temperature_violation': 20.517789409881846}
Simulation Progress [Episode 1]:  26%|██▌       | 26/100 [00:05<00:14,  5.20%/s, 26% completed]Reward:  -70603.72337365196 {'time_elapsed(hours)': 2160.25, 'month': 4, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 8640, 'reward': -12.211713512655699, 'energy_term': -0.010059948897039251, 'comfort_term': -12.20165356375866, 'reward_weight': 0.5, 'abs_energy_penalty': -201.198977940785, 'abs_comfort_penalty': -24.40330712751732, 'total_power_demand': 201.198977940785, 'total_temperature_violation': 24.40330712751732}
Simulation Progress [Episode 1]:  34%|███▍      | 34/100 [00:07<00:15,  4.25%/s, 34% completed]Reward:  -114977.5022623719 {'time_elapsed(hours)': 2880.25, 'month': 5, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 11520, 'reward': -19.68289174648665, 'energy_term': -0.010059948897039251, 'comfort_term': -19.67283179758961, 'reward_weight': 0.5, 'abs_energy_penalty': -201.198977940785, 'abs_comfort_penalty': -39.34566359517922, 'total_power_demand': 201.198977940785, 'total_temperature_violation': 39.34566359517922}
Simulation Progress [Episode 1]:  42%|████▏     | 42/100 [00:09<00:14,  4.11%/s, 42% completed]Reward:  -167310.85629364094 {'time_elapsed(hours)': 3624.25, 'month': 6, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 14496, 'reward': -2.21823982552994, 'energy_term': -0.008430450234156376, 'comfort_term': -2.2098093752957837, 'reward_weight': 0.5, 'abs_energy_penalty': -168.6090046831275, 'abs_comfort_penalty': -4.419618750591567, 'total_power_demand': 168.6090046831275, 'total_temperature_violation': 4.419618750591567}
Simulation Progress [Episode 1]:  51%|█████     | 51/100 [00:11<00:13,  3.62%/s, 51% completed]Reward:  -181130.2400803524 {'time_elapsed(hours)': 4344.25, 'month': 7, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 17376, 'reward': -8.064856313778423, 'energy_term': -0.26742679043554096, 'comfort_term': -7.797429523342881, 'reward_weight': 0.5, 'abs_energy_penalty': -5348.535808710819, 'abs_comfort_penalty': -15.594859046685762, 'total_power_demand': 5348.535808710819, 'total_temperature_violation': 15.594859046685762}
Simulation Progress [Episode 1]:  59%|█████▉    | 59/100 [00:13<00:10,  3.93%/s, 59% completed]Reward:  -193401.17140870017 {'time_elapsed(hours)': 5088.25, 'month': 8, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 20352, 'reward': -5.7091399347815965, 'energy_term': -0.010059948897039251, 'comfort_term': -5.699079985884557, 'reward_weight': 0.5, 'abs_energy_penalty': -201.198977940785, 'abs_comfort_penalty': -11.398159971769115, 'total_power_demand': 201.198977940785, 'total_temperature_violation': 11.398159971769115}
Simulation Progress [Episode 1]:  67%|██████▋   | 67/100 [00:15<00:08,  3.93%/s, 67% completed]Reward:  -204732.49252537402 {'time_elapsed(hours)': 5832.25, 'month': 9, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 23328, 'reward': -3.009932245152207, 'energy_term': -0.008430450234156376, 'comfort_term': -3.0015017949180507, 'reward_weight': 0.5, 'abs_energy_penalty': -168.6090046831275, 'abs_comfort_penalty': -6.003003589836101, 'total_power_demand': 168.6090046831275, 'total_temperature_violation': 6.003003589836101}
Simulation Progress [Episode 1]:  76%|███████▌  | 76/100 [00:17<00:05,  4.19%/s, 76% completed]Reward:  -215571.57315607253 {'time_elapsed(hours)': 6552.25, 'month': 10, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 26208, 'reward': -25.841002498884077, 'energy_term': -0.010059948897039251, 'comfort_term': -25.830942549987036, 'reward_weight': 0.5, 'abs_energy_penalty': -201.198977940785, 'abs_comfort_penalty': -51.66188509997407, 'total_power_demand': 201.198977940785, 'total_temperature_violation': 51.661885099974064}
Simulation Progress [Episode 1]:  84%|████████▍ | 84/100 [00:19<00:03,  4.71%/s, 84% completed]Reward:  -258417.85317828832 {'time_elapsed(hours)': 7296.25, 'month': 11, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 29184, 'reward': -14.908274454668001, 'energy_term': -0.11419542960070476, 'comfort_term': -14.794079025067298, 'reward_weight': 0.5, 'abs_energy_penalty': -2283.908592014095, 'abs_comfort_penalty': -29.588158050134595, 'total_power_demand': 2283.908592014095, 'total_temperature_violation': 29.588158050134595}
Simulation Progress [Episode 1]:  92%|█████████▏| 92/100 [00:21<00:01,  4.88%/s, 92% completed]Reward:  -289838.39189161843 {'time_elapsed(hours)': 8016.25, 'month': 12, 'day': 1, 'hour': 0, 'is_raining': False, 'action': array([], dtype=float32), 'timestep': 32064, 'reward': -7.385952598375827, 'energy_term': -0.008430450234156376, 'comfort_term': -7.37752214814167, 'reward_weight': 0.5, 'abs_energy_penalty': -168.6090046831275, 'abs_comfort_penalty': -14.75504429628334, 'total_power_demand': 168.6090046831275, 'total_temperature_violation': 14.75504429628334}
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:22<00:00,  5.03%/s, 100% completed]Episode  0 Mean reward:  -8.928154911945706 Cumulative reward:  -312842.54811457754
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:25<00:00,  4.00%/s, 100% completed]
[ENVIRONMENT] (INFO) : Environment closed. [office-hot-continuous-v1]

In this example, a default environment is created, but the space and definition of the default action are replaced with an empty one. Sinergym handles the necessary background changes. Then, the implemented random agent sends empty actions (``[]``) to the environment.

When setting an empty action space, Sinergym retains the default actuators that are defined in the building model. Their complexity and implementation will depend on the building definition in the epJSON file.

The benefits of this approach include the ability to mix and match weathers and buildings as desired, with Sinergym automatically making all the necessary configuration.

You can simulate as many years as you want in a single experiment, using the pre-defined loggers.

This method also provides more flexibility when choosing which observation variables are going to be used.