Rule-based controller example
Let’s try a simple RBC in Sinergym environment.
First, we import all the necessary libraries. Remember to always import sinergym
, even if it appears unused, as it is needed to define the environments.
[ ]:
from typing import List, Any, Sequence
from sinergym.utils.constants import YEAR
from datetime import datetime
import gymnasium as gym
import numpy as np
import sinergym
Now we can define the environment:
[2]:
env = gym.make('Eplus-5zone-hot-continuous-v1')
#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: 5zone-hot-continuous-v1
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created.
[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-5zone-hot-continuous-v1-res1
[MODELING] (INFO) : Model Config is correct.
[MODELING] (INFO) : Update building model Output:Variable with variable names.
[MODELING] (INFO) : Update building model Output:Meter with meter names.
[MODELING] (INFO) : Runperiod established.
[MODELING] (INFO) : Episode length (seconds): 31536000.0
[MODELING] (INFO) : timestep size (seconds): 900.0
[MODELING] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.
You should check out the available list of pre-defined RBCs.
In this example, we extend the pre-defined controller by defining a custom act
method:
[ ]:
from sinergym.utils.controllers import RBC5Zone
class MyRuleBasedController(RBC5Zone):
def act(self, observation: List[Any]) -> Sequence[Any]:
"""Select action based on outdoor air drybulb temperature and daytime.
Args:
observation (List[Any]): Perceived observation.
Returns:
Sequence[Any]: Action chosen.
"""
obs_dict = dict(zip(self.env.get_wrapper_attr(
'observation_variables'), observation))
out_temp = obs_dict['outdoor_temperature']
day = int(obs_dict['day_of_month'])
month = int(obs_dict['month'])
hour = int(obs_dict['hour'])
year = int(obs_dict['year'] if obs_dict.get('year', False) else YEAR)
summer_start_date = datetime(year, 6, 1)
summer_final_date = datetime(year, 9, 30)
current_dt = datetime(year, month, day)
# Get season comfort range
if current_dt >= summer_start_date and current_dt <= summer_final_date:
season_comfort_range = self.setpoints_summer
else:
season_comfort_range = self.setpoints_summer
season_comfort_range = self.setpoints_winter
# Update setpoints
in_temp = obs_dict['air_temperature']
current_heat_setpoint = obs_dict[
'htg_setpoint']
current_cool_setpoint = obs_dict[
'clg_setpoint']
new_heat_setpoint = current_heat_setpoint
new_cool_setpoint = current_cool_setpoint
if in_temp < season_comfort_range[0]:
new_heat_setpoint = current_heat_setpoint + 1
new_cool_setpoint = current_cool_setpoint + 1
elif in_temp > season_comfort_range[1]:
new_cool_setpoint = current_cool_setpoint - 1
new_heat_setpoint = current_heat_setpoint - 1
# Clip setpoints to the action space
if new_heat_setpoint > self.env.get_wrapper_attr('action_space').high[0]:
new_heat_setpoint = self.env.get_wrapper_attr(
'action_space').high[0]
if new_heat_setpoint < self.env.get_wrapper_attr('action_space').low[0]:
new_heat_setpoint = self.env.get_wrapper_attr(
'action_space').low[0]
if new_cool_setpoint > self.env.get_wrapper_attr('action_space').high[1]:
new_cool_setpoint = self.env.get_wrapper_attr(
'action_space').high[1]
if new_cool_setpoint < self.env.get_wrapper_attr('action_space').low[1]:
new_cool_setpoint = self.env.get_wrapper_attr(
'action_space').low[1]
action = (new_heat_setpoint, new_cool_setpoint)
if current_dt.weekday() > 5 or hour in range(22, 6):
# Weekend or night
action = (18.33, 23.33)
return action
Now that our controller is ready, we can use it as follows:
[ ]:
# Create rule-based controller
agent = MyRuleBasedController(env)
for i in range(1):
obs, info = env.reset()
rewards = []
truncated = terminated = False
current_month = 0
while not (terminated or truncated):
action = agent.act(obs)
obs, reward, terminated, truncated, info = env.step(action)
rewards.append(reward)
if info['month'] != current_month: # display results every month
current_month = info['month']
print('Reward: ', sum(rewards), info)
print(
'Episode ',
i,
'Mean reward: ',
np.mean(rewards),
'Cumulative reward: ',
sum(rewards))
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: 5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created.
[MODELING] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODELING] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path.
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward: -0.10122987987606541 {'time_elapsed(hours)': 0.5, 'month': 1, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(13.8), np.float32(30.0)), 'timestep': 1, 'reward': -0.10122987987606541, 'energy_term': -0.00589738497079933, 'comfort_term': -0.09533249490526607, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': -0.19066498981053215, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.19066498981053215}
/usr/local/lib/python3.12/dist-packages/gymnasium/spaces/box.py:240: UserWarning: WARN: Casting input x to numpy array.
gym.logger.warn("Casting input x to numpy array.")
Simulation Progress [Episode 1]: 10%|█ | 10/100 [00:00<00:09, 9.64%/s, 10% completed] Reward: -500.8158604654543 {'time_elapsed(hours)': 744.25, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 2976, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 17%|█▋ | 17/100 [00:01<00:08, 9.88%/s, 17% completed]Reward: -771.2477959705227 {'time_elapsed(hours)': 1416.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 5664, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 26%|██▌ | 26/100 [00:02<00:06, 11.30%/s, 26% completed]Reward: -1099.6675505462101 {'time_elapsed(hours)': 2160.25, 'month': 4, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 8640, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 34%|███▍ | 34/100 [00:03<00:06, 9.53%/s, 34% completed]Reward: -1443.9892728462598 {'time_elapsed(hours)': 2880.25, 'month': 5, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 11520, 'reward': -0.00986244334034497, 'energy_term': -0.00986244334034497, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -197.24886680689937, 'abs_comfort_penalty': 0, 'total_power_demand': 197.24886680689937, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 42%|████▏ | 42/100 [00:04<00:06, 9.08%/s, 42% completed]Reward: -1850.3837512878513 {'time_elapsed(hours)': 3624.25, 'month': 6, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 14496, 'reward': -0.443418973577729, 'energy_term': -0.03306759297397215, 'comfort_term': -0.41035138060375687, 'reward_weight': 0.5, 'abs_energy_penalty': -661.351859479443, 'abs_comfort_penalty': -0.8207027612075137, 'total_power_demand': 661.351859479443, 'total_temperature_violation': 0.8207027612075137}
Simulation Progress [Episode 1]: 51%|█████ | 51/100 [00:05<00:04, 11.44%/s, 51% completed]Reward: -2994.821084822717 {'time_elapsed(hours)': 4344.25, 'month': 7, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 17376, 'reward': -0.2168368480281406, 'energy_term': -0.03331450494007778, 'comfort_term': -0.1835223430880628, 'reward_weight': 0.5, 'abs_energy_penalty': -666.2900988015556, 'abs_comfort_penalty': -0.3670446861761256, 'total_power_demand': 666.2900988015556, 'total_temperature_violation': 0.3670446861761256}
Simulation Progress [Episode 1]: 59%|█████▉ | 59/100 [00:06<00:04, 8.89%/s, 59% completed]Reward: -4192.843264700024 {'time_elapsed(hours)': 5088.25, 'month': 8, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 20352, 'reward': -0.4444265580218612, 'energy_term': -0.03655451169123719, 'comfort_term': -0.407872046330624, 'reward_weight': 0.5, 'abs_energy_penalty': -731.0902338247438, 'abs_comfort_penalty': -0.815744092661248, 'total_power_demand': 731.0902338247438, 'total_temperature_violation': 0.815744092661248}
Simulation Progress [Episode 1]: 67%|██████▋ | 67/100 [00:06<00:02, 11.54%/s, 67% completed]Reward: -5380.941330999556 {'time_elapsed(hours)': 5832.25, 'month': 9, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 23328, 'reward': -0.2973068555116624, 'energy_term': -0.02993733348744081, 'comfort_term': -0.2673695220242216, 'reward_weight': 0.5, 'abs_energy_penalty': -598.7466697488162, 'abs_comfort_penalty': -0.5347390440484432, 'total_power_demand': 598.7466697488162, 'total_temperature_violation': 0.5347390440484432}
Simulation Progress [Episode 1]: 76%|███████▌ | 76/100 [00:07<00:02, 10.25%/s, 76% completed]Reward: -6627.123778199568 {'time_elapsed(hours)': 6552.25, 'month': 10, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 26208, 'reward': -0.03177037793883851, 'energy_term': -0.03177037793883851, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -635.4075587767702, 'abs_comfort_penalty': 0, 'total_power_demand': 635.4075587767702, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 84%|████████▍ | 84/100 [00:08<00:01, 11.03%/s, 84% completed]Reward: -6944.464641871541 {'time_elapsed(hours)': 7296.25, 'month': 11, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 29184, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 92%|█████████▏| 92/100 [00:09<00:00, 10.92%/s, 92% completed]Reward: -7259.377698005483 {'time_elapsed(hours)': 8016.25, 'month': 12, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 32064, 'reward': -0.018698130195533794, 'energy_term': -0.018698130195533794, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -373.96260391067585, 'abs_comfort_penalty': 0, 'total_power_demand': 373.96260391067585, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:09<00:00, 11.14%/s, 100% completed]Episode 0 Mean reward: -0.21907941002188872 Cumulative reward: -7676.542527166982
Always remember to close the environment:
[5]:
env.close()
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:12<00:00, 8.33%/s, 100% completed]
[ENVIRONMENT] (INFO) : Environment closed. [5zone-hot-continuous-v1]
For more information about pre-defined controllers and how to create custom ones, visit the corresponding documentation.