26. Rule Controller example

First, we import all the necessary libraries. Remember to always import sinergym, even if it appears unused, as it’s needed to define the environments.

[1]:
from typing import List, Any, Sequence
from sinergym.utils.constants import YEAR
from datetime import datetime
import gymnasium as gym
import numpy as np
import sinergym

Next, we can define the environment we want to use.

[2]:
env = gym.make('Eplus-5zone-hot-continuous-v1')
#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment.
[ENVIRONMENT] (INFO) : Name: 5zone-hot-continuous-v1
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created.
[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-5zone-hot-continuous-v1-res1
[MODELING] (INFO) : Model Config is correct.
[MODELING] (INFO) : Update building model Output:Variable with variable names.
[MODELING] (INFO) : Update building model Output:Meter with meter names.
[MODELING] (INFO) : Runperiod established.
[MODELING] (INFO) : Episode length (seconds): 31536000.0
[MODELING] (INFO) : timestep size (seconds): 900.0
[MODELING] (INFO) : timesteps per episode: 35040
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment created successfully.

For the Rule-based controller, check out the already defined controllers. There’s one for each building. We’re extending that controller and defining the action function we want. Feel free to modify the function to define your own action.

[3]:
from sinergym.utils.controllers import RBC5Zone

class MyRuleBasedController(RBC5Zone):

    def act(self, observation: List[Any]) -> Sequence[Any]:
        """Select action based on outdoor air drybulb temperature and daytime.

        Args:
            observation (List[Any]): Perceived observation.

        Returns:
            Sequence[Any]: Action chosen.
        """
        obs_dict = dict(zip(self.env.get_wrapper_attr('observation_variables'), observation))

        out_temp = obs_dict['outdoor_temperature']

        day = int(obs_dict['day_of_month'])
        month = int(obs_dict['month'])
        hour = int(obs_dict['hour'])
        year = int(obs_dict['year'] if obs_dict.get('year',False) else YEAR)

        summer_start_date = datetime(year, 6, 1)
        summer_final_date = datetime(year, 9, 30)

        current_dt = datetime(year, month, day)

        # Get season comfort range
        if current_dt >= summer_start_date and current_dt <= summer_final_date:
            season_comfort_range = self.setpoints_summer
        else:
            season_comfort_range = self.setpoints_summer
        season_comfort_range = self.setpoints_winter
        # Update setpoints
        in_temp = obs_dict['air_temperature']

        current_heat_setpoint = obs_dict[
            'htg_setpoint']
        current_cool_setpoint = obs_dict[
            'clg_setpoint']

        new_heat_setpoint = current_heat_setpoint
        new_cool_setpoint = current_cool_setpoint

        if in_temp < season_comfort_range[0]:
            new_heat_setpoint = current_heat_setpoint + 1
            new_cool_setpoint = current_cool_setpoint + 1
        elif in_temp > season_comfort_range[1]:
            new_cool_setpoint = current_cool_setpoint - 1
            new_heat_setpoint = current_heat_setpoint - 1

        #Clip setpoints to the action space
        if new_heat_setpoint>self.env.get_wrapper_attr('action_space').high[0]:
            new_heat_setpoint=self.env.get_wrapper_attr('action_space').high[0]
        if new_heat_setpoint<self.env.get_wrapper_attr('action_space').low[0]:
            new_heat_setpoint=self.env.get_wrapper_attr('action_space').low[0]
        if new_cool_setpoint>self.env.get_wrapper_attr('action_space').high[1]:
            new_cool_setpoint=self.env.get_wrapper_attr('action_space').high[1]
        if new_cool_setpoint<self.env.get_wrapper_attr('action_space').low[1]:
            new_cool_setpoint=self.env.get_wrapper_attr('action_space').low[1]

        action = (new_heat_setpoint, new_cool_setpoint)
        if current_dt.weekday() > 5 or hour in range(22, 6):
            #weekend or night
            action = (18.33, 23.33)

        return action

Now that our controller is ready, we can use it:

[4]:

# create rule-based controller agent = MyRuleBasedController(env) for i in range(1): obs, info = env.reset() rewards = [] truncated = terminated = False current_month = 0 while not (terminated or truncated): action = agent.act(obs) obs, reward, terminated, truncated, info = env.step(action) rewards.append(reward) if info['month'] != current_month: # display results every month current_month = info['month'] print('Reward: ', sum(rewards), info) print( 'Episode ', i, 'Mean reward: ', np.mean(rewards), 'Cumulative reward: ', sum(rewards))
#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode.
[ENVIRONMENT] (INFO) : Episode 1: 5zone-hot-continuous-v1
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created.
[MODELING] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODELING] (INFO) : Adapting weather to building model.
[ENVIRONMENT] (INFO) : Saving episode output path.
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward:  -0.10122987987606541 {'time_elapsed(hours)': 0.5, 'month': 1, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(13.8), np.float32(30.0)), 'timestep': 1, 'reward': -0.10122987987606541, 'energy_term': -0.00589738497079933, 'comfort_term': -0.09533249490526607, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': -0.19066498981053215, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.19066498981053215}
/usr/local/lib/python3.12/dist-packages/gymnasium/spaces/box.py:240: UserWarning: WARN: Casting input x to numpy array.
  gym.logger.warn("Casting input x to numpy array.")
Simulation Progress [Episode 1]:  10%|█         | 10/100 [00:00<00:09,  9.64%/s, 10% completed] Reward:  -500.8158604654543 {'time_elapsed(hours)': 744.25, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 2976, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  17%|█▋        | 17/100 [00:01<00:08,  9.88%/s, 17% completed]Reward:  -771.2477959705227 {'time_elapsed(hours)': 1416.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 5664, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  26%|██▌       | 26/100 [00:02<00:06, 11.30%/s, 26% completed]Reward:  -1099.6675505462101 {'time_elapsed(hours)': 2160.25, 'month': 4, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 8640, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  34%|███▍      | 34/100 [00:03<00:06,  9.53%/s, 34% completed]Reward:  -1443.9892728462598 {'time_elapsed(hours)': 2880.25, 'month': 5, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 11520, 'reward': -0.00986244334034497, 'energy_term': -0.00986244334034497, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -197.24886680689937, 'abs_comfort_penalty': 0, 'total_power_demand': 197.24886680689937, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  42%|████▏     | 42/100 [00:04<00:06,  9.08%/s, 42% completed]Reward:  -1850.3837512878513 {'time_elapsed(hours)': 3624.25, 'month': 6, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 14496, 'reward': -0.443418973577729, 'energy_term': -0.03306759297397215, 'comfort_term': -0.41035138060375687, 'reward_weight': 0.5, 'abs_energy_penalty': -661.351859479443, 'abs_comfort_penalty': -0.8207027612075137, 'total_power_demand': 661.351859479443, 'total_temperature_violation': 0.8207027612075137}
Simulation Progress [Episode 1]:  51%|█████     | 51/100 [00:05<00:04, 11.44%/s, 51% completed]Reward:  -2994.821084822717 {'time_elapsed(hours)': 4344.25, 'month': 7, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 17376, 'reward': -0.2168368480281406, 'energy_term': -0.03331450494007778, 'comfort_term': -0.1835223430880628, 'reward_weight': 0.5, 'abs_energy_penalty': -666.2900988015556, 'abs_comfort_penalty': -0.3670446861761256, 'total_power_demand': 666.2900988015556, 'total_temperature_violation': 0.3670446861761256}
Simulation Progress [Episode 1]:  59%|█████▉    | 59/100 [00:06<00:04,  8.89%/s, 59% completed]Reward:  -4192.843264700024 {'time_elapsed(hours)': 5088.25, 'month': 8, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 20352, 'reward': -0.4444265580218612, 'energy_term': -0.03655451169123719, 'comfort_term': -0.407872046330624, 'reward_weight': 0.5, 'abs_energy_penalty': -731.0902338247438, 'abs_comfort_penalty': -0.815744092661248, 'total_power_demand': 731.0902338247438, 'total_temperature_violation': 0.815744092661248}
Simulation Progress [Episode 1]:  67%|██████▋   | 67/100 [00:06<00:02, 11.54%/s, 67% completed]Reward:  -5380.941330999556 {'time_elapsed(hours)': 5832.25, 'month': 9, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 23328, 'reward': -0.2973068555116624, 'energy_term': -0.02993733348744081, 'comfort_term': -0.2673695220242216, 'reward_weight': 0.5, 'abs_energy_penalty': -598.7466697488162, 'abs_comfort_penalty': -0.5347390440484432, 'total_power_demand': 598.7466697488162, 'total_temperature_violation': 0.5347390440484432}
Simulation Progress [Episode 1]:  76%|███████▌  | 76/100 [00:07<00:02, 10.25%/s, 76% completed]Reward:  -6627.123778199568 {'time_elapsed(hours)': 6552.25, 'month': 10, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 26208, 'reward': -0.03177037793883851, 'energy_term': -0.03177037793883851, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -635.4075587767702, 'abs_comfort_penalty': 0, 'total_power_demand': 635.4075587767702, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  84%|████████▍ | 84/100 [00:08<00:01, 11.03%/s, 84% completed]Reward:  -6944.464641871541 {'time_elapsed(hours)': 7296.25, 'month': 11, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 29184, 'reward': -0.00589738497079933, 'energy_term': -0.00589738497079933, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -117.9476994159866, 'abs_comfort_penalty': 0, 'total_power_demand': 117.9476994159866, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]:  92%|█████████▏| 92/100 [00:09<00:00, 10.92%/s, 92% completed]Reward:  -7259.377698005483 {'time_elapsed(hours)': 8016.25, 'month': 12, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (np.float32(18.33), np.float32(23.33)), 'timestep': 32064, 'reward': -0.018698130195533794, 'energy_term': -0.018698130195533794, 'comfort_term': 0.0, 'reward_weight': 0.5, 'abs_energy_penalty': -373.96260391067585, 'abs_comfort_penalty': 0, 'total_power_demand': 373.96260391067585, 'total_temperature_violation': 0.0}
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:09<00:00, 11.14%/s, 100% completed]Episode  0 Mean reward:  -0.21907941002188872 Cumulative reward:  -7676.542527166982

Always remember to close the environment when you’re done:

[5]:
env.close()
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:12<00:00,  8.33%/s, 100% completed]
[ENVIRONMENT] (INFO) : Environment closed. [5zone-hot-continuous-v1]

For more information about our defined controllers and how to create a new one, please visit our Controller Documentation.