22. Rule Controller example

First, we import all the used libraries, remember to always import sinergym even if it says is not used, because that is needed to define the environments.

[1]:

from typing import List, Any, Sequence
from sinergym.utils.constants import YEAR
from datetime import datetime
import gymnasium as gym
import numpy as np
import sinergym

Now, we can define the environment we want to use, in our case we are using the Eplus demo.

[2]:

env = gym.make('Eplus-5zone-hot-continuous-v1')

#==============================================================================================#
[ENVIRONMENT] (INFO) : Creating Gymnasium environment... [5zone-hot-discrete-v1]
#==============================================================================================#
[MODELING] (INFO) : Experiment working directory created [/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041]
[MODELING] (INFO) : runperiod established: {'start_day': 1, 'start_month': 1, 'start_year': 1991, 'end_day': 31, 'end_month': 12, 'end_year': 1991, 'start_weekday': 1, 'n_steps_per_hour': 4}
[MODELING] (INFO) : Episode length (seconds): 31536000.0
[MODELING] (INFO) : timestep size (seconds): 900.0
[MODELING] (INFO) : timesteps per episode: 35040
[MODELING] (INFO) : Model Config is correct.
[REWARD] (INFO) : Reward function initialized.
[ENVIRONMENT] (INFO) : Environment 5zone-hot-discrete-v1 created successfully.

For the Rule-base controller have a look at the already defined controllers, there is one for each building, since the demo is based on the 5Zone building we are extending that controller and defining the action function we desire, feel free to play with the function to define your own action.

[3]:

from sinergym.utils.controllers import RBC5Zone

class MyRuleBasedController(RBC5Zone):

    def act(self, observation: List[Any]) -> Sequence[Any]:
        """Select action based on outdoor air drybulb temperature and daytime.

        Args:
            observation (List[Any]): Perceived observation.

        Returns:
            Sequence[Any]: Action chosen.
        """
        obs_dict = dict(zip(self.env.get_wrapper_attr('observation_variables'), observation))

        out_temp = obs_dict['outdoor_temperature']

        day = int(obs_dict['day_of_month'])
        month = int(obs_dict['month'])
        hour = int(obs_dict['hour'])
        year = int(obs_dict['year'] if obs_dict.get('year',False) else YEAR)

        summer_start_date = datetime(year, 6, 1)
        summer_final_date = datetime(year, 9, 30)

        current_dt = datetime(year, month, day)

        # Get season comfort range
        if current_dt >= summer_start_date and current_dt <= summer_final_date:
            season_comfort_range = self.setpoints_summer
        else:
            season_comfort_range = self.setpoints_summer
        season_comfort_range = self.setpoints_winter
        # Update setpoints
        in_temp = obs_dict['air_temperature']

        current_heat_setpoint = obs_dict[
            'htg_setpoint']
        current_cool_setpoint = obs_dict[
            'clg_setpoint']

        new_heat_setpoint = current_heat_setpoint
        new_cool_setpoint = current_cool_setpoint

        if in_temp < season_comfort_range[0]:
            new_heat_setpoint = current_heat_setpoint + 1
            new_cool_setpoint = current_cool_setpoint + 1
        elif in_temp > season_comfort_range[1]:
            new_cool_setpoint = current_cool_setpoint - 1
            new_heat_setpoint = current_heat_setpoint - 1

        #Clip setpoints to the action space
        if new_heat_setpoint>self.env.get_wrapper_attr('action_space').high[0]:
            new_heat_setpoint=self.env.get_wrapper_attr('action_space').high[0]
        if new_heat_setpoint<self.env.get_wrapper_attr('action_space').low[0]:
            new_heat_setpoint=self.env.get_wrapper_attr('action_space').low[0]
        if new_cool_setpoint>self.env.get_wrapper_attr('action_space').high[1]:
            new_cool_setpoint=self.env.get_wrapper_attr('action_space').high[1]
        if new_cool_setpoint<self.env.get_wrapper_attr('action_space').low[1]:
            new_cool_setpoint=self.env.get_wrapper_attr('action_space').low[1]

        action = (new_heat_setpoint, new_cool_setpoint)
        if current_dt.weekday() > 5 or hour in range(22, 6):
            #weekend or night
            action = (18.33, 23.33)

        return action

Now that we have our controller ready, we can use it:

[4]:

# create rule-based controller
agent = MyRuleBasedController(env)

for i in range(1):
    obs, info = env.reset()
    rewards = []
    terminated = False
    current_month = 0
while not terminated:
    action = agent.act(obs)
    obs, reward, terminated, truncated, info = env.step(action)
    rewards.append(reward)
    if info['month'] != current_month:  # display results every month
        current_month = info['month']
        print('Reward: ', sum(rewards), info)
print(
    'Episode ',
    i,
    'Mean reward: ',
    np.mean(rewards),
    'Cumulative reward: ',
    sum(rewards))

#----------------------------------------------------------------------------------------------#
[ENVIRONMENT] (INFO) : Starting a new episode... [5zone-hot-discrete-v1] [Episode 1]
#----------------------------------------------------------------------------------------------#
[MODELING] (INFO) : Episode directory created [/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041/Eplus-env-sub_run1]
[MODELING] (INFO) : Weather file USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw used.
[MODELING] (INFO) : Updated building model with whole Output:Variable available names
[MODELING] (INFO) : Updated building model with whole Output:Meter available names
[MODELING] (INFO) : Adapting weather to building model. [USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw]
[ENVIRONMENT] (INFO) : Saving episode output path... [/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041/Eplus-env-sub_run1/output]

/usr/local/lib/python3.10/dist-packages/opyplus/weather_data/weather_data.py:493: FutureWarning: the 'line_terminator'' keyword is deprecated, use 'lineterminator' instead.
  epw_content = self._headers_to_epw(use_datetimes=use_datetimes) + df.to_csv(

[SIMULATOR] (INFO) : Running EnergyPlus with args: ['-w', '/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041/Eplus-env-sub_run1/USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw', '-d', '/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041/Eplus-env-sub_run1/output', '/workspaces/sinergym/examples/Eplus-env-5zone-hot-discrete-v1-res35041/Eplus-env-sub_run1/5ZoneAutoDXVAV.epJSON']
[ENVIRONMENT] (INFO) : Episode 1 started.
[SIMULATOR] (INFO) : handlers initialized.
[SIMULATOR] (INFO) : handlers are ready.
[SIMULATOR] (INFO) : System is ready.
Reward:  -0.913313585398759 {'time_elapsed(hours)': 0.5, 'month': 1, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 30.0), 'timestep': 2, 'reward': -0.913313585398759, 'energy_term': -0.05542625239599594, 'comfort_term': -0.857887333002763, 'reward_weight': 0.5, 'abs_energy': 1108.5250479199187, 'abs_comfort': 1.715774666005526, 'energy_values': [1108.5250479199187], 'temp_values': [18.284225333994474]}
Progress: |**-------------------------------------------------------------------------------------------------| 2%

/usr/local/lib/python3.10/dist-packages/gymnasium/spaces/box.py:240: UserWarning: WARN: Casting input x to numpy array.
  gym.logger.warn("Casting input x to numpy array.")

Reward:  -2795.338629010751 {'time_elapsed(hours)': 744.375, 'month': 2, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 2977, 'reward': -0.005629148688825955, 'energy_term': -0.005629148688825955, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 112.5829737765191, 'abs_comfort': 0.0, 'energy_values': [112.5829737765191], 'temp_values': [20.780085346555335]}
Reward:  -3442.9942477932536 {'time_elapsed(hours)': 1416.25, 'month': 3, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (17.0, 24.5), 'timestep': 5665, 'reward': -0.22725948485913058, 'energy_term': -0.00882370430179678, 'comfort_term': -0.21843578055733381, 'reward_weight': 0.5, 'abs_energy': 176.47408603593558, 'abs_comfort': 0.43687156111466763, 'energy_values': [176.47408603593558], 'temp_values': [19.563128438885332]}
Reward:  -6086.758193423204 {'time_elapsed(hours)': 2160.25, 'month': 4, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 8641, 'reward': -0.0, 'energy_term': -0.0, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 0.0, 'energy_values': [0.0], 'temp_values': [21.98606808783697]}
Reward:  -8588.569469115886 {'time_elapsed(hours)': 2880.25, 'month': 5, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 11521, 'reward': -0.4531417474460593, 'energy_term': -0.0, 'comfort_term': -0.4531417474460593, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 0.9062834948921186, 'energy_values': [0.0], 'temp_values': [24.40628349489212]}
Reward:  -11765.765367945189 {'time_elapsed(hours)': 3624.25, 'month': 6, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 14497, 'reward': -0.0, 'energy_term': -0.0, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 0.0, 'energy_values': [0.0], 'temp_values': [25.900405758648773]}
Reward:  -14245.140165597242 {'time_elapsed(hours)': 4344.375, 'month': 7, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (18.33, 23.33), 'timestep': 17377, 'reward': -0.9700071561753898, 'energy_term': -0.0, 'comfort_term': -0.9700071561753898, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 1.9400143123507796, 'energy_values': [0.0], 'temp_values': [27.94001431235078]}
Reward:  -16167.362661542358 {'time_elapsed(hours)': 5088.25, 'month': 8, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 20353, 'reward': -0.0, 'energy_term': -0.0, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 0.0, 'energy_values': [0.0], 'temp_values': [25.996330089060972]}
Reward:  -18044.632044925856 {'time_elapsed(hours)': 5832.25, 'month': 9, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 23329, 'reward': -0.0, 'energy_term': -0.0, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 0.0, 'abs_comfort': 0.0, 'energy_values': [0.0], 'temp_values': [25.799172085986662]}
Reward:  -19572.469153276368 {'time_elapsed(hours)': 6552.3125, 'month': 10, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 26209, 'reward': -1.0005842181877573, 'energy_term': -0.06446283743025183, 'comfort_term': -0.9361213807575055, 'reward_weight': 0.5, 'abs_energy': 1289.2567486050366, 'abs_comfort': 1.872242761515011, 'energy_values': [1289.2567486050366], 'temp_values': [25.37224276151501]}
Reward:  -25697.968230973762 {'time_elapsed(hours)': 7296.25, 'month': 11, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 29185, 'reward': -0.004536449977537947, 'energy_term': -0.004536449977537947, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 90.72899955075894, 'abs_comfort': 0.0, 'energy_values': [90.72899955075894], 'temp_values': [21.766705234161947]}
Reward:  -28175.99400805497 {'time_elapsed(hours)': 8016.25, 'month': 12, 'day': 1, 'hour': 0, 'is_raining': False, 'action': (15.0, 22.5), 'timestep': 32065, 'reward': -0.004536449977537947, 'energy_term': -0.004536449977537947, 'comfort_term': -0.0, 'reward_weight': 0.5, 'abs_energy': 90.72899955075894, 'abs_comfort': 0.0, 'energy_values': [90.72899955075894], 'temp_values': [20.933681388733177]}
Progress: |***************************************************************************************************| 99%
Episode  0 Mean reward:  -0.8548035936270668 Cumulative reward:  -29952.317920693455

Always remember to close the environment:

[5]:

env.close()

[ENVIRONMENT] (INFO) : Environment closed. [5zone-hot-discrete-v1]

Note

For more information about our defines controllers and how create a new one, please, visit our Controller Documentation