20. Rule Controller example
First we import all the used libraries, remember to always import sinergym
even if it says is not used, because that is needed to define the environments
[1]:
from typing import List, Any, Sequence
from sinergym.utils.common import get_season_comfort_range
from datetime import datetime
import gymnasium as gym
import numpy as np
import sinergym
Now we can define the environment we want to use, in our case we are using the Eplus demo.
[2]:
env = gym.make('Eplus-demo-v1')
[2023-05-26 08:50:09,459] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating Building model ExternalInterface object if it is not present...
[2023-05-26 08:50:09,461] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating Building model Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2023-05-26 08:50:09,464] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating building model OutPut:Variable and variables XML tree model for BVCTB connection.
[2023-05-26 08:50:09,465] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2023-05-26 08:50:09,466] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...
For the Rule-base controller have a look at the already defined controllers, there is one for each building, since the demo is based on the 5Zone building we are extending that controller and defining the action function we desire, feel free to play with the function to define your own action.
[3]:
from sinergym.utils.controllers import RBC5Zone
class MyRuleBasedController(RBC5Zone):
def act(self, observation: List[Any]) -> Sequence[Any]:
"""Select action based on outdoor air drybulb temperature and daytime.
Args:
observation (List[Any]): Perceived observation.
Returns:
Sequence[Any]: Action chosen.
"""
obs_dict = dict(zip(self.variables['observation'], observation))
out_temp = obs_dict['Site Outdoor Air Drybulb Temperature(Environment)']
day = int(obs_dict['day'])
month = int(obs_dict['month'])
hour = int(obs_dict['hour'])
year = int(obs_dict['year'])
summer_start_date = datetime(year, 6, 1)
summer_final_date = datetime(year, 9, 30)
current_dt = datetime(year, month, day)
# Get season comfort range
if current_dt >= summer_start_date and current_dt <= summer_final_date:
season_comfort_range = self.setpoints_summer
else:
season_comfort_range = self.setpoints_summer
season_comfort_range = get_season_comfort_range(1991,month, day)
# Update setpoints
in_temp = obs_dict['Zone Air Temperature(SPACE1-1)']
current_heat_setpoint = obs_dict[
'Zone Thermostat Heating Setpoint Temperature(SPACE1-1)']
current_cool_setpoint = obs_dict[
'Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)']
new_heat_setpoint = current_heat_setpoint
new_cool_setpoint = current_cool_setpoint
if in_temp < season_comfort_range[0]:
new_heat_setpoint = current_heat_setpoint + 1
new_cool_setpoint = current_cool_setpoint + 1
elif in_temp > season_comfort_range[1]:
new_cool_setpoint = current_cool_setpoint - 1
new_heat_setpoint = current_heat_setpoint - 1
action = (new_heat_setpoint, new_cool_setpoint)
if current_dt.weekday() > 5 or hour in range(22, 6):
#weekend or night
action = (18.33, 23.33)
return action
Now that we have our controller ready we can use it:
[4]:
# create rule-based controller
agent = MyRuleBasedController(env)
for i in range(1):
obs, info = env.reset()
rewards = []
terminated = False
current_month = 0
while not terminated:
action = agent.act(obs)
obs, reward, terminated, truncated, info = env.step(action)
rewards.append(reward)
if info['month'] != current_month: # display results every month
current_month = info['month']
print('Reward: ', sum(rewards), info)
print(
'Episode ',
i,
'Mean reward: ',
np.mean(rewards),
'Cumulative reward: ',
sum(rewards))
[2023-05-26 08:50:09,837] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:50:09,973] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-demo-v1-res5/Eplus-env-sub_run1
/usr/local/lib/python3.10/dist-packages/opyplus/weather_data/weather_data.py:493: FutureWarning: the 'line_terminator'' keyword is deprecated, use 'lineterminator' instead.
epw_content = self._headers_to_epw(use_datetimes=use_datetimes) + df.to_csv(
Reward: -0.555472460427038 {'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'action': [21.0, 25.0], 'reward': -0.555472460427038, 'reward_energy': -1.110944920854076, 'reward_comfort': -0.0, 'total_energy': 11109.44920854076, 'abs_comfort': 0.0, 'temperatures': [20.99999214301718]}
Reward: -1646.2524485001004 {'timestep': 2976, 'time_elapsed': 2678400, 'year': 1991, 'month': 2, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -0.6319275820302366, 'reward_energy': -1.2638551640604732, 'reward_comfort': -0.0, 'total_energy': 12638.55164060473, 'abs_comfort': 0.0, 'temperatures': [20.32999035838968]}
Reward: -3521.554500576192 {'timestep': 5664, 'time_elapsed': 5097600, 'year': 1991, 'month': 3, 'day': 1, 'hour': 0, 'action': [17.329999923706055, 22.329999923706055], 'reward': -1.493972806005651, 'reward_energy': -0.2963122890182604, 'reward_comfort': -2.6916333229930416, 'total_energy': 2963.122890182604, 'abs_comfort': 2.6916333229930416, 'temperatures': [17.30836667700696]}
Reward: -4625.416944945128 {'timestep': 8640, 'time_elapsed': 7776000, 'year': 1991, 'month': 4, 'day': 1, 'hour': 0, 'action': [18.33, 23.33], 'reward': -0.5765227606488547, 'reward_energy': -0.007756884363910412, 'reward_comfort': -1.145288636933799, 'total_energy': 77.56884363910412, 'abs_comfort': 1.145288636933799, 'temperatures': [18.8547113630662]}
Reward: -5253.588301789213 {'timestep': 11520, 'time_elapsed': 10368000, 'year': 1991, 'month': 5, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -0.17350166808601855, 'reward_energy': -0.3470033361720371, 'reward_comfort': -0.0, 'total_energy': 3470.033361720371, 'abs_comfort': 0.0, 'temperatures': [20.33028428947056]}
Reward: -5804.239857002728 {'timestep': 14496, 'time_elapsed': 13046400, 'year': 1991, 'month': 6, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -1.3575452900204674, 'reward_energy': -0.04504726077090611, 'reward_comfort': -2.670043319270029, 'total_energy': 450.4726077090611, 'abs_comfort': 2.670043319270029, 'temperatures': [20.32995668072997]}
Reward: -7014.056138683364 {'timestep': 17376, 'time_elapsed': 15638400, 'year': 1991, 'month': 7, 'day': 1, 'hour': 0, 'action': [18.33, 23.33], 'reward': -1.14237290405316, 'reward_energy': -0.007776888271558947, 'reward_comfort': -2.2769689198347614, 'total_energy': 77.76888271558947, 'abs_comfort': 2.2769689198347614, 'temperatures': [20.72303108016524]}
Reward: -8347.885443505464 {'timestep': 20352, 'time_elapsed': 18316800, 'year': 1991, 'month': 8, 'day': 1, 'hour': 0, 'action': [23.33, 28.33], 'reward': -0.5777724111276291, 'reward_energy': -1.1555448222552582, 'reward_comfort': -0.0, 'total_energy': 11555.44822255258, 'abs_comfort': 0.0, 'temperatures': [23.33011237958411]}
Reward: -9578.267067274302 {'timestep': 23328, 'time_elapsed': 20995200, 'year': 1991, 'month': 9, 'day': 1, 'hour': 0, 'action': [23.33, 28.33], 'reward': -0.11371544019180026, 'reward_energy': -0.22743088038360051, 'reward_comfort': -0.0, 'total_energy': 2274.308803836005, 'abs_comfort': 0.0, 'temperatures': [23.33010938638332]}
Reward: -10730.280647200174 {'timestep': 26208, 'time_elapsed': 23587200, 'year': 1991, 'month': 10, 'day': 1, 'hour': 0, 'action': [23.33, 28.33], 'reward': -0.3223465283115184, 'reward_energy': -0.6446930566230368, 'reward_comfort': -0.0, 'total_energy': 6446.930566230368, 'abs_comfort': 0.0, 'temperatures': [23.33025334254351]}
Reward: -11936.260758227023 {'timestep': 29184, 'time_elapsed': 26265600, 'year': 1991, 'month': 11, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -0.16612981759986586, 'reward_energy': -0.33225963519973173, 'reward_comfort': -0.0, 'total_energy': 3322.596351997317, 'abs_comfort': 0.0, 'temperatures': [20.33009897623347]}
Reward: -12990.251981433648 {'timestep': 32064, 'time_elapsed': 28857600, 'year': 1991, 'month': 12, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -0.26488400229285347, 'reward_energy': -0.5297680045857069, 'reward_comfort': -0.0, 'total_energy': 5297.68004585707, 'abs_comfort': 0.0, 'temperatures': [20.32991787449022]}
Reward: -14461.222013342354 {'timestep': 35040, 'time_elapsed': 31536000, 'year': 1992, 'month': 1, 'day': 1, 'hour': 0, 'action': [20.33, 25.33], 'reward': -0.7190890740147525, 'reward_energy': -1.438178148029505, 'reward_comfort': -0.0, 'total_energy': 14381.78148029505, 'abs_comfort': 0.0, 'temperatures': [20.33001174568691]}
Episode 0 Mean reward: -0.4127061076867006 Cumulative reward: -14461.222013342354
Always remember to close the environment:
[5]:
env.close()
[2023-05-26 08:50:20,243] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.
Note
For more information about our defines controllers and how create a new one, please, visit our Controller Documentation