19. Logger Wrapper personalization/configuration

We will see on this notebook how to personalize the logger wrapper defined by sinergym.

[1]:

import gym
import numpy as np
import sinergym
from sinergym.utils.wrappers import (LoggerWrapper, MultiObsWrapper,
                                     NormalizeObservation)
from sinergym.utils.constants import RANGES_5ZONE

/usr/local/lib/python3.10/dist-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(

19.1. Step 1 Inherit and modify the CSVloger

First we need to change the CSV logger to modify the values written into the file on the function create_row_contents

[2]:

from sinergym.utils.logger import CSVLogger
from typing import Any, Dict, Optional, Sequence, Tuple, Union, List

class CustomCSVLogger(CSVLogger):

    def __init__(
            self,
            monitor_header: str,
            progress_header: str,
            log_progress_file: str,
            log_file: Optional[str] = None,
            flag: bool = True):
        super(CustomCSVLogger, self).__init__(monitor_header,progress_header,log_progress_file,log_file,flag)
        self.last_10_steps_reward = [0]*10

    def _create_row_content(
            self,
            obs: List[Any],
            action: Union[int, np.ndarray, List[Any]],
            reward: Optional[float],
            done: bool,
            info: Optional[Dict[str, Any]]) -> List:

        if reward is not None:
            self.last_10_steps_reward.pop(0)
            self.last_10_steps_reward.append(reward)

        if info is None:  # In a reset
            return [0] + list(obs) + list(action) + \
                [0, reward, np.mean(self.last_10_steps_reward), None, None, None, done]
        else:
            return [
                info['timestep']] + list(obs) + list(action) + [
                info['time_elapsed'],
                reward,
                np.mean(self.last_10_steps_reward),
                info['total_power_no_units'],
                info['comfort_penalty'],
                info['abs_comfort'],
                done]

19.2. Step 2 Instantiate the LoggerWrapper

now we need to instantiate the loggerwrapper and specify the new headers of our file and the csvlogger class we want to use.

[3]:

env=gym.make('Eplus-demo-v1')
env=LoggerWrapper(env,logger_class=CustomCSVLogger,monitor_header = ['timestep'] + env.variables['observation'] +
                env.variables['action'] + ['time (seconds)', 'reward', '10-mean-reward',
                'power_penalty', 'comfort_penalty', 'done'])

[2022-10-07 09:08:57,743] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf ExternalInterface object if it is not present...
[2022-10-07 09:08:57,746] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2022-10-07 09:08:57,749] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Updating idf OutPut:Variable and variables XML tree model for BVCTB connection.
[2022-10-07 09:08:57,751] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2022-10-07 09:08:57,752] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Setting up action definition in building model if exists...

Now, you can see in Sinergym output folder that you will have available progress.csv file and monitor.csv files in each episode.

[4]:

for i in range(1):
    obs = env.reset()
    rewards = []
    done = False
    current_month = 0
    while not done:
        a = env.action_space.sample()
        obs, reward, done, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', sum(rewards), info)
    print('Episode ', i, 'Mean reward: ', np.mean(
        rewards), 'Cumulative reward: ', sum(rewards))
env.close()

[2022-10-07 09:08:57,984] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2022-10-07 09:08:57,996] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-demo-v1-res5/Eplus-env-sub_run1
Reward:  -0.5693658209031192 {'timestep': 1, 'time_elapsed': 900, 'year': 1991, 'month': 1, 'day': 1, 'hour': 0, 'total_power': 3780.170717786078, 'total_power_no_units': -0.3780170717786078, 'comfort_penalty': -0.7607145700276305, 'abs_comfort': 0.7607145700276305, 'temperatures': [19.23928542997237], 'out_temperature': 1.8, 'action_': [15, 30]}
Reward:  -2061.064957150696 {'timestep': 2976, 'time_elapsed': 2678400, 'year': 1991, 'month': 2, 'day': 1, 'hour': 0, 'total_power': 22592.29761805248, 'total_power_no_units': -2.259229761805248, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.29027708737195], 'out_temperature': -7.0, 'action_': [22, 22]}
Reward:  -4085.7626350724654 {'timestep': 5664, 'time_elapsed': 5097600, 'year': 1991, 'month': 3, 'day': 1, 'hour': 0, 'total_power': 420.968971758518, 'total_power_no_units': -0.042096897175851807, 'comfort_penalty': -0.11870426967686143, 'abs_comfort': 0.11870426967686143, 'temperatures': [19.88129573032314], 'out_temperature': 8.1, 'action_': [15, 30]}
Reward:  -5435.5346266621 {'timestep': 8640, 'time_elapsed': 7776000, 'year': 1991, 'month': 4, 'day': 1, 'hour': 0, 'total_power': 11649.00520907892, 'total_power_no_units': -1.164900520907892, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.81204708989529], 'out_temperature': 7.7, 'action_': [22, 23]}
Reward:  -6345.247828294638 {'timestep': 11520, 'time_elapsed': 10368000, 'year': 1991, 'month': 5, 'day': 1, 'hour': 0, 'total_power': 152.4868953414246, 'total_power_no_units': -0.01524868953414246, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.29631550252394], 'out_temperature': 13.0, 'action_': [18, 27]}
Reward:  -7202.603862296734 {'timestep': 14496, 'time_elapsed': 13046400, 'year': 1991, 'month': 6, 'day': 1, 'hour': 0, 'total_power': 257.3532375277869, 'total_power_no_units': -0.025735323752778694, 'comfort_penalty': -2.74339864119381, 'abs_comfort': 2.74339864119381, 'temperatures': [20.25660135880619], 'out_temperature': 18.4, 'action_': [19, 26]}
Reward:  -10096.316306201286 {'timestep': 17376, 'time_elapsed': 15638400, 'year': 1991, 'month': 7, 'day': 1, 'hour': 0, 'total_power': 175.7796775010779, 'total_power_no_units': -0.017577967750107792, 'comfort_penalty': -2.050386500045999, 'abs_comfort': 2.050386500045999, 'temperatures': [20.949613499954], 'out_temperature': 17.7, 'action_': [16, 29]}
Reward:  -13373.916779680616 {'timestep': 20352, 'time_elapsed': 18316800, 'year': 1991, 'month': 8, 'day': 1, 'hour': 0, 'total_power': 12541.97872344346, 'total_power_no_units': -1.254197872344346, 'comfort_penalty': -1.9257929336437414, 'abs_comfort': 1.9257929336437414, 'temperatures': [21.07420706635626], 'out_temperature': 20.6, 'action_': [21, 24]}
Reward:  -16582.229771212274 {'timestep': 23328, 'time_elapsed': 20995200, 'year': 1991, 'month': 9, 'day': 1, 'hour': 0, 'total_power': 2297.770586821443, 'total_power_no_units': -0.22977705868214432, 'comfort_penalty': -2.0045748664060916, 'abs_comfort': 2.0045748664060916, 'temperatures': [20.99542513359391], 'out_temperature': 18.8, 'action_': [21, 24]}
Reward:  -19397.944264214886 {'timestep': 26208, 'time_elapsed': 23587200, 'year': 1991, 'month': 10, 'day': 1, 'hour': 0, 'total_power': 752.5381431017472, 'total_power_no_units': -0.07525381431017472, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [21.17564750687498], 'out_temperature': 13.3, 'action_': [21, 24]}
Reward:  -20424.788544721127 {'timestep': 29184, 'time_elapsed': 26265600, 'year': 1991, 'month': 11, 'day': 1, 'hour': 0, 'total_power': 522.1287670446718, 'total_power_no_units': -0.052212876704467184, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.00072925585593], 'out_temperature': 13.0, 'action_': [20, 25]}
Reward:  -21565.995065534156 {'timestep': 32064, 'time_elapsed': 28857600, 'year': 1991, 'month': 12, 'day': 1, 'hour': 0, 'total_power': 7777.204818523511, 'total_power_no_units': -0.7777204818523511, 'comfort_penalty': -0.0, 'abs_comfort': 0.0, 'temperatures': [20.97813067889673], 'out_temperature': 5.1, 'action_': [21, 21]}
Reward:  -23560.17226791806 {'timestep': 35040, 'time_elapsed': 31536000, 'year': 1992, 'month': 1, 'day': 1, 'hour': 0, 'total_power': 23095.70961516462, 'total_power_no_units': -2.309570961516462, 'comfort_penalty': -0.15004851034096944, 'abs_comfort': 0.15004851034096944, 'temperatures': [19.84995148965903], 'out_temperature': -12.0, 'action_': [21, 21]}
Episode  0 Mean reward:  -0.672379345545623 Cumulative reward:  -23560.17226791806
[2022-10-07 09:09:10,464] EPLUS_ENV_demo-v1_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.