Output format

When a simulation is run, this generate a directory called Eplus-env-<env_name>-res<num_simulation>. The content of this directory is the result of the simulation and we have:

Eplus-env-<env_name>-res<num_simulation>
├── Eplus-env-sub_run1
├── Eplus-env-sub_run2
├── Eplus-env-sub_run3
├── ...
├── Eplus-env-sub_runN
│   ├── output/
│   ├── variables.cfg
│   ├── socket.cfg
│   ├── utilSocket.cfg
│   ├── environment.idf
|   ├── weather.epw
│   ├── monitor.csv
|   └── monitor_normalized.csv (optional)
└── progress.csv

Eplus-env-sub_run<num_episode> records the results of each episode in simulation. The number of these directories depends on the number of episodes.
Within these directories, you have always the same structure:
- A copy of variables.cfg and environment.idf which are being used during simulation. Environment.idf does not have to be the same as the original hosted in the repository. Since the simulation can be modified to suit the specific weather or apply extra user-defined settings when building the gym environment.
- A copy of Weather.epw appears only when the weather change for one episode to another (using variability, for example). If weather does not change, original repository .epw will be used in each episode.
- A copy of socket.cfg and utilSocket.idf which are being used in order to establish communication interface with Energyplus during simulation.
- monitor.csv: This records all interactions Agent-Environment during the episode timestep by timestep, the format is: timestep, observation_values, action_values, simulation_time (seconds), reward, done. This file only exists when environment has been wrapped with Logger (see Wrappers for more information).
- monitor_normalized.csv: This file is only generated when environment is wrapped with logger and normalization (see Wrappers). The structure is the same than monitor.csv but observation_values are normalized.
- output/: This directory has EnergyPlus environment output.
progress.csv: This file has information about general simulation results. There is a row per episode and it records most important data. Currently, the format is: episode_num,cumulative_reward,mean_reward,cumulative_power_consumption, mean_power_consumption,cumulative_comfort_penalty,mean_comfort_penalty, cumulative_power_penalty,mean_power_penalty,comfort_violation (%),length(timesteps), time_elapsed(seconds). This file only exists when environment has been wrapped with Logger (see Wrappers for more information).

Note

For more information about specific EnergyPlus output, visit EnergyPlus documentation.

Logger

The files monitor.csv, monitor_normalized.csv and progress.csv belong to Sinergym logger which is a wrapper for the environment (see Wrappers). This logger has the responsibility of recording all the interactions that are carried out in a simulation, regardless of the training technique which may be being used or any other external factor.

Recording is managed by a instance of the class CSVLogger which is present as a environment attribute and is called in each timestep and in the end of a episode:

class CSVLogger(object):
    """CSV Logger for agent interaction with environment.

        :param monitor_header: CSV header for sub_run_N/monitor.csv which record interaction step by step.
        :param progress_header: CSV header for res_N/progress.csv which record main data episode by episode.
        :param log_file: log_file path for monitor.csv, there will be one CSV per episode.
        :param log_progress_file: log_file path for progress.csv, there will be only one CSV per whole simulation.
        :param flag: This flag is used to activate (True) or deactivate (False) Logger in real time.
        :param steps_data, rewards, powers, etc: These arrays are used to record steps data to elaborate main data for progress.csv later.
        :param total_timesteps: Current episode timesteps executed.
        :param total_time_elapsed: Current episode time elapsed (simulation seconds).
        :param comfort_violation_timesteps: Current episode timesteps whose comfort_penalty!=0.
        :param steps_data: It is a array of str's. Each element belong to a step data.

    """

    def __init__(
            self,
            monitor_header,
            progress_header,
            log_progress_file,
            log_file=None,
            flag=True):

        self.monitor_header = monitor_header
        self.progress_header = progress_header + '\n'
        self.log_file = log_file
        self.log_progress_file = log_progress_file
        self.flag = flag

        # episode data
        self.steps_data = [self.monitor_header.split(',')]
        self.steps_data_normalized = [self.monitor_header.split(',')]
        self.rewards = []
        self.powers = []
        self.comfort_penalties = []
        self.power_penalties = []
        self.total_timesteps = 0
        self.total_time_elapsed = 0
        self.comfort_violation_timesteps = 0

    def log_step(
            self,
            timestep,
            date,
            observation,
            action,
            simulation_time,
            reward,
            total_power_no_units,
            comfort_penalty,
            power,
            done):
        """Log step information and store it in steps_data param.

        Args:
            timestep (int): Current episode timestep in simulation.
            date (list): Current date [month,day,hour] in simulation.
            observation (list): Values that belong to current observation.
            action (list): Values that belong to current action.
            simulation_time (float): Total time elapsed in current episode (seconds).
            reward (float): Current reward achieved.
            total_power_no_units (float): Power consumption penalty depending on reward function.
            comfort_penalty (float): Temperature comfort penalty depending on reward function.
            power (float): Power consumption in current step (W).
            done (bool): Specifies if this step terminates episode or not.

        """
        if self.flag:
            row_contents = [timestep] + list(date) + list(observation) + \
                list(action) + [simulation_time, reward,
                                total_power_no_units, comfort_penalty, done]
            self.steps_data.append(row_contents)

            # Store step information for episode
            self._store_step_information(
                reward,
                power,
                comfort_penalty,
                total_power_no_units,
                timestep,
                simulation_time)
        else:
            pass

    def log_step_normalize(
            self,
            timestep,
            date,
            observation,
            action,
            simulation_time,
            reward,
            total_power_no_units,
            comfort_penalty,
            done):
        if self.flag:
            row_contents = [timestep] + list(date) + list(observation) + \
                list(action) + [simulation_time, reward,
                                total_power_no_units, comfort_penalty, done]
            self.steps_data_normalized.append(row_contents)
        else:
            pass

    def log_episode(self, episode):
        """Log episode main information using steps_data param.

        Args:
            episode (int): Current simulation episode number.

        """
        if self.flag:
            # statistics metrics for whole episode
            ep_mean_reward = np.mean(self.rewards)
            ep_cumulative_reward = np.sum(self.rewards)
            ep_cumulative_power = np.sum(self.powers)
            ep_mean_power = np.mean(self.powers)
            ep_cumulative_comfort_penalty = np.sum(self.comfort_penalties)
            ep_mean_comfort_penalty = np.mean(self.comfort_penalties)
            ep_cumulative_power_penalty = np.sum(self.power_penalties)
            ep_mean_power_penalty = np.mean(self.power_penalties)
            try:
                comfort_violation = (
                    self.comfort_violation_timesteps /
                    self.total_timesteps *
                    100)
            except ZeroDivisionError:
                comfort_violation = np.nan

            # write steps_info in monitor.csv
            with open(self.log_file, 'w', newline='') as file_obj:
                # Create a writer object from csv module
                csv_writer = csv.writer(file_obj)
                # Add contents of list as last row in the csv file
                csv_writer.writerows(self.steps_data)

            # Write normalize steps_info in monitor_normalized.csv
            if len(self.steps_data_normalized) > 1:
                with open(self.log_file[:-4] + '_normalized.csv', 'w', newline='') as file_obj:
                    # Create a writer object from csv module
                    csv_writer = csv.writer(file_obj)
                    # Add contents of list as last row in the csv file
                    csv_writer.writerows(self.steps_data_normalized)

            # Create CSV file with header if it's required for progress.csv
            if not os.path.isfile(self.log_progress_file):
                with open(self.log_progress_file, 'a', newline='\n') as file_obj:
                    file_obj.write(self.progress_header)

            # building episode row
            row_contents = [
                episode,
                ep_cumulative_reward,
                ep_mean_reward,
                ep_cumulative_power,
                ep_mean_power,
                ep_cumulative_comfort_penalty,
                ep_mean_comfort_penalty,
                ep_cumulative_power_penalty,
                ep_mean_power_penalty,
                comfort_violation,
                self.total_timesteps,
                self.total_time_elapsed]
            with open(self.log_progress_file, 'a+', newline='') as file_obj:
                # Create a writer object from csv module
                csv_writer = csv.writer(file_obj)
                # Add contents of list as last row in the csv file
                csv_writer.writerow(row_contents)

            # Reset episode information
            self._reset_logger()
        else:
            pass

    def set_log_file(self, new_log_file):
        """Change log_file path for monitor.csv when an episode ends.

        Args:
            new_log_file (str): New log path depending on simulation.

        """
        if self.flag:
            self.log_file = new_log_file
            if self.log_file:
                with open(self.log_file, 'a', newline='\n') as file_obj:
                    file_obj.write(self.monitor_header)
        else:
            pass

    def _store_step_information(
            self,
            reward,
            power,
            comfort_penalty,
            power_penalty,
            timestep,
            simulation_time):
        """Store relevant data to episode summary in progress.csv.

        Args:
            reward (float): Current reward achieved.
            power (float): Power consumption in current step (W).
            comfort_penalty (float): Temperature comfort penalty depending on reward function.
            power_penalty (float): Power consumption penalty depending on reward function.
            timestep (int): Current episode timestep in simulation.
            simulation_time (float): Total time elapsed in current episode (seconds).

        """
        if reward is not None:
            self.rewards.append(reward)
        if power is not None:
            self.powers.append(power)
        if comfort_penalty is not None:
            self.comfort_penalties.append(comfort_penalty)
        if power_penalty is not None:
            self.power_penalties.append(power_penalty)
        if comfort_penalty != 0:
            self.comfort_violation_timesteps += 1
        self.total_timesteps = timestep
        self.total_time_elapsed = simulation_time

    def _reset_logger(self):
        """Reset relevant data to next episode summary in progress.csv.
        """
        self.steps_data = [self.monitor_header.split(',')]
        self.steps_data_normalized = [self.monitor_header.split(',')]
        self.rewards = []
        self.powers = []
        self. comfort_penalties = []
        self.power_penalties = []
        self.total_timesteps = 0
        self.total_time_elapsed = 0
        self.comfort_violation_timesteps = 0

    def activate_flag(self):
        """Activate Sinergym CSV logger
        """
        self.flag = True

    def deactivate_flag(self):
        """Deactivate Sinergym CSV logger
        """
        self.flag = False

Note

Normalized observation methods are only used when environment is wrapped with normalization previously (see Wrappers).

Note

Note that you can activate and deactivate logger from environment when you want it, using methods activate and deactivate, so you don’t need to unwrap environment.