5. Environments

As mentioned in introduction, Sinergym follows the next structure:

Sinergym backend

Sinergym is composed of three main components: agent, communication interface and simulation. The agent sends actions and receives observations from the environment through the Gymnasium interface. At the same time, the gym interface communicates with the simulator engine via BCVTB, which provide the socket in order to parse the information in EnergyPlus format.

The next image shows that socket connection:

Sinergym backend

5.1. Additional observation information

In addition to the observations returned in the step and reset methods as you can see in the images above, both return a Python dictionary with additional information:

  • Reset info: This dictionary has the next keys:

info = {
          'eplus_working_dir': eplus_working_dir,
          'episode_num': self._epi_num,
          'socket_host': addr[0],
          'socket_port': addr[1],
          'init_year': time_info[0],
          'init_month': time_info[1],
          'init_day': time_info[2],
          'init_hour': time_info[3],
          'timestep': 0,
          'time_elapsed': 0
      }

Thus, we can get information about where episode output will be allocated, the episode num of the simulation, socket information, when episode start and timestep and time elapsed (which is 0).

  • step info: This dictionary has the next keys:

info = {
          'timestep': int(
              curSimTim / self._eplus_run_stepsize),
          'time_elapsed': int(curSimTim),
          'year': time_info[0],
          'month': time_info[1],
          'day': time_info[2],
          'hour': time_info[3],
          'action': action,
          'reward': reward,
          # AND REWARD TERMS
      }

The additional information we can obtain in a step is the timestep number, simulation time elapsed, step datetime, action executed in simulator (which can be different to the step action parameter due to the transformations) and reward terms. The keys of reward terms depends on the reward class it has been used in the environment.

5.2. Environments List

The list of available environments is the following:

Env. name

IDF file

EPW file

Weather variability

Action space

Simulation period

Eplus-demo-v1

5ZoneAutoDXVAV.idf

USA_PA_Pittsburgh-Allegheny.County.AP.725205_TMY3.epw

No

Discrete(10)

01/01 - 31/03

Eplus-5Zone-hot-discrete-v1

5ZoneAutoDXVAV.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-5Zone-mixed-discrete-v1

5ZoneAutoDXVAV.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-5Zone-cool-discrete-v1

5ZoneAutoDXVAV.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-5Zone-hot-continuous-v1

5ZoneAutoDXVAV.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-5Zone-mixed-continuous-v1

5ZoneAutoDXVAV.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-5Zone-cool-continuous-v1

5ZoneAutoDXVAV.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-5Zone-hot-discrete-stochastic-v1

5ZoneAutoDXVAV.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-5Zone-mixed-discrete-stochastic-v1

5ZoneAutoDXVAV.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-5Zone-cool-discrete-stochastic-v1

5ZoneAutoDXVAV.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-5Zone-hot-continuous-stochastic-v1

5ZoneAutoDXVAV.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-5Zone-mixed-continuous-stochastic-v1

5ZoneAutoDXVAV.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-5Zone-cool-continuous-stochastic-v1

5ZoneAutoDXVAV.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-datacenter-hot-discrete-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-datacenter-hot-continuous-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-datacenter-hot-discrete-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-datacenter-hot-continuous-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-datacenter-mixed-discrete-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-datacenter-mixed-continuous-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-datacenter-mixed-discrete-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-datacenter-mixed-continuous-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-datacenter-cool-discrete-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-datacenter-cool-continuous-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-datacenter-cool-discrete-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-datacenter-cool-continuous-stochastic-v1

2ZoneDataCenterHVAC_wEconomizer.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-warehouse-hot-discrete-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-warehouse-hot-continuous-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(5)

01/01 - 31/12

Eplus-warehouse-hot-discrete-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-warehouse-hot-continuous-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(5)

01/01 - 31/12

Eplus-warehouse-mixed-discrete-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-warehouse-mixed-continuous-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(5)

01/01 - 31/12

Eplus-warehouse-mixed-discrete-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-warehouse-mixed-continuous-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(5)

01/01 - 31/12

Eplus-warehouse-cool-discrete-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-warehouse-cool-continuous-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(5)

01/01 - 31/12

Eplus-warehouse-cool-discrete-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-warehouse-cool-continuous-stochastic-v1

ASHRAE9012016_Warehouse_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(5)

01/01 - 31/12

Eplus-office-hot-discrete-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-office-hot-continuous-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-office-hot-discrete-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-office-hot-continuous-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-office-mixed-discrete-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-office-mixed-continuous-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-office-mixed-discrete-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-office-mixed-continuous-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-office-cool-discrete-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Discrete(10)

01/01 - 31/12

Eplus-office-cool-continuous-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-office-cool-discrete-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Discrete(10)

01/01 - 31/12

Eplus-office-cool-continuous-stochastic-v1

ASHRAE9012016_OfficeMedium_Denver.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-officegrid-cool-continuous-v1

OfficeGridStorageSmoothing.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-officegrid-mixed-continuous-v1

OfficeGridStorageSmoothing.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-officegrid-hot-continuous-v1

OfficeGridStorageSmoothing.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(4)

01/01 - 31/12

Eplus-officegrid-cool-continuous-stochastic-v1

OfficeGridStorageSmoothing.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-officegrid-mixed-continuous-stochastic-v1

OfficeGridStorageSmoothing.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-officegrid-hot-continuous-stochastic-v1

OfficeGridStorageSmoothing.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(4)

01/01 - 31/12

Eplus-shop-cool-continuous-v1

ShopWithVandBattery.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-shop-mixed-continuous-v1

ShopWithVandBattery.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-shop-hot-continuous-v1

ShopWithVandBattery.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

No

Box(2)

01/01 - 31/12

Eplus-shop-cool-continuous-stochastic-v1

ShopWithVandBattery.idf

USA_WA_Port.Angeles-William.R.Fairchild.Intl.AP.727885_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-shop-mixed-continuous-stochastic-v1

ShopWithVandBattery.idf

USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Eplus-shop-hot-continuous-stochastic-v1

ShopWithVandBattery.idf

USA_AZ_Davis-Monthan.AFB.722745_TMY3.epw

Yes

Box(2)

01/01 - 31/12

Note

For more information about buildings (epJSON column) and weathers (EPW column), please, visit sections Buildings and Weathers respectively.

5.3. Available Parameters

With the environment constructor we can configure the complete context of our environment for experimentation, either starting from one predefined by Sinergym shown in the table above or creating a new one.

    def __init__(
        self,
        building_file: str,
        weather_file: Union[str, List[str]],
        observation_space: gym.spaces.Box = gym.spaces.Box(
            low=-5e6, high=5e6, shape=(4,), dtype=np.float32),
        observation_variables: List[str] = [],
        action_space: Union[gym.spaces.Box, gym.spaces.Discrete] = gym.spaces.Box(
            low=0, high=0, shape=(0,), dtype=np.float32),
        action_variables: List[str] = [],
        action_mapping: Dict[int, Tuple[float, ...]] = {},
        weather_variability: Optional[Tuple[float]] = None,
        reward: Any = LinearReward,
        reward_kwargs: Optional[Dict[str, Any]] = {},
        act_repeat: int = 1,
        max_ep_data_store_num: int = 10,
        action_definition: Optional[Dict[str, Any]] = None,
        env_name: str = 'eplus-env-v1',
        config_params: Optional[Dict[str, Any]] = None
    ):
        """Environment with EnergyPlus simulator.

        Args:
            building_file (str): Name of the JSON file with the building definition.
            weather_file (Union[str,List[str]]): Name of the EPW file for weather conditions. It can be specified a list of weathers files in order to sample a weather in each episode randomly.
            observation_space (gym.spaces.Box, optional): Gym Observation Space definition. Defaults to an empty observation_space (no control).
            observation_variables (List[str], optional): List with variables names in building. Defaults to an empty observation variables (no control).
            action_space (Union[gym.spaces.Box, gym.spaces.Discrete], optional): Gym Action Space definition. Defaults to an empty action_space (no control).
            action_variables (List[str],optional): Action variables to be controlled in building, if that actions names have not been configured manually in building, you should configure or use action_definition. Default to empty List.
            action_mapping (Dict[int, Tuple[float, ...]], optional): Action mapping list for discrete actions spaces only. Defaults to empty list.
            weather_variability (Optional[Tuple[float]], optional): Tuple with sigma, mu and tao of the Ornstein-Uhlenbeck process to be applied to weather data. Defaults to None.
            reward (Any, optional): Reward function instance used for agent feedback. Defaults to LinearReward.
            reward_kwargs (Optional[Dict[str, Any]], optional): Parameters to be passed to the reward function. Defaults to empty dict.
            act_repeat (int, optional): Number of timesteps that an action is repeated in the simulator, regardless of the actions it receives during that repetition interval.
            max_ep_data_store_num (int, optional): Number of last sub-folders (one for each episode) generated during execution on the simulation.
            action_definition (Optional[Dict[str, Any]): Dict with building components to being controlled by Sinergym automatically if it is supported. Default value to None.
            env_name (str, optional): Env name used for working directory generation. Defaults to eplus-env-v1.
            config_params (Optional[Dict[str, Any]], optional): Dictionary with all extra configuration for simulator. Defaults to None.
        """

        # ---------------------------------------------------------------------------- #
        #                          Energyplus, BCVTB and paths                         #
        # ---------------------------------------------------------------------------- #
        eplus_path = os.environ['EPLUS_PATH']
        bcvtb_path = os.environ['BCVTB_PATH']

        # building file
        self.building_file = building_file
        # EPW file(s) (str or List of EPW's)
        if isinstance(weather_file, str):
            self.weather_files = [weather_file]
        else:
            self.weather_files = weather_file

        # ---------------------------------------------------------------------------- #
        #                             Variables definition                             #
        # ---------------------------------------------------------------------------- #
        self.variables = {}
        self.variables['observation'] = observation_variables
        self.variables['action'] = action_variables

        # Copy to use original variables in step.obs_dict for reward
        self.original_obs = observation_variables
        self.original_act = action_variables

        self.name = env_name

        # ---------------------------------------------------------------------------- #
        #                                   Simulator                                  #
        # ---------------------------------------------------------------------------- #
        self.simulator = EnergyPlus(
            env_name=env_name,
            eplus_path=eplus_path,
            bcvtb_path=bcvtb_path,
            building_file=self.building_file,
            weather_files=self.weather_files,
            variables=self.variables,
            act_repeat=act_repeat,
            max_ep_data_store_num=max_ep_data_store_num,
            action_definition=action_definition,
            config_params=config_params
        )

        # ---------------------------------------------------------------------------- #
        #                       Detection of controllable planners                     #
        # ---------------------------------------------------------------------------- #
        self.schedulers = self.get_schedulers()

        # ---------------------------------------------------------------------------- #
        #        Adding simulation date to observation (not needed in simulator)       #
        # ---------------------------------------------------------------------------- #

        self.variables['observation'] = ['year', 'month',
                                         'day', 'hour'] + self.variables['observation']
        self.original_obs = ['year', 'month',
                             'day', 'hour'] + self.original_obs

        # ---------------------------------------------------------------------------- #
        #                          reset default options                               #
        # ---------------------------------------------------------------------------- #
        self.default_options = {}
        # Weather Variability
        if weather_variability:
            self.default_options['weather_variability'] = weather_variability
        # ... more reset option implementations here

        # ---------------------------------------------------------------------------- #
        #                               Observation Space                              #
        # ---------------------------------------------------------------------------- #
        self._observation_space = observation_space

        # ---------------------------------------------------------------------------- #
        #                                 Action Space                                 #
        # ---------------------------------------------------------------------------- #
        # Action space type
        self.flag_discrete = (
            isinstance(
                action_space,
                gym.spaces.Discrete))

        # Discrete
        if self.flag_discrete:
            self.action_mapping = action_mapping
            self._action_space = action_space
        # Continuous
        else:
            # Defining action values setpoints (one per value)
            self.setpoints_space = action_space

            self._action_space = gym.spaces.Box(
                # continuous_action_def[2] --> shape
                low=np.array(
                    np.repeat(-1, action_space.shape[0]), dtype=np.float32),
                high=np.array(
                    np.repeat(1, action_space.shape[0]), dtype=np.float32),
                dtype=action_space.dtype
            )

        # ---------------------------------------------------------------------------- #
        #                                    Reward                                    #
        # ---------------------------------------------------------------------------- #
        self.reward_fn = reward(**reward_kwargs)

        # ---------------------------------------------------------------------------- #
        #                        Environment definition checker                        #
        # ---------------------------------------------------------------------------- #

        self._check_eplus_env()

We will show which parameters are available and what their function is.

5.3.1. building file

The parameter building_file is the epJSON file, a new adaptation of IDF (Intermediate Data Format) where EnergyPlus building model is defined.

Sinergym initially provides “free” buildings. This means that the epJSON does not have the external interface defined and default components, such as the timesteps, the runperiod, the location or DesignDays.

Depending on the rest of the parameters that make up the environment, the building model is updated by Sinergym automatically, changing those components that are necessary, such as the external interface that we have mentioned.

Once the building is configured, it is copied to the output folder of that particular experimentation and used by the simulator of that execution.

5.3.2. EPW file

The parameter weather_file is the EPW (EnergyPlus Weather) file name where climate conditions during a year is defined.

Depending on the climate that is set for the environment, some of building model components need to be modified in such a way that it is compatible with that weather. Therefore, Sinergym updates the DesignDays and Location fields automatically using the weather data, without the need for user intervention.

This parameter can be a weather file name (str) as mentioned, or a list of different weather files (List[str]). When a list of several files is defined, Sinergym will select an EPW file in each episode and re-adapt building model randomly. This is done in order to increase the complexity in the environment whether is desired.

The weather file used in each episode is stored in Sinergym episode output folder, if variability (section Weather Variability is defined), the EPW stored will have that noise included.

5.3.3. Weather Variability

Weather variability can be integrated into an environment using weather_variability parameter.

It implements the Ornstein-Uhlenbeck process in order to introduce noise to the weather data episode to episode. Then, parameter established is a Python tuple of three variables (sigma, mu and tau) whose values define the nature of that noise.

Ornstein-Uhlenbeck process noise with different hyperparameters.

5.3.4. Reward

The parameter called reward is used to define the reward class (see section Rewards) that the environment is going to use to calculate and return reward values each timestep.

5.3.5. Reward Kwargs

Depending on the reward class that is specified to the environment, it may have different parameters depending on its type. In addition, if a user creates a new custom reward, it can have new parameters as well.

Moreover, depending on the building being used (epJSON file) for the environment, the values of these reward parameters may need to be different, such as the comfort range or the energy and temperature variables of the simulation that will be used to calculate the reward.

Then, the parameter called reward_kwargs is a Python dictionary where we can specify all reward class parameters that they are needed. For more information about rewards, visit section Rewards.

5.3.6. Action Repeat

The parameter called act_repeat is the number of timesteps that an action is repeated in the simulator, regardless of the actions it receives during that repetition interval. Default value is 1.

5.3.7. Maximum Episode Data Stored in Sinergym Output

Sinergym stores all the output of an experiment in a folder organized in sub-folders for each episode (see section Output format for more information). Depending on the value of the parameter max_ep_data_store_num, the experiment will store the output data of the last n episodes set, where n is the value of the parameter.

In any case, if Sinergym Logger (See Logger section) is activate, progress.csv will be present with the summary data of each episode.

5.3.8. Observation/action spaces

Structure of observation and action space is defined in Environment constructor directly. This allows for a dynamic definition of these spaces. Let’s see the fields required to do it:

  • observation_variables: List of observation variables that simulator is going to process like an observation. These variables names must follow the structure <variable_name>(<zone_name>) in order to register them correctly. Sinergym will check for you that the variable names are correct with respect to the building you are trying to simulate (epJSON file). To do this, it will look at the list found in the variables folder of the project (RDD file).

    Note

    In case a new observation variable is added to the default ones for an environment, care must be taken in case normalization is to be done. This is because you have to update the dictionary of ranges of values available in constants.py as discussed in issue #249. You can use the range_getter function of common.py to get these ranges automatically from a experiment output folder.

  • observation_space: Definition of the observation space following the gymnasium standard. This space is used to represent all the observations variables that we have previously defined. Remember that the year, month, day and hour are added by Sinergym later, so space must be reserved for these fields in the definition. If an inconsistency is found, Sinergym will notify you so that it can be fixed easily.

  • action_variables: List of the action variables that simulator is going to process like schedule control actuator in the building model. These variables must be defined in the building model (epJSON file) correctly before simulation like an external interface. You can modify manually in the building file or using our action definition field in which you set what you want to control and Sinergym takes care of modifying this file for you automatically. For more information about this automatic adaptation in section Action definition.

  • action_space: Definition of the action space following the gymnasium standard. This definition can be discrete or continuous and must be consistent with the previously defined action variables (Sinergym will show inconsistency as usual).

Note

In order to make environments more generic in DRL solutions. We have updated action space for continuous problems. Gymnasium action space is defined always between [-1,1] and Sinergym parse this values to action space defined in environment internally before to send it to EnergyPlus Simulator. The method in charge of parse this values from [-1,1] to real action space is called _setpoints_transform(action) in sinergym/sinergym/envs/eplus_env.py

  • action_mapping: It is only necessary to specify it in discrete action spaces. It is a dictionary that links an index to a specific configuration of values for each action variable.

As we have told, observation and action spaces are defined dynamically in Sinergym Environment constructor. Environment ID’s registered in Sinergym use a default definition set up in constants.py.

As can be seen in environments observations, the year, month, day and hour are included in, but is not configured in default observation variables definition. This is because they are not variables recognizable by the simulator (EnergyPlus) as such and Sinergym does the calculations and adds them in the states returned as output by the environment. This feature is common to all environments available in Sinergym and all supported building designs. In other words, you don’t need to add this variables (year, month, day and hour) to observation variables, but yes to the observation space.

As we told before, all environments ID’s registered in Sinergym use its respectively default action and observation spaces, variables and action definition. However, you can change this values giving you the possibility of playing with different observation/action spaces in discrete and continuous environments in order to study how this affects the resolution of a building problem.

Sinergym has several checkers to ensure that there are no inconsistencies in the alternative specifications made to the default ones. In case the specification offered is wrong, Sinergym will launch messages indicating where the error or inconsistency is located.

Sinergym offers the possibility to create empty action interfaces, so that you can take advantage of all its benefits instead of using the EnergyPlus simulator directly, meanwhile the control is managed by default building model schedulers (actuators). For more information, see the example of use Default building control setting up an empty action interface.

Note

variables.cfg is a requirement in order to establish a connection between Gymnasium environment and Simulator with a external interface (using BCVTB). Since Sinergym 1.9.0 version, it is created automatically using action and observation space definition in environment construction.

Note

Sinergym backend was initially developed as an extension of Zhiang Zhang and Khee Poh Lam Gym-Eplus project. It has since evolved into a tool with its own identity.

5.3.9. Environment name

The parameter env_name is used to define the name of working directory generation.

5.3.10. Action definition

Creating a new external interface to control different parts of a building is not a trivial task, it requires certain changes in the building model (epJSON), configuration files for the external interface (variables.cfg), etc in order to control it.

The changes in the building model are complex due to depending on the building model we will have available different zones and actuators.

Thus, there is the possibility to add an action definition in environment instead of modify the building model directly about components or actuators changes required to control by external variables specified in Observation/action spaces.

For this purpose, we have available action_definition parameter in environments. Its value is a dictionary with the next structure:

action_definition_example={
  <Original_building_scheduler_name>:'name':{<external_variable_name>,'initial_value':<value>},
  ...
}

For an example about how to use this action definition functionality, visit section Updating the action definition of the environment.

Sinergym obtains a list of the schedulers available in the building model that is loaded in that environment and is stored as an environment attribute. The information that appears in this dictionary has the following structure:

# env.schedulers
{
  <scheduler_name>:{'Type':<scheduler_value_type>,
                    'Object1':{'':,'':,'':},
                    'Object2':{'':,'':,'':},
                    ...
                   },
  ...
}

For each scheduler found, a new entry is created in the dictionary in which the key is its name, the data type and objects in which the value of the scheduler is used are included. Sinergym will use this information to perform the automatic changes you want in the action definition we have seen above.

Sinergym replaces the original scheduler with an external interface created and used in each of the objects it handled. The data type does not need to be specified, since Sinergym uses the data type of the replaced scheduler.

This allows any component to be handled by an external interface with a simple definition by the user. Although it is first necessary to know the components that are included in the building.

If you do not want to read directly the dictionary of env.schedulers, it is also possible to export a pdf with such information in a better presented form to study those things that you want to manage. For an example about how to use it, see Getting information about building model with Sinergym.

5.3.11. Extra configuration

Some parameters directly associated with the simulator can be set as extra configuration as well, such as people occupant, timesteps per simulation hour, runperiod, etc.

Like this extra configuration context can grow up in the future, this is specified in config_params field. It is a Python Dictionary where this values are specified. For more information about extra configuration available for Sinergym visit section Extra Configuration in Sinergym simulations.

5.4. Adding new weathers for environments

Sinergym includes several weathers covering different types of climate in different areas of the world. The aim is to provide the greatest possible diversity for the experiments taking into account certain characteristics.

However, you may need or want to include a new weather for an experiment. Therefore, this section is dedicated to give an explanation of how to do it:

  1. Download EPW file and DDY file in EnergyPlus page. DDY file contains information about the location and different design days available for that weather.

  2. Both files (EPW and DDY) must have exactly the same name, the only difference being the extension. They should be placed in the weathers folder.

That is all! Sinergym should be able to adapt SizingPeriod:DesignDays and Site:Location fields in building model file using DDY automatically for that weather.

5.5. Adding new buildings for environments

As we have already mentioned, a user can change the already available environments or even create new environment definitions including new climates, action and observation spaces, etc. However, perhaps the most complex thing to incorporate into the project are new building models (epJSON files) than the ones we support.

This section is intended to provide information if someone decides to add new buildings for use with Sinergym. The main steps you have to follow are the next:

  1. Add your building file (epJSON) to buildings. EnergyPlus pretends to work with JSON format instead of IDF format in their building definitions and simulations. Then, Sinergym pretends to work with this format from version 2.4.0 or higher directly. You can download a IDF file and convert to epJSON using their ConvertInputFormat tool from EnergyPlus. That building model must be “free” as far as external interface is concerned if you plan to use Sinergym’s action definition which will modify the model for you before starting the simulation automatically (see section Action definition). Be sure that new epJSON model version is compatible with EnergyPlus version.

  2. Add your own EPW file for weather conditions (section Adding new weathers for environments) or use ours in environment constructor.

  3. Sinergym will check that observation variables specified in environments constructor are available in the simulation before starting. In order to be able to do these checks, you need to copy RDD file with the same name than building model file (except extension) to variables. To obtain this RDD file, you have to run a simulation with EnergyPlus directly and extract from output folder. Make sure that Output:VariableDictionary object in building model has the value Regular in order to RDD file has the correct format for Sinergym.

  4. Register your own environment ID here following the same structure than the rest. You will have to specify environment components you want to control, action/observation space, etc. We have examples about how to do it (Getting information about building model with Sinergym, Updating the action definition of the environment, etc).

  5. Now, you can use your own environment ID with gym.make() like our documentation examples.