21. DRL usage example

In this notebook example, Stable Baselines 3 has been used to train and to load an agent. However, Sinergym is completely agnostic to any DRL algorithm (although there are custom callbacks for SB3 specifically) and can be used with any DRL library that works with gymnasium environments.

21.1. Training a model

We are going to rely on the script available in the repository root called DRL_battery.py. This script applies all the possibilities that Sinergym has to work with deep reinforcement learning algorithms and set parameters to everything so that we can define the training options from the execution of the script easily by a JSON file.

For more information about how run DRL_battery.py, please, see Train a model.

[1]:

import sys
from datetime import datetime

import gymnasium as gym
import numpy as np
import wandb
from stable_baselines3 import *
from stable_baselines3.common.callbacks import CallbackList
from stable_baselines3.common.logger import HumanOutputFormat, Logger
from stable_baselines3.common.monitor import Monitor

import sinergym
import sinergym.utils.gcloud as gcloud
from sinergym.utils.callbacks import *
from sinergym.utils.constants import *
from sinergym.utils.logger import CSVLogger, WandBOutputFormat
from sinergym.utils.rewards import *
from sinergym.utils.wrappers import *

First let’s define some variables for the execution.

[2]:

# Environment ID
environment = "Eplus-demo-v1"
# Training episodes
episodes = 4
#Name of the experiment
experiment_date = datetime.today().strftime('%Y-%m-%d_%H:%M')
experiment_name = 'SB3_DQN-' + environment + \
    '-episodes-' + str(episodes)
experiment_name += '_' + experiment_date

We can combine this experiment executions with Weights&Biases in order to host all information extracted. With wandb, it’s possible to track and visualize all DRL training process in real time, register hyperparameters and details of each experiment, save artifacts such as models and sinergym output, and compare between different executions.

[3]:

# Create wandb.config object in order to log all experiment params
experiment_params = {
    'sinergym-version': sinergym.__version__,
    'python-version': sys.version
}
experiment_params.update({'environment':environment,
                          'episodes':episodes,
                          'algorithm':'SB3_DQN'})

# Get wandb init params (you have to specify your own project and entity)
wandb_params = {"project": 'sinergym',
                "entity": 'alex_ugr'}
# Init wandb entry
run = wandb.init(
    name=experiment_name + '_' + wandb.util.generate_id(),
    config=experiment_params,
    ** wandb_params
)

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
wandb: Currently logged in as: alex_ugr. Use `wandb login --relogin` to force relogin

wandb version 0.15.3 is available! To upgrade, please run: $ pip install wandb --upgrade

Tracking run with wandb version 0.15.2

Run data is saved locally in /workspaces/sinergym/examples/wandb/run-20230526_083242-12zkrpfr

Syncing run SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_g5fey5jp to Weights & Biases (docs)

View project at https://wandb.ai/alex_ugr/sinergym

View run at https://wandb.ai/alex_ugr/sinergym/runs/12zkrpfr

Now we are ready to create the Gymnasium Environment. Here we use the environment name defined, remember that you can change default environment configuration. We will create a eval_env too in order to interact in the evaluation episodes. We can overwrite the env name with experiment name if we want.

[4]:

env = gym.make(environment, env_name=experiment_name)
eval_env = gym.make(environment, env_name=experiment_name+'_EVALUATION')

[2023-05-26 08:32:44,154] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Updating Building model ExternalInterface object if it is not present...
[2023-05-26 08:32:44,156] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Updating Building model Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2023-05-26 08:32:44,157] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Updating building model OutPut:Variable and variables XML tree model for BVCTB connection.
[2023-05-26 08:32:44,158] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2023-05-26 08:32:44,158] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Setting up action definition in building model if exists...
[2023-05-26 08:32:44,212] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Updating Building model ExternalInterface object if it is not present...
[2023-05-26 08:32:44,213] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Updating Building model Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2023-05-26 08:32:44,214] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Updating building model OutPut:Variable and variables XML tree model for BVCTB connection.
[2023-05-26 08:32:44,215] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2023-05-26 08:32:44,215] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Setting up action definition in building model if exists...

We can also add a Wrapper to the environment, we are going to use a Logger (extension of gym.Wrapper) this is used to monitor and log the interactions with the environment and save the data into a CSV. Files generated will be stored as artifact in wandb too.

[5]:

env = LoggerWrapper(env)

At this point, we have the environment set up and ready to be used. We are going to create our learning model (Stable Baselines 3 DQN), but we can use any other algorithm.

[6]:

model = DQN('MlpPolicy', env, verbose=1)

Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.

Now we need to calculate the number of timesteps of each episode for the evaluation. Evaluation will execute the current model during a number of episodes determined to decide if it is the best current version of the model at that point of the training. Output generated will be stored in wandb server too.

[7]:

n_timesteps_episode = env.simulator._eplus_one_epi_len / \
                      env.simulator._eplus_run_stepsize

We are going to use the LoggerEval callback to print and save the best model evaluated during training.

[8]:

callbacks = []

# Set up Evaluation and saving best model
eval_callback = LoggerEvalCallback(
    eval_env,
    best_model_save_path=eval_env.simulator._env_working_dir_parent +
    '/best_model/',
    log_path=eval_env.simulator._env_working_dir_parent +
    '/best_model/',
    eval_freq=n_timesteps_episode * 2,
    deterministic=True,
    render=False,
    n_eval_episodes=1)
callbacks.append(eval_callback)

callback = CallbackList(callbacks)

In order to track all the training process in wandb, it is necessary to create a callback with a compatible wandb output format (which call wandb log method in the learning algorithm process).

[9]:

# wandb logger and setting in SB3
logger = Logger(
    folder=None,
    output_formats=[
        HumanOutputFormat(
            sys.stdout,
            max_length=120),
        WandBOutputFormat()])
model.set_logger(logger)
# Append callback
log_callback = LoggerCallback()
callbacks.append(log_callback)


callback = CallbackList(callbacks)

This is the number of total time steps for the training.

[10]:

timesteps = episodes * n_timesteps_episode

Now is time to train the model with the callbacks defined earlier. This may take a few minutes, depending on your computer.

[11]:

model.learn(
    total_timesteps=timesteps,
    callback=callback,
    log_interval=1)

[2023-05-26 08:32:44,713] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:32:44,906] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1/Eplus-env-sub_run1

/usr/local/lib/python3.10/dist-packages/opyplus/weather_data/weather_data.py:493: FutureWarning: the 'line_terminator'' keyword is deprecated, use 'lineterminator' instead.
  epw_content = self._headers_to_epw(use_datetimes=use_datetimes) + df.to_csv(

[2023-05-26 08:33:02,051] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:33:02,052] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:33:02,143] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1/Eplus-env-sub_run2
------------------------------------------------------------------------------------
| action/                                                             |            |
|    Cooling_Setpoint_RL                                              | 25.5       |
|    Heating_Setpoint_RL                                              | 19.1       |
| action_simulation/                                                  |            |
|    Cooling_Setpoint_RL                                              | 25.5       |
|    Heating_Setpoint_RL                                              | 19.1       |
| episode/                                                            |            |
|    comfort_violation_time(%)                                        | 49.8       |
|    cumulative_comfort_penalty                                       | -2.47e+04  |
|    cumulative_power                                                 | 1.64e+08   |
|    cumulative_power_penalty                                         | -1.64e+04  |
|    cumulative_reward                                                | -20503.81  |
|    ep_length                                                        | 35040      |
|    mean_comfort_penalty                                             | -0.703     |
|    mean_power                                                       | 4.67e+03   |
|    mean_power_penalty                                               | -0.467     |
|    mean_reward                                                      | -0.5851544 |
| observation/                                                        |            |
|    Facility Total HVAC Electricity Demand Rate(Whole Building)      | 4.67e+03   |
|    People Air Temperature(SPACE1-1 PEOPLE 1)                        | 21.6       |
|    Site Diffuse Solar Radiation Rate per Area(Environment)          | 73.6       |
|    Site Direct Solar Radiation Rate per Area(Environment)           | 115        |
|    Site Outdoor Air Drybulb Temperature(Environment)                | 11.2       |
|    Site Outdoor Air Relative Humidity(Environment)                  | 71.3       |
|    Site Wind Direction(Environment)                                 | 195        |
|    Site Wind Speed(Environment)                                     | 3.87       |
|    Zone Air Relative Humidity(SPACE1-1)                             | 41         |
|    Zone Air Temperature(SPACE1-1)                                   | 21.6       |
|    Zone People Occupant Count(SPACE1-1)                             | 3.18       |
|    Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)           | 0.662      |
|    Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)         | 29.8       |
|    Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1) | 21.7       |
|    Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)           | 25.5       |
|    Zone Thermostat Heating Setpoint Temperature(SPACE1-1)           | 19.1       |
|    day                                                              | 15.7       |
|    hour                                                             | 11.5       |
|    month                                                            | 6.53       |
|    year                                                             | 1.99e+03   |
| rollout/                                                            |            |
|    ep_len_mean                                                      | 3.5e+04    |
|    ep_rew_mean                                                      | -2.05e+04  |
|    exploration_rate                                                 | 0.05       |
| time/                                                               |            |
|    episodes                                                         | 1          |
|    fps                                                              | 1957       |
|    time_elapsed                                                     | 17         |
|    total_timesteps                                                  | 35040      |
------------------------------------------------------------------------------------
[2023-05-26 08:33:27,550] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:33:27,551] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:33:27,648] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1/Eplus-env-sub_run3
[2023-05-26 08:33:28,106] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:33:28,211] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION-res1/Eplus-env-sub_run1
[2023-05-26 08:33:42,883] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:33:42,885] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:33:43,002] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION-res1/Eplus-env-sub_run2
Eval num_timesteps=70080, episode_reward=-19618.27 +/- 0.00
Episode length: 35040.00 +/- 0.00
New best mean reward!
------------------------------------------------------------------------------------
| action/                                                             |            |
|    Cooling_Setpoint_RL                                              | 23.8       |
|    Heating_Setpoint_RL                                              | 20.5       |
| action_simulation/                                                  |            |
|    Cooling_Setpoint_RL                                              | 23.8       |
|    Heating_Setpoint_RL                                              | 20.5       |
| episode/                                                            |            |
|    comfort_violation_time(%)                                        | 46.8       |
|    cumulative_comfort_penalty                                       | -1.84e+04  |
|    cumulative_power                                                 | 1.99e+08   |
|    cumulative_power_penalty                                         | -1.99e+04  |
|    cumulative_reward                                                | -19159.918 |
|    ep_length                                                        | 35040      |
|    mean_comfort_penalty                                             | -0.526     |
|    mean_power                                                       | 5.68e+03   |
|    mean_power_penalty                                               | -0.568     |
|    mean_reward                                                      | -0.5468013 |
| eval/                                                               |            |
|    comfort_penalty                                                  | -1.6e+04   |
|    comfort_violation(%)                                             | 34.4       |
|    mean_ep_length                                                   | 3.5e+04    |
|    mean_power_consumption                                           | 2.32e+08   |
|    mean_rewards                                                     | -19618.273 |
|    power_penalty                                                    | -2.32e+04  |
|    std_rewards                                                      | 0.0        |
| observation/                                                        |            |
|    Facility Total HVAC Electricity Demand Rate(Whole Building)      | 5.68e+03   |
|    People Air Temperature(SPACE1-1 PEOPLE 1)                        | 21.8       |
|    Site Diffuse Solar Radiation Rate per Area(Environment)          | 73.6       |
|    Site Direct Solar Radiation Rate per Area(Environment)           | 115        |
|    Site Outdoor Air Drybulb Temperature(Environment)                | 11.2       |
|    Site Outdoor Air Relative Humidity(Environment)                  | 71.3       |
|    Site Wind Direction(Environment)                                 | 195        |
|    Site Wind Speed(Environment)                                     | 3.87       |
|    Zone Air Relative Humidity(SPACE1-1)                             | 39.6       |
|    Zone Air Temperature(SPACE1-1)                                   | 21.8       |
|    Zone People Occupant Count(SPACE1-1)                             | 3.18       |
|    Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)           | 0.662      |
|    Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)         | 26.2       |
|    Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1) | 21.8       |
|    Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)           | 23.8       |
|    Zone Thermostat Heating Setpoint Temperature(SPACE1-1)           | 20.5       |
|    day                                                              | 15.7       |
|    hour                                                             | 11.5       |
|    month                                                            | 6.53       |
|    year                                                             | 1.99e+03   |
| rollout/                                                            |            |
|    ep_len_mean                                                      | 3.5e+04    |
|    ep_rew_mean                                                      | -1.98e+04  |
|    exploration_rate                                                 | 0.05       |
| time/                                                               |            |
|    episodes                                                         | 2          |
|    fps                                                              | 1188       |
|    time_elapsed                                                     | 58         |
|    total_timesteps                                                  | 70080      |
| train/                                                              |            |
|    learning_rate                                                    | 0.0001     |
|    loss                                                             | 28.3       |
|    n_updates                                                        | 5019       |
------------------------------------------------------------------------------------
[2023-05-26 08:34:19,312] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:34:19,313] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:34:19,429] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1/Eplus-env-sub_run4
-------------------------------------------------------------------------------------
| action/                                                             |             |
|    Cooling_Setpoint_RL                                              | 23.5        |
|    Heating_Setpoint_RL                                              | 20.8        |
| action_simulation/                                                  |             |
|    Cooling_Setpoint_RL                                              | 23.5        |
|    Heating_Setpoint_RL                                              | 20.8        |
| episode/                                                            |             |
|    comfort_violation_time(%)                                        | 34.9        |
|    cumulative_comfort_penalty                                       | -1.54e+04   |
|    cumulative_power                                                 | 2.2e+08     |
|    cumulative_power_penalty                                         | -2.2e+04    |
|    cumulative_reward                                                | -18666.184  |
|    ep_length                                                        | 35040       |
|    mean_comfort_penalty                                             | -0.438      |
|    mean_power                                                       | 6.27e+03    |
|    mean_power_penalty                                               | -0.627      |
|    mean_reward                                                      | -0.53271073 |
| observation/                                                        |             |
|    Facility Total HVAC Electricity Demand Rate(Whole Building)      | 6.27e+03    |
|    People Air Temperature(SPACE1-1 PEOPLE 1)                        | 22.1        |
|    Site Diffuse Solar Radiation Rate per Area(Environment)          | 73.6        |
|    Site Direct Solar Radiation Rate per Area(Environment)           | 115         |
|    Site Outdoor Air Drybulb Temperature(Environment)                | 11.2        |
|    Site Outdoor Air Relative Humidity(Environment)                  | 71.3        |
|    Site Wind Direction(Environment)                                 | 195         |
|    Site Wind Speed(Environment)                                     | 3.87        |
|    Zone Air Relative Humidity(SPACE1-1)                             | 39.4        |
|    Zone Air Temperature(SPACE1-1)                                   | 22.1        |
|    Zone People Occupant Count(SPACE1-1)                             | 3.18        |
|    Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)           | 0.662       |
|    Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)         | 23          |
|    Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1) | 22.1        |
|    Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)           | 23.5        |
|    Zone Thermostat Heating Setpoint Temperature(SPACE1-1)           | 20.8        |
|    day                                                              | 15.7        |
|    hour                                                             | 11.5        |
|    month                                                            | 6.53        |
|    year                                                             | 1.99e+03    |
| rollout/                                                            |             |
|    ep_len_mean                                                      | 3.5e+04     |
|    ep_rew_mean                                                      | -1.94e+04   |
|    exploration_rate                                                 | 0.05        |
| time/                                                               |             |
|    episodes                                                         | 3           |
|    fps                                                              | 1101        |
|    time_elapsed                                                     | 95          |
|    total_timesteps                                                  | 105120      |
| train/                                                              |             |
|    learning_rate                                                    | 0.0001      |
|    loss                                                             | 7.99        |
|    n_updates                                                        | 13779       |
-------------------------------------------------------------------------------------
[2023-05-26 08:35:01,083] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:35:01,084] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:35:01,206] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1/Eplus-env-sub_run5
[2023-05-26 08:35:06,157] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:35:06,159] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:35:06,283] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION-res1/Eplus-env-sub_run3
[2023-05-26 08:35:21,151] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:35:21,152] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:35:21,253] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION-res1/Eplus-env-sub_run4
Eval num_timesteps=140160, episode_reward=-19774.41 +/- 0.00
Episode length: 35040.00 +/- 0.00
------------------------------------------------------------------------------------
| action/                                                             |            |
|    Cooling_Setpoint_RL                                              | 23.9       |
|    Heating_Setpoint_RL                                              | 20.5       |
| action_simulation/                                                  |            |
|    Cooling_Setpoint_RL                                              | 23.9       |
|    Heating_Setpoint_RL                                              | 20.5       |
| episode/                                                            |            |
|    comfort_violation_time(%)                                        | 35.5       |
|    cumulative_comfort_penalty                                       | -1.63e+04  |
|    cumulative_power                                                 | 2.17e+08   |
|    cumulative_power_penalty                                         | -2.17e+04  |
|    cumulative_reward                                                | -19023.477 |
|    ep_length                                                        | 35040      |
|    mean_comfort_penalty                                             | -0.466     |
|    mean_power                                                       | 6.2e+03    |
|    mean_power_penalty                                               | -0.62      |
|    mean_reward                                                      | -0.5429074 |
| eval/                                                               |            |
|    comfort_penalty                                                  | -1.85e+04  |
|    comfort_violation(%)                                             | 36.9       |
|    mean_ep_length                                                   | 3.5e+04    |
|    mean_power_consumption                                           | 2.1e+08    |
|    mean_rewards                                                     | -19774.412 |
|    power_penalty                                                    | -2.1e+04   |
|    std_rewards                                                      | 0.0        |
| observation/                                                        |            |
|    Facility Total HVAC Electricity Demand Rate(Whole Building)      | 6.2e+03    |
|    People Air Temperature(SPACE1-1 PEOPLE 1)                        | 22.1       |
|    Site Diffuse Solar Radiation Rate per Area(Environment)          | 73.6       |
|    Site Direct Solar Radiation Rate per Area(Environment)           | 115        |
|    Site Outdoor Air Drybulb Temperature(Environment)                | 11.2       |
|    Site Outdoor Air Relative Humidity(Environment)                  | 71.3       |
|    Site Wind Direction(Environment)                                 | 195        |
|    Site Wind Speed(Environment)                                     | 3.87       |
|    Zone Air Relative Humidity(SPACE1-1)                             | 39.5       |
|    Zone Air Temperature(SPACE1-1)                                   | 22.1       |
|    Zone People Occupant Count(SPACE1-1)                             | 3.18       |
|    Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)           | 0.662      |
|    Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)         | 23.4       |
|    Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1) | 22.1       |
|    Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)           | 23.9       |
|    Zone Thermostat Heating Setpoint Temperature(SPACE1-1)           | 20.5       |
|    day                                                              | 15.7       |
|    hour                                                             | 11.5       |
|    month                                                            | 6.53       |
|    year                                                             | 1.99e+03   |
| rollout/                                                            |            |
|    ep_len_mean                                                      | 3.5e+04    |
|    ep_rew_mean                                                      | -1.93e+04  |
|    exploration_rate                                                 | 0.05       |
| time/                                                               |            |
|    episodes                                                         | 4          |
|    fps                                                              | 892        |
|    time_elapsed                                                     | 157        |
|    total_timesteps                                                  | 140160     |
| train/                                                              |            |
|    learning_rate                                                    | 0.0001     |
|    loss                                                             | 14.1       |
|    n_updates                                                        | 22539      |
------------------------------------------------------------------------------------

[11]:

<stable_baselines3.dqn.dqn.DQN at 0x7fc699c89780>

Now, we save the current model (model version when training has finished).

[12]:

model.save(env.simulator._env_working_dir_parent + '/' + experiment_name)

And as always, remember to close the environment.

[13]:

env.close()

/usr/local/lib/python3.10/dist-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:192: RuntimeWarning: invalid value encountered in scalar divide
  ret = ret.dtype.type(ret / rcount)
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:269: RuntimeWarning: Degrees of freedom <= 0 for slice
  ret = _var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:226: RuntimeWarning: invalid value encountered in divide
  arrmean = um.true_divide(arrmean, div, out=arrmean,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:261: RuntimeWarning: invalid value encountered in scalar divide
  ret = ret.dtype.type(ret / rcount)

[2023-05-26 08:35:26,449] EPLUS_ENV_SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.

We have to upload all Sinergym output as wandb artifact. This output include all sinergym_output (and LoggerWrapper CSV files) and models generated in training and evaluation episodes.

[14]:

artifact = wandb.Artifact(
        name="training",
        type="experiment1")
artifact.add_dir(
        env.simulator._env_working_dir_parent,
        name='training_output/')
artifact.add_dir(
    eval_env.simulator._env_working_dir_parent,
    name='evaluation_output/')
run.log_artifact(artifact)

# wandb has finished
run.finish()

wandb: Adding directory to artifact (/workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32-res1)... Done. 0.1s
wandb: Adding directory to artifact (/workspaces/sinergym/examples/Eplus-env-SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_EVALUATION-res1)... Done. 0.1s

Waiting for W&B process to finish... (success).

Run history:

action/Cooling_Setpoint_RL	█▂▁▂
action/Heating_Setpoint_RL	▁▇█▇
action_simulation/Cooling_Setpoint_RL	█▂▁▂
action_simulation/Heating_Setpoint_RL	▁▇█▇
episode/comfort_violation_time(%)	█▇▁▁
episode/cumulative_comfort_penalty	▁▆█▇
episode/cumulative_power	▁▅██
episode/cumulative_power_penalty	█▄▁▁
episode/cumulative_reward	▁▆█▇
episode/ep_length	▁▁▁▁
episode/mean_comfort_penalty	▁▆█▇
episode/mean_power	▁▅██
episode/mean_power_penalty	█▄▁▁
episode/mean_reward	▁▆█▇
eval/comfort_penalty	█▁
eval/comfort_violation(%)	▁█
eval/mean_ep_length	▁▁
eval/mean_power_consumption	█▁
eval/mean_rewards	█▁
eval/power_penalty	▁█
eval/std_rewards	▁▁
observation/Facility Total HVAC Electricity Demand Rate(Whole Building)	▁▅██
observation/People Air Temperature(SPACE1-1 PEOPLE 1)	▁▄██
observation/Site Diffuse Solar Radiation Rate per Area(Environment)	▁▁▁▁
observation/Site Direct Solar Radiation Rate per Area(Environment)	▁▁▁▁
observation/Site Outdoor Air Drybulb Temperature(Environment)	▁▁▁▁
observation/Site Outdoor Air Relative Humidity(Environment)	▁▁▁▁
observation/Site Wind Direction(Environment)	▁▁▁▁
observation/Site Wind Speed(Environment)	▁▁▁▁
observation/Zone Air Relative Humidity(SPACE1-1)	█▂▁▂
observation/Zone Air Temperature(SPACE1-1)	▁▄██
observation/Zone People Occupant Count(SPACE1-1)	▁▁▁▁
observation/Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)	▁▁▁▁
observation/Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)	█▄▁▁
observation/Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1)	▁▃██
observation/Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)	█▂▁▂
observation/Zone Thermostat Heating Setpoint Temperature(SPACE1-1)	▁▇█▇
observation/day	▁▁▁▁
observation/hour	▁▁▁▁
observation/month	▁▁▁▁
observation/year	▁▁▁▁
rollout/ep_len_mean	▁▁▁▁
rollout/ep_rew_mean	▁▅▇█
rollout/exploration_rate	▁▁▁▁
time/episodes	▁▃▆█
time/fps	█▃▂▁
time/time_elapsed	▁▃▅█
time/total_timesteps	▁▃▆█
train/learning_rate	▁▁▁
train/loss	█▁▃
train/n_updates	▁▅█

Run summary:

action/Cooling_Setpoint_RL	23.90848
action/Heating_Setpoint_RL	20.47078
action_simulation/Cooling_Setpoint_RL	23.90848
action_simulation/Heating_Setpoint_RL	20.47078
episode/comfort_violation_time(%)	35.45091
episode/cumulative_comfort_penalty	-16322.24555
episode/cumulative_power	217247078.16777
episode/cumulative_power_penalty	-21724.70782
episode/cumulative_reward	-19023.47656
episode/ep_length	35040
episode/mean_comfort_penalty	-0.46582
episode/mean_power	6199.97369
episode/mean_power_penalty	-0.62
episode/mean_reward	-0.54291
eval/comfort_penalty	-18519.07108
eval/comfort_violation(%)	36.88071
eval/mean_ep_length	35040.0
eval/mean_power_consumption	210295875.7211
eval/mean_rewards	-19774.41211
eval/power_penalty	-21029.58757
eval/std_rewards	0.0
observation/Facility Total HVAC Electricity Demand Rate(Whole Building)	6199.7935
observation/People Air Temperature(SPACE1-1 PEOPLE 1)	22.13362
observation/Site Diffuse Solar Radiation Rate per Area(Environment)	73.57191
observation/Site Direct Solar Radiation Rate per Area(Environment)	115.15166
observation/Site Outdoor Air Drybulb Temperature(Environment)	11.24046
observation/Site Outdoor Air Relative Humidity(Environment)	71.29315
observation/Site Wind Direction(Environment)	194.72517
observation/Site Wind Speed(Environment)	3.86764
observation/Zone Air Relative Humidity(SPACE1-1)	39.50717
observation/Zone Air Temperature(SPACE1-1)	22.13422
observation/Zone People Occupant Count(SPACE1-1)	3.17908
observation/Zone Thermal Comfort Clothing Value(SPACE1-1 PEOPLE 1)	0.66228
observation/Zone Thermal Comfort Fanger Model PPD(SPACE1-1 PEOPLE 1)	23.38245
observation/Zone Thermal Comfort Mean Radiant Temperature(SPACE1-1 PEOPLE 1)	22.07272
observation/Zone Thermostat Cooling Setpoint Temperature(SPACE1-1)	23.90856
observation/Zone Thermostat Heating Setpoint Temperature(SPACE1-1)	20.47075
observation/day	15.72055
observation/hour	11.5
observation/month	6.52603
observation/year	1991.0
rollout/ep_len_mean	35040.0
rollout/ep_rew_mean	-19338.34824
rollout/exploration_rate	0.05
time/episodes	4
time/fps	892
time/time_elapsed	157
time/total_timesteps	140160
train/learning_rate	0.0001
train/loss	14.10438
train/n_updates	22539

View run SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_g5fey5jp at: https://wandb.ai/alex_ugr/sinergym/runs/12zkrpfr
Synced 5 W&B file(s), 0 media file(s), 180 artifact file(s) and 0 other file(s)

Find logs at: ./wandb/run-20230526_083242-12zkrpfr/logs

We have all the experiments results in our local computer, but we can see the execution in wandb too:

If we check our projects, we can see the execution allocated:

Hyperparameters tracked in the training experiment:

Artifacts registered (if evaluation is enabled, best model is registered too):

Visualization of metrics in real time:

21.2. Loading a model

We are going to rely on the script available in the repository root called load_agent.py. This script applies all the possibilities that Sinergym has to work with deep reinforcement learning models loaded and set parameters to everything so that we can define the load options from the execution of the script easily by a JSON file.

For more information about how run load_agent.py, please, see Load a trained model.

First we define the Sinergym environment ID where we want to check the loaded agent and the name of the evaluation experiment.

[15]:

# Environment ID
environment = "Eplus-demo-v1"
# Episodes
episodes=5
# Evaluation name
evaluation_date = datetime.today().strftime('%Y-%m-%d_%H:%M')
evaluation_name = 'SB3_DQN-EVAL-' + environment + \
    '-episodes-' + str(episodes)
evaluation_name += '_' + evaluation_date

We can also use wandb here. We can allocate this evaluation of a loaded model in other project in order to not merge experiments.

[16]:

# Create wandb.config object in order to log all experiment params
experiment_params = {
    'sinergym-version': sinergym.__version__,
    'python-version': sys.version
}
experiment_params.update({'environment':environment,
                          'episodes':episodes,
                          'algorithm':'SB3_DQN'})

# Get wandb init params (you have to specify your own project and entity)
wandb_params = {"project": 'sinergym_evaluations',
                "entity": 'alex_ugr'}
# Init wandb entry
run = wandb.init(
    name=experiment_name + '_' + wandb.util.generate_id(),
    config=experiment_params,
    ** wandb_params
)

wandb version 0.15.3 is available! To upgrade, please run: $ pip install wandb --upgrade

Tracking run with wandb version 0.15.2

Run data is saved locally in /workspaces/sinergym/examples/wandb/run-20230526_083542-raly3l0r

Syncing run SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_awf23a8c to Weights & Biases (docs)

View project at https://wandb.ai/alex_ugr/sinergym_evaluations

View run at https://wandb.ai/alex_ugr/sinergym_evaluations/runs/raly3l0r

We make the gymnasium environment and wrap with LoggerWrapper. We can use the evaluation experiment name to rename the environment.

[17]:

env=gym.make(environment, env_name=evaluation_name)
env=LoggerWrapper(env)

[2023-05-26 08:35:44,274] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Updating Building model ExternalInterface object if it is not present...
[2023-05-26 08:35:44,275] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Updating Building model Site:Location and SizingPeriod:DesignDay(s) to weather and ddy file...
[2023-05-26 08:35:44,277] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Updating building model OutPut:Variable and variables XML tree model for BVCTB connection.
[2023-05-26 08:35:44,277] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Setting up extra configuration in building model if exists...
[2023-05-26 08:35:44,278] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Setting up action definition in building model if exists...

We load the Stable Baselines 3 DQN model using the model allocated in our local computer, although we can use a remote model allocated in wandb from other training experiment.

[18]:

# get wandb artifact path (to load model)
load_artifact_entity = 'alex_ugr'
load_artifact_project = 'sinergym'
load_artifact_name = 'training'
load_artifact_tag = 'latest'
load_artifact_model_path = 'evaluation_output/best_model/model.zip'
wandb_path = load_artifact_entity + '/' + load_artifact_project + \
    '/' + load_artifact_name + ':' + load_artifact_tag
# Download artifact
artifact = run.use_artifact(wandb_path)
artifact.get_path(load_artifact_model_path).download('.')
# Set model path to local wandb file downloaded
model_path = './' + load_artifact_model_path
model = DQN.load(model_path)

As we can see, The wandb model we want to load can come from an artifact of an different entity or project from the one we are using to register the evaluation of the loaded model, as long as it is accessible. The next step is use the model to predict actions and interact with the environment in order to collect data to evaluate the model.

[19]:

for i in range(episodes):
    obs, info = env.reset()
    rewards = []
    terminated = False
    current_month = 0
    while not terminated:
        a, _ = model.predict(obs)
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:
            current_month = info['month']
            print(info['month'], sum(rewards))
    print(
        'Episode ',
        i,
        'Mean reward: ',
        np.mean(rewards),
        'Cumulative reward: ',
        sum(rewards))
env.close()

[2023-05-26 08:35:45,677] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:35:45,812] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1/Eplus-env-sub_run1

/usr/local/lib/python3.10/dist-packages/opyplus/weather_data/weather_data.py:493: FutureWarning: the 'line_terminator'' keyword is deprecated, use 'lineterminator' instead.
  epw_content = self._headers_to_epw(use_datetimes=use_datetimes) + df.to_csv(

1 -0.830348444547302
2 -1730.6829453012092
3 -3545.7671945926318
4 -4692.961008551935
5 -5469.236536607371
6 -6235.107970886588
7 -8528.284908692733
8 -10984.995142143025
9 -13428.543608769201
10 -15618.815441406427
11 -16695.779938547083
12 -17668.211895692
1 -19226.671112311986
Episode  0 Mean reward:  -0.5487063673605038 Cumulative reward:  -19226.671112311986
[2023-05-26 08:36:03,143] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:36:03,144] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:36:03,257] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1/Eplus-env-sub_run2
1 -0.830348444547302
2 -1747.1947558564166
3 -3579.724841715917
4 -4733.712733481687
5 -5506.35487121334
6 -6267.90523650068
7 -8539.592252228094
8 -10997.893085195392
9 -13418.111770139198
10 -15604.56401582258
11 -16674.272787829606
12 -17645.406855033856
1 -19198.51769070178
Episode  1 Mean reward:  -0.5479029021319058 Cumulative reward:  -19198.51769070178
[2023-05-26 08:36:16,890] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:36:16,891] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:36:17,007] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1/Eplus-env-sub_run3
1 -0.830348444547302
2 -1735.8569049386824
3 -3568.8932851123354
4 -4717.441021523517
5 -5482.645461224451
6 -6237.297471848277
7 -8529.04871666093
8 -11031.1711091575
9 -13442.477955723938
10 -15628.77803902472
11 -16692.899604523893
12 -17666.32788194328
1 -19220.932640063565
Episode  2 Mean reward:  -0.5485425981753338 Cumulative reward:  -19220.932640063565
[2023-05-26 08:36:31,292] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:36:31,293] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:36:31,406] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1/Eplus-env-sub_run4
1 -0.830348444547302
2 -1726.736947340626
3 -3555.194956363846
4 -4700.299221364219
5 -5481.466016498881
6 -6242.313422911042
7 -8528.844295105855
8 -10989.006549771846
9 -13341.503460272028
10 -15538.908847744931
11 -16610.7968756126
12 -17590.823164498117
1 -19146.143077620738
Episode  3 Mean reward:  -0.5464081928544774 Cumulative reward:  -19146.143077620738
[2023-05-26 08:36:44,843] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus episode completed successfully.
[2023-05-26 08:36:44,844] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:Creating new EnergyPlus simulation episode...
[2023-05-26 08:36:44,946] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus working directory is in /workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1/Eplus-env-sub_run5
1 -0.830348444547302
2 -1736.656630362896
3 -3548.611853205083
4 -4702.086039366923
5 -5485.046598736376
6 -6250.707072069005
7 -8572.283180745597
8 -11071.885206430767
9 -13520.062110635592
10 -15696.123332977162
11 -16759.107587250033
12 -17730.209753990493
1 -19286.618094543086
Episode  4 Mean reward:  -0.5504171830634469 Cumulative reward:  -19286.618094543086
[2023-05-26 08:36:58,137] EPLUS_ENV_SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35_MainThread_ROOT INFO:EnergyPlus simulation closed successfully.

Finally, we register the evaluation data in wandb as an artifact to save it.

[20]:

artifact = wandb.Artifact(
    name="evaluating",
    type="evaluation1")
artifact.add_dir(
    env.simulator._env_working_dir_parent,
    name='evaluation_output/')

run.log_artifact(artifact)

# wandb has finished
run.finish()

wandb: Adding directory to artifact (/workspaces/sinergym/examples/Eplus-env-SB3_DQN-EVAL-Eplus-demo-v1-episodes-5_2023-05-26_08:35-res1)... Done. 0.2s

Waiting for W&B process to finish... (success).

View run SB3_DQN-Eplus-demo-v1-episodes-4_2023-05-26_08:32_awf23a8c at: https://wandb.ai/alex_ugr/sinergym_evaluations/runs/raly3l0r
Synced 5 W&B file(s), 0 media file(s), 101 artifact file(s) and 0 other file(s)

Find logs at: ./wandb/run-20230526_083542-raly3l0r/logs

We have the loaded model results in our local computer, but we can see the execution in wandb too:

If we check the wandb project list, we can see that sinergym_evaluations project has a new run:

Hyperparameters tracked in the evaluation experiment and we can see the previous training artifact used to load the model:

Artifact registered with Sinergym Output (and CSV files generated with the Logger Wrapper):