2. Usage example

If you used our Dockerfile during installation, you should have the try_env.py file in your workspace as soon as you enter in. In case you have installed everything on your local machine directly, place it inside our cloned repository. In any case, we start from the point that you have at your disposal a terminal with the appropriate python version and Sinergym running correctly.

Let’s start with the simplest use case for the Sinergym tool. In the root repository we have the script try_env.py:

import gymnasium as gym
import numpy as np
from gymnasium.wrappers.normalize import NormalizeReward

import sinergym
from sinergym.utils.wrappers import (LoggerWrapper, NormalizeAction,
                                     NormalizeObservation)

env = gym.make('Eplus-demo-v1')
env = NormalizeAction(env)
env = NormalizeObservation(env)
env = NormalizeReward(env)
env = LoggerWrapper(env)

for i in range(1):
    obs, info = env.reset()
    rewards = []
    terminated = False
    current_month = 0
    while not terminated:
        a = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            print('Reward: ', sum(rewards), info)
    print(
        'Episode ',
        i,
        'Mean reward: ',
        np.mean(rewards),
        'Cumulative reward: ',
        sum(rewards))
env.close()

At first glance, it may appear that Sinergym is only imported, but never used. Importing Sinergym, all its Environments are defined to be used. In this case, Eplus-demo-v1 is available with all the features contained.

We create our environment with gym.make and we run the simulation for one episode (for i in range(1)). We collect the rewards returned by the environment and calculate their average each month of simulation.

The action taken at each step is randomly chosen from its action space defined under the Gymnasium standard. When we have finished displaying the results on the screen and the episode is finished, we close the environment with env.close().

Important

This is the simplest usage example. More functionality examples are shown in Examples section.