2. Usage Example

Assuming you’ve used our Dockerfile for installation, the try_env.py file should already be in your workspace. If you’ve installed everything directly on your local machine, ensure this file is placed within our cloned repository. Regardless of your installation method, you should have a terminal ready with the appropriate Python version and Sinergym running correctly.

We’ll begin with the most straightforward use case for the Sinergym tool. In the root repository, you’ll find the script try_env.py:

import logging

import gymnasium as gym
import numpy as np
from gymnasium.wrappers.normalize import NormalizeReward

import sinergym
from sinergym.utils.logger import TerminalLogger
from sinergym.utils.wrappers import (LoggerWrapper, NormalizeAction,
                                     NormalizeObservation)

# Optional: Terminal log in the same format as Sinergym.
# Logger info can be replaced by print.
terminal_logger = TerminalLogger()
logger = terminal_logger.getLogger(
    name='MAIN',
    level=logging.INFO
)

env = gym.make('Eplus-demo-v1')
env = NormalizeAction(env)
env = NormalizeObservation(env)
env = NormalizeReward(env)
env = LoggerWrapper(env)

# Execute interactions during 1 episode
for i in range(1):
    # Reset the environment to start a new episode
    obs, info = env.reset()
    rewards = []
    truncated = terminated = False
    current_month = 0
    while not (terminated or truncated):
        # Random action control
        a = env.action_space.sample()
        # Read observation and reward
        obs, reward, terminated, truncated, info = env.step(a)
        rewards.append(reward)
        # If this timestep is a new month start
        if info['month'] != current_month:  # display results every month
            current_month = info['month']
            # Print information
            logger.info('Reward: {}'.format(sum(rewards)))
            logger.info('Info: {}'.format(info))
    # Final episode information print
    logger.info('Episode {} - Mean reward: {} - Cumulative Reward: {}'.format(i,
                                                                              np.mean(rewards), sum(rewards)))
env.close()

Upon initial inspection, it might seem that Sinergym is imported but not utilized. However, importing Sinergym defines all its Environments for use. In this instance, Eplus-demo-v1 is readily available with all its features.

We instantiate our environment using gym.make and execute the simulation for a single episode (for i in range(1)). The rewards returned by the environment are collected and their monthly average is computed.

Each step’s action is randomly selected from its action space, as defined by the Gymnasium standard. Once the results are displayed and the episode concludes, we terminate the environment with env.close().

Important

This represents the most basic usage example. Additional functional examples can be found in the Examples section.