sinergym.utils.wrappers.MultiObjectiveReward

class sinergym.utils.wrappers.MultiObjectiveReward(env: Env, reward_terms: List[str])

__init__(env: Env, reward_terms: List[str])

The environment will return a reward vector of each objective instead of a scalar value.

Parameters:

env (Env) – Original Sinergym environment.
reward_terms (List[str]) – List of keys in reward terms which will be included in reward vector.

Methods

`__init__`(env, reward_terms)	The environment will return a reward vector of each objective instead of a scalar value.
`class_name`()	Returns the class name of the wrapper.
`close`()	Closes the wrapper and `env`.
`get_wrapper_attr`(name)	Gets an attribute from the wrapper and lower environments if name doesn't exist in this object.
`has_wrapper_attr`(name)	Checks if the given attribute is within the wrapper or its environment.
`render`()	Uses the `render()` of the `env` that can be overwritten to change the returned data.
`reset`(*[, seed, options])	Uses the `reset()` of the `env` that can be overwritten to change the returned data.
`set_wrapper_attr`(name, value, *[, force])	Sets an attribute on this wrapper or lower environment if name is already defined.
`step`(action)	Perform the action and environment return reward vector.
`wrapper_spec`(**kwargs)	Generates a WrapperSpec for the wrappers.

Attributes

`action_space`	Return the `Env` `action_space` unless overwritten then the wrapper `action_space` is used.
`logger`
`metadata`	Returns the `Env` `metadata`.
`np_random`	Returns the `Env` `np_random` attribute.
`np_random_seed`	Returns the base environment's `np_random_seed`.
`observation_space`	Return the `Env` `observation_space` unless overwritten then the wrapper `observation_space` is used.
`render_mode`	Returns the `Env` `render_mode`.
`spec`	Returns the `Env` `spec` attribute with the WrapperSpec if the wrapper inherits from EzPickle.
`unwrapped`	Returns the base environment of the wrapper.

step(action: int | ndarray) → Tuple[ndarray, List[float], bool, bool, Dict[str, Any]]

Perform the action and environment return reward vector. If reward term is not in info reward_terms, it will be ignored.

Parameters:: action (Union[int, np.ndarray]) – Action to be executed in environment.
Returns:: observation, vector reward, terminated, truncated and info.
Return type:: Tuple[ np.ndarray, List[float], bool, bool, Dict[str, Any]]