sinergym.utils.wrappers.MultiObjectiveReward
- class sinergym.utils.wrappers.MultiObjectiveReward(env: Env, reward_terms: List[str])
- __init__(env: Env, reward_terms: List[str])
The environment will return a reward vector of each objective instead of a scalar value.
- Parameters:
env (Env) – Original Sinergym environment.
reward_terms (List[str]) – List of keys in reward terms which will be included in reward vector.
Methods
__init__(env, reward_terms)The environment will return a reward vector of each objective instead of a scalar value.
class_name()Returns the class name of the wrapper.
close()Closes the wrapper and
env.get_wrapper_attr(name)Gets an attribute from the wrapper and lower environments if name doesn't exist in this object.
has_wrapper_attr(name)Checks if the given attribute is within the wrapper or its environment.
render()Uses the
render()of theenvthat can be overwritten to change the returned data.reset(*[, seed, options])Uses the
reset()of theenvthat can be overwritten to change the returned data.set_wrapper_attr(name, value, *[, force])Sets an attribute on this wrapper or lower environment if name is already defined.
step(action)Perform the action and environment return reward vector.
wrapper_spec(**kwargs)Generates a WrapperSpec for the wrappers.
Attributes
action_spaceReturn the
Envaction_spaceunless overwritten then the wrapperaction_spaceis used.metadataReturns the
Envmetadata.np_randomReturns the
Envnp_randomattribute.np_random_seedReturns the base environment's
np_random_seed.observation_spaceReturn the
Envobservation_spaceunless overwritten then the wrapperobservation_spaceis used.render_modeReturns the
Envrender_mode.specReturns the
Envspecattribute with the WrapperSpec if the wrapper inherits from EzPickle.unwrappedReturns the base environment of the wrapper.
- logger = <Logger WRAPPER MultiObjectiveReward (INFO)>
- step(action: int | ndarray) Tuple[ndarray, List[float], bool, bool, Dict[str, Any]]
Perform the action and environment return reward vector. If reward term is not in info reward_terms, it will be ignored.
- Parameters:
action (Union[int, np.ndarray]) – Action to be executed in environment.
- Returns:
observation, vector reward, terminated, truncated and info.
- Return type:
Tuple[ np.ndarray, List[float], bool, bool, Dict[str, Any]]