sinergym.utils.wrappers.MultiObjectiveReward
- class sinergym.utils.wrappers.MultiObjectiveReward(env: Env, reward_terms: List[str])
- __init__(env: Env, reward_terms: List[str])
The environment will return a reward vector of each objective instead of a scalar value.
- Parameters:
env (Env) – Original Sinergym environment.
reward_terms (List[str]) – List of keys in reward terms which will be included in reward vector.
Methods
__init__
(env, reward_terms)The environment will return a reward vector of each objective instead of a scalar value.
class_name
()Returns the class name of the wrapper.
close
()Closes the wrapper and
env
.get_wrapper_attr
(name)Gets an attribute from the wrapper and lower environments if name doesn't exist in this object.
render
()Uses the
render()
of theenv
that can be overwritten to change the returned data.reset
(*[, seed, options])Uses the
reset()
of theenv
that can be overwritten to change the returned data.step
(action)Perform the action and environment return reward vector.
wrapper_spec
(**kwargs)Generates a WrapperSpec for the wrappers.
Attributes
action_space
Return the
Env
action_space
unless overwritten then the wrapperaction_space
is used.metadata
Returns the
Env
metadata
.np_random
Returns the
Env
np_random
attribute.observation_space
Return the
Env
observation_space
unless overwritten then the wrapperobservation_space
is used.render_mode
Returns the
Env
render_mode
.reward_range
Return the
Env
reward_range
unless overwritten then the wrapperreward_range
is used.spec
Returns the
Env
spec
attribute with the WrapperSpec if the wrapper inherits from EzPickle.unwrapped
Returns the base environment of the wrapper.
- logger = <Logger WRAPPER MultiObjectiveReward (INFO)>
- step(action: int | ndarray) Tuple[ndarray, List[float], bool, bool, Dict[str, Any]]
Perform the action and environment return reward vector. If reward term is not in info reward_terms, it will be ignored.
- Parameters:
action (Union[int, np.ndarray]) – Action to be executed in environment.
- Returns:
observation, vector reward, terminated, truncated and info.
- Return type:
Tuple[ np.ndarray, List[float], bool, bool, Dict[str, Any]]