Interface for clipping the reward to some value between the specified maximum and minimum value (Clipping here is implemented as .) More...
Public Types | |
using | Action = typename EnvironmentType::Action |
Convenient typedef for action. More... | |
using | State = typename EnvironmentType::State |
Convenient typedef for state. More... | |
Public Member Functions | |
RewardClipping (EnvironmentType &environment, const double minReward=-1.0, const double maxReward=1.0) | |
Constructor for creating a RewardClipping instance. More... | |
EnvironmentType & | Environment () const |
Get the environment. More... | |
EnvironmentType & | Environment () |
Modify the environment. More... | |
State | InitialSample () |
The InitialSample method is called by the environment to initialize the starting state. More... | |
bool | IsTerminal (const State &state) const |
Checks whether given state is a terminal state. More... | |
double | MaxReward () const |
Get the maximum reward value. More... | |
double & | MaxReward () |
Modify the maximum reward value. More... | |
double | MinReward () const |
Get the minimum reward value. More... | |
double & | MinReward () |
Modify the minimum reward value. More... | |
double | Sample (const State &state, const Action &action, State &nextState) |
Dynamics of Environment. More... | |
double | Sample (const State &state, const Action &action) |
Dynamics of Environment. More... | |
Interface for clipping the reward to some value between the specified maximum and minimum value (Clipping here is implemented as .)
EnvironmentType | A type of Environment that is being wrapped. |
Definition at line 30 of file reward_clipping.hpp.
using Action = typename EnvironmentType::Action |
Convenient typedef for action.
Definition at line 37 of file reward_clipping.hpp.
using State = typename EnvironmentType::State |
Convenient typedef for state.
Definition at line 34 of file reward_clipping.hpp.
|
inline |
Constructor for creating a RewardClipping instance.
minReward | Minimum possible value of clipped reward. |
maxReward | Maximum possible value of clipped reward. |
environment | An instance of the environment used for actual simulations. |
Definition at line 47 of file reward_clipping.hpp.
|
inline |
Get the environment.
Definition at line 113 of file reward_clipping.hpp.
|
inline |
Modify the environment.
Definition at line 115 of file reward_clipping.hpp.
|
inline |
The InitialSample method is called by the environment to initialize the starting state.
Returns whatever Initial Sample is returned by the environment.
Definition at line 62 of file reward_clipping.hpp.
|
inline |
Checks whether given state is a terminal state.
Returns the value by calling the environment method.
state | desired state. |
Definition at line 74 of file reward_clipping.hpp.
|
inline |
Get the maximum reward value.
Definition at line 123 of file reward_clipping.hpp.
|
inline |
Modify the maximum reward value.
Definition at line 125 of file reward_clipping.hpp.
|
inline |
Get the minimum reward value.
Definition at line 118 of file reward_clipping.hpp.
|
inline |
Modify the minimum reward value.
Definition at line 120 of file reward_clipping.hpp.
Dynamics of Environment.
The rewards returned from the base environment are clipped according the maximum and minimum values specified.
state | The current state. |
action | The current action. |
nextState | The next state. |
Definition at line 88 of file reward_clipping.hpp.
References mlpack::math::ClampRange().
Referenced by RewardClipping< EnvironmentType >::Sample().
Dynamics of Environment.
The rewards returned from the base environment are clipped according the maximum and minimum values specified.
state | The current state. |
action | The current action. |
Definition at line 106 of file reward_clipping.hpp.
References RewardClipping< EnvironmentType >::Sample().