Implementation for epsilon greedy policy. More...
Public Types | |
| using | ActionType = typename EnvironmentType::Action |
| Convenient typedef for action. More... | |
Public Member Functions | |
| GreedyPolicy (const double initialEpsilon, const size_t annealInterval, const double minEpsilon, const double decayRate=1.0) | |
| Constructor for epsilon greedy policy class. More... | |
| void | Anneal () |
| Exploration probability will anneal at each step. More... | |
| const double & | Epsilon () const |
| ActionType | Sample (const arma::colvec &actionValue, bool deterministic=false, const bool isNoisy=false) |
| Sample an action based on given action values. More... | |
Implementation for epsilon greedy policy.
In general we will select an action greedily based on the action value, however sometimes we will also randomly select an action to encourage exploration.
| EnvironmentType | The reinforcement learning task. |
Definition at line 31 of file greedy_policy.hpp.
| using ActionType = typename EnvironmentType::Action |
Convenient typedef for action.
Definition at line 35 of file greedy_policy.hpp.
|
inline |
Constructor for epsilon greedy policy class.
| initialEpsilon | The initial probability to explore (select a random action). |
| annealInterval | The steps during which the probability to explore will anneal. |
| minEpsilon | Epsilon will never be less than this value. |
| decayRate | How much to change the model in response to the estimated error each time the model weights are updated. |
Definition at line 48 of file greedy_policy.hpp.
|
inline |
Exploration probability will anneal at each step.
Definition at line 90 of file greedy_policy.hpp.
|
inline |
Definition at line 99 of file greedy_policy.hpp.
|
inline |
Sample an action based on given action values.
| actionValue | Values for each action. |
| deterministic | Always select the action greedily. |
| isNoisy | Specifies whether the network used is noisy. |
Definition at line 65 of file greedy_policy.hpp.
References mlpack::math::RandInt(), and mlpack::math::Random().