Implementation for epsilon greedy policy. More...

Public Types
using	ActionType = typename EnvironmentType::Action
	Convenient typedef for action. More...

Public Member Functions
	GreedyPolicy (const double initialEpsilon, const size_t annealInterval, const double minEpsilon, const double decayRate=1.0)
	Constructor for epsilon greedy policy class. More...

void	Anneal ()
	Exploration probability will anneal at each step. More...

const double &	Epsilon () const

ActionType	Sample (const arma::colvec &actionValue, bool deterministic=false, const bool isNoisy=false)
	Sample an action based on given action values. More...

Detailed Description

template
<
typename
EnvironmentType
>

class mlpack::rl::GreedyPolicy< EnvironmentType >

Implementation for epsilon greedy policy.

In general we will select an action greedily based on the action value, however sometimes we will also randomly select an action to encourage exploration.

Template Parameters

EnvironmentType The reinforcement learning task.

Definition at line 31 of file greedy_policy.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 35 of file greedy_policy.hpp.

Constructor & Destructor Documentation

◆ GreedyPolicy()

GreedyPolicy	(	const double	initialEpsilon,
		const size_t	annealInterval,
		const double	minEpsilon,
		const double	decayRate = `1.0`
	)

inline

Constructor for epsilon greedy policy class.

Parameters

initialEpsilon	The initial probability to explore (select a random action).
annealInterval	The steps during which the probability to explore will anneal.
minEpsilon	Epsilon will never be less than this value.
decayRate	How much to change the model in response to the estimated error each time the model weights are updated.

Definition at line 48 of file greedy_policy.hpp.

Member Function Documentation

◆ Anneal()

void Anneal ( )

inline

Exploration probability will anneal at each step.

Definition at line 90 of file greedy_policy.hpp.

◆ Epsilon()

const double& Epsilon ( ) const

inline

Returns: Current possibility to explore.

Definition at line 99 of file greedy_policy.hpp.

◆ Sample()

ActionType Sample	(	const arma::colvec &	actionValue,
		bool	deterministic = `false`,
		const bool	isNoisy = `false`
	)

inline

Sample an action based on given action values.

Parameters

actionValue	Values for each action.
deterministic	Always select the action greedily.
isNoisy	Specifies whether the network used is noisy.

Returns: Sampled action.

Definition at line 65 of file greedy_policy.hpp.

References mlpack::math::RandInt(), and mlpack::math::Random().

The documentation for this class was generated from the following file:

/home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/methods/reinforcement_learning/policy/greedy_policy.hpp

Public Types

Public Member Functions

Detailed Description

template<typenameEnvironmentType> class mlpack::rl::GreedyPolicy< EnvironmentType >

Member Typedef Documentation

◆ ActionType

Constructor & Destructor Documentation

◆ GreedyPolicy()

Member Function Documentation

◆ Anneal()

◆ Epsilon()

◆ Sample()

template
<
typename
EnvironmentType
>

class mlpack::rl::GreedyPolicy< EnvironmentType >