Implementation of various Q-Learning algorithms, such as DQN, double DQN. More...

Public Types
using	ActionType = typename EnvironmentType::Action
	Convenient typedef for action. More...

using	StateType = typename EnvironmentType::State
	Convenient typedef for state. More...

Public Member Functions
	QLearning (TrainingConfig &config, NetworkType &network, PolicyType &policy, ReplayType &replayMethod, UpdaterType updater=UpdaterType(), EnvironmentType environment=EnvironmentType())
	Create the QLearning object with given settings. More...

	~QLearning ()
	Clean memory. More...

const ActionType &	Action () const
	Get the action of the agent. More...

bool &	Deterministic ()
	Modify the training mode / test mode indicator. More...

const bool &	Deterministic () const
	Get the indicator of training mode / test mode. More...

EnvironmentType &	Environment ()
	Modify the environment in which the agent is. More...

const EnvironmentType &	Environment () const
	Get the environment in which the agent is. More...

double	Episode ()
	Execute an episode. More...

const NetworkType &	Network () const
	Return the learning network. More...

NetworkType &	Network ()
	Modify the learning network. More...

void	SelectAction ()
	Select an action, given an agent. More...

StateType &	State ()
	Modify the state of the agent. More...

const StateType &	State () const
	Get the state of the agent. More...

size_t &	TotalSteps ()
	Modify total steps from beginning. More...

const size_t &	TotalSteps () const
	Get total steps from beginning. More...

void	TrainAgent ()
	Trains the DQN agent(non-categorical). More...

void	TrainCategoricalAgent ()
	Trains the DQN agent of categorical type. More...

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Implementation of various Q-Learning algorithms, such as DQN, double DQN.

For more details, see the following:

@article{Mnih2013,
 author    = {Volodymyr Mnih and
              Koray Kavukcuoglu and
              David Silver and
              Alex Graves and
              Ioannis Antonoglou and
              Daan Wierstra and
              Martin A. Riedmiller},
 title     = {Playing Atari with Deep Reinforcement Learning},
 journal   = {CoRR},
 year      = {2013},
 url       = {http://arxiv.org/abs/1312.5602}
}

Template Parameters

EnvironmentType	The environment of the reinforcement learning task.
NetworkType	The network to compute action value.
UpdaterType	How to apply gradients when training.
PolicyType	Behavior policy of the agent.
ReplayType	Experience replay method.

Definition at line 59 of file q_learning.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 66 of file q_learning.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 63 of file q_learning.hpp.

Constructor & Destructor Documentation

◆ QLearning()

QLearning	(	TrainingConfig &	config,
		NetworkType &	network,
		PolicyType &	policy,
		ReplayType &	replayMethod,
		UpdaterType	updater = `UpdaterType()`,
		EnvironmentType	environment = `EnvironmentType()`
	)

Create the QLearning object with given settings.

If you want to pass in a parameter and discard the original parameter object, be sure to use std::move to avoid unnecessary copy.

Parameters

config	Hyper-parameters for training.
network	The network to compute action value.
policy	Behavior policy of the agent.
replayMethod	Experience replay method.
updater	How to apply gradients when training.
environment	Reinforcement learning task.

◆ ~QLearning()

~QLearning ( )

Clean memory.

Member Function Documentation

◆ Action()

const ActionType& Action ( ) const

inline

Get the action of the agent.

Definition at line 125 of file q_learning.hpp.

◆ Deterministic() [1/2]

bool& Deterministic ( )

inline

Modify the training mode / test mode indicator.

Definition at line 133 of file q_learning.hpp.

◆ Deterministic() [2/2]

const bool& Deterministic ( ) const

inline

Get the indicator of training mode / test mode.

Definition at line 135 of file q_learning.hpp.

◆ Environment() [1/2]

EnvironmentType& Environment ( )

inline

Modify the environment in which the agent is.

Definition at line 128 of file q_learning.hpp.

◆ Environment() [2/2]

const EnvironmentType& Environment ( ) const

inline

Get the environment in which the agent is.

Definition at line 130 of file q_learning.hpp.

◆ Episode()

double Episode ( )

Execute an episode.

Returns: Return of the episode.

◆ Network() [1/2]

const NetworkType& Network ( ) const

inline

Return the learning network.

Definition at line 138 of file q_learning.hpp.

◆ Network() [2/2]

NetworkType& Network ( )

inline

Modify the learning network.

Definition at line 140 of file q_learning.hpp.

◆ SelectAction()

void SelectAction ( )

Select an action, given an agent.

◆ State() [1/2]

StateType& State ( )

inline

Modify the state of the agent.

Definition at line 120 of file q_learning.hpp.

◆ State() [2/2]

const StateType& State ( ) const

inline

Get the state of the agent.

Definition at line 122 of file q_learning.hpp.

◆ TotalSteps() [1/2]

size_t& TotalSteps ( )

inline

Modify total steps from beginning.

Definition at line 115 of file q_learning.hpp.

◆ TotalSteps() [2/2]

const size_t& TotalSteps ( ) const

inline

Get total steps from beginning.

Definition at line 117 of file q_learning.hpp.

◆ TrainAgent()

void TrainAgent ( )

Trains the DQN agent(non-categorical).

◆ TrainCategoricalAgent()

void TrainCategoricalAgent ( )

Trains the DQN agent of categorical type.

The documentation for this class was generated from the following file:

/home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/methods/reinforcement_learning/q_learning.hpp

Public Types

Public Member Functions

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>> class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Member Typedef Documentation

◆ ActionType

◆ StateType

Constructor & Destructor Documentation

◆ QLearning()

◆ ~QLearning()

Member Function Documentation

◆ Action()

◆ Deterministic() [1/2]

◆ Deterministic() [2/2]

◆ Environment() [1/2]

◆ Environment() [2/2]

◆ Episode()

◆ Network() [1/2]

◆ Network() [2/2]

◆ SelectAction()

◆ State() [1/2]

◆ State() [2/2]

◆ TotalSteps() [1/2]

◆ TotalSteps() [2/2]

◆ TrainAgent()

◆ TrainCategoricalAgent()

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >