RandomReplay< EnvironmentType > Class Template Reference

Implementation of random experience replay. More...

Classes

struct  Transition
 

Public Types

using ActionType = typename EnvironmentType::Action
 Convenient typedef for action. More...

 
using StateType = typename EnvironmentType::State
 Convenient typedef for state. More...

 

Public Member Functions

 RandomReplay ()
 
 RandomReplay (const size_t batchSize, const size_t capacity, const size_t nSteps=1, const size_t dimension=StateType::dimension)
 Construct an instance of random experience replay class. More...

 
void GetNStepInfo (double &reward, StateType &nextState, bool &isEnd, const double &discount)
 Get the reward, next state and terminal boolean for nth step. More...

 
const size_t & NSteps () const
 Get the number of steps for n-step agent. More...

 
void Sample (arma::mat &sampledStates, std::vector< ActionType > &sampledActions, arma::rowvec &sampledRewards, arma::mat &sampledNextStates, arma::irowvec &isTerminal)
 Sample some experiences. More...

 
const size_t & Size ()
 Get the number of transitions in the memory. More...

 
void Store (StateType state, ActionType action, double reward, StateType nextState, bool isEnd, const double &discount)
 Store the given experience. More...

 
void Update (arma::mat, std::vector< ActionType >, arma::mat, arma::mat &)
 Update the priorities of transitions and Update the gradients. More...

 

Detailed Description


template
<
typename
EnvironmentType
>

class mlpack::rl::RandomReplay< EnvironmentType >

Implementation of random experience replay.

At each time step, interactions between the agent and the environment will be saved to a memory buffer. When necessary, we can simply sample previous experiences from the buffer to train the agent. Typically this would be a random sample and the memory will be a First-In-First-Out buffer.

For more information, see the following.

@phdthesis{lin1993reinforcement,
title = {Reinforcement learning for robots using neural networks},
author = {Lin, Long-Ji},
year = {1993},
school = {Fujitsu Laboratories Ltd}
}
Template Parameters
EnvironmentTypeDesired task.

Definition at line 44 of file random_replay.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 48 of file random_replay.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 51 of file random_replay.hpp.

Constructor & Destructor Documentation

◆ RandomReplay() [1/2]

RandomReplay ( )
inline

Definition at line 62 of file random_replay.hpp.

◆ RandomReplay() [2/2]

RandomReplay ( const size_t  batchSize,
const size_t  capacity,
const size_t  nSteps = 1,
const size_t  dimension = StateType::dimension 
)
inline

Construct an instance of random experience replay class.

Parameters
batchSizeNumber of examples returned at each sample.
capacityTotal memory size in terms of number of examples.
nStepsNumber of steps to look in the future.
dimensionThe dimension of an encoded state.

Definition at line 78 of file random_replay.hpp.

Member Function Documentation

◆ GetNStepInfo()

void GetNStepInfo ( double &  reward,
StateType nextState,
bool &  isEnd,
const double &  discount 
)
inline

Get the reward, next state and terminal boolean for nth step.

Parameters
rewardGiven reward.
nextStateGiven next state.
isEndWhether next state is terminal state.
discountThe discount parameter.

Definition at line 151 of file random_replay.hpp.

Referenced by RandomReplay< EnvironmentType >::Store().

◆ NSteps()

const size_t& NSteps ( ) const
inline

Get the number of steps for n-step agent.

Definition at line 228 of file random_replay.hpp.

◆ Sample()

void Sample ( arma::mat &  sampledStates,
std::vector< ActionType > &  sampledActions,
arma::rowvec &  sampledRewards,
arma::mat &  sampledNextStates,
arma::irowvec &  isTerminal 
)
inline

Sample some experiences.

Parameters
sampledStatesSampled encoded states.
sampledActionsSampled actions.
sampledRewardsSampled rewards.
sampledNextStatesSampled encoded next states.
isTerminalIndicate whether corresponding next state is terminal state.

Definition at line 183 of file random_replay.hpp.

◆ Size()

const size_t& Size ( )
inline

Get the number of transitions in the memory.

Returns
Actual used memory size

Definition at line 206 of file random_replay.hpp.

◆ Store()

void Store ( StateType  state,
ActionType  action,
double  reward,
StateType  nextState,
bool  isEnd,
const double &  discount 
)
inline

Store the given experience.

Parameters
stateGiven state.
actionGiven action.
rewardGiven reward.
nextStateGiven next state.
isEndWhether next state is terminal state.
discountThe discount parameter.

Definition at line 104 of file random_replay.hpp.

References RandomReplay< EnvironmentType >::Transition::action, RandomReplay< EnvironmentType >::GetNStepInfo(), RandomReplay< EnvironmentType >::Transition::isEnd, RandomReplay< EnvironmentType >::Transition::nextState, RandomReplay< EnvironmentType >::Transition::reward, and RandomReplay< EnvironmentType >::Transition::state.

◆ Update()

void Update ( arma::mat  ,
std::vector< ActionType ,
arma::mat  ,
arma::mat &   
)
inline

Update the priorities of transitions and Update the gradients.

Parameters
*(target) The learned value
*(sampledActions) Agent's sampled action
*(nextActionValues) Agent's next action
*(gradients) The model's gradients

Definition at line 219 of file random_replay.hpp.


The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/methods/reinforcement_learning/replay/random_replay.hpp