Forward declaration of OneStepQLearningWorker. More...
Public Types | |
using | ActionType = typename EnvironmentType::Action |
using | StateType = typename EnvironmentType::State |
using | TransitionType = std::tuple< StateType, ActionType, double, StateType > |
Public Member Functions | |
OneStepQLearningWorker (const UpdaterType &updater, const EnvironmentType &environment, const TrainingConfig &config, bool deterministic) | |
Construct one step Q-Learning worker with the given parameters and environment. More... | |
OneStepQLearningWorker (const OneStepQLearningWorker &other) | |
Copy another OneStepQLearningWorker. More... | |
OneStepQLearningWorker (OneStepQLearningWorker &&other) | |
Take ownership of another OneStepQLearningWorker. More... | |
~OneStepQLearningWorker () | |
Clean memory. More... | |
void | Initialize (NetworkType &learningNetwork) |
Initialize the worker. More... | |
OneStepQLearningWorker & | operator= (const OneStepQLearningWorker &other) |
Copy another OneStepQLearningWorker. More... | |
OneStepQLearningWorker & | operator= (OneStepQLearningWorker &&other) |
Take ownership of another OneStepQLearningWorker. More... | |
bool | Step (NetworkType &learningNetwork, NetworkType &targetNetwork, size_t &totalSteps, PolicyType &policy, double &totalReward) |
The agent will execute one step. More... | |
Forward declaration of OneStepQLearningWorker.
One step Q-Learning worker.
EnvironmentType | The type of the reinforcement learning task. |
NetworkType | The type of the network model. |
UpdaterType | The type of the optimizer. |
PolicyType | The type of the behavior policy. |
EnvironmentType | The type of the reinforcement learning task. |
NetworkType | The type of the network model. |
UpdaterType | The type of the optimizer. |
PolicyType | The type of the behavior policy. * |
Definition at line 147 of file async_learning.hpp.
using ActionType = typename EnvironmentType::Action |
Definition at line 40 of file one_step_q_learning_worker.hpp.
using StateType = typename EnvironmentType::State |
Definition at line 39 of file one_step_q_learning_worker.hpp.
using TransitionType = std::tuple<StateType, ActionType, double, StateType> |
Definition at line 41 of file one_step_q_learning_worker.hpp.
|
inline |
Construct one step Q-Learning worker with the given parameters and environment.
updater | The optimizer. |
environment | The reinforcement learning task. |
config | Hyper-parameters. |
deterministic | Whether it should be deterministic. |
Definition at line 52 of file one_step_q_learning_worker.hpp.
|
inline |
Copy another OneStepQLearningWorker.
other | OneStepQLearningWorker to copy. |
Definition at line 72 of file one_step_q_learning_worker.hpp.
|
inline |
Take ownership of another OneStepQLearningWorker.
other | OneStepQLearningWorker to take ownership of. |
Definition at line 102 of file one_step_q_learning_worker.hpp.
|
inline |
Clean memory.
Definition at line 204 of file one_step_q_learning_worker.hpp.
|
inline |
Initialize the worker.
learningNetwork | The shared network. |
Definition at line 215 of file one_step_q_learning_worker.hpp.
|
inline |
Copy another OneStepQLearningWorker.
other | OneStepQLearningWorker to copy. |
Definition at line 132 of file one_step_q_learning_worker.hpp.
|
inline |
Take ownership of another OneStepQLearningWorker.
other | OneStepQLearningWorker to take ownership of. |
Definition at line 169 of file one_step_q_learning_worker.hpp.
|
inline |
The agent will execute one step.
learningNetwork | The shared learning network. |
targetNetwork | The shared target network. |
totalSteps | The shared counter for total steps. |
policy | The shared behavior policy. |
totalReward | This will be the episode return if the episode ends after this step. Otherwise this is invalid. |
Definition at line 244 of file one_step_q_learning_worker.hpp.
References TrainingConfig::Discount(), TrainingConfig::GradientLimit(), TrainingConfig::StepLimit(), TrainingConfig::StepSize(), TrainingConfig::TargetNetworkSyncInterval(), and TrainingConfig::UpdateInterval().