Forward declaration of OneStepSarsaWorker. More...
Public Types | |
| using | ActionType = typename EnvironmentType::Action |
| using | StateType = typename EnvironmentType::State |
| using | TransitionType = std::tuple< StateType, ActionType, double, StateType, ActionType > |
Public Member Functions | |
| OneStepSarsaWorker (const UpdaterType &updater, const EnvironmentType &environment, const TrainingConfig &config, bool deterministic) | |
| Construct one step sarsa worker with the given parameters and environment. More... | |
| OneStepSarsaWorker (const OneStepSarsaWorker &other) | |
| Copy another OneStepSarsaWorker. More... | |
| OneStepSarsaWorker (OneStepSarsaWorker &&other) | |
| Take ownership of another OneStepSarsaWorker. More... | |
| ~OneStepSarsaWorker () | |
| Clean memory. More... | |
| void | Initialize (NetworkType &learningNetwork) |
| Initialize the worker. More... | |
| OneStepSarsaWorker & | operator= (const OneStepSarsaWorker &other) |
| Copy another OneStepSarsaWorker. More... | |
| OneStepSarsaWorker & | operator= (OneStepSarsaWorker &&other) |
| Take ownership of another OneStepSarsaWorker. More... | |
| bool | Step (NetworkType &learningNetwork, NetworkType &targetNetwork, size_t &totalSteps, PolicyType &policy, double &totalReward) |
| The agent will execute one step. More... | |
Forward declaration of OneStepSarsaWorker.
One step Sarsa worker.
| EnvironmentType | The type of the reinforcement learning task. |
| NetworkType | The type of the network model. |
| UpdaterType | The type of the optimizer. |
| PolicyType | The type of the behavior policy. |
Definition at line 163 of file async_learning.hpp.
| using ActionType = typename EnvironmentType::Action |
Definition at line 40 of file one_step_sarsa_worker.hpp.
| using StateType = typename EnvironmentType::State |
Definition at line 39 of file one_step_sarsa_worker.hpp.
| using TransitionType = std::tuple<StateType, ActionType, double, StateType, ActionType> |
Definition at line 42 of file one_step_sarsa_worker.hpp.
|
inline |
Construct one step sarsa worker with the given parameters and environment.
| updater | The optimizer. |
| environment | The reinforcement learning task. |
| config | Hyper-parameters. |
| deterministic | Whether it should be deterministic. |
Definition at line 53 of file one_step_sarsa_worker.hpp.
|
inline |
Copy another OneStepSarsaWorker.
| other | OneStepSarsaWorker to copy. |
Definition at line 73 of file one_step_sarsa_worker.hpp.
|
inline |
Take ownership of another OneStepSarsaWorker.
| other | OneStepSarsaWorker to take ownership of. |
Definition at line 104 of file one_step_sarsa_worker.hpp.
|
inline |
Clean memory.
Definition at line 209 of file one_step_sarsa_worker.hpp.
|
inline |
Initialize the worker.
| learningNetwork | The shared network. |
Definition at line 220 of file one_step_sarsa_worker.hpp.
|
inline |
Copy another OneStepSarsaWorker.
| other | OneStepSarsaWorker to copy. |
Definition at line 135 of file one_step_sarsa_worker.hpp.
|
inline |
Take ownership of another OneStepSarsaWorker.
| other | OneStepSarsaWorker to take ownership of. |
Definition at line 173 of file one_step_sarsa_worker.hpp.
|
inline |
The agent will execute one step.
| learningNetwork | The shared learning network. |
| targetNetwork | The shared target network. |
| totalSteps | The shared counter for total steps. |
| policy | The shared behavior policy. |
| totalReward | This will be the episode return if the episode ends after this step. Otherwise this is invalid. |
Definition at line 249 of file one_step_sarsa_worker.hpp.
References TrainingConfig::Discount(), TrainingConfig::GradientLimit(), TrainingConfig::StepLimit(), TrainingConfig::StepSize(), TrainingConfig::TargetNetworkSyncInterval(), and TrainingConfig::UpdateInterval().