Implementation of Acrobot game. More...
Classes | |
class | Action |
class | State |
Public Member Functions | |
Acrobot (const size_t maxSteps=500, const double gravity=9.81, const double linkLength1=1.0, const double linkLength2=1.0, const double linkMass1=1.0, const double linkMass2=1.0, const double linkCom1=0.5, const double linkCom2=0.5, const double linkMoi=1.0, const double maxVel1=4 *M_PI, const double maxVel2=9 *M_PI, const double dt=0.2, const double doneReward=0) | |
Construct a Acrobot instance using the given constants. More... | |
arma::colvec | Dsdt (arma::colvec state, const double torque) const |
This is the ordinary differential equations required for estimation of nextState through RK4 method. More... | |
State | InitialSample () |
This function does random initialization of state space. More... | |
bool | IsTerminal (const State &state) const |
This function checks if the acrobot has reached the terminal state. More... | |
size_t | MaxSteps () const |
Get the maximum number of steps allowed. More... | |
size_t & | MaxSteps () |
Set the maximum number of steps allowed. More... | |
arma::colvec | Rk4 (const arma::colvec state, const double torque) const |
This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation. More... | |
double | Sample (const State &state, const Action &action, State &nextState) |
Dynamics of the Acrobot System. More... | |
double | Sample (const State &state, const Action &action) |
Dynamics of the Acrobot System. More... | |
size_t | StepsPerformed () const |
Get the number of steps performed. More... | |
double | Torque (const Action &action) const |
This function calculates the torque for a particular action. More... | |
double | Wrap (double value, const double minimum, const double maximum) const |
Wrap funtion is required to truncate the angle value from -180 to 180. More... | |
Implementation of Acrobot game.
Acrobot is a 2-link pendulum with only the second joint actuated. Intitially, both links point downwards. The goal is to swing the end-effector at a height at least the length of one link above the base. Both links can swing freely and can pass by each other, i.e., they don't collide when they have the same angle.
Definition at line 28 of file acrobot.hpp.
|
inline |
Construct a Acrobot instance using the given constants.
maxSteps | The number of steps after which the episode terminates. If the value is 0, there is no limit. |
gravity | The gravity parameter. |
linkLength1 | The length of link 1. |
linkLength2 | The length of link 2. |
linkMass1 | The mass of link 1. |
linkMass2 | The mass of link 2. |
linkCom1 | The position of the center of mass of link 1. |
linkCom2 | The position of the center of mass of link 2. |
linkMoi | The moments of inertia for both links. |
maxVel1 | The max angular velocity of link1. |
maxVel2 | The max angular velocity of link2. |
dt | The differential value. |
doneReward | The reward recieved by the agent on success. |
Definition at line 122 of file acrobot.hpp.
|
inline |
This is the ordinary differential equations required for estimation of nextState through RK4 method.
state | Current State. |
torque | The torque Applied. |
Definition at line 249 of file acrobot.hpp.
References M_PI.
Referenced by Acrobot::Rk4().
|
inline |
This function does random initialization of state space.
Definition at line 213 of file acrobot.hpp.
References Acrobot::State::State().
|
inline |
This function checks if the acrobot has reached the terminal state.
state | The current State. |
Definition at line 225 of file acrobot.hpp.
References Log::Info, Acrobot::State::Theta1(), and Acrobot::State::Theta2().
Referenced by Acrobot::Sample().
|
inline |
Get the maximum number of steps allowed.
Definition at line 350 of file acrobot.hpp.
|
inline |
Set the maximum number of steps allowed.
Definition at line 352 of file acrobot.hpp.
|
inline |
This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation.
state | The current State. |
torque | The torque applied. |
Definition at line 335 of file acrobot.hpp.
References Acrobot::Dsdt().
Referenced by Acrobot::Sample().
Dynamics of the Acrobot System.
To get reward and next state based on current state and current action. Always return -1 reward.
state | The current State. |
action | The action taken. |
nextState | The next state. |
The value of angular velocity is bounded in min and max value.
Definition at line 160 of file acrobot.hpp.
References Acrobot::State::AngularVelocity1(), Acrobot::State::AngularVelocity2(), mlpack::math::ClampRange(), Acrobot::IsTerminal(), M_PI, Acrobot::Rk4(), Acrobot::State::Theta1(), Acrobot::State::Theta2(), Acrobot::Torque(), and Acrobot::Wrap().
Referenced by Acrobot::Sample().
Dynamics of the Acrobot System.
To get reward and next state based on current state and current action. This function calls the Sample function to estimate the next state return reward for taking a particular action.
state | The current State. |
action | The action taken. |
Definition at line 204 of file acrobot.hpp.
References Acrobot::Sample().
|
inline |
Get the number of steps performed.
Definition at line 347 of file acrobot.hpp.
|
inline |
This function calculates the torque for a particular action.
0 : negative torque, 1 : zero torque, 2 : positive torque.
action | Action taken. |
Definition at line 322 of file acrobot.hpp.
References Acrobot::Action::action, and mlpack::math::Random().
Referenced by Acrobot::Sample().
|
inline |
Wrap funtion is required to truncate the angle value from -180 to 180.
This function will make sure that value will always be between minimum to maximum.
value | Scalar value to wrap. |
minimum | Minimum range of wrap. |
maximum | Maximum range of wrap. |
Definition at line 298 of file acrobot.hpp.
Referenced by Acrobot::Sample().