#include <PredictiveModel.h>
Public Member Functions | |
VERVE_DECL | PredictiveModel (const Observation &obs, bool isDynamicRBFEnabled, unsigned int numActions) |
virtual VERVE_DECL | ~PredictiveModel () |
virtual VERVE_DECL void VERVE_CALL | resetShortTermMemory () |
virtual VERVE_DECL void VERVE_CALL | predictAndTrain (const Observation &actualPrevObs, unsigned int prevAction, const Observation &actualCurrentObs, const real actualCurrentReward, Observation &predCurrentObs, real &predCurrentReward, real &predUncertainty) |
virtual VERVE_DECL void VERVE_CALL | predict (const Observation &actualCurrentObs, unsigned int currentAction, Observation &predNextObs, real &predNextReward, real &predUncertainty, bool allowDynamicRBFCreation) |
virtual VERVE_DECL void VERVE_CALL | changeStepSize (real newValue) |
virtual VERVE_DECL void VERVE_CALL | setDeltaLearningRate (real timeConstant, real stepSize) |
virtual VERVE_DECL real VERVE_CALL | getPredictionMSE () |
Protected Attributes | |
RBFInputData | mStateActionInputData |
real * | mDiscObsTrainingData |
RBFPopulation * | mStateActionRepresentation |
Population * | mDiscObsPredPopulation |
Population * | mContObsPredPopulation |
Population * | mRewardPredPopulation |
Population * | mUncertaintyPredPopulation |
std::vector< Population * > | mAllPopulations |
real | mLatestPredMSE |
real | mDeltaLearningTimeConstant |
real | mDeltaLearningFactor |
Given some Observation and action, it outputs the predicted next Observation and reward. It is trained using prediction errors computed from the actual next Observation and reward.
Definition at line 49 of file PredictiveModel.h.
|
Sets up the RLModule to work with the given type of Observation. Applies initial noise to trainable Connection weights. Definition at line 30 of file PredictiveModel.cpp. References verve::Observation::getDiscreteInputNumOptions(), and verve::Observation::getNumDiscreteInputs(). |
|
Definition at line 120 of file PredictiveModel.cpp. References mAllPopulations, and mDiscObsTrainingData. |
|
Updates all step size-dependent factors using the new step size.
Definition at line 268 of file PredictiveModel.cpp. References mDeltaLearningTimeConstant, and setDeltaLearningRate(). Referenced by verve::Agent::setStepSize(). |
|
Returns the most recent combined mean squared error for Observation and reward predictions.
Definition at line 292 of file PredictiveModel.cpp. Referenced by verve::Agent::getModelMSE(). |
|
Predicts the next Observation and reward based on the given current Observation and action. Long-term changes occur here only if 'allowDynamicRBFs' is true. Definition at line 194 of file PredictiveModel.cpp. References verve::RBFInputData::discInputData, verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), and mStateActionInputData. Referenced by verve::Agent::planningSequence(), and predictAndTrain(). |
|
Predicts the current Observation and reward based on the previous Observation and action. This also trains the predictors based on the given actual current Observation and actual current reward. On the first step the predicted Observation and reward will simply be set equal to the actual Observation and reward. Definition at line 146 of file PredictiveModel.cpp. References verve::Observation::getDiscreteInputNumOptions(), verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), mDiscObsTrainingData, and predict(). Referenced by verve::Agent::update(). |
|
Resets temporary dynamics without affecting learned parameters.
Definition at line 133 of file PredictiveModel.cpp. References mAllPopulations, mLatestPredMSE, mStateActionInputData, and verve::RBFInputData::zeroInputData(). Referenced by verve::Agent::resetShortTermMemory(). |
|
Sets the TD learning rate for the value function and policy. The time constant (which must be greater than zero) specifies how many seconds it takes for the value function's prediction errors to be reduced to 37% of their initial values. The policy learning multiplier combined with the value function's learning rate determines the policy's learning rate (the multiplier usually ranges from 1-100). Definition at line 273 of file PredictiveModel.cpp. References verve::globals::calcDecayConstant(), verve::RBFPopulation::computeMaxActivationSum(), mDeltaLearningFactor, mDeltaLearningTimeConstant, and mStateActionRepresentation. Referenced by changeStepSize(), and verve::Agent::setModelLearningRate(). |
|
A list of all Populations.
Definition at line 128 of file PredictiveModel.h. Referenced by resetShortTermMemory(), and ~PredictiveModel(). |
|
The continuous Observation prediction Population.
Definition at line 119 of file PredictiveModel.h. |
|
A precomputed value used for the delta rule learning rate.
Definition at line 137 of file PredictiveModel.h. Referenced by setDeltaLearningRate(). |
|
A time constant that determines the delta rule learning rate.
Definition at line 134 of file PredictiveModel.h. Referenced by changeStepSize(), and setDeltaLearningRate(). |
|
The discrete Observation prediction Population.
Definition at line 116 of file PredictiveModel.h. |
|
An array used to store the discrete Observation data used for training. Instead of keeping discrete data as integer values, it must be converted to reals within [-1, 1] before being used for training. Definition at line 110 of file PredictiveModel.h. Referenced by predictAndTrain(), and ~PredictiveModel(). |
|
The latest MSE for Observation and reward predictions.
Definition at line 131 of file PredictiveModel.h. Referenced by resetShortTermMemory(). |
|
The reward prediction Population.
Definition at line 122 of file PredictiveModel.h. |
|
A convenient data structure used to pass data to the state-action representation.
Definition at line 104 of file PredictiveModel.h. Referenced by predict(), and resetShortTermMemory(). |
|
The state-action representation Population.
Definition at line 113 of file PredictiveModel.h. Referenced by setDeltaLearningRate(). |
|
The uncertainty prediction Population.
Definition at line 125 of file PredictiveModel.h. |