verve::PredictiveModel Class Reference

A PredictiveModel learns a predictive model of the environment dynamics (transitions) from direct experience. More...

#include <PredictiveModel.h>

List of all members.

Public Member Functions

VERVE_DECL PredictiveModel (const Observation &obs, bool isDynamicRBFEnabled, unsigned int numActions)
virtual VERVE_DECL ~PredictiveModel ()
virtual VERVE_DECL void VERVE_CALL resetShortTermMemory ()
virtual VERVE_DECL void VERVE_CALL predictAndTrain (const Observation &actualPrevObs, unsigned int prevAction, const Observation &actualCurrentObs, const real actualCurrentReward, Observation &predCurrentObs, real &predCurrentReward, real &predUncertainty)
virtual VERVE_DECL void VERVE_CALL predict (const Observation &actualCurrentObs, unsigned int currentAction, Observation &predNextObs, real &predNextReward, real &predUncertainty, bool allowDynamicRBFCreation)
virtual VERVE_DECL void VERVE_CALL changeStepSize (real newValue)
virtual VERVE_DECL void VERVE_CALL setDeltaLearningRate (real timeConstant, real stepSize)
virtual VERVE_DECL real VERVE_CALL getPredictionMSE ()

Protected Attributes

RBFInputData mStateActionInputData
realmDiscObsTrainingData
RBFPopulationmStateActionRepresentation
PopulationmDiscObsPredPopulation
PopulationmContObsPredPopulation
PopulationmRewardPredPopulation
PopulationmUncertaintyPredPopulation
std::vector< Population * > mAllPopulations
real mLatestPredMSE
real mDeltaLearningTimeConstant
real mDeltaLearningFactor


Detailed Description

A PredictiveModel learns a predictive model of the environment dynamics (transitions) from direct experience.

Given some Observation and action, it outputs the predicted next Observation and reward. It is trained using prediction errors computed from the actual next Observation and reward.

Definition at line 49 of file PredictiveModel.h.


Constructor & Destructor Documentation

verve::PredictiveModel::PredictiveModel const Observation obs,
bool  isDynamicRBFEnabled,
unsigned int  numActions
 

Sets up the RLModule to work with the given type of Observation.

Applies initial noise to trainable Connection weights.

Definition at line 30 of file PredictiveModel.cpp.

References verve::Observation::getDiscreteInputNumOptions(), and verve::Observation::getNumDiscreteInputs().

verve::PredictiveModel::~PredictiveModel  )  [virtual]
 

Definition at line 120 of file PredictiveModel.cpp.

References mAllPopulations, and mDiscObsTrainingData.


Member Function Documentation

void verve::PredictiveModel::changeStepSize real  newValue  )  [virtual]
 

Updates all step size-dependent factors using the new step size.

Definition at line 268 of file PredictiveModel.cpp.

References mDeltaLearningTimeConstant, and setDeltaLearningRate().

Referenced by verve::Agent::setStepSize().

real verve::PredictiveModel::getPredictionMSE  )  [virtual]
 

Returns the most recent combined mean squared error for Observation and reward predictions.

Definition at line 292 of file PredictiveModel.cpp.

Referenced by verve::Agent::getModelMSE().

void verve::PredictiveModel::predict const Observation actualCurrentObs,
unsigned int  currentAction,
Observation predNextObs,
real predNextReward,
real predUncertainty,
bool  allowDynamicRBFCreation
[virtual]
 

Predicts the next Observation and reward based on the given current Observation and action.

Long-term changes occur here only if 'allowDynamicRBFs' is true.

Definition at line 194 of file PredictiveModel.cpp.

References verve::RBFInputData::discInputData, verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), and mStateActionInputData.

Referenced by verve::Agent::planningSequence(), and predictAndTrain().

void verve::PredictiveModel::predictAndTrain const Observation actualPrevObs,
unsigned int  prevAction,
const Observation actualCurrentObs,
const real  actualCurrentReward,
Observation predCurrentObs,
real predCurrentReward,
real predUncertainty
[virtual]
 

Predicts the current Observation and reward based on the previous Observation and action.

This also trains the predictors based on the given actual current Observation and actual current reward. On the first step the predicted Observation and reward will simply be set equal to the actual Observation and reward.

Definition at line 146 of file PredictiveModel.cpp.

References verve::Observation::getDiscreteInputNumOptions(), verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), mDiscObsTrainingData, and predict().

Referenced by verve::Agent::update().

void verve::PredictiveModel::resetShortTermMemory  )  [virtual]
 

Resets temporary dynamics without affecting learned parameters.

Definition at line 133 of file PredictiveModel.cpp.

References mAllPopulations, mLatestPredMSE, mStateActionInputData, and verve::RBFInputData::zeroInputData().

Referenced by verve::Agent::resetShortTermMemory().

void verve::PredictiveModel::setDeltaLearningRate real  timeConstant,
real  stepSize
[virtual]
 

Sets the TD learning rate for the value function and policy.

The time constant (which must be greater than zero) specifies how many seconds it takes for the value function's prediction errors to be reduced to 37% of their initial values. The policy learning multiplier combined with the value function's learning rate determines the policy's learning rate (the multiplier usually ranges from 1-100).

Definition at line 273 of file PredictiveModel.cpp.

References verve::globals::calcDecayConstant(), verve::RBFPopulation::computeMaxActivationSum(), mDeltaLearningFactor, mDeltaLearningTimeConstant, and mStateActionRepresentation.

Referenced by changeStepSize(), and verve::Agent::setModelLearningRate().


Member Data Documentation

std::vector<Population*> verve::PredictiveModel::mAllPopulations [protected]
 

A list of all Populations.

Definition at line 128 of file PredictiveModel.h.

Referenced by resetShortTermMemory(), and ~PredictiveModel().

Population* verve::PredictiveModel::mContObsPredPopulation [protected]
 

The continuous Observation prediction Population.

Definition at line 119 of file PredictiveModel.h.

real verve::PredictiveModel::mDeltaLearningFactor [protected]
 

A precomputed value used for the delta rule learning rate.

Definition at line 137 of file PredictiveModel.h.

Referenced by setDeltaLearningRate().

real verve::PredictiveModel::mDeltaLearningTimeConstant [protected]
 

A time constant that determines the delta rule learning rate.

Definition at line 134 of file PredictiveModel.h.

Referenced by changeStepSize(), and setDeltaLearningRate().

Population* verve::PredictiveModel::mDiscObsPredPopulation [protected]
 

The discrete Observation prediction Population.

Definition at line 116 of file PredictiveModel.h.

real* verve::PredictiveModel::mDiscObsTrainingData [protected]
 

An array used to store the discrete Observation data used for training.

Instead of keeping discrete data as integer values, it must be converted to reals within [-1, 1] before being used for training.

Definition at line 110 of file PredictiveModel.h.

Referenced by predictAndTrain(), and ~PredictiveModel().

real verve::PredictiveModel::mLatestPredMSE [protected]
 

The latest MSE for Observation and reward predictions.

Definition at line 131 of file PredictiveModel.h.

Referenced by resetShortTermMemory().

Population* verve::PredictiveModel::mRewardPredPopulation [protected]
 

The reward prediction Population.

Definition at line 122 of file PredictiveModel.h.

RBFInputData verve::PredictiveModel::mStateActionInputData [protected]
 

A convenient data structure used to pass data to the state-action representation.

Definition at line 104 of file PredictiveModel.h.

Referenced by predict(), and resetShortTermMemory().

RBFPopulation* verve::PredictiveModel::mStateActionRepresentation [protected]
 

The state-action representation Population.

Definition at line 113 of file PredictiveModel.h.

Referenced by setDeltaLearningRate().

Population* verve::PredictiveModel::mUncertaintyPredPopulation [protected]
 

The uncertainty prediction Population.

Definition at line 125 of file PredictiveModel.h.


The documentation for this class was generated from the following files:
Generated on Tue Jan 24 21:46:39 2006 for Verve by  doxygen 1.4.6-NO