verve::PredictiveModel Class Reference

A PredictiveModel learns a predictive model of the environment dynamics (transitions) from direct experience. More...

#include <PredictiveModel.h>

List of all members.

Public Member Functions

VERVE_DECL PredictiveModel (const Observation &obs, bool isDynamicRBFEnabled, unsigned int numActions)

virtual VERVE_DECL ~PredictiveModel ()

virtual VERVE_DECL void VERVE_CALL resetShortTermMemory ()

virtual VERVE_DECL void VERVE_CALL predictAndTrain (const Observation &actualPrevObs, unsigned int prevAction, const Observation &actualCurrentObs, const real actualCurrentReward, Observation &predCurrentObs, real &predCurrentReward, real &predUncertainty)

virtual VERVE_DECL void VERVE_CALL predict (const Observation &actualCurrentObs, unsigned int currentAction, Observation &predNextObs, real &predNextReward, real &predUncertainty, bool allowDynamicRBFCreation)

virtual VERVE_DECL void VERVE_CALL changeStepSize (real newValue)

virtual VERVE_DECL void VERVE_CALL setDeltaLearningRate (real timeConstant, real stepSize)

virtual VERVE_DECL real VERVE_CALL getPredictionMSE ()

Protected Attributes

RBFInputData mStateActionInputData

real * mDiscObsTrainingData

RBFPopulation * mStateActionRepresentation

Population * mDiscObsPredPopulation

Population * mContObsPredPopulation

Population * mRewardPredPopulation

Population * mUncertaintyPredPopulation

std::vector< Population * > mAllPopulations

real mLatestPredMSE

real mDeltaLearningTimeConstant

real mDeltaLearningFactor

Detailed Description

A PredictiveModel learns a predictive model of the environment dynamics (transitions) from direct experience.

Given some Observation and action, it outputs the predicted next Observation and reward. It is trained using prediction errors computed from the actual next Observation and reward.

Definition at line 49 of file PredictiveModel.h.

Constructor & Destructor Documentation

verve::PredictiveModel::PredictiveModel ( const Observation & obs,

bool isDynamicRBFEnabled,

unsigned int numActions

)

Sets up the RLModule to work with the given type of Observation.
Applies initial noise to trainable Connection weights.
Definition at line 30 of file PredictiveModel.cpp.
References verve::Observation::getDiscreteInputNumOptions(), and verve::Observation::getNumDiscreteInputs().

verve::PredictiveModel::~PredictiveModel ( ) [virtual]

Definition at line 120 of file PredictiveModel.cpp.
References mAllPopulations, and mDiscObsTrainingData.

Member Function Documentation

void verve::PredictiveModel::changeStepSize ( real newValue ) [virtual]

Updates all step size-dependent factors using the new step size.

Definition at line 268 of file PredictiveModel.cpp.
References mDeltaLearningTimeConstant, and setDeltaLearningRate().
Referenced by verve::Agent::setStepSize().

real verve::PredictiveModel::getPredictionMSE ( ) [virtual]

Returns the most recent combined mean squared error for Observation and reward predictions.

Definition at line 292 of file PredictiveModel.cpp.
Referenced by verve::Agent::getModelMSE().

void verve::PredictiveModel::predict ( const Observation & actualCurrentObs,

unsigned int currentAction,

Observation & predNextObs,

real & predNextReward,

real & predUncertainty,

bool allowDynamicRBFCreation

) [virtual]

Predicts the next Observation and reward based on the given current Observation and action.
Long-term changes occur here only if 'allowDynamicRBFs' is true.
Definition at line 194 of file PredictiveModel.cpp.
References verve::RBFInputData::discInputData, verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), and mStateActionInputData.
Referenced by verve::Agent::planningSequence(), and predictAndTrain().

void verve::PredictiveModel::predictAndTrain ( const Observation & actualPrevObs,

unsigned int prevAction,

const Observation & actualCurrentObs,

const real actualCurrentReward,

Observation & predCurrentObs,

real & predCurrentReward,

real & predUncertainty

) [virtual]

Predicts the current Observation and reward based on the previous Observation and action.
This also trains the predictors based on the given actual current Observation and actual current reward. On the first step the predicted Observation and reward will simply be set equal to the actual Observation and reward.
Definition at line 146 of file PredictiveModel.cpp.
References verve::Observation::getDiscreteInputNumOptions(), verve::Observation::getDiscreteValue(), verve::Observation::getNumDiscreteInputs(), mDiscObsTrainingData, and predict().
Referenced by verve::Agent::update().

void verve::PredictiveModel::resetShortTermMemory ( ) [virtual]

Resets temporary dynamics without affecting learned parameters.

Definition at line 133 of file PredictiveModel.cpp.
References mAllPopulations, mLatestPredMSE, mStateActionInputData, and verve::RBFInputData::zeroInputData().
Referenced by verve::Agent::resetShortTermMemory().

void verve::PredictiveModel::setDeltaLearningRate ( real timeConstant,

real stepSize

) [virtual]

Sets the TD learning rate for the value function and policy.
The time constant (which must be greater than zero) specifies how many seconds it takes for the value function's prediction errors to be reduced to 37% of their initial values. The policy learning multiplier combined with the value function's learning rate determines the policy's learning rate (the multiplier usually ranges from 1-100).
Definition at line 273 of file PredictiveModel.cpp.
References verve::globals::calcDecayConstant(), verve::RBFPopulation::computeMaxActivationSum(), mDeltaLearningFactor, mDeltaLearningTimeConstant, and mStateActionRepresentation.
Referenced by changeStepSize(), and verve::Agent::setModelLearningRate().

Member Data Documentation

std::vector<Population*> verve::PredictiveModel::mAllPopulations [protected]

A list of all Populations.

Definition at line 128 of file PredictiveModel.h.
Referenced by resetShortTermMemory(), and ~PredictiveModel().

Population* verve::PredictiveModel::mContObsPredPopulation [protected]

The continuous Observation prediction Population.

Definition at line 119 of file PredictiveModel.h.

real verve::PredictiveModel::mDeltaLearningFactor [protected]

A precomputed value used for the delta rule learning rate.

Definition at line 137 of file PredictiveModel.h.
Referenced by setDeltaLearningRate().

real verve::PredictiveModel::mDeltaLearningTimeConstant [protected]

A time constant that determines the delta rule learning rate.

Definition at line 134 of file PredictiveModel.h.
Referenced by changeStepSize(), and setDeltaLearningRate().

Population* verve::PredictiveModel::mDiscObsPredPopulation [protected]

The discrete Observation prediction Population.

Definition at line 116 of file PredictiveModel.h.

real* verve::PredictiveModel::mDiscObsTrainingData [protected]

An array used to store the discrete Observation data used for training.
Instead of keeping discrete data as integer values, it must be converted to reals within [-1, 1] before being used for training.
Definition at line 110 of file PredictiveModel.h.
Referenced by predictAndTrain(), and ~PredictiveModel().

real verve::PredictiveModel::mLatestPredMSE [protected]

The latest MSE for Observation and reward predictions.

Definition at line 131 of file PredictiveModel.h.
Referenced by resetShortTermMemory().

Population* verve::PredictiveModel::mRewardPredPopulation [protected]

The reward prediction Population.

Definition at line 122 of file PredictiveModel.h.

RBFInputData verve::PredictiveModel::mStateActionInputData [protected]

A convenient data structure used to pass data to the state-action representation.

Definition at line 104 of file PredictiveModel.h.
Referenced by predict(), and resetShortTermMemory().

RBFPopulation* verve::PredictiveModel::mStateActionRepresentation [protected]

The state-action representation Population.

Definition at line 113 of file PredictiveModel.h.
Referenced by setDeltaLearningRate().

Population* verve::PredictiveModel::mUncertaintyPredPopulation [protected]

The uncertainty prediction Population.

Definition at line 125 of file PredictiveModel.h.

The documentation for this class was generated from the following files:

Generated on Tue Jan 24 21:46:39 2006 for Verve by

1.4.6-NO


Public Member Functions
VERVE_DECL	PredictiveModel (const Observation &obs, bool isDynamicRBFEnabled, unsigned int numActions)
virtual VERVE_DECL	~PredictiveModel ()
virtual VERVE_DECL void VERVE_CALL	resetShortTermMemory ()
virtual VERVE_DECL void VERVE_CALL	predictAndTrain (const Observation &actualPrevObs, unsigned int prevAction, const Observation &actualCurrentObs, const real actualCurrentReward, Observation &predCurrentObs, real &predCurrentReward, real &predUncertainty)
virtual VERVE_DECL void VERVE_CALL	predict (const Observation &actualCurrentObs, unsigned int currentAction, Observation &predNextObs, real &predNextReward, real &predUncertainty, bool allowDynamicRBFCreation)
virtual VERVE_DECL void VERVE_CALL	changeStepSize (real newValue)
virtual VERVE_DECL void VERVE_CALL	setDeltaLearningRate (real timeConstant, real stepSize)
virtual VERVE_DECL real VERVE_CALL	getPredictionMSE ()
Protected Attributes
RBFInputData	mStateActionInputData
real *	mDiscObsTrainingData
RBFPopulation *	mStateActionRepresentation
Population *	mDiscObsPredPopulation
Population *	mContObsPredPopulation
Population *	mRewardPredPopulation
Population *	mUncertaintyPredPopulation
std::vector< Population * >	mAllPopulations
real	mLatestPredMSE
real	mDeltaLearningTimeConstant
real	mDeltaLearningFactor