verve::Agent Class Reference

An Agent is an autonomous entity that learns from direct with its environment. More...

#include <Agent.h>

List of all members.

Public Member Functions

VERVE_DECL Agent (const AgentDescriptor &desc)
virtual VERVE_DECL ~Agent ()
virtual VERVE_DECL void VERVE_CALL destroy ()
virtual VERVE_DECL void VERVE_CALL resetShortTermMemory ()
virtual VERVE_DECL unsigned
int VERVE_CALL 
update (real reinforcement, const Observation &obs, real dt)
virtual VERVE_DECL unsigned
int VERVE_CALL 
getNumDiscreteSensors () const
virtual VERVE_DECL unsigned
int VERVE_CALL 
getNumContinuousSensors () const
virtual VERVE_DECL void VERVE_CALL setETraceTimeConstant (real timeConstant)
virtual VERVE_DECL void VERVE_CALL setTDDiscountTimeConstant (real timeConstant)
virtual VERVE_DECL void VERVE_CALL setTDLearningRate (real valueFunctionTimeConstant, real policyLearningMultiplier)
virtual VERVE_DECL void VERVE_CALL setModelLearningRate (real timeConstant)
virtual VERVE_DECL void VERVE_CALL setLearningEnabled (bool enabled)
virtual VERVE_DECL long unsigned
int VERVE_CALL 
getAge () const
virtual VERVE_DECL std::string
VERVE_CALL 
getAgeString () const
virtual VERVE_DECL real VERVE_CALL getTDError () const
virtual VERVE_DECL real VERVE_CALL getModelMSE () const
virtual VERVE_DECL unsigned
int VERVE_CALL 
getLastPlanLength () const
virtual VERVE_DECL real VERVE_CALL computeValueEstimation (const Observation &obs)
virtual VERVE_DECL const AgentDescriptor
*VERVE_CALL 
getDescriptor () const
virtual VERVE_DECL void VERVE_CALL saveValueData (unsigned int continuousResolution, const std::string &filename="")
virtual VERVE_DECL void VERVE_CALL saveStateRBFData (const std::string &filename="")

Protected Member Functions

void setStepSize (real value)
unsigned int planningSequence (const Observation &predCurrentObs, real predCurrentReward, real currentUncertainty)
void incrementAge ()

Protected Attributes

AgentDescriptor mDescriptor
RLModulemRLModule
PredictiveModelmPredictiveModel
bool mFirstStep
unsigned int mActionIndex
Observation mActualPrevObs
Observation mPredCurrentObs
Observation mTempPlanningObs
bool mLearningEnabled
real mStepSize
long unsigned int mAgeHours
unsigned int mAgeMinutes
real mAgeSeconds
unsigned int mLastPlanningSequenceLength


Detailed Description

An Agent is an autonomous entity that learns from direct with its environment.

Definition at line 39 of file Agent.h.


Constructor & Destructor Documentation

verve::Agent::Agent const AgentDescriptor desc  ) 
 

Creates an Agent using the given AgentDescriptor.

Adds initial noise to the trainable weights. Never use this to create an Agent dynamically (i.e. never call "new Agent"). Instead, use the global factory functions. This ensures that memory is allocated from the correct heap.

Definition at line 45 of file Agent.cpp.

References verve::CURIOUS_MODEL_RL, verve::AgentDescriptor::getArchitecture(), verve::AgentDescriptor::getNumOutputs(), verve::Observation::init(), verve::AgentDescriptor::isDynamicRBFEnabled(), mActualPrevObs, verve::MODEL_RL, mPredCurrentObs, mPredictiveModel, mRLModule, mTempPlanningObs, and verve::RL.

verve::Agent::~Agent  )  [virtual]
 

Note that "delete Agent" should never be called on a dynamically-allocated Agent.

Instead, use the destroy function. This ensures that memory is deallocated from the correct heap.

Definition at line 95 of file Agent.cpp.

References mPredictiveModel, and mRLModule.


Member Function Documentation

real verve::Agent::computeValueEstimation const Observation obs  )  [virtual]
 

Computes and returns the value estimation for the given observation.

This should not be performed regularly as it is fairly expensive.

Definition at line 610 of file Agent.cpp.

References verve::RLModule::computeValueEstimation(), and mRLModule.

void verve::Agent::destroy  )  [virtual]
 

Deallocates a dynamically-allocated Agent.

Use this instead of "delete Agent" to ensure that memory is deallocated from the correct heap.

Definition at line 104 of file Agent.cpp.

long unsigned int verve::Agent::getAge  )  const [virtual]
 

Returns the age of the Agent in seconds.

Note that the age is only incremented when the Agent is learning.

Definition at line 620 of file Agent.cpp.

References mAgeHours, mAgeMinutes, and mAgeSeconds.

std::string verve::Agent::getAgeString  )  const [virtual]
 

Returns the age of the Agent as a string containing the hours, minutes, and seconds.

Note that the age is only incremented when the Agent is learning.

Definition at line 626 of file Agent.cpp.

References mAgeHours, mAgeMinutes, and mAgeSeconds.

const AgentDescriptor * verve::Agent::getDescriptor  )  const [virtual]
 

Returns a pointer to the Agent's descriptor.

Definition at line 615 of file Agent.cpp.

References mDescriptor.

Referenced by verve::Observation::init().

unsigned int verve::Agent::getLastPlanLength  )  const [virtual]
 

Returns the length of the most recent planning sequence (in number of steps).

Definition at line 650 of file Agent.cpp.

References mLastPlanningSequenceLength.

real verve::Agent::getModelMSE  )  const [virtual]
 

Returns the most recent mean squared error from the predictive model.

Returns zero if this Agent was not constructed with a predictive model.

Definition at line 638 of file Agent.cpp.

References verve::PredictiveModel::getPredictionMSE(), and mPredictiveModel.

unsigned int verve::Agent::getNumContinuousSensors  )  const [virtual]
 

Returns the number of continuous sensors.

Definition at line 563 of file Agent.cpp.

References verve::AgentDescriptor::getNumContinuousSensors(), and mDescriptor.

unsigned int verve::Agent::getNumDiscreteSensors  )  const [virtual]
 

Returns the number of discrete sensors.

Definition at line 558 of file Agent.cpp.

References verve::AgentDescriptor::getNumDiscreteSensors(), and mDescriptor.

real verve::Agent::getTDError  )  const [virtual]
 

Returns the most recent TD error.

Definition at line 633 of file Agent.cpp.

References verve::RLModule::getTDError(), and mRLModule.

void verve::Agent::incrementAge  )  [protected]
 

Increases the Agent's age by one time step.

Definition at line 666 of file Agent.cpp.

References mAgeHours, mAgeMinutes, mAgeSeconds, and mStepSize.

Referenced by update().

unsigned int verve::Agent::planningSequence const Observation predCurrentObs,
real  predCurrentReward,
real  currentUncertainty
[protected]
 

Performs a single planning sequence which trains the RLModule.

This proceeds until either the prediciton uncertainty is too high or the sequence length is too long. Returns the length of the planning sequence.

Definition at line 253 of file Agent.cpp.

References verve::Observation::copyInputData(), verve::CURIOUS_MODEL_RL, verve::AgentDescriptor::getArchitecture(), verve::AgentDescriptor::getMaxNumPlanningSteps(), verve::AgentDescriptor::getPlanningUncertaintyThreshold(), mDescriptor, mPredictiveModel, mRLModule, mTempPlanningObs, verve::PredictiveModel::predict(), verve::RLModule::resetShortTermMemory(), and verve::RLModule::update().

Referenced by update().

void verve::Agent::resetShortTermMemory  )  [virtual]
 

Resets temporary dynamics without affecting learned parameters.

Definition at line 109 of file Agent.cpp.

References mActionIndex, mActualPrevObs, mFirstStep, mLastPlanningSequenceLength, mPredCurrentObs, mPredictiveModel, mRLModule, mTempPlanningObs, verve::PredictiveModel::resetShortTermMemory(), verve::RLModule::resetShortTermMemory(), and verve::Observation::zeroInputData().

void verve::Agent::saveStateRBFData const std::string &  filename = ""  )  [virtual]
 

Outputs a data file containing the position of all RBFs in the state representation, including discrete and continuous data.

Passing in an empty filename string will automatically generate a unique filename and save the file in the current working directory. This does nothing if the Agent uses no inputs.

Definition at line 661 of file Agent.cpp.

References mRLModule, and verve::RLModule::saveStateRBFData().

void verve::Agent::saveValueData unsigned int  continuousResolution,
const std::string &  filename = ""
[virtual]
 

Outputs a data file containing estimated values for every possible state.

The 'resolution' parameter determines how many values to check along each continuous input dimension. Passing in an empty filename string will automatically generate a unique filename and save the file in the current working directory. This does nothing if the Agent uses no inputs. The output file format is: First line: the number of distinct values along each input dimension All other lines: the inputs in each dimension and the value of the corresponding state.

Definition at line 655 of file Agent.cpp.

References mRLModule, and verve::RLModule::saveValueData().

void verve::Agent::setETraceTimeConstant real  timeConstant  )  [virtual]
 

Sets how fast the eligibility traces will decay.

The time constant must be greater than zero.

Definition at line 580 of file Agent.cpp.

References mRLModule, mStepSize, and verve::RLModule::setETraceTimeConstant().

void verve::Agent::setLearningEnabled bool  enabled  )  [virtual]
 

Enables and disables learning.

Once the Agent performs adequately, learning can be disabled to improve runtime performance.

Definition at line 605 of file Agent.cpp.

References mLearningEnabled.

void verve::Agent::setModelLearningRate real  timeConstant  )  [virtual]
 

Sets the learning rate for the predictive model.

The time constant (which must be greater than zero) specifies how many seconds it takes for the prediction errors to be reduced to 37% of their initial values. This does nothing if this Agent was not constructed with a predictive model.

Definition at line 597 of file Agent.cpp.

References mPredictiveModel, mStepSize, and verve::PredictiveModel::setDeltaLearningRate().

void verve::Agent::setStepSize real  value  )  [protected]
 

Sets the size of the time steps used during each simulation step.

Definition at line 568 of file Agent.cpp.

References verve::PredictiveModel::changeStepSize(), verve::RLModule::changeStepSize(), mPredictiveModel, mRLModule, and mStepSize.

Referenced by update().

void verve::Agent::setTDDiscountTimeConstant real  timeConstant  )  [virtual]
 

Sets how much future rewards are discounted.

The time constant must be greater than zero.

Definition at line 585 of file Agent.cpp.

References mRLModule, mStepSize, and verve::RLModule::setTDDiscountTimeConstant().

void verve::Agent::setTDLearningRate real  valueFunctionTimeConstant,
real  policyLearningMultiplier
[virtual]
 

Sets the TD learning rate for the value function and policy.

The time constant (which must be greater than zero) specifies how many seconds it takes for the value function's prediction errors to be reduced to 37% of their initial values. The policy learning multiplier combined with the value function's learning rate determines the policy's learning rate (the multiplier usually ranges from 1-100).

Definition at line 590 of file Agent.cpp.

References mRLModule, mStepSize, and verve::RLModule::setTDLearningRate().

unsigned int verve::Agent::update real  reinforcement,
const Observation obs,
real  dt
[virtual]
 

Gives the Agent reinforcement for the current state, the current observation (i.e.

sensory input data from the current state), and how much time has elapsed since the previous update. Allows the Agent to learn (if learning is enabled). Returns the index of the action to perform. The reward value must be within the range [-1, 1] (this is to ensure that the reward magnitude is not used to affect the TD learning rate). It is best to pass in the same dt each time this is called (when the dt changes between successive calls, several things must be recomputed internally).

Definition at line 125 of file Agent.cpp.

References verve::AgentDescriptor::getArchitecture(), incrementAge(), mActionIndex, mActualPrevObs, mDescriptor, mFirstStep, mLastPlanningSequenceLength, mLearningEnabled, mPredCurrentObs, mPredictiveModel, mRLModule, mStepSize, planningSequence(), verve::PredictiveModel::predictAndTrain(), verve::RL, setStepSize(), verve::RLModule::update(), and verve::RLModule::updatePolicyOnly().


Member Data Documentation

unsigned int verve::Agent::mActionIndex [protected]
 

A stored copy of the most recent action index.

Definition at line 217 of file Agent.h.

Referenced by resetShortTermMemory(), and update().

Observation verve::Agent::mActualPrevObs [protected]
 

A copy of the previous actual Observation.

This must be stored across time steps.

Definition at line 221 of file Agent.h.

Referenced by Agent(), resetShortTermMemory(), and update().

long unsigned int verve::Agent::mAgeHours [protected]
 

The hours component of the Agent's age.

Definition at line 238 of file Agent.h.

Referenced by getAge(), getAgeString(), and incrementAge().

unsigned int verve::Agent::mAgeMinutes [protected]
 

The minutes component of the Agent's age.

Definition at line 241 of file Agent.h.

Referenced by getAge(), getAgeString(), and incrementAge().

real verve::Agent::mAgeSeconds [protected]
 

The seconds component of the Agent's age.

Definition at line 244 of file Agent.h.

Referenced by getAge(), getAgeString(), and incrementAge().

AgentDescriptor verve::Agent::mDescriptor [protected]
 

A saved copy of the AgentDescriptor used to create this Agent.

Definition at line 203 of file Agent.h.

Referenced by getDescriptor(), getNumContinuousSensors(), getNumDiscreteSensors(), planningSequence(), and update().

bool verve::Agent::mFirstStep [protected]
 

Used to handle the first step differently.

This is necessary because the Agent is trained using the state representation from the previous step.

Definition at line 214 of file Agent.h.

Referenced by resetShortTermMemory(), and update().

unsigned int verve::Agent::mLastPlanningSequenceLength [protected]
 

The length of the most recent planning sequence (in number of steps).

Definition at line 248 of file Agent.h.

Referenced by getLastPlanLength(), resetShortTermMemory(), and update().

bool verve::Agent::mLearningEnabled [protected]
 

Determines whether the Agent learns.

Definition at line 232 of file Agent.h.

Referenced by setLearningEnabled(), and update().

Observation verve::Agent::mPredCurrentObs [protected]
 

An allocated Observation, mainly used for convenience.

This does not need to be valid across time steps.

Definition at line 225 of file Agent.h.

Referenced by Agent(), resetShortTermMemory(), and update().

PredictiveModel* verve::Agent::mPredictiveModel [protected]
 

The predictive model component.

Definition at line 209 of file Agent.h.

Referenced by Agent(), getModelMSE(), planningSequence(), resetShortTermMemory(), setModelLearningRate(), setStepSize(), update(), and ~Agent().

RLModule* verve::Agent::mRLModule [protected]
 

The main reinforcement learning component.

Definition at line 206 of file Agent.h.

Referenced by Agent(), computeValueEstimation(), getTDError(), planningSequence(), resetShortTermMemory(), saveStateRBFData(), saveValueData(), setETraceTimeConstant(), setStepSize(), setTDDiscountTimeConstant(), setTDLearningRate(), update(), and ~Agent().

real verve::Agent::mStepSize [protected]
 

The current step size being used.

Definition at line 235 of file Agent.h.

Referenced by incrementAge(), setETraceTimeConstant(), setModelLearningRate(), setStepSize(), setTDDiscountTimeConstant(), setTDLearningRate(), and update().

Observation verve::Agent::mTempPlanningObs [protected]
 

An allocated Observation, mainly used for convenience.

This does not need to be valid across time steps.

Definition at line 229 of file Agent.h.

Referenced by Agent(), planningSequence(), and resetShortTermMemory().


The documentation for this class was generated from the following files:
Generated on Tue Jan 24 21:46:39 2006 for Verve by  doxygen 1.4.6-NO