SmartEngine
1.6.0
|
Data used to construct an ID4PGTrainer instance More...
#include <D4PGTrainer.h>
Public Attributes | |
IGraph * | actorGraph = nullptr |
The actor graph to train More... | |
int | criticNeuronCount = 32 |
How many neurons are in the critic hidden layer More... | |
int | criticAtomCount = 32 |
How many neurons are in the critic output layer. This is also the granularity of the probability distribution. Higher value means more granular. More... | |
float | minValue = -10.0f |
The minimum expected rewards we can track in our probability distribution. More... | |
float | maxValue = 10.0f |
The maximum expected rewards we can track in our probability distribution. More... | |
int | minSampleCount = 10000 |
How many samples we wait for before we start training. More... | |
int | criticPreTrainSteps = 10000 |
How many samples to train the critic before training the actor. More... | |
int | lookAheadSteps = 2 |
How many actual experiences we should look at before using an estimate for total rewards this episode. More... | |
int | batchSize = 64 |
How many data samples we should try to train at a time. More... | |
int | syncGenerationCount = 1 |
How often we should sync the training network with the actual network More... | |
float | syncLerpPercent = 1e-3f |
How much to sync the training network with the actual network More... | |
Public Attributes inherited from SmartEngine::RLTrainerCInfo | |
IContext * | context = nullptr |
The context to perform graph operations within. More... | |
IAgentDataStore * | dataStore = nullptr |
The data store used to save experience state More... | |
const char * | agentName = "" |
Should be a unique name across the data store More... | |
const char ** | policyNodeNames = nullptr |
The names of the output nodes of the actor (the network used to manipulate the environment). Only the output nodes need to be specified. They should match the actions provided by the agent. More... | |
int | policyNodeNameCount = 0 |
The number of elements in the policy node name array More... | |
float | gamma = 0.99f |
Reward decay over time More... | |
GradientDescentTrainingInfo | trainingInfo |
Gradient descent training parameters More... | |
int | sequenceLength = 1 |
LSTM sequence lengths. Can be ignored if there is no LSTM in the graphs. More... | |
Public Attributes inherited from SmartEngine::ResourceCInfo | |
const char * | resourceName = nullptr |
Optional resource name that will be used with Load() and Save() if no other name is provided. More... | |
Data used to construct an ID4PGTrainer instance
IGraph* SmartEngine::D4PGTrainerCInfo::actorGraph = nullptr |
The actor graph to train
int SmartEngine::D4PGTrainerCInfo::batchSize = 64 |
How many data samples we should try to train at a time.
int SmartEngine::D4PGTrainerCInfo::criticAtomCount = 32 |
How many neurons are in the critic output layer. This is also the granularity of the probability distribution. Higher value means more granular.
int SmartEngine::D4PGTrainerCInfo::criticNeuronCount = 32 |
How many neurons are in the critic hidden layer
int SmartEngine::D4PGTrainerCInfo::criticPreTrainSteps = 10000 |
How many samples to train the critic before training the actor.
int SmartEngine::D4PGTrainerCInfo::lookAheadSteps = 2 |
How many actual experiences we should look at before using an estimate for total rewards this episode.
float SmartEngine::D4PGTrainerCInfo::maxValue = 10.0f |
The maximum expected rewards we can track in our probability distribution.
int SmartEngine::D4PGTrainerCInfo::minSampleCount = 10000 |
How many samples we wait for before we start training.
float SmartEngine::D4PGTrainerCInfo::minValue = -10.0f |
The minimum expected rewards we can track in our probability distribution.
int SmartEngine::D4PGTrainerCInfo::syncGenerationCount = 1 |
How often we should sync the training network with the actual network
float SmartEngine::D4PGTrainerCInfo::syncLerpPercent = 1e-3f |
How much to sync the training network with the actual network