12 struct D4PGTrainerCInfo : RLTrainerCInfo
109 virtual float GetActorLoss() = 0;
127 SMARTENGINE_EXPORT ObjPtr D4PGTrainer_CreateInstance(
const D4PGTrainerCInfo& cinfo);
128 SMARTENGINE_EXPORT
float D4PGTrainer_GetActorLoss(ObjPtr
object);
129 SMARTENGINE_EXPORT
float D4PGTrainer_GetCriticLoss(ObjPtr
object);
int minSampleCount
How many samples we wait for before we start training.
Definition: D4PGTrainer.h:49
virtual float GetCriticLoss()=0
Returns the loss in the critic graph.
int criticPreTrainSteps
How many samples to train the critic before training the actor.
Definition: D4PGTrainer.h:55
float minValue
The minimum expected rewards we can track in our probability distribution.
Definition: D4PGTrainer.h:38
int criticNeuronCount
How many neurons are in the critic hidden layer
Definition: D4PGTrainer.h:25
float syncLerpPercent
How much to sync the training network with the actual network
Definition: D4PGTrainer.h:75
float maxValue
The maximum expected rewards we can track in our probability distribution.
Definition: D4PGTrainer.h:44
Base class for all reinforcement learning trainers.
Definition: RLTrainer.h:69
Data used to construct an ID4PGTrainer instance
Definition: D4PGTrainer.h:16
IGraph * actorGraph
The actor graph to train
Definition: D4PGTrainer.h:20
Smart pointer to an IObject. Automatic ref counting.
Definition: ObjectPtr.h:16
Definition: A2CTrainer.h:10
SMARTENGINE_EXPORT ObjectPtr< ID4PGTrainer > CreateD4PGTrainer(const D4PGTrainerCInfo &cinfo)
Creates an instance of ID4PGTrainer
int batchSize
How many data samples we should try to train at a time.
Definition: D4PGTrainer.h:65
int criticAtomCount
How many neurons are in the critic output layer. This is also the granularity of the probability dist...
Definition: D4PGTrainer.h:32
A graph is a collection of buffers and nodes that together form a neural network. The graph is create...
Definition: Graph.h:61
int lookAheadSteps
How many actual experiences we should look at before using an estimate for total rewards this episode...
Definition: D4PGTrainer.h:60
The D4PGTrainer is a reinforcement learning trainer that is composed of two parts: an actor sub graph...
Definition: D4PGTrainer.h:104
int syncGenerationCount
How often we should sync the training network with the actual network
Definition: D4PGTrainer.h:70