Data used to construct an ID4PGTrainer instance More...

#include <D4PGTrainer.h>

Inheritance diagram for SmartEngine::D4PGTrainerCInfo:

Public Attributes
IGraph *	actorGraph = nullptr
	The actor graph to train More...

int	criticNeuronCount = 32
	How many neurons are in the critic hidden layer More...

int	criticAtomCount = 32
	How many neurons are in the critic output layer. This is also the granularity of the probability distribution. Higher value means more granular. More...

float	minValue = -10.0f
	The minimum expected rewards we can track in our probability distribution. More...

float	maxValue = 10.0f
	The maximum expected rewards we can track in our probability distribution. More...

int	minSampleCount = 10000
	How many samples we wait for before we start training. More...

int	criticPreTrainSteps = 10000
	How many samples to train the critic before training the actor. More...

int	lookAheadSteps = 2
	How many actual experiences we should look at before using an estimate for total rewards this episode. More...

int	batchSize = 64
	How many data samples we should try to train at a time. More...

int	syncGenerationCount = 1
	How often we should sync the training network with the actual network More...

float	syncLerpPercent = 1e-3f
	How much to sync the training network with the actual network More...

Public Attributes inherited from SmartEngine::RLTrainerCInfo
IContext *	context = nullptr
	The context to perform graph operations within. More...

IAgentDataStore *	dataStore = nullptr
	The data store used to save experience state More...

const char *	agentName = ""
	Should be a unique name across the data store More...

const char **	policyNodeNames = nullptr
	The names of the output nodes of the actor (the network used to manipulate the environment). Only the output nodes need to be specified. They should match the actions provided by the agent. More...

int	policyNodeNameCount = 0
	The number of elements in the policy node name array More...

float	gamma = 0.99f
	Reward decay over time More...

GradientDescentTrainingInfo	trainingInfo
	Gradient descent training parameters More...

int	sequenceLength = 1
	LSTM sequence lengths. Can be ignored if there is no LSTM in the graphs. More...

Public Attributes inherited from SmartEngine::ResourceCInfo
const char *	resourceName = nullptr
	Optional resource name that will be used with Load() and Save() if no other name is provided. More...

Detailed Description

Data used to construct an ID4PGTrainer instance

Member Data Documentation

◆ actorGraph

IGraph* SmartEngine::D4PGTrainerCInfo::actorGraph = nullptr

The actor graph to train

◆ batchSize

int SmartEngine::D4PGTrainerCInfo::batchSize = 64

How many data samples we should try to train at a time.

◆ criticAtomCount

int SmartEngine::D4PGTrainerCInfo::criticAtomCount = 32

How many neurons are in the critic output layer. This is also the granularity of the probability distribution. Higher value means more granular.

◆ criticNeuronCount

int SmartEngine::D4PGTrainerCInfo::criticNeuronCount = 32

How many neurons are in the critic hidden layer

◆ criticPreTrainSteps

int SmartEngine::D4PGTrainerCInfo::criticPreTrainSteps = 10000

How many samples to train the critic before training the actor.

◆ lookAheadSteps

int SmartEngine::D4PGTrainerCInfo::lookAheadSteps = 2

How many actual experiences we should look at before using an estimate for total rewards this episode.

◆ maxValue

float SmartEngine::D4PGTrainerCInfo::maxValue = 10.0f

The maximum expected rewards we can track in our probability distribution.

◆ minSampleCount

int SmartEngine::D4PGTrainerCInfo::minSampleCount = 10000

How many samples we wait for before we start training.

◆ minValue

float SmartEngine::D4PGTrainerCInfo::minValue = -10.0f

The minimum expected rewards we can track in our probability distribution.

◆ syncGenerationCount

int SmartEngine::D4PGTrainerCInfo::syncGenerationCount = 1

How often we should sync the training network with the actual network

◆ syncLerpPercent

float SmartEngine::D4PGTrainerCInfo::syncLerpPercent = 1e-3f

How much to sync the training network with the actual network

Public Attributes