Data used to construct an IA2CTrainer instance More...

#include <A2CTrainer.h>

Inheritance diagram for SmartEngine::A2CTrainerCInfo:

Public Attributes
IGraph *	graph = nullptr
	The graph we are training. This should contain the policy network and value network. More...

ICuriosityModule *	curiosityModule = nullptr
	Optional curiosity module for additional exploration rewards More...

const char *	valueNodeName = ""
	The name of the output of the critic node. This node should be a linear layer with one output neuron and no activation function. More...

float	valueCoefficient = 1.0f
	How much weight the value contributes to the loss More...

float	entropyCoefficient = 0.01f
	How much weight the entropy contributes to the loss. Entropy is a measure of how random our output is. At the beginning of training, we want random output, but towards the end we want less random output so that we choose paths we know work. More...

int	lookAheadSteps = 2
	How many actual experiences we should look at before using an estimate for total rewards this episode. More...

int	minBatchSize = 32
	How many data samples we should try to train at a time. More...

Public Attributes inherited from SmartEngine::RLTrainerCInfo
IContext *	context = nullptr
	The context to perform graph operations within. More...

IAgentDataStore *	dataStore = nullptr
	The data store used to save experience state More...

const char *	agentName = ""
	Should be a unique name across the data store More...

const char **	policyNodeNames = nullptr
	The names of the output nodes of the actor (the network used to manipulate the environment). Only the output nodes need to be specified. They should match the actions provided by the agent. More...

int	policyNodeNameCount = 0
	The number of elements in the policy node name array More...

float	gamma = 0.99f
	Reward decay over time More...

GradientDescentTrainingInfo	trainingInfo
	Gradient descent training parameters More...

int	sequenceLength = 1
	LSTM sequence lengths. Can be ignored if there is no LSTM in the graphs. More...

Public Attributes inherited from SmartEngine::ResourceCInfo
const char *	resourceName = nullptr
	Optional resource name that will be used with Load() and Save() if no other name is provided. More...

Detailed Description

Data used to construct an IA2CTrainer instance

Member Data Documentation

◆ curiosityModule

ICuriosityModule* SmartEngine::A2CTrainerCInfo::curiosityModule = nullptr

Optional curiosity module for additional exploration rewards

◆ entropyCoefficient

float SmartEngine::A2CTrainerCInfo::entropyCoefficient = 0.01f

How much weight the entropy contributes to the loss. Entropy is a measure of how random our output is. At the beginning of training, we want random output, but towards the end we want less random output so that we choose paths we know work.

◆ graph

IGraph* SmartEngine::A2CTrainerCInfo::graph = nullptr

The graph we are training. This should contain the policy network and value network.

◆ lookAheadSteps

int SmartEngine::A2CTrainerCInfo::lookAheadSteps = 2

How many actual experiences we should look at before using an estimate for total rewards this episode.

◆ minBatchSize

int SmartEngine::A2CTrainerCInfo::minBatchSize = 32

How many data samples we should try to train at a time.

◆ valueCoefficient

float SmartEngine::A2CTrainerCInfo::valueCoefficient = 1.0f

How much weight the value contributes to the loss

◆ valueNodeName

const char* SmartEngine::A2CTrainerCInfo::valueNodeName = ""

The name of the output of the critic node. This node should be a linear layer with one output neuron and no activation function.

Public Attributes