Building and Loading a Graph

The basic unit of getting output out of SmartEngine is a Graph. A graph contains input buffers and connected nodes that produce output.

This code demonstrates how to load and evaluate a graph at the lowest level. The graph definition can alternatively come from a json resource file. While the below code will work, it is recommended to use the Network and NetworkController functionality in the SmartEngine.Helpers namespace for graph lifetime control and evaluation. See the example projects for usage.

This code can be found in SmartEngine/Examples/QuickStart.cs

// This is our graph definition.
//
// We an input buffer to the graph ("Input") that takes in 2 float values
// per row. We also have a buffer called "TrainingOutput" that is not used
// during the evaluation of the graph, but is a simply a convienence for
// creating a buffer that can be used as the output we want to train against.
// The training output buffer could alternatively be created directly in code.
//
// There is one "hidden" neuron layer ("LinearLayer1") of 32 neurons that
// connects into an output layer ("Output") of 1 neuron. The hidden layer
// has an activation function of Selu, and the output Tanh
const string cGraphDefinition = @"
    {
        ""Nodes"": [
            {
                ""Name"": ""Input"",
                ""Type"": ""BufferInput"",
                ""Parameters"": {
                ""Dimension"": 2
                }
            },
 
            {
                ""Name"": ""TrainingOutput"",
                ""Type"": ""BufferInput"",
                ""Parameters"": {
                ""Dimension"": 1
                }
            },
 
            {
                ""Name"": ""LinearLayer1"",
                ""Type"": ""NeuronLayer"",
                ""Parameters"": {
                    ""Input"": ""Input"",
                    ""Type"": ""Linear"",
                    ""ActivationType"": ""Selu"",
                    ""NeuronCount"": 32
                }
            },
 
            {
                ""Name"": ""Output"",
                ""Type"": ""NeuronLayer"",
                ""Parameters"": {
                    ""Input"": ""LinearLayer1"",
                    ""Type"": ""Linear"",
                    ""ActivationType"": ""Tanh"",
                    ""NeuronCount"": 1
                }
            }
        ]
    }";
 
// All graph objects need a context. Connected graph nodes must have
// the same context.
Context context = new Context();
 
// Create the graph using the definition above. All the linear layers will
// have random weights assigned, so if we evaluate now, we'll get random output.
// Alternatively, we can create an empty graph with no nodes by calling
// the constructor directly. In that case, nodes would need to be manually added
// to the graph.
Graph graph;
{
    Graph.CInfo cinfo = new Graph.CInfo();
    cinfo.Context = context;
    graph = Graph.CreateFromDefinitionString(cinfo, cGraphDefinition);
}
 
// Load the weights of the graph from disk. It is assumed we have previously trained
// and saved out the graph.
graph.Deserialize("MyFolder", "MyGraph");
 
// Grab the input buffer from the graph and fill it with some data. The dimension
// of the buffer is 2 from the graph definition above, so we must give it a
// multiple of 2 number of floats. Each multiple will be a separate row in the
// output. Here, our input looks like
//      [ 0.5,  0.5 ]
//      [ 0.8, -0.6 ]
// In a 2D motorcycle game, these could be the direction the bike is facing
// for 2 AI characters.
BufferInput input = graph.GetNode<BufferInput>("Input");
input.SetValues(new float[]{ 0.5f, 0.5f, 0.8f, -0.6f });
 
// Grab the output node and evaluate it using the data currently stored
// in the buffers. Since the input has 2 rows of data, we expect two rows
// of output. The output node has a dimension of 1, so the output looks like
//  [ Output[0] ]
//  [ Output[1] ]
GraphNode output = graph.GetNode<GraphNode>("Output");
output.RetrieveOutput();
 
// Print out the results. In our 2D motorcycle game example, we could interpret these values
// as the amount of acceleration to apply for 2 AI characters. Since the output node
// has an activation function of Tanh, these values will always be in the range [-1..1]
Debug.Log(string.Format("Graph output: [{0}, {1}]", output.Output[0], output.Output[1]));

Training a Graph Using Gradient Descent

Once you have a graph, you can create a gradient descent trainer to train a graph. The idea behind training is that we give it sample inputs and also the expected output for those inputs. After training, the network will have learned the relationship between the input and output and be able to extrapolate new outputs from inputs it has never seen. The network will only be as good as its training data, so the more accurate and broader range the data covers, the better.

// Once you have a graph (see the above function for an example on how to create
// and load a graph), you can train it. In this sample, let's assume we have a graph
// using the definition from the above example.
Graph graph = null; // = CreateMyGraph();
 
// Grab the context of the graph. All the training objects we create must use the
// same context as the graph.
Context context = graph.GetContext();
 
// Grab a buffer from the graph that will be filled with the expected values we want
// the graph to produce given some inputs we give it. After training, we will be able
// to evaluate the graph with any input.
BufferInput expectedOutput = graph.GetNode<BufferInput>("TrainingOutput");
 
// Training by gradient descent requires a loss, which let's us compare the expected values
// with what the graph is producing. This will internally be used to adjust the weights
// in the neuron layers so that the input actually does produce the output we want.
Loss loss;
{
    Loss.CInfo cinfo = new Loss.CInfo();
    cinfo.Context = context;
    cinfo.NetworkOutput = graph.GetNode<GraphNode>("Output");   // What the graph is actually producing
    cinfo.TargetValues = expectedOutput;                        // What we want the graph to output
    loss = new Loss(cinfo);
}
 
// Create the gradient descent trainer. We'll change the learning rate as an example.
// The learning rate tells us how much to adjust the weights each step. A higher value
// means we learn faster, but too high and it might not learn the subtleties of the
// problem space. A good value is in the range 1e-4 to 1e-3.
GradientDescentTrainer trainer;
{
    GradientDescentTrainer.CInfo cinfo = new GradientDescentTrainer.CInfo();
    cinfo.Context = graph.GetContext();
    cinfo.Loss = loss;
 
    // The trainer will update all the neuron layers in the graph from output backwards
    // until it reaches an input node or a GradientStop node.
    cinfo.Graph = graph;
 
    trainer = new GradientDescentTrainer(cinfo);
}
 
// Tell the trainer how we would like to train over the data. Batch means to train
// over all the data each step. If the batch size is 0, then all the data will be
// processed in one operation, otherwise it'll be processed in smaller chunks.
// Stochastic means take a random sample of our data of size BatchSize. We will train
// just once over this data. Stochastic usually leads to better networks because
// the order of the data does not influence how it trains. The downside is that
// the loss tends to jump around instead of smoothly coming down.
{
    LossTrainingMethodInfo info = new LossTrainingMethodInfo();
    info.Method = LossTrainingMethod.Stochastic;
    info.BatchSize = 32;
    trainer.SetTrainingMethod(info);
}
 
// Set the sample inputs and expected outputs we want to train. The number of rows
// in each of the inputs and output must match. Here we are giving it two rows
// of data. In real training, give it as much data as possible (thousands of rows)
// and cover as broad a range of inputs as possible.
BufferInput input = graph.GetNode<BufferInput>("Input");
input.SetValues(new float[] { 0.5f, 0.5f, 0.8f, -0.6f });
expectedOutput.SetValues(new float[] { Constants.MaxTanh, 0.5f });
 
// Start the training process. There are a couple of ways to know when you should stop.
// You can either go for a fixed number of steps, stop when the loss reaches a desired
// point, or monitor it manually and see when the loss has stopped coming down. Expect
// 100,000 steps for a moderately sized network with 10s of thousands of rows.
for (int i = 0; i < 500; i++)
{
    trainer.Train(100);
 
    // Output the loss to see how we are doing
    Debug.Log("Loss: " + trainer.Loss);
 
    yield return null;
}
 
// We're done! Save the graph and we can start using it. See the example above for evaluation.
graph.Serialize("MyFolder", "MyGraph");