Hands-On Neural Network Programming with C#

上QQ阅读APP看书，第一时间看更新

Loading and saving models

SharpLearning makes it very easy to load and save models to disk. This is a very important part of a machine learning library and SharpLearning is among the easiest to implement.

All models in SharpLearning have a Save and a Load method. These methods do the heavy lifting of saving and loading a model for us.

As an example, here we will save a model that we learned to disk:

model.Save(() => StreamWriter(@"C:\randomforest.xml"));

If we want to load this model back in, we simply use the Load method:

varloadedModel = RegressionForestModel.Load(() => newStreamReader(@"C:\randomforest.xml"));

Yep, it’s that easy and simple to load and save your data models. It is also possible for you to save models using serialization. This will allow us to choose between XML and a Binary format. Another very nice design feature of SharpLearning is that serializing models allows us to serialize to the IPredictorModel interface. This makes replacing your models much easier, if each conforms to that interface. Here’s how we would do that:

varxmlSerializer = new GenericXmlDataContractSerializer();
xmlSerializer.Serialize<IPredictorModel<double>>(model, 
 () => new StreamWriter(@"C:\randomforest.xml"));
var loadedModelXml = xmlSerializer
.Deserialize<IPredictorModel<double>>(() => new StreamReader(@"C:\randomforest.xml"));

And there you have it, instant training and testing errors.

When reporting the performance of your model, you should always use the test error even if the training error is lower, since that is an estimate of how well the model generalizes to new data.

Now, let's talk for a second about hyperparameters. Hyperparameters are parameters that affect the learning process of the machine learning algorithm. You can adjust them to tune the process and improve performance and reliability. At the same time, you can also incorrectly adjust parameters and have something that does not work as intended. Let's look at a few things that can happen with an incorrectly tuned hyperparameter:

If the model is too complex, you can end up with what is known as high variance, or Overfitting
If the model ends up being too simple, you will end up with what is known as high bias, or Underfitting

For those who have not done so, manually tuning hyperparameters, a process that happens in almost every use case, can take a considerable amount of your time. As the number of hyperparameters increases with the model, the tuning time and effort increase as well. The best way around this is to use an optimizer and let the work happen for you. To this end, SharpLearning can be a huge help to us due to the numerous optimizers that are available for it. Here is a list of just some of them:

Grid search
Random search
Particle swarm (which we will talk about in Chapter 7, Replacing Back Propagation with PSO)
Bayesian optimization
Globalized bounded nelder mead

Let’s start with an example.

Let’s create a learner and use the default parameters, which, more than likely, will be good enough. Once we find our parameters and create the learner, we need to create the model. We then will predict the training and test set. Once all of that is complete, we will measure the error on the test set and record it:

// create learner with default parameters
var learner = new RegressionSquareLossGradientBoostLearner(runParallel: false);
// learn model with found parameters
var model = learner.Learn(trainSet.Observations, trainSet.Targets);
// predict the training and test set.
var trainPredictions = model.Predict(trainSet.Observations);
var testPredictions = model.Predict(testSet.Observations);
// since this is a regression problem we are using square error as metric
// for evaluating how well the model performs.
var metric = new MeanSquaredErrorRegressionMetric();
// measure the error on the test set.
var testError = metric.Error(testSet.Targets, testPredictions);

And here is our test set error

With that part complete, we now have our baseline established. Let’s use a RandomSearchOptimizer to tune the hyperparameters to see if we can get any better results. To do this we need to establish the bounds of the hyperparameters, so our optimizer knows how to tune. Let’s look at how we do this:

var parameters = new ParameterBounds[]
{
 new ParameterBounds(min: 80, max: 300, 
 transform: Transform.Linear, parameterType: ParameterType.Discrete), 
 new ParameterBounds(min: 0.02, max: 0.2, 
 transform: Transform.Logarithmic, parameterType: ParameterType.Continuous), 
 new ParameterBounds(min: 8, max: 15, 
 transform: Transform.Linear, parameterType: ParameterType.Discrete), 
 new ParameterBounds(min: 0.5, max: 0.9, 
 transform: Transform.Linear, parameterType: ParameterType.Continuous), 
 new ParameterBounds(min: 1, max: numberOfFeatures, 
 transform: Transform.Linear, parameterType: ParameterType.Discrete), 
};

Did you notice that we used a Logarithmic transform for the learning rate? Do you know why we did this? The answer is: to ensure that we had a more even distribution across the entire range of values. We have a large range difference between our minimum and maximum values (0.02 -> 0.2), so the logarithmic transform will be best.

We now need a validation set to help us measure how well the model generalizes to unseen data during our optimization. To do this, we will need to further split the training data. To do this, we are going to leave our current test set out of the optimization process. If we don’t, we risk getting a positive bias on our final error estimate, and that will not be what we want:

var validationSplit = new RandomTrainingTestIndexSplitter<double>(trainingPercentage: 0.7, seed: 24)
.SplitSet(trainSet.Observations, trainSet.Targets);

One more thing that the optimizer will need is an objective function. The function will take a double array as input (containing the set of hyperparameters) and return an OptimizerResult that contains the validation error and the corresponding set of hyperparameters:

Func<double[], OptimizerResult> minimize = p =>
 {
 var candidateLearner = new RegressionSquareLossGradientBoostLearner(
 iterations: (int)p[0],
learningRate: p[1], 
maximumTreeDepth: (int)p[2], 
subSampleRatio: p[3], 
featuresPrSplit: (int)p[4],
runParallel: false);
 var candidateModel = candidateLearner.Learn(validationSplit.TrainingSet.Observations,
validationSplit.TrainingSet.Targets);
 var validationPredictions = candidateModel.Predict(validationSplit.TestSet.Observations);
 var candidateError = metric.Error(validationSplit.TestSet.Targets, validationPredictions);
 return new OptimizerResult(p, candidateError);
};

Once this objective function has been defined, we can now create and run the optimizer to find the best set of parameters. Let’s start out by running our optimizer for 30 iterations and trying out 30 different sets of hyperparameters:

// create our optimizer
var optimizer = new RandomSearchOptimizer(parameters, iterations: 30, runParallel: true);
// find the best hyperparameters for use
var result = optimizer.OptimizeBest(minimize);
var best = result.ParameterSet;

Once we run this, our optimizer should find the best set of hyperparameters. Let’s see what it finds:

Trees: 277
learningRate: 0.035
maximumTreeDepth: 15
subSampleRatio: 0.838
featuresPrSplit: 4

Progress. Now that we have a set of best hyperparameters, which were measured on our validation set, we can create a learner with these parameters and learn a new model using the entire dataset:

var learner = new RegressionSquareLossGradientBoostLearner(
 iterations: (int)best[0],
learningRate: best[1], 
maximumTreeDepth: (int)best[2], 
subSampleRatio: best[3],
featuresPrSplit: (int)best[4], 
runParallel: false);
var model = learner.Learn(trainSet.Observations, trainSet.Targets);

With our final set of hyperparameters now intact, we pass these to our learner and are able to reduce the test error significantly. For us to do that manually would have taken us an eternity and beyond!