Error on Training/Test Data over Model Size
Model menu > Model Compression > Error over model size
Note |
---|
The description of this window refers to the error that is based on the Leave-One-Out method – but it applies analogously to the errors based on training (> Error over Model Size (Training Data)) or test data (> Error over Model Size (Test Data)). In addition, you can only determine the model error when the desired output was modeled with the model type Compressed Model. |
This function helps in assessing how the number of training data used influenced the model quality. If you call this function the Output Selection window opens.
In the Output Selection window you can select the output for which you want to determine the model error.
After you have selected the outputs, click OK. Then the Error on training data over Model Size window opens.
Error on Training/Test Data over Model Size
Here you need to specify the amount by which the analysis will be started (Start Model Size), the amount in which the analysis is completed (End Model Size), the Step width of Model data size between the evaluations (Stepwidth) and the number of subsets (Number of repetitions), with which the error is determined.
ASCMO-STATIC has deposited default values for this parameter, which you can call at any time via Reset.
After you have defined the parameters, click OK. Then the Analyze Model training error for <output> on model size window opens.
Analyze model training/test error for <Output> on model size
This window shows the average model error (RMSE) for each output depending on the number of training data used.
This allows identifying whether the model improves if more training data are used or if the size of the training data can even be reduced since no appreciable model improvement can be achieved starting at a certain size.
<Number of Repetitions> different subsets of the training data record are selected for the analysis, and leave-one-out error is determined in each case. The bar shows the variance of these <Number of Repetitions> results, the solid line the mean value of the results.
Note |
---|
The larger the subsets, the more time is required for the calculation. |