Model Types of ASCMO-STATIC

The following model types are available for ASCMO-STATIC:

If the advanced settings are enabled(File > Options > Advanced Settings), the following additional model types are available:

Note  

For an overview of all model types and their best use, see Overview: Model Type Descriptions.

No Model

Only measurement data is displayed.

ASC GP Model

The ASCMO Gaussian Process (ASC GP) model type can be used for data sets with up to 4000 training data points with 15 inputs. This is the default model type. Training time and memory consumption scale especially with the number of data points. For many more data points, other model types may be more suitable.

The following model parameters can be specified:

Automatic Transformation

Activate this checkbox if you want the optimal Box-Cox transformation to be determined automatically during model training.

Output Transformation

Select the transformation type of the output. Using a transformation can improve the model prediction. If the Automatic Transformation checkbox is activated, the transformation type is selected automatically. Not all transformations are available if the training data has negative or zero values.

You can select from the following choices:

  • none: no transformation
  • 1/y: inversion
  • 1/sqrt(y): inverted square root
  • log(y): logarithm
  • sqrt(y): square root
  • Bounded: limited to lower and upper bound

    Click Edit to Closedview the automatically selected bounds or to define the lower and upper bounds manually. To define them manually, deactivate the Automatic checkbox. The bounds must be in the range of the training data.

  • log(y+c): logarithm plus constant

    Click Edit to Closedview the automatically selected log shift or to define a manual shift value. To define it manually, deactivate the Automatic checkbox.

Advanced

If the advanced settings are enabled (File > Options > Advanced Settings), additional settings for the model training of the output can be selected here.

ClosedAdvanced parameters for the ASC GP model

Number of Iterations

Enter the number of iterations to be performed during model training. If the model performance does not improve within 10 iterations on the validation data, the training will be aborted. In deep learning this is often referred to as number of epochs.

Number of Multistarts

Enter the number of training repetitions with different starting values. A higher value can improve the model quality, but the model training then takes more time. The default value is 3.

Plot Likelihood

Activate if you want the values of the logarithmic Likelihood function and the Leave-One-Out error to be Closeddisplayed as a function of the runs during model training.

Max No. of Basis Functions

Enter a value as maximum number of basis functions for the model training. A higher value can improve the model quality, but the model training then takes more time. The default value is 4000. The smaller value of either training data points or subset size is used as the number of basic functions in the model. If more training data points are available, all data is used for training, but the resulting model is constrained to 4000 basis functions. A slightly different training algorithm is used in this case, while model predictions are done in the same way.

Sparse Subset Selection Method

Select the method you want to use to reduce the training data points to a spare subset when there are more than "Max No. of Basis Functions" data points.

  • Random (Default): Random selection of a subset without specific selection criteria.

  • GP-SCS like Selection using Gaussian process regression to iteratively add relevant features to a subset based on their impact on predictive performance.

Kernel

Select the kernel function to be used during model training.

  • Squared Exponential (ARD) (Default): Trains models with softer curve characteristics.

  • Matern (ARD) Trains models with harder curve characteristics. Provides sharper resolution of strong nonlinear effects with small noise. This could lead to overfitting.

Automatic Noise

Activate if you want the optimization of the noise level parameter to be performed automatically.

Noise Level

Enter a value for the maximum noise level to be tolerated. The input field is only accessible if the "Automatic Noise" checkbox is deactivated.

Lower Bound Noise

Enter a value as minimum for the noise level parameter. The input field is only accessible if the "Automatic Noise" checkbox is activated.

Linear Extrapolation

Activate if you want a supplementary linear model to be used, which learns the basic tendency of the data and shows this tendency outside the measured area.

Modeling Criterion

Select the modeling criterion to be used during model training.

  • Default: Based on likelihood.

  • Relative Error: Is the quotient of the error and the actual value (measured data - predicted data / measured data * 100). During the optimization, a Closedvisualization window pops up where you can manually stop the optimization and use the best result so far.

Input Length Scale Squared

Click Edit to open the ClosedInput Length Scale Squared window. You can edit the hyperparameter for each input. If Automatic is activated for an input, the respective hyperparameter is set automatically. Otherwise, you can edit the values manually. The hyperparameter per input dimension is the core width of the Gaussian bell. The length scale is r in the following equation, so a smaller value has greater relevance:

You can display the relevances graphically in the Relevance of Inputs (ASCMO-STATIC).

MLP Model

MLP (Multilayer Perceptron) with possibility to export it to flatbuffer format for Bosch ECU.

The following model parameters can be specified:

Automatic Transformation

Activate this checkbox if you want the optimal Box-Cox transformation to be determined automatically during model training.

Output Transformation

Select the transformation type of the output. Using a transformation can improve the model prediction. If the Automatic Transformation checkbox is activated, the transformation type is selected automatically. Not all transformations are available if the training data has negative or zero values.

You can select from the following choices:

  • none: no transformation
  • 1/y: inversion
  • 1/sqrt(y): inverted square root
  • log(y): logarithm
  • sqrt(y): square root
  • Bounded: limited to lower and upper bound

    Click Edit to Closedview the automatically selected bounds or to define the lower and upper bounds manually. To define them manually, deactivate the Automatic checkbox. The bounds must be in the range of the training data.

  • log(y+c): logarithm plus constant

    Click Edit to Closedview the automatically selected log shift or to define a manual shift value. To define it manually, deactivate the Automatic checkbox.

Iterations

Enter the number of iterations to be performed during model training. If the model performance does not improve within 10 iterations on the validation data, the training will be aborted. In deep learning this is often referred to as number of epochs.

Multistart

Enter the number of times to run the optimizer with different starting values during model training. A higher number means a higher probability of finding the optimal model, but it takes more time.

Validations Ratio [%]

Enter the relative number, in percent, of validation samples to be randomly selected from the training data.

Use Test Data as Validation Data

Activate the checkbox if you want to use the test data as validation data.

Continue Training

Activate the checkbox to continue with existing model training and iterations, if possible, instead of starting a new training. You can change the training properties and continue. For example, train with a complex activation function, then switch to a more efficient one (for the ECU) and continue training seamlessly.

Layers

Configure the layers of the multilayer perceptron. There is always at least one hidden layer, usually 1-3, and exactly one output layer.

Select an activation function from the list:

  • Linear: y = x

  • ReLU: y = max(0, x)

  • LeakyReLU: y = max(0.01 * x, x)

  • Sigmoid: y = 1 / (1 + exp(-x))

  • PrecTanh: y = 2 / (1 + exp(-2 * x)) - 1

  • Elliotsig: y = x / (1 + abs(x))

Insert: Click to insert a hidden layer.

Delete: Select one or more layers and delete them.

Polynom Model

Note  

The model is only available if you have enabled the advanced settings via File > Options > Advanced Settings (see also Enabling Advanced Settings).

For the Polynom model the following parameters can be specified:

Automatic Transformation

Activate this checkbox if you want the optimal Box-Cox transformation to be determined automatically during model training.

Output Transformation

Select the transformation type of the output. Using a transformation can improve the model prediction. If the Automatic Transformation checkbox is activated, the transformation type is selected automatically. Not all transformations are available if the training data has negative or zero values.

You can select from the following choices:

  • none: no transformation
  • 1/y: inversion
  • 1/sqrt(y): inverted square root
  • log(y): logarithm
  • sqrt(y): square root
  • Bounded: limited to lower and upper bound

    Click Edit to Closedview the automatically selected bounds or to define the lower and upper bounds manually. To define them manually, deactivate the Automatic checkbox. The bounds must be in the range of the training data.

  • log(y+c): logarithm plus constant

    Click Edit to Closedview the automatically selected log shift or to define a manual shift value. To define it manually, deactivate the Automatic checkbox.

Interaction 1/2

  • Interaction of first order:

    xi * xj with i!=j

  • Interaction of second order:

    xi * xj * xk with i=j allowed

Order

Maximal order of the polynomial.

Pure terms xin.

Stepwise Regression

When this option is enabled, the regression model is built up gradually. Terms with a p-value (significance level) less than 5% are included sequentially. Including new terms may change the significance level of an already included term. If the p-value is greater than 10%, terms are removed. This method is also called forward selection.

ASC Compressed Model

Note  

The model is only available if you have enabled the advanced settings via File > Options > Advanced Settings (see also Enabling Advanced Settings). You will also need the additional license ASCMO_MODEL_COMPRESSION.

This model type allows you to limit the number of basis functions within the model (see Model Compression). The following model parameters can be specified:

Model Size

Enter the number of basis functions of the compressed model.

Number of Iterations 1/2

Enter the number of iterations to be performed during model training. If the model performance does not improve within 10 iterations on the validation data, the training will be aborted. In deep learning this is often referred to as number of epochs.

Multistart

Enter the number of times to run the optimizer with different starting values during model training. A higher number means a higher probability of finding the optimal model, but it takes more time.

Plot Error

Activate the checkbox to display model error information during model training.

To save the RMSE as a bitmap graphic use View > Save as Bitmap in the main window.

Output Transformation

Select the transformation type of the output. Using a transformation can improve the model prediction. Not all transformations are available if the training data has negative or zero values.

You can select from the following choices:

  • none: no transformation
  • 1/y: inversion
  • 1/sqrt(y): inverted square root
  • log(y): logarithm
  • sqrt(y): square root
  • Bounded: limited to lower and upper bound

    Click Edit to Closedview the automatically selected bounds or to define the lower and upper bounds manually. To define them manually, deactivate the Automatic checkbox. The bounds must be in the range of the training data.

  • log(y+c): logarithm plus constant

    Click Edit to Closedview the automatically selected log shift or to define a manual shift value. To define it manually, deactivate the Automatic checkbox.

Advanced

Click Edit to edit the Closedadvanced parameters.

Modeling Criterion

Select the modeling criterion to be used during model training.

  • RMSE: The Root Mean Square Error is the average size of the error between the predicted and actual values. A second measurement is less than 1 RMSE from the model prediction with 68% probability (95.5% < 2 RMSE, 99.7% < 3 RMSE, etc.). During the optimization, a Closedvisualization window pops up where you can manually stop the optimization and use the best result so far.

  • Relative Error: Is the quotient of the error and the actual value (measured data - predicted data / measured data * 100). During the optimization, a Closedvisualization window pops up where you can manually stop the optimization and use the best result so far.

Max. Bump Height

Enter a number to limit the height of the Gaussian bump.

Min. Bump Width

Click Edit and enter a number to limit the width of the Gaussian bump per input.

 

Classification (GP) Model

Note  

The model is only available if you have enabled the advanced settings via File > Options > Advanced Settings (see also Enabling Advanced Settings).

This type of model allows you to classify existing measurements (inputs) into two classes, 0 or 1. The curve shown in the classification model is equal to the probability that the current input belongs to class 1. For example, you can determine whether the engine knocks in a certain configuration (class 1) or not (class 2).

The following model parameters can be specified:

Number of Iterations

Enter the number of iterations to be performed during model training. If the model performance does not improve within 10 iterations on the validation data, the training will be aborted. In deep learning this is often referred to as number of epochs.

Kernel

Defines which kernel function is used for model training

  • Squared Exponential (ARD): Trains models with softer curve characteristics.
  • Matern (ARD): Trains models with harder curve characteristics. This can lead to overfitting.

Export/Plot Category

If deactivated, the probability of mapping the input to class 1 is plotted.

If activated, the probability values greater than or equal to the threshold are mapped to class 1. The class membership is plotted.

Category Threshold

In the model evaluation, measurement data will be assigned to class 1 if it meets or exceeds this threshold. The Plot Category checkbox must be activated. The threshold must be a number in [0,1], the default is 0.5.

Classification (MLP) Model

The following model parameters can be specified:

Layers

Configure the layers of the multilayer perceptron.

Select an activation function from the list:

  • Linear: y = x

  • ReLU: y = max(0, x)

  • LeakyReLU: y = max(0.01 * x, x)

  • Sigmoid: y = 1 / (1 + exp(-x))

  • PrecTanh: y = 2 / (1 + exp(-2 * x)) - 1

  • Elliotsig: y = x / (1 + abs(x))

Add Layer: Click to insert a layer.

Layer: Select one or more layers and delete them.

Number of Network Parameters

Dynamically shows the number of parameters used by the model training for current settings.

Continue Training

Activate the checkbox to continue with existing model training and iterations, if possible, instead of starting a new training. You can change the training properties and continue. For example, train with a complex activation function, then switch to a more efficient one (for the ECU) and continue training seamlessly.

Number of Multistarts

Enter the number of training repetitions with different starting values. A higher value can improve the model quality, but the model training then takes more time. The default value is 3.

Number of Iterations

Enter the number of iterations to be performed during model training. If the model performance does not improve within 10 iterations on the validation data, the training will be aborted. In deep learning this is often referred to as number of epochs.

Use Test Data as Validation Data

Activate the checkbox if you want to use the test data as validation data.

Validations Ratio [%]

Enter the relative number, in percent, of validation samples to be randomly selected from the training data.

Plot RMSE during Training

Activate if you want the RMSE values for training data and validation data to be Closeddisplayed during model training.

Activate Detailed Training Settings

Activate the checkbox to display the ClosedDetailed Training Settings section.

  • Optimizer

    Select the optimizer used to train the model. If you activate the Continue Training checkbox, it is recommended to select Stochastic Gradient Descent (for continue).

The detailed training settings are adjusted for each iteration. For the first iteration the Start Value is used, for the last iteration the Final Value. The values in between are interpolated.

  • No. of Optimizer Substeps

    Determines how many sequences of length Lookback Length are used for one optimizer update. The default value is 100, which is the batch size used in deep learning. The larger the value, the smaller the batch size and vice versa. If the number is small, the optimizer step is performed less frequently and the training is therefore faster.

  • Learning Rate

    Enter the size of the optimizer steps. The default value is 0.01. Valid value range is [0, 1].

    The larger the learning rate, the faster the training generally will be. However, convergence can be hindered, or even prevented, by large learning rates.

Plot Category

If deactivated, the probability of mapping the input to class 1 is plotted.

If activated, the probability values greater than or equal to the threshold are mapped to class 1. The class membership is plotted.

Category Threshold

In the model evaluation, measurement data will be assigned to class 1 if it meets or exceeds this threshold. The Plot Category checkbox must be activated. The threshold must be a number in [0,1], the default is 0.5.

ASC GP-SCS Model

Note  

The model is only available if you have enabled the advanced settings via File > Options > Advanced Settings (see also Enabling Advanced Settings). You will also need the additional license ASCMO_MODEL_COMPRESSION.

The ASCMO Gaussian Process Sparse Constant Sigma (ASC GP-SCS) model type should be preferred if the number of training data is large. You can define the following parameters:

No. Basis Functions

For a faster convergence, use a smaller number than in the actual model training (e.g., 20).

Output Transformation

Select the transformation type of the output. Using a transformation can improve the model prediction. Not all transformations are available if the training data has negative or zero values.

You can select from the following choices:

  • none: no transformation
  • log(y): logarithm
  • Bounded: limited to lower and upper bound

    Click Edit to Closedview the automatically selected bounds or to define the lower and upper bounds manually. To define them manually, deactivate the Automatic checkbox. The bounds must be in the range of the training data.

  • log(y+c): logarithm plus constant

    Click Edit to Closedview the automatically selected log shift or to define a manual shift value. To define it manually, deactivate the Automatic checkbox.

Optimize Positions

If checkbox is activated, virtual or pseudo inputs at optimized positions are used instead of basis functions at positions of randomly chosen training data.

The number of inputs used for training is given in the "No. Basis Functions" field.

Multistart

Enter the number of times to run the optimizer with different starting values during model training. A higher number means a higher probability of finding the optimal model, but it takes more time.

Input Length Scale Squared

Click Edit to open the ClosedInput Length Scale Squared window. You can edit the hyperparameter for each input. If Automatic is activated for an input, the respective hyperparameter is set automatically. Otherwise, you can edit the values manually. The hyperparameter per input dimension is the core width of the Gaussian bell. The length scale is r in the following equation, so a smaller value has greater relevance:

You can display the relevances graphically in the Relevance of Inputs (ASCMO-STATIC).

Default

Restores the default settings for all parameters.

Note  

Models of type ASC GP-SCS cannot be exported to INCA/MDA.

Classification (Random Decision Trees) Model

Random decision trees classification uses an ensemble of multiple binary decision trees. By using multiple decision trees, it reduces the risk of overfitting. Different trees use a different subset of the training data, while the same data points can appear in multiple trees, this is called bagging. In addition to bagging, different trees model different features (set of inputs), which is the key point of the algorithm, also called random subspace sampling or feature bagging. You can define the following model parameters:

Global Settings

Number of Trees

Number of trees used by the random decision trees algorithm.

Use Bagging

If activated (recommended), each tree of the random decision trees is trained with a subset of all training data (the size of the subset can be specified by further options).

Bagging reduces the risk of overfitting.

Maximum Number of Samples per Tree

Specifies the number of random draws from the training data to determine the training data of a single tree.

Number of Training Samples

Absolute Number: An exact number can be specified.

Fraction: A value between 0 and 1 can be specified, the number of training data samples for a single tree will then be max(1, fraction * number of training data).

All Training Data: Number of draws is equal to the number of training data.

Maximum Number of Inputs per Split

In the training process, the best split of the data has to be found for each node. To avoid overfitting and to speed up the training process, it can be useful to limit the number of inputs that take part in the split decision (for each split, a subset is randomly selected). Especially for high-dimensional problems, it is recommended to use an option like Square Root of Number of Inputs.

Number of Inputs

All Inputs: All inputs are included.

Absolute Number: Specifies the number of features considered in a split decision, in the range [1, number of features].

Fraction: A fraction (a number in [0, 1]) of the total number of inputs is selected for each split, i.e., max(1, round(fraction * number of inputs)).

Square Root of the Number of Inputs: The total number of inputs for a split is the square root of the number of inputs, i.e., round(sqrt(number of inputs)).

2-Logarithm of Number of Inputs: The total number of inputs for a split is the logarithm to the base 2 of the number of inputs, i.e., round(log2(number of inputs)).

Tree Settings

Maximum Depth

Integer value that defines the maximum depth for each decision tree in the random decision trees.

Split Criterion

Gini impurity (between 0 and 0.5).

Shannon information gain (entropy) (between 0 and 1).

Minimum Number of Samples per Split

If the number of samples in a node is less than this value, the node is not split further.

Minimum Samples per Leaf

Splitting a node that results in leaves with less than this value is considered to be invalid.

Minimum Split Gain

Let this value be v and the total number of training samples S.

For each possible split into a left node with n samples and a right node with m samples, the gain g for the split is calculated internally using the split criterion.

A split is considered to be invalid if (n+m)/S*g less or equal than v.

Plot Settings

Export/Plot Category

If deactivated, the probability of mapping the input to class 1 is plotted.

If activated, the probability values greater than or equal to the threshold are mapped to class 1. The class membership is plotted.

Category Threshold

In the model evaluation, measurement data will be assigned to class 1 if it meets or exceeds this threshold. The Plot Category checkbox must be activated. The threshold must be a number in [0,1], the default is 0.5.

See also

Options (ASCMO-STATIC)

Overview: Exports Supported by Model Type

Overview: Model Type Descriptions