train_class_gmmTrainClassGmmTrainClassGmmtrain_class_gmmT_train_class_gmm🔗

Short description🔗

train_class_gmmTrainClassGmmTrainClassGmmtrain_class_gmmT_train_class_gmm — Train a Gaussian Mixture Model.

Signature🔗

train_class_gmm( class_gmm GMMHandle, integer MaxIter, real Threshold, string ClassPriors, real Regularize, out integer Centers, out integer Iter )

Description🔗

train_class_gmmTrainClassGmm trains the Gaussian Mixture Model (GMM) referenced by GMMHandleGMMHandlegmmhandle. Before the GMM can be trained, all training samples to be used for the training must be stored in the GMM using add_sample_class_gmmAddSampleClassGmm, add_samples_image_class_gmmAddSamplesImageClassGmm, or read_samples_class_gmmReadSamplesClassGmm. After the training, new training samples can be added to the GMM and the GMM can be trained again.

During the training, the error that results from the GMM applied to the training vectors will be minimized with the expectation maximization (EM) algorithm.

MaxItermaxItermax_iter specifies the maximum number of iterations per class for the EM algorithm. In practice, values between 2020 and 200200 should be sufficient for most problems. Thresholdthresholdthreshold specifies a threshold for the relative changes of the error. If the relative change in error exceeds the threshold after MaxItermaxItermax_iter iterations, the algorithm will be canceled for this class. Because the algorithm starts with the maximum specified number of centers (parameter NumCentersnumCentersnum_centers in create_class_gmmCreateClassGmm), in case of a premature termination the number of centers and the error for this class will not be optimal. In this case, a new training with different parameters (e.g., another value for RandSeedrandSeedrand_seed in create_class_gmmCreateClassGmm) can be tried.

ClassPriorsclassPriorsclass_priors specifies the method of calculation of the class priors in GMM. If 'training'"training" is specified, the priors of the classes are taken from the proportion of the corresponding sample data during training. If 'uniform'"uniform" is specified, the priors are set equal to 1/NumClasses for all classes.

Regularizeregularizeregularize is used to regularize (nearly) singular covariance matrices during the training. A covariance matrix might collapse to singularity if it is trained with linearly dependent data. To avoid this, a small value specified by Regularizeregularizeregularize is added to each main diagonal element of the covariance matrix, which prevents this element from becoming smaller than Regularizeregularizeregularize. A recommended value for Regularizeregularizeregularize is 0.00010.0001. If Regularizeregularizeregularize is set to 0.00.0, no regularization is performed.

The centers are initially randomly distributed. In individual cases, relatively high errors will result from the algorithm because the initial random values determined by RandSeedrandSeedrand_seed in create_class_gmmCreateClassGmm lead to local minima. In this case, a new GMM with a different value for RandSeedrandSeedrand_seed should be generated to test whether a significantly smaller error can be obtained.

It should be noted that, depending on the number of centers, the type of covariance matrix, and the number of training samples, the training can take from a few seconds to several hours.

On output, train_class_gmmTrainClassGmm returns in Centerscenterscenters the number of centers per class that have been found to be optimal by the EM algorithm. These values can be used as a reference in NumCentersnumCentersnum_centers (in create_class_gmmCreateClassGmm) for future GMMs. If the number of centers found by training a new GMM on integer training data is unexpectedly high, this might be corrected by adding a Randomizerandomizerandomize noise to the training data in add_sample_class_gmmAddSampleClassGmm. Iteriteriter contains the number of performed iterations per class. If a value in Iteriteriter equals MaxItermaxItermax_iter, the training algorithm has been terminated prematurely (see above).

Execution information🔗

Execution information

Multithreading type: reentrant (runs in parallel with non-exclusive operators).
Multithreading scope: global (may be called from any thread).
Automatically parallelized on internal data level.

This operator modifies the state of the following input parameter:

GMMHandleGMMHandlegmmhandle

During execution of this operator, access to the value of this parameter must be synchronized if it is used across multiple threads.

Parameters🔗

GMMHandleGMMHandlegmmhandle (input_control, state is modified) class_gmm → (handle)HTuple (HHandle)HClassGmm, HTuple (IntPtr)HHandleHtuple (handle)

GMM handle.

MaxItermaxItermax_iter (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Maximum number of iterations of the expectation maximization algorithm

Default: 100100
Suggested values: 10, 20, 30, 50, 100, 20010, 20, 30, 50, 100, 200

Thresholdthresholdthreshold (input_control) real → (real)HTuple (double)HTuple (double)floatHtuple (double)

Threshold for relative change of the error for the expectation maximization algorithm to terminate.

Default: 0.0010.001
Suggested values: 0.001, 0.00010.001, 0.0001
Restriction: Threshold >= 0.0 && Threshold <= 1.0

ClassPriorsclassPriorsclass_priors (input_control) string → (string)HTuple (HString)HTuple (string)strHtuple (char*)

Mode to determine the a-priori probabilities of the classes

Default: 'training'"training"
List of values: 'training', 'uniform'"training", "uniform"

Regularizeregularizeregularize (input_control) real → (real)HTuple (double)HTuple (double)floatHtuple (double)

Regularization value for preventing covariance matrix singularity.

Default: 0.00010.0001
Restriction: Regularize >= 0.0 && Regularize < 1.0

Centerscenterscenters (output_control) integer-array → (integer)HTuple (Hlong)HTuple (int / long)Sequence[int]Htuple (Hlong)

Number of found centers per class

Iteriteriter (output_control) integer-array → (integer)HTuple (Hlong)HTuple (int / long)Sequence[int]Htuple (Hlong)

Number of executed iterations per class

Example🔗

(HDevelop)

create_class_gmm (NumDim, NumClasses, [1,5], 'full', 'none', 0, 42,\
                  GMMHandle)
* Add the training data
read_samples_class_gmm (GMMHandle, 'samples.gsf')
* Train the GMM
train_class_gmm (GMMHandle, 100, 1e-4, 'training', 1e-4, Centers, Iter)
* Write the Gaussian Mixture Model to file
write_class_gmm (GMMHandle, 'gmmclassifier.gmm')

Result🔗

If the parameters are valid, the operator train_class_gmmTrainClassGmm returns the value 2 (H_MSG_TRUE). If necessary an exception is raised.

Combinations with other operators🔗

Combinations

Possible predecessors

add_sample_class_gmmAddSampleClassGmm, read_samples_class_gmmReadSamplesClassGmm

Possible successors

evaluate_class_gmmEvaluateClassGmm, classify_class_gmmClassifyClassGmm, write_class_gmmWriteClassGmm, create_class_lut_gmmCreateClassLutGmm

Alternatives

read_class_gmmReadClassGmm

See also

create_class_gmmCreateClassGmm

References🔗

Christopher M. Bishop: “Neural Networks for Pattern Recognition”; Oxford University Press, Oxford; 1995.

Mario A.T. Figueiredo: ``Unsupervised Learning of Finite Mixture Models’‘; IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 3; March 2002.

Module🔗

Foundation