Skip to content

create_ocr_class_mlpCreateOcrClassMlpCreateOcrClassMlpcreate_ocr_class_mlpT_create_ocr_class_mlp🔗

Short description🔗

create_ocr_class_mlpCreateOcrClassMlpCreateOcrClassMlpcreate_ocr_class_mlpT_create_ocr_class_mlp — Create an OCR classifier using a multilayer perceptron.

Signature🔗

create_ocr_class_mlp( integer WidthCharacter, integer HeightCharacter, string Interpolation, string Features, string Characters, integer NumHidden, string Preprocessing, integer NumComponents, integer RandSeed, out ocr_mlp OCRHandle )void CreateOcrClassMlp( const HTuple& WidthCharacter, const HTuple& HeightCharacter, const HTuple& Interpolation, const HTuple& Features, const HTuple& Characters, const HTuple& NumHidden, const HTuple& Preprocessing, const HTuple& NumComponents, const HTuple& RandSeed, HTuple* OCRHandle )static void HOperatorSet.CreateOcrClassMlp( HTuple widthCharacter, HTuple heightCharacter, HTuple interpolation, HTuple features, HTuple characters, HTuple numHidden, HTuple preprocessing, HTuple numComponents, HTuple randSeed, out HTuple OCRHandle )def create_ocr_class_mlp( width_character: int, height_character: int, interpolation: str, features: MaybeSequence[str], characters: Sequence[str], num_hidden: int, preprocessing: str, num_components: int, rand_seed: int ) -> HHandle

Herror T_create_ocr_class_mlp( const Htuple WidthCharacter, const Htuple HeightCharacter, const Htuple Interpolation, const Htuple Features, const Htuple Characters, const Htuple NumHidden, const Htuple Preprocessing, const Htuple NumComponents, const Htuple RandSeed, Htuple* OCRHandle )

void HOCRMlp::HOCRMlp( Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HTuple& Features, const HTuple& Characters, Hlong NumHidden, const HString& Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::HOCRMlp( Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HString& Features, const HTuple& Characters, Hlong NumHidden, const HString& Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::HOCRMlp( Hlong WidthCharacter, Hlong HeightCharacter, const char* Interpolation, const char* Features, const HTuple& Characters, Hlong NumHidden, const char* Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::HOCRMlp( Hlong WidthCharacter, Hlong HeightCharacter, const wchar_t* Interpolation, const wchar_t* Features, const HTuple& Characters, Hlong NumHidden, const wchar_t* Preprocessing, Hlong NumComponents, Hlong RandSeed ) (Windows only)

public HOCRMlp( int widthCharacter, int heightCharacter, string interpolation, HTuple features, HTuple characters, int numHidden, string preprocessing, int numComponents, int randSeed )

public HOCRMlp( int widthCharacter, int heightCharacter, string interpolation, string features, HTuple characters, int numHidden, string preprocessing, int numComponents, int randSeed )

void HOCRMlp::CreateOcrClassMlp( Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HTuple& Features, const HTuple& Characters, Hlong NumHidden, const HString& Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::CreateOcrClassMlp( Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HString& Features, const HTuple& Characters, Hlong NumHidden, const HString& Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::CreateOcrClassMlp( Hlong WidthCharacter, Hlong HeightCharacter, const char* Interpolation, const char* Features, const HTuple& Characters, Hlong NumHidden, const char* Preprocessing, Hlong NumComponents, Hlong RandSeed )

void HOCRMlp::CreateOcrClassMlp( Hlong WidthCharacter, Hlong HeightCharacter, const wchar_t* Interpolation, const wchar_t* Features, const HTuple& Characters, Hlong NumHidden, const wchar_t* Preprocessing, Hlong NumComponents, Hlong RandSeed ) (Windows only)

void HOCRMlp.CreateOcrClassMlp( int widthCharacter, int heightCharacter, string interpolation, HTuple features, HTuple characters, int numHidden, string preprocessing, int numComponents, int randSeed )

void HOCRMlp.CreateOcrClassMlp( int widthCharacter, int heightCharacter, string interpolation, string features, HTuple characters, int numHidden, string preprocessing, int numComponents, int randSeed )

Description🔗

create_ocr_class_mlpCreateOcrClassMlp creates an OCR classifier that uses a multilayer perceptron (MLP). The handle of the OCR classifier is returned in OCRHandleOCRHandleocrhandle.

For a description on how an MLP works, see create_class_mlpCreateClassMlp. create_ocr_class_mlpCreateOcrClassMlp creates an MLP with OutputFunctionoutputFunctionoutput_function \(=\) 'softmax'"softmax". The length of the feature vector of the MLP (NumInputnumInputnum_input in create_class_mlpCreateClassMlp) is determined from the features that are used for the OCR, which are passed in Featuresfeaturesfeatures. The features are described below. The number of units in the hidden layer is determined by NumHiddennumHiddennum_hidden. The number of output variables of the MLP (NumOutputnumOutputnum_output in create_class_mlpCreateClassMlp) is determined from the names of the characters to be used in the OCR, which are passed in Characterscharacterscharacters. As described with create_class_mlpCreateClassMlp, the parameters Preprocessingpreprocessingpreprocessing and NumComponentsnumComponentsnum_components can be used to specify a preprocessing of the data (i.e., the feature vectors). The OCR already approximately normalizes the features. Hence, Preprocessingpreprocessingpreprocessing can typically be set to 'none'"none". The parameter RandSeedrandSeedrand_seed has the same meaning as in create_class_mlpCreateClassMlp. Furthermore, like for general MLP classifiers (see create_class_mlpCreateClassMlp and set_regularization_params_class_mlpSetRegularizationParamsClassMlp), it may be desirable to regularize OCR classifiers. This can be achieved by calling set_regularization_params_ocr_class_mlpSetRegularizationParamsOcrClassMlp before training the OCR classifier. In addition, like for general MLP classifiers (see create_class_mlpCreateClassMlp and set_rejection_params_class_mlpSetRejectionParamsClassMlp), it might be desirable to equip the OCR classifiers with the capability to reject unknown characters. The rejection class is by convention an additional symbol chr(26) that must be provided in Characterscharacterscharacters. The parameters of the rejection class can be set by calling set_rejection_params_ocr_class_mlpSetRejectionParamsOcrClassMlp before training the OCR classifier.

The features to be used for the classification are determined by Featuresfeaturesfeatures. Featuresfeaturesfeatures can contain a tuple of several feature names. Each of these feature names results in one or more features to be calculated for the classifier. Some of the feature names compute gray value features (e.g., 'pixel_invar'"pixel_invar"). Because a classifier requires a constant number of features (input variables), a character to be classified is transformed to a standard size, which is determined by WidthCharacterwidthCharacterwidth_character and HeightCharacterheightCharacterheight_character. The interpolation to be used for the transformation is determined by Interpolationinterpolationinterpolation. It has the same meaning as in affine_trans_imageAffineTransImage. The interpolation should be chosen such that no aliasing effects occur in the transformation. For most applications, Interpolationinterpolationinterpolation \(=\) 'constant'"constant" should be used. It should be noted that the size of the transformed character is not chosen too large, because the generalization properties of the classifier may become bad for large sizes. In particular, large sizes will lead to the fact that small segmentation errors will have a large influence on the computed features if gray value features are used. This happens because segmentation errors will change the smallest enclosing rectangle of the regions, which leads to the fact that the character is zoomed differently than the characters in the training set. In most applications, sizes between 6x8 and 10x14 should be used.

The parameter Featuresfeaturesfeatures can contain the following feature names for the classification of the characters.

After the classifier has been created, it is trained using trainf_ocr_class_mlpTrainfOcrClassMlp. After this, the classifier can be saved using write_ocr_class_mlpWriteOcrClassMlp. Alternatively, the classifier can be used immediately after training to classify characters using do_ocr_single_class_mlpDoOcrSingleClassMlp or do_ocr_multi_class_mlpDoOcrMultiClassMlp.

HALCON provides a number of pretrained OCR classifiers (see the “Solution Guide I”, chapter ‘OCR’, section ‘Pretrained OCR Fonts’). These pretrained OCR classifiers can be read directly with read_ocr_class_mlpReadOcrClassMlp and make it possible to read a wide variety of different fonts without the need to train an OCR classifier. Therefore, it is recommended to try if one of the pretrained OCR classifiers can be used successfully. If this is the case, it is not necessary to create and train an OCR classifier.

A comparison of the MLP and the support vector machine (SVM) (see create_ocr_class_svmCreateOcrClassSvm) typically shows that SVMs are generally faster at training, especially for huge training sets, and achieve slightly better recognition rates than MLPs. The MLP is faster at classification and should therefore be preferred in time critical applications. Please note that this guideline assumes optimal tuning of the parameters.

Execution information🔗

Execution information
  • Multithreading type: reentrant (runs in parallel with non-exclusive operators).

  • Multithreading scope: global (may be called from any thread).

  • Processed without parallelization.

This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.

Parameters🔗

WidthCharacterwidthCharacterwidth_character (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Width of the rectangle to which the gray values of the segmented character are zoomed.

Default: 88
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 201, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Value range: 4 ≤ WidthCharacter ≤ 20

HeightCharacterheightCharacterheight_character (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Height of the rectangle to which the gray values of the segmented character are zoomed.

Default: 1010
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 201, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Value range: 4 ≤ HeightCharacter ≤ 20

Interpolationinterpolationinterpolation (input_control) string → (string)HTuple (HString)HTuple (string)strHtuple (char*)

Interpolation mode for the zooming of the characters.

Default: 'constant'"constant"
List of values: 'bicubic', 'bilinear', 'constant', 'nearest_neighbor', 'weighted'"bicubic", "bilinear", "constant", "nearest_neighbor", "weighted"

Featuresfeaturesfeatures (input_control) string(-array) → (string)HTuple (HString)HTuple (string)MaybeSequence[str]Htuple (char*)

Features to be used for classification.

Default: 'default'"default"
List of values: 'anisometry', 'chord_histo', 'compactness', 'convexity', 'cooc', 'default', 'foreground', 'foreground_grid_16', 'foreground_grid_9', 'gradient_8dir', 'height', 'moments_central', 'moments_gray_plane', 'moments_region_2nd_invar', 'moments_region_2nd_rel_invar', 'moments_region_3rd_invar', 'num_connect', 'num_holes', 'num_runs', 'phi', 'pixel', 'pixel_binary', 'pixel_invar', 'projection_horizontal', 'projection_horizontal_invar', 'projection_vertical', 'projection_vertical_invar', 'ratio', 'width', 'zoom_factor'"anisometry", "chord_histo", "compactness", "convexity", "cooc", "default", "foreground", "foreground_grid_16", "foreground_grid_9", "gradient_8dir", "height", "moments_central", "moments_gray_plane", "moments_region_2nd_invar", "moments_region_2nd_rel_invar", "moments_region_3rd_invar", "num_connect", "num_holes", "num_runs", "phi", "pixel", "pixel_binary", "pixel_invar", "projection_horizontal", "projection_horizontal_invar", "projection_vertical", "projection_vertical_invar", "ratio", "width", "zoom_factor"

Characterscharacterscharacters (input_control) string-array → (string)HTuple (HString)HTuple (string)Sequence[str]Htuple (char*)

All characters of the character set to be read.

Default: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

NumHiddennumHiddennum_hidden (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Number of hidden units of the MLP.

Default: 8080
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 1501, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150
Restriction: NumHidden >= 1

Preprocessingpreprocessingpreprocessing (input_control) string → (string)HTuple (HString)HTuple (string)strHtuple (char*)

Type of preprocessing used to transform the feature vectors.

Default: 'none'"none"
List of values: 'canonical_variates', 'none', 'normalization', 'principal_components'"canonical_variates", "none", "normalization", "principal_components"

NumComponentsnumComponentsnum_components (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Preprocessing parameter: Number of transformed features (ignored for Preprocessingpreprocessingpreprocessing \(=\) 'none'"none" and Preprocessingpreprocessingpreprocessing \(=\) 'normalization'"normalization").

Default: 1010
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 1001, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100
Restriction: NumComponents >= 1

RandSeedrandSeedrand_seed (input_control) integer → (integer)HTuple (Hlong)HTuple (int / long)intHtuple (Hlong)

Seed value of the random number generator that is used to initialize the MLP with random values.

Default: 4242

OCRHandleOCRHandleocrhandle (output_control) ocr_mlp → (handle)HTuple (HHandle)HOCRMlp, HTuple (IntPtr)HHandleHtuple (handle)

Handle of the OCR classifier.

Example🔗

(HDevelop)

read_image (Image, 'letters')
* Segment the image.
binary_threshold(Image,&Region, 'otsu', 'dark', &UsedThreshold)\;
dilation_circle (Region, RegionDilation, 3.5)
connection (RegionDilation, ConnectedRegions)
intersection (ConnectedRegions, Region, RegionIntersection)
sort_region (RegionIntersection, Characters, 'character', 'true', 'row')
* Generate the training file.
count_obj (Characters, Number)
Classes := []
for J := 0 to 25 by 1
    Classes := [Classes,gen_tuple_const(20,chr(ord('a')+J))]
endfor
Classes := [Classes,gen_tuple_const(20,'.')]
write_ocr_trainf (Characters, Image, Classes, 'letters.trf')
* Generate and train the classifier.
read_ocr_trainf_names ('letters.trf', CharacterNames, CharacterCount)
create_ocr_class_mlp (8, 10, 'constant', 'default', CharacterNames, 20, \
                      'none', 81, 42, OCRHandle)
trainf_ocr_class_mlp (OCRHandle, 'letters.trf', 100, 0.01, 0.01, Error, \
                      ErrorLog)
* Re-classify the characters in the image.
do_ocr_multi_class_mlp (Characters, Image, OCRHandle, Class, Confidence)

Result🔗

If the parameters are valid, the operator create_ocr_class_mlpCreateOcrClassMlp returns the value 2 (H_MSG_TRUE). If necessary, an exception is raised.

Combinations with other operators🔗

Combinations

Possible successors

trainf_ocr_class_mlpTrainfOcrClassMlp, set_regularization_params_ocr_class_mlpSetRegularizationParamsOcrClassMlp, set_rejection_params_ocr_class_mlpSetRejectionParamsOcrClassMlp

Alternatives

create_ocr_class_svmCreateOcrClassSvm

See also

do_ocr_single_class_mlpDoOcrSingleClassMlp, do_ocr_multi_class_mlpDoOcrMultiClassMlp, clear_ocr_class_mlpClearOcrClassMlp, create_class_mlpCreateClassMlp, train_class_mlpTrainClassMlp, classify_class_mlpClassifyClassMlp

Module🔗

OCR/OCV