select_feature_set_trainf_mlpT_select_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp (Operator)
select_feature_set_trainf_mlpT_select_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
— Selects an optimal combination of features to classify OCR data.
Signature
void SelectFeatureSetTrainfMlp(const HTuple& TrainingFile, const HTuple& FeatureList, const HTuple& SelectionMethod, const HTuple& Width, const HTuple& Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* OCRHandle, HTuple* FeatureSet, HTuple* Score)
HTuple HOCRMlp::SelectFeatureSetTrainfMlp(const HTuple& TrainingFile, const HTuple& FeatureList, const HString& SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRMlp::SelectFeatureSetTrainfMlp(const HString& TrainingFile, const HString& FeatureList, const HString& SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRMlp::SelectFeatureSetTrainfMlp(const char* TrainingFile, const char* FeatureList, const char* SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
HTuple HOCRMlp::SelectFeatureSetTrainfMlp(const wchar_t* TrainingFile, const wchar_t* FeatureList, const wchar_t* SelectionMethod, Hlong Width, Hlong Height, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* Score)
(
Windows only)
static void HOperatorSet.SelectFeatureSetTrainfMlp(HTuple trainingFile, HTuple featureList, HTuple selectionMethod, HTuple width, HTuple height, HTuple genParamName, HTuple genParamValue, out HTuple OCRHandle, out HTuple featureSet, out HTuple score)
HTuple HOCRMlp.SelectFeatureSetTrainfMlp(HTuple trainingFile, HTuple featureList, string selectionMethod, int width, int height, HTuple genParamName, HTuple genParamValue, out HTuple score)
HTuple HOCRMlp.SelectFeatureSetTrainfMlp(string trainingFile, string featureList, string selectionMethod, int width, int height, HTuple genParamName, HTuple genParamValue, out HTuple score)
def select_feature_set_trainf_mlp(training_file: MaybeSequence[str], feature_list: MaybeSequence[str], selection_method: str, width: int, height: int, gen_param_name: Sequence[str], gen_param_value: Sequence[Union[int, str, float]]) -> Tuple[HHandle, Sequence[str], Sequence[float]]
Description
select_feature_set_trainf_mlpselect_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
selects an optimal combination of
features, to classify the OCR data given in the training file
TrainingFileTrainingFileTrainingFiletrainingFiletraining_file
with a multilayer perceptron,
for details see create_ocr_class_mlpcreate_ocr_class_mlpCreateOcrClassMlpCreateOcrClassMlpcreate_ocr_class_mlp
.
Possible features are all OCR features listed and explained in
create_ocr_class_mlpcreate_ocr_class_mlpCreateOcrClassMlpCreateOcrClassMlpcreate_ocr_class_mlp
. All candidates which should be tested can be
specified in FeatureListFeatureListFeatureListfeatureListfeature_list
. A subset of these features is
returned as selected features in FeatureSetFeatureSetFeatureSetfeatureSetfeature_set
.
select_feature_set_trainf_mlpselect_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
is specialized on OCR problems and
only supports the features in the list mentioned before.
In order to use other features, please use the more general operator
select_feature_set_mlpselect_feature_set_mlpSelectFeatureSetMlpSelectFeatureSetMlpselect_feature_set_mlp
.
The selection method
SelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method
is either a greedy search 'greedy'"greedy""greedy""greedy""greedy"
(iteratively add the feature with highest gain)
or the dynamically oscillating search 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating"
(add the feature with highest gain and test then if any of the already added
features can be left out without great loss).
The method 'greedy'"greedy""greedy""greedy""greedy" is generally preferable, since it is faster.
Only in cases when a large training set is available
the method 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating" might return better results.
The optimization criterion is the classification rate of a two-fold
cross-validation of the training data. The best achieved value
is returned in ScoreScoreScorescorescore
.
The parameters GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name
and GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value
allow
to adapt the setting of the number of hidden neurons in the MLP with
'num_hidden'"num_hidden""num_hidden""num_hidden""num_hidden". The default value is 80, a higher value
leads to longer training times but might lead to a more expressive
classifier.
Attention
This operator may take considerable time, depending on the size of the
data set in the training file, and the number of features.
Please note, that this operator should not be called, if only a small
set of training data is available. Due to the risk of overfitting the
operator select_feature_set_trainf_mlpselect_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
may deliver a classifier with
a very high score. However, the classifier may perform poorly when tested.
Execution Information
- Multithreading type: reentrant (runs in parallel with non-exclusive operators).
- Multithreading scope: global (may be called from any thread).
- Automatically parallelized on internal data level.
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
Parameters
TrainingFileTrainingFileTrainingFiletrainingFiletraining_file
(input_control) filename.read(-array) →
HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)
Names of the training files.
Default:
''
""
""
""
""
File extension:
.trf
, .otr
FeatureListFeatureListFeatureListfeatureListfeature_list
(input_control) string(-array) →
HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)
List of features that should be considered for selection.
Default:
['zoom_factor','ratio','width','height','foreground','foreground_grid_9','foreground_grid_16','anisometry','compactness','convexity','moments_region_2nd_invar','moments_region_2nd_rel_invar','moments_region_3rd_invar','moments_central','phi','num_connect','num_holes','projection_horizontal','projection_vertical','projection_horizontal_invar','projection_vertical_invar','chord_histo','num_runs','pixel','pixel_invar','pixel_binary','gradient_8dir','cooc','moments_gray_plane']
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
["zoom_factor","ratio","width","height","foreground","foreground_grid_9","foreground_grid_16","anisometry","compactness","convexity","moments_region_2nd_invar","moments_region_2nd_rel_invar","moments_region_3rd_invar","moments_central","phi","num_connect","num_holes","projection_horizontal","projection_vertical","projection_horizontal_invar","projection_vertical_invar","chord_histo","num_runs","pixel","pixel_invar","pixel_binary","gradient_8dir","cooc","moments_gray_plane"]
List of values:
'anisometry'"anisometry""anisometry""anisometry""anisometry", 'chord_histo'"chord_histo""chord_histo""chord_histo""chord_histo", 'compactness'"compactness""compactness""compactness""compactness", 'convexity'"convexity""convexity""convexity""convexity", 'cooc'"cooc""cooc""cooc""cooc", 'default'"default""default""default""default", 'foreground'"foreground""foreground""foreground""foreground", 'foreground_grid_16'"foreground_grid_16""foreground_grid_16""foreground_grid_16""foreground_grid_16", 'foreground_grid_9'"foreground_grid_9""foreground_grid_9""foreground_grid_9""foreground_grid_9", 'gradient_8dir'"gradient_8dir""gradient_8dir""gradient_8dir""gradient_8dir", 'height'"height""height""height""height", 'moments_central'"moments_central""moments_central""moments_central""moments_central", 'moments_gray_plane'"moments_gray_plane""moments_gray_plane""moments_gray_plane""moments_gray_plane", 'moments_region_2nd_invar'"moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar", 'moments_region_2nd_rel_invar'"moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar", 'moments_region_3rd_invar'"moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar", 'num_connect'"num_connect""num_connect""num_connect""num_connect", 'num_holes'"num_holes""num_holes""num_holes""num_holes", 'num_runs'"num_runs""num_runs""num_runs""num_runs", 'phi'"phi""phi""phi""phi", 'pixel'"pixel""pixel""pixel""pixel", 'pixel_binary'"pixel_binary""pixel_binary""pixel_binary""pixel_binary", 'pixel_invar'"pixel_invar""pixel_invar""pixel_invar""pixel_invar", 'projection_horizontal'"projection_horizontal""projection_horizontal""projection_horizontal""projection_horizontal", 'projection_horizontal_invar'"projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar", 'projection_vertical'"projection_vertical""projection_vertical""projection_vertical""projection_vertical", 'projection_vertical_invar'"projection_vertical_invar""projection_vertical_invar""projection_vertical_invar""projection_vertical_invar", 'ratio'"ratio""ratio""ratio""ratio", 'width'"width""width""width""width", 'zoom_factor'"zoom_factor""zoom_factor""zoom_factor""zoom_factor"
SelectionMethodSelectionMethodSelectionMethodselectionMethodselection_method
(input_control) string →
HTuplestrHTupleHtuple (string) (string) (HString) (char*)
Method to perform the selection.
Default:
'greedy'
"greedy"
"greedy"
"greedy"
"greedy"
List of values:
'greedy'"greedy""greedy""greedy""greedy", 'greedy_oscillating'"greedy_oscillating""greedy_oscillating""greedy_oscillating""greedy_oscillating"
WidthWidthWidthwidthwidth
(input_control) integer →
HTupleintHTupleHtuple (integer) (int / long) (Hlong) (Hlong)
Width of the rectangle to which the gray values
of the segmented character are zoomed.
Default:
15
HeightHeightHeightheightheight
(input_control) integer →
HTupleintHTupleHtuple (integer) (int / long) (Hlong) (Hlong)
Height of the rectangle to which the gray values
of the segmented character are zoomed.
Default:
16
GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name
(input_control) string-array →
HTupleSequence[str]HTupleHtuple (string) (string) (HString) (char*)
Names of generic parameters to configure the selection
process and the classifier.
Default:
[]
List of values:
'nu'"nu""nu""nu""nu"
GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value
(input_control) number-array →
HTupleSequence[Union[int, str, float]]HTupleHtuple (real / integer / string) (double / int / long / string) (double / Hlong / HString) (double / Hlong / char*)
Values of generic parameters to configure the selection
process and the classifier.
Default:
[]
Suggested values:
'0.1'"0.1""0.1""0.1""0.1"
OCRHandleOCRHandleOCRHandleOCRHandleocrhandle
(output_control) ocr_mlp →
HOCRMlp, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)
Trained OCR-MLP classifier.
ScoreScoreScorescorescore
(output_control) real-array →
HTupleSequence[float]HTupleHtuple (real) (double) (double) (double)
Achieved score using tow-fold cross-validation.
Result
If the parameters are valid, the operator
select_feature_set_trainf_mlpselect_feature_set_trainf_mlpSelectFeatureSetTrainfMlpSelectFeatureSetTrainfMlpselect_feature_set_trainf_mlp
returns the value 2 (
H_MSG_TRUE)
. If necessary,
an exception is raised.
Alternatives
select_feature_set_trainf_svmselect_feature_set_trainf_svmSelectFeatureSetTrainfSvmSelectFeatureSetTrainfSvmselect_feature_set_trainf_svm
,
select_feature_set_trainf_knnselect_feature_set_trainf_knnSelectFeatureSetTrainfKnnSelectFeatureSetTrainfKnnselect_feature_set_trainf_knn
See also
select_feature_set_trainf_mlp_protectedselect_feature_set_trainf_mlp_protectedSelectFeatureSetTrainfMlpProtectedSelectFeatureSetTrainfMlpProtectedselect_feature_set_trainf_mlp_protected
,
select_feature_set_mlpselect_feature_set_mlpSelectFeatureSetMlpSelectFeatureSetMlpselect_feature_set_mlp
Module
OCR/OCV