Operator Reference

create_dl_layer_roi_poolingT_create_dl_layer_roi_poolingCreateDlLayerRoiPoolingCreateDlLayerRoiPoolingcreate_dl_layer_roi_pooling (Operator)

create_dl_layer_roi_poolingT_create_dl_layer_roi_poolingCreateDlLayerRoiPoolingCreateDlLayerRoiPoolingcreate_dl_layer_roi_pooling — Create an ROI pooling layer.

Signature

Herror T_create_dl_layer_roi_pooling(const Htuple DLLayerInputImage, const Htuple DLLayerRoI, const Htuple DLLayerFeature, const Htuple DLLayerInstanceIndex, const Htuple LayerName, const Htuple Type, const Htuple GridSize, const Htuple GenParamName, const Htuple GenParamValue, Htuple* DLLayerRoIPooling)

void CreateDlLayerRoiPooling(const HTuple& DLLayerInputImage, const HTuple& DLLayerRoI, const HTuple& DLLayerFeature, const HTuple& DLLayerInstanceIndex, const HTuple& LayerName, const HTuple& Type, const HTuple& GridSize, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* DLLayerRoIPooling)

HDlLayer HDlLayer::CreateDlLayerRoiPooling(const HDlLayer& DLLayerRoI, const HDlLayerArray& DLLayerFeature, const HDlLayer& DLLayerInstanceIndex, const HString& LayerName, const HString& Type, const HTuple& GridSize, const HTuple& GenParamName, const HTuple& GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerRoiPooling(const HDlLayer& DLLayerRoI, const HDlLayer& DLLayerFeature, const HDlLayer& DLLayerInstanceIndex, const HString& LayerName, const HString& Type, const HTuple& GridSize, const HString& GenParamName, const HString& GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerRoiPooling(const HDlLayer& DLLayerRoI, const HDlLayer& DLLayerFeature, const HDlLayer& DLLayerInstanceIndex, const char* LayerName, const char* Type, const HTuple& GridSize, const char* GenParamName, const char* GenParamValue) const

HDlLayer HDlLayer::CreateDlLayerRoiPooling(const HDlLayer& DLLayerRoI, const HDlLayer& DLLayerFeature, const HDlLayer& DLLayerInstanceIndex, const wchar_t* LayerName, const wchar_t* Type, const HTuple& GridSize, const wchar_t* GenParamName, const wchar_t* GenParamValue) const   ( Windows only)

def create_dl_layer_roi_pooling(dllayer_input_image: HHandle, dllayer_ro_i: HHandle, dllayer_feature: MaybeSequence[HHandle], dllayer_instance_index: HHandle, layer_name: str, type: str, grid_size: Sequence[int], gen_param_name: MaybeSequence[str], gen_param_value: MaybeSequence[Union[int, float, str]]) -> HHandle

Description

The operator create_dl_layer_roi_poolingcreate_dl_layer_roi_poolingCreateDlLayerRoiPoolingCreateDlLayerRoiPoolingcreate_dl_layer_roi_pooling creates a region of interest (ROI) pooling layer whose handle is returned in DLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingdllayer_ro_ipooling. Features within the given ROIs are pooled to a fixed output spatial dimension for further processing. The output spatial dimension is given by GridSizeGridSizeGridSizegridSizegrid_size.

This layer expects several feeding input layers:

The parameter LayerNameLayerNameLayerNamelayerNamelayer_name sets an individual layer name. Note that if creating a model using create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelcreate_dl_model each layer of the created network must have a unique name.

The ROI pooling operation works as follows. A grid is laid over each ROI and the features within each bin of the grid are pooled. How this is done in detail depends on the TypeTypeTypetypetype:

'roi_pool'"roi_pool""roi_pool""roi_pool""roi_pool":

Performs a max-pooling, thus the calculated grid coordinates are rounded to pixel-precise coordinates.

'roi_align'"roi_align""roi_align""roi_align""roi_align":

For each sampling point the value is determined by bilinear interpolation of the four neighboring pixel-values. The output value for each grid bin is the average of the sampling point values. The number of uniformly distributed sampling points in each output grid bin is determined by 'sampling_ratio'"sampling_ratio""sampling_ratio""sampling_ratio""sampling_ratio".

The pooled features can for example be used to predict object masks within the given ROIs. In this case it may be useful to pool from a slightly larger ROI to increase the probability that the object is completely contained in the ROI. With the generic parameters 'enlarge_box_factor_long'"enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long" and 'enlarge_box_factor_short'"enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short" the scaling of the longer and shorter box lengths before pooling can be controlled.

For multiple feature maps, the ROIs will be distributed over the feature maps according to their size by the following formula:

where is the ROI scale, calculated as square root of the ROI area. is the canonical FPN level and is the canonical FPN scale. The canonical FPN level and scale can be set via the generic parameters 'fpn_roi_canonical_level'"fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level" and 'fpn_roi_canonical_scale'"fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale" respectively. is added for robustness and set to 1e-6.

The following generic parameters GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name and the corresponding values GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value are supported:

'enlarge_box_factor_long'"enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long":

Factor with which the longer side of the box is multiplied before pooling.

Default: 1.0.

'enlarge_box_factor_short'"enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short":

Factor with which the shorter side of the box is multiplied before pooling.

Default: 1.0.

'fpn_roi_canonical_level'"fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level":

FPN-level, the ROIs with the canonical scale are assigned to.

Default: 4.

'fpn_roi_canonical_scale'"fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale":

ROIs with this scale will be assigned to the canonical level.

Default: 224.

'instance_type'"instance_type""instance_type""instance_type""instance_type":

Type of RoIs. Possible values:

  • 'rectangle1'"rectangle1""rectangle1""rectangle1""rectangle1": axis-aligned rectangles.

  • 'rectangle2'"rectangle2""rectangle2""rectangle2""rectangle2": oriented rectangles.

Default: 'rectangle1'"rectangle1""rectangle1""rectangle1""rectangle1".

'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output":

Determines whether apply_dl_modelapply_dl_modelApplyDlModelApplyDlModelapply_dl_model will include the output of this layer in the dictionary DLResultBatchDLResultBatchDLResultBatchDLResultBatchdlresult_batch even without specifying this layer in OutputsOutputsOutputsoutputsoutputs ('true'"true""true""true""true") or not ('false'"false""false""false""false").

Default: 'false'"false""false""false""false"

'mode'"mode""mode""mode""mode":

Mode of the layer. Possible values:

Default: 'feature'"feature""feature""feature""feature".

'num_classes'"num_classes""num_classes""num_classes""num_classes":

The number of classes to be predicted by the model. This parameter is only available for 'mode'"mode""mode""mode""mode" 'mask_target'"mask_target""mask_target""mask_target""mask_target".

Restriction: If set to a value greater than 1, the mask targets are generated class specifically. This also affects the output shape of the layer, i.e., the depth of the mask targets will be equal to 'num_classes'"num_classes""num_classes""num_classes""num_classes".

Default: 1.

'sampling_ratio'"sampling_ratio""sampling_ratio""sampling_ratio""sampling_ratio":

Number of sampling points distributed over the bin height and width in one grid bin. E.g., for 'sampling_ratio'"sampling_ratio""sampling_ratio""sampling_ratio""sampling_ratio" set to two, there are four sampling points in each grid bin. If set to 0, this number is computed automatically.

Default: 0.

'threshold_value'"threshold_value""threshold_value""threshold_value""threshold_value":

This value sets a threshold between zero and one for the outputs. Set to -1 in order to switch thresholding off.

Restriction: Only available for 'mode'"mode""mode""mode""mode" 'mask_target'"mask_target""mask_target""mask_target""mask_target" and TypeTypeTypetypetype 'roi_align'"roi_align""roi_align""roi_align""roi_align".

Default: 0.5.

Some parameters are not supported by create_dl_layer_roi_poolingcreate_dl_layer_roi_poolingCreateDlLayerRoiPoolingCreateDlLayerRoiPoolingcreate_dl_layer_roi_pooling, since they are computed internally using the input DLLayerFeatureDLLayerFeatureDLLayerFeatureDLLayerFeaturedllayer_feature. These are the following:

'fpn_roi_min_level'"fpn_roi_min_level""fpn_roi_min_level""fpn_roi_min_level""fpn_roi_min_level":

Minimum FPN-level used for pooling.

Restriction: Applies only to 'mode'"mode""mode""mode""mode" 'feature'"feature""feature""feature""feature".

Default: 0.

'fpn_roi_max_level'"fpn_roi_max_level""fpn_roi_max_level""fpn_roi_max_level""fpn_roi_max_level":

Maximum FPN-level used for pooling.

Restriction: Applies only to 'mode'"mode""mode""mode""mode" 'feature'"feature""feature""feature""feature".

Default: 0.

Certain parameters of layers created using this operator create_dl_layer_roi_poolingcreate_dl_layer_roi_poolingCreateDlLayerRoiPoolingCreateDlLayerRoiPoolingcreate_dl_layer_roi_pooling can be set and retrieved using further operators. The following tables give an overview, which parameters can be set using set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and which ones can be retrieved using get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param or get_dl_layer_paramget_dl_layer_paramGetDlLayerParamGetDlLayerParamget_dl_layer_param. Note, the operators set_dl_model_layer_paramset_dl_model_layer_paramSetDlModelLayerParamSetDlModelLayerParamset_dl_model_layer_param and get_dl_model_layer_paramget_dl_model_layer_paramGetDlModelLayerParamGetDlModelLayerParamget_dl_model_layer_param require a model created by create_dl_modelcreate_dl_modelCreateDlModelCreateDlModelcreate_dl_model.

Layer Parameters set get
'grid_size'"grid_size""grid_size""grid_size""grid_size" (GridSizeGridSizeGridSizegridSizegrid_size) x
'input_layer'"input_layer""input_layer""input_layer""input_layer" (DLLayerInputImageDLLayerInputImageDLLayerInputImageDLLayerInputImagedllayer_input_image, DLLayerRoIDLLayerRoIDLLayerRoIDLLayerRoIdllayer_ro_i, DLLayerFeatureDLLayerFeatureDLLayerFeatureDLLayerFeaturedllayer_feature, and/or DLLayerInstanceIndexDLLayerInstanceIndexDLLayerInstanceIndexDLLayerInstanceIndexdllayer_instance_index) x
'name'"name""name""name""name" (LayerNameLayerNameLayerNamelayerNamelayer_name) x x
'output_layer'"output_layer""output_layer""output_layer""output_layer" (DLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingdllayer_ro_ipooling) x
'shape'"shape""shape""shape""shape" x
'roi_pooling_type'"roi_pooling_type""roi_pooling_type""roi_pooling_type""roi_pooling_type" (TypeTypeTypetypetype) x x
'type'"type""type""type""type" x
Generic Layer Parameters set get
'enlarge_box_factor_long'"enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long" x x
'enlarge_box_factor_short'"enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short" x x
'fpn_roi_canonical_level'"fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level" x x
'fpn_roi_canonical_scale'"fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale" x x
'fpn_roi_max_level'"fpn_roi_max_level""fpn_roi_max_level""fpn_roi_max_level""fpn_roi_max_level" x
'fpn_roi_min_level'"fpn_roi_min_level""fpn_roi_min_level""fpn_roi_min_level""fpn_roi_min_level" x
'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output" x x
'instance_type'"instance_type""instance_type""instance_type""instance_type" x
'mode'"mode""mode""mode""mode" x
'num_classes'"num_classes""num_classes""num_classes""num_classes" x
'num_trainable_params'"num_trainable_params""num_trainable_params""num_trainable_params""num_trainable_params" x
'sampling_ratio'"sampling_ratio""sampling_ratio""sampling_ratio""sampling_ratio" x x
'threshold_value'"threshold_value""threshold_value""threshold_value""threshold_value" x x

Execution Information

  • Multithreading type: reentrant (runs in parallel with non-exclusive operators).
  • Multithreading scope: global (may be called from any thread).
  • Processed without parallelization.

Parameters

DLLayerInputImageDLLayerInputImageDLLayerInputImageDLLayerInputImagedllayer_input_image (input_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Feeding layer containing network input image.

Default: 'InputImageLayer' "InputImageLayer" "InputImageLayer" "InputImageLayer" "InputImageLayer"

DLLayerRoIDLLayerRoIDLLayerRoIDLLayerRoIdllayer_ro_i (input_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Feeding layer containing ROI coordinates.

Default: 'RoILayer' "RoILayer" "RoILayer" "RoILayer" "RoILayer"

DLLayerFeatureDLLayerFeatureDLLayerFeatureDLLayerFeaturedllayer_feature (input_control)  dl_layer(-array) HDlLayer, HTupleMaybeSequence[HHandle]HTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Feeding layers containing the features/ground truth instance masks to be pooled from.

Default: 'FeatureLayers' "FeatureLayers" "FeatureLayers" "FeatureLayers" "FeatureLayers"

DLLayerInstanceIndexDLLayerInstanceIndexDLLayerInstanceIndexDLLayerInstanceIndexdllayer_instance_index (input_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

Feeding layer containing matched instance indices for each ROI.

Default: 'InstanceIndexLayer' "InstanceIndexLayer" "InstanceIndexLayer" "InstanceIndexLayer" "InstanceIndexLayer"

LayerNameLayerNameLayerNamelayerNamelayer_name (input_control)  string HTuplestrHTupleHtuple (string) (string) (HString) (char*)

Name of the output layer.

TypeTypeTypetypetype (input_control)  string HTuplestrHTupleHtuple (string) (string) (HString) (char*)

Type of ROI pooling.

Default: 'roi_pool' "roi_pool" "roi_pool" "roi_pool" "roi_pool"

List of values: 'roi_align'"roi_align""roi_align""roi_align""roi_align", 'roi_pool'"roi_pool""roi_pool""roi_pool""roi_pool"

GridSizeGridSizeGridSizegridSizegrid_size (input_control)  number-array HTupleSequence[int]HTupleHtuple (integer) (int / long) (Hlong) (Hlong)

Spatial dimensions of the pooling grid, output spatial dimensions.

Default: [7,7]

GenParamNameGenParamNameGenParamNamegenParamNamegen_param_name (input_control)  attribute.name(-array) HTupleMaybeSequence[str]HTupleHtuple (string) (string) (HString) (char*)

Generic input parameter names.

Default: []

List of values: 'enlarge_box_factor_long'"enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long""enlarge_box_factor_long", 'enlarge_box_factor_short'"enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short""enlarge_box_factor_short", 'fpn_roi_canonical_level'"fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level""fpn_roi_canonical_level", 'fpn_roi_canonical_scale'"fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale""fpn_roi_canonical_scale", 'instance_type'"instance_type""instance_type""instance_type""instance_type", 'is_inference_output'"is_inference_output""is_inference_output""is_inference_output""is_inference_output", 'mode'"mode""mode""mode""mode", 'num_classes'"num_classes""num_classes""num_classes""num_classes", 'sampling_ratio'"sampling_ratio""sampling_ratio""sampling_ratio""sampling_ratio", 'threshold_value'"threshold_value""threshold_value""threshold_value""threshold_value"

GenParamValueGenParamValueGenParamValuegenParamValuegen_param_value (input_control)  attribute.value(-array) HTupleMaybeSequence[Union[int, float, str]]HTupleHtuple (string / integer / real) (string / int / long / double) (HString / Hlong / double) (char* / Hlong / double)

Generic input parameter values.

Default: []

Suggested values: 'feature'"feature""feature""feature""feature", 'mask_target'"mask_target""mask_target""mask_target""mask_target", 'rectangle1'"rectangle1""rectangle1""rectangle1""rectangle1", 'rectangle2'"rectangle2""rectangle2""rectangle2""rectangle2", 'true'"true""true""true""true", 'false'"false""false""false""false", 0.5

DLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingDLLayerRoIPoolingdllayer_ro_ipooling (output_control)  dl_layer HDlLayer, HTupleHHandleHTupleHtuple (handle) (IntPtr) (HHandle) (handle)

ROI pooling layer.

Example (HDevelop)

* Example for create_dl_layer_roi_pooling.
* This model can be trained to classify multiple
* predefined RoIs in an image.
*
* Create simple model.
create_dl_layer_input ('image', [224,224,3], [], [], DLGraphNodeInput)
create_dl_layer_input ('gt_boxes', [1, 5, 5], [], [], DLGraphNodeGTBoxes)
create_dl_layer_input ('rois', [1, 6, 5], [], [], DLGraphNodeRoIs)
*
* Apply two convolution layer to extract features of the image.
create_dl_layer_convolution (DLGraphNodeInput, 'conv1', 3, 1, 2, 32, 1, \
                             'half_kernel_size', 'relu', [], [], \
                             DLGraphNodeConvolution)
create_dl_layer_convolution (DLGraphNodeConvolution, 'conv2', 3, 1, 2, 32, \
                             1, 'half_kernel_size', 'relu', [], [], \
                             DLGraphNodeConvolution2)
*
* Apply RoI pooling to pool the features for each RoI.
GridSize := [7,7]
create_dl_layer_roi_pooling (DLGraphNodeInput, DLGraphNodeRoIs, \
                             DLGraphNodeConvolution2, [], 'roi_pool', \
                             'roi_pool', GridSize, [], [], \
                             DLGraphNodeRoIPooling)
*
* Classify the RoIs according to the pooled features.
NumClasses := 3
create_dl_layer_dense (DLGraphNodeRoIPooling, 'fc1', 64, [], [], \
                       DLGraphNodeDense)
create_dl_layer_activation (DLGraphNodeDense, 'relu1', 'relu', [], \
                            [], Relu1)
create_dl_layer_dense (Relu1, 'cls_score', NumClasses + 1, [], [], \
                       DLGraphNodeScore)
create_dl_layer_softmax (DLGraphNodeScore, 'cls_prob', [], [], \
                          DLGraphNodeSoftMax)
*
* Append a cross entropy loss to train the classifier.
TargetOutputModes := ['cls_target', 'cls_weight']
TargetOutputNames := TargetOutputModes
create_dl_layer_box_targets (DLGraphNodeRoIs, DLGraphNodeGTBoxes, [], \
                             TargetOutputNames, 'box_proposals', \
                             TargetOutputModes, NumClasses, [], [], \
                             DLGraphNodeClsTarget, DLGraphNodeClsWeight, \
                             _, _, _, _, _)
create_dl_layer_loss_cross_entropy (DLGraphNodeSoftMax, \
                                    DLGraphNodeClsTarget, \
                                    DLGraphNodeClsWeight, 'cls_loss', \
                                    1.0, [], [], \
                                    DLGraphNodeLossCrossEntropy)
*
* Append a box proposal layer to get a detection-like output.
GenParamNameBoxProposal := ['input_mode', 'apply_box_regression', \
                        'max_overlap', 'max_overlap_class_agnostic']
GenParamValueBoxProposal := ['dense', 'false', 1.0, 1.0]
create_dl_layer_box_proposals (DLGraphNodeSoftMax, [], DLGraphNodeRoIs, \
                               DLGraphNodeInput, 'box_output', \
                               GenParamNameBoxProposal, \
                               GenParamValueBoxProposal, \
                               DLGraphNodeGenerateBoxProposals)
*
* Create the model.
create_dl_model ([DLGraphNodeLossCrossEntropy, \
                 DLGraphNodeGenerateBoxProposals], \
                 DLModelHandle)
set_dl_model_param (DLModelHandle, 'type', 'detection')
ClassIDs := [1:NumClasses]
set_dl_model_param (DLModelHandle, 'class_ids', ClassIDs)

References

Tsung-Yi Lin, Piotr Dollàr, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie, "Feature Pyramid Networks for Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 936--944, doi: 10.1109/CVPR.2017.106.

Module

Deep Learning Professional