easy_vision.python.model.faster_rcnn¶
easy_vision.python.model.faster_rcnn.faster_rcnn¶
-
class
easy_vision.python.model.faster_rcnn.faster_rcnn.
FasterRcnn
(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]¶ Bases:
easy_vision.python.model.detection_model.DetectionModel
-
__init__
(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]¶ x.__init__(…) initializes x; see help(type(x)) for signature
-
classmethod
create_class
(name)¶
-
easy_vision.python.model.faster_rcnn.rcnn_head¶
-
class
easy_vision.python.model.faster_rcnn.rcnn_head.
RCNNHead
(feature_dict, head_config, label_dict=None, fpn_config=None, mode='predict', region_feature_extractor=None)[source]¶ Bases:
easy_vision.python.model.cv_head.CVHead
for the second stage of faster rcnn: classification
-
__init__
(feature_dict, head_config, label_dict=None, fpn_config=None, mode='predict', region_feature_extractor=None)[source]¶ - Args
- feature_dict: input dict of features head_config: rcnn head config label_dict: a dict of labels, during prediction, it can be None fpn_config: config of fpn mode: train for train phase, evaluate for evaluate phase, predict for predict phase region_feature_extractor: block reuse part of backbone to extract box feature in second stage
-
build_loss_graph
()[source]¶ Build loss of the rcnn stage, including classification loss and regression loss. variables involved are: proposals, proposal scores, proposal box offsets, groundtruth_boxes, groundtruth_classes key steps are:
- find matches between proposal boxes and groundtruth boxes, proposal boxes of larger IOUs with groundtruth boxes are assigned groundtruth class label, others are assigned backgroung class label
- for proposals with groundtruth class label, regression targets(i.e. offsets) are computed.
- compute regression and classification loss, normalized by number of proposals, and then normalized by batch size
Returns: rcnn_reg loss, rcnn_cls loss
-
easy_vision.python.model.faster_rcnn.rcnn_helper¶
-
easy_vision.python.model.faster_rcnn.rcnn_helper.
add_gts_to_proposal
(proposal_boxlists, groundtruth_boxlists)[source]¶ Add gts(groundtruth boxes) to proposals
Parameters: - proposal_boxlists – A list of BoxList containing proposal boxes in absolute coordinates.
- groundtruth_boxlists – A list of Boxlist containing groundtruth object boxes in absolute coordinates.
Returns: a list of BoxList contained sampled proposals. num_proposals: number of sampled proposals per images max_num_proposals: max number of sampled proposals
Return type: sampled_proposal_boxlists
-
easy_vision.python.model.faster_rcnn.rcnn_helper.
build_roi_pooling_fn
(initial_crop_size, maxpool_kernel_size=1, maxpool_stride=1, fpn_config=None)[source]¶ RoiPooling Function Builder
Parameters: - initial_crop_size – the initial bilinear interpolation based cropping during ROI pooling.
- maxpool_kernel_size – kernel size of the max pool op on the cropped feature map during ROI pooling.
- maxpool_stride – stride of the max pool op on the cropped feature map during ROI pooling.
- fpn_config – config of fpn.
Returns: A roi_pooling function
-
easy_vision.python.model.faster_rcnn.rcnn_helper.
fpn_roi_pooling
(features_list_to_crop, proposal_boxes, image_shape, initial_crop_size, maxpool_kernel_size=1, maxpool_stride=1, roi_min_level=2, roi_max_level=5, roi_canonical_level=4, roi_canonical_scale=224)[source]¶ Crops to a set of proposals from the feature map for a batch of images.
Parameters: - features_list_to_crop – A list of float32 tensor with shape [batch_size, height, width, depth]
- proposal_boxes – A float32 tensor with shape [batch_size, num_proposals, box_code_size] containing proposal boxes in absolute coordinates in preprocessed image.
- image_shape – A 1-D tensor of shape [4] containing image tensor shape.
- initial_crop_size – the initial bilinear interpolation based cropping during ROI pooling.
- maxpool_kernel_size – kernel size of the max pool op on the cropped feature map during ROI pooling.
- maxpool_stride – stride of the max pool op on the cropped feature map during ROI pooling.
- roi_min_level – level refers to feature map indices, level is associated with feature map strides = 2 ^ level
- roi_max_level – refers to the roi level of lowest fpn feature map
- roi_canonical_level – roi_canonical_level(k0) specified the parameters used in distribute proposals to feature maps: k = floor(k0 + log2(sqrt(wh)/224))
- roi_canonical_scale – default is 224
Returns: A float32 tensor with shape [K, new_height, new_width, depth].
-
easy_vision.python.model.faster_rcnn.rcnn_helper.
roi_pooling
(features_list_to_crop, proposal_boxes, image_shape, initial_crop_size, maxpool_kernel_size=1, maxpool_stride=1)[source]¶ Crops to a set of proposals from the feature map for a batch of images.
Parameters: - features_list_to_crop – A list of float32 tensor with shape [batch_size, height, width, depth]
- proposal_boxes – A float32 tensor with shape [batch_size, num_proposals, box_code_size] containing proposal boxes in absolute coordinates in preprocessed image.
- image_shape – A 1-D tensor of shape [4] containing image tensor shape.
- initial_crop_size – the initial bilinear interpolation based cropping during ROI pooling.
- maxpool_kernel_size – kernel size of the max pool op on the cropped feature map during ROI pooling.
- maxpool_stride – stride of the max pool op on the cropped feature map during ROI pooling.
Returns: A float32 tensor with shape [K, new_height, new_width, depth].
-
easy_vision.python.model.faster_rcnn.rcnn_helper.
sample_minbatch_train_box
(proposal_boxlists, groundtruth_boxlists, detector_target_assigner, minibatch_balance_fraction, minibatch_batch_size)[source]¶ Samples a mini-batch of proposals to be sent to the box classifier.
Parameters: - proposal_boxlists – A list of BoxList containing proposal boxes in absolute coordinates.
- groundtruth_boxlists – A list of Boxlist containing groundtruth object boxes in absolute coordinates.
- detector_target_assigner – detection target assigner.
- minibatch_balance_fraction – balance fraction of BalancedPositiveNegativeSampler
- minibatch_batch_size – sample batch size
Returns: a list of BoxList contained sampled proposals. num_proposals: number of sampled proposals per images max_num_proposals: max number of sampled proposals
Return type: sampled_proposal_boxlists
easy_vision.python.model.faster_rcnn.rpn_head¶
-
class
easy_vision.python.model.faster_rcnn.rpn_head.
RPNHead
(feature_dict, head_config, label_dict=None, is_fpn=False, mode='predict')[source]¶ Bases:
easy_vision.python.model.cv_head.CVHead
for the first stage of faster rcnn: region proposal
-
__init__
(feature_dict, head_config, label_dict=None, is_fpn=False, mode='predict')[source]¶ Parameters: - feature_dict – must include two parts: 1. backbone output features 2. preprocessed batched image shape(input_shape) 3. preprocessed per image shapes(image_shapes)
- label_dict – groundtruth_boxes
- is_fpn – whether to use fpn(multiscale features)
- head_config – protos.rpn_head_pb2.RPNHead
- is_training – train or not(eval/predict)
-
build_loss_graph
()[source]¶ Parameters: - label_dict – must include two fields ‘groundtruth_boxes’: bounding boxes of each object ‘num_groundtruth_boxes’: number of objects in each image
- objectness scores (box) – self._prediction_dict[‘cls’]
- offsets (box) – self._prediction_dict[‘reg’]
Returns: a dict of {‘rpn_reg’: reg_loss, ‘rpn_cls’:rpn_cls}
Return type: loss_dict
-
build_postprocess_graph
()[source]¶ - inputs: self._prediction_dict[‘reg’]
- self._prediction_dict[‘cls’] self._prediction_dict[‘anchors’] self._image_shapes
- return: a dict of proposal_boxes, proposal_scores, num_proposals
- the results are also merged into self._prediction_dict
- steps:
- box decoding: _batch_decode_boxes
2. nms and pad to max_num_proposals: batch_multiclass_non_max_suppression the second step are done image by image
-