easy_vision.python.model.ssd¶

easy_vision.python.model.ssd.ssd¶

class easy_vision.python.model.ssd.ssd.SSD(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]¶

Bases: easy_vision.python.model.detection_model.DetectionModel

__init__(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]¶: x.__init__(…) initializes x; see help(type(x)) for signature

build_loss_graph()[source]¶

build_predict_graph()[source]¶

classmethod create_class(name)¶

get_scopes_of_levels()[source]¶: return a list of variable scope list order by levels ( outputs -> heads -> backbone -> inputs).

easy_vision.python.model.ssd.ssd_head¶

class easy_vision.python.model.ssd.ssd_head.SSDHead(feature_dict, head_config, label_dict=None, mode='predict')[source]¶

Bases: easy_vision.python.model.cv_head.CVHead

SSD head which helps to build multi-scale feature maps, placing different size of anchors on each feature map and calculate loc_loss and cls_loss

__init__(feature_dict, head_config, label_dict=None, mode='predict')[source]¶

Parameters:	feature_dict – must include two parts: 1. backbone output features 2. preprocessed batched image shape(preprocessed_input_shape) head_config – protos.ssd_pb2.SSDHead mode – train/evaluate/predict

build_loss_graph()[source]¶

Compute scalar loss tensors with respect to provided groundtruth.

Calling this function requires that groundtruth tensors have been provided via the provide_groundtruth function.

Parameters:

self._prediction_dict –

a dictionary holding prediction tensors with 1) box_encodings: 3-D float tensor of shape [batch_size, num_anchors,

box_code_dimension] containing predicted boxes.

class_predictions_with_background: 3-D float tensor of shape

[batch_size, num_anchors, num_classes+1] containing class predictions (logits) for each of the anchors. Note that this tensor includes background class predictions.

Returns:

a dictionary mapping loss keys (localization_loss and: classification_loss) to scalar tensors representing corresponding loss values.

build_postprocess_graph()[source]¶

Postprocess including box decoding, clipping, nms :returns:

a dictionary containing the following fields

detection_boxes: [batch, max_detections, 4] detection_scores: [batch, max_detections] detection_classes: [batch, max_detections] detection_keypoints: [batch, max_detections, num_keypoints, 2] (if

encoded in the prediction_dict ‘box_encodings’)

num_detections: [batch]

Return type:	detections

build_predict_graph()[source]¶

Predicts unpostprocessed tensors from easy_vision.python.input tensor. This function takes an input batch of images and runs it through the forward pass of the network to yield unpostprocessesed predictions.

A side effect of calling the predict method is that self._anchors is populated with a box_list.BoxList of anchors. These anchors must be constructed before the postprocess or loss functions can be called.

build prediction graph, including multi-scale feature map construction, anchor generation and placement, box location and class prediction

Returns:

a dictionary holding “raw” prediction tensors:

preprocessed_inputs: the [batch, height, width, channels] image

tensor.

box_encodings: 4-D float tensor of shape [batch_size, num_anchors,

box_code_dimension] containing predicted boxes.

class_predictions_with_background: 3-D float tensor of shape

[batch_size, num_anchors, num_classes+1] containing class predictions (logits) for each of the anchors. Note that this tensor includes background class predictions (at class index 0).

feature_maps: a list of tensors where the ith tensor has shape

[batch, height_i, width_i, depth_i].

anchors: 2-D float tensor of shape [num_anchors, 4] containing

the generated anchors in normalized coordinates.

Return type: prediction_dict