easy_vision.python.model.ssd

easy_vision.python.model.ssd.ssd

class easy_vision.python.model.ssd.ssd.SSD(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]

Bases: easy_vision.python.model.detection_model.DetectionModel

__init__(model_config, feature_dict, label_dict=None, mode='predict', categories=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_loss_graph()[source]
build_predict_graph()[source]
classmethod create_class(name)
get_scopes_of_levels()[source]

return a list of variable scope list order by levels ( outputs -> heads -> backbone -> inputs).

easy_vision.python.model.ssd.ssd_head

class easy_vision.python.model.ssd.ssd_head.SSDHead(feature_dict, head_config, label_dict=None, mode='predict')[source]

Bases: easy_vision.python.model.cv_head.CVHead

SSD head which helps to build multi-scale feature maps, placing different size of anchors on each feature map and calculate loc_loss and cls_loss

__init__(feature_dict, head_config, label_dict=None, mode='predict')[source]
Parameters:
  • feature_dict – must include two parts: 1. backbone output features 2. preprocessed batched image shape(preprocessed_input_shape)
  • head_config – protos.ssd_pb2.SSDHead
  • mode – train/evaluate/predict
build_loss_graph()[source]

Compute scalar loss tensors with respect to provided groundtruth.

Calling this function requires that groundtruth tensors have been provided via the provide_groundtruth function.

Parameters:self._prediction_dict

a dictionary holding prediction tensors with 1) box_encodings: 3-D float tensor of shape [batch_size, num_anchors,

box_code_dimension] containing predicted boxes.
  1. class_predictions_with_background: 3-D float tensor of shape
[batch_size, num_anchors, num_classes+1] containing class predictions (logits) for each of the anchors. Note that this tensor includes background class predictions.
Returns:
a dictionary mapping loss keys (localization_loss and
classification_loss) to scalar tensors representing corresponding loss values.
build_postprocess_graph()[source]

Postprocess including box decoding, clipping, nms :returns:

a dictionary containing the following fields

detection_boxes: [batch, max_detections, 4] detection_scores: [batch, max_detections] detection_classes: [batch, max_detections] detection_keypoints: [batch, max_detections, num_keypoints, 2] (if

encoded in the prediction_dict ‘box_encodings’)

num_detections: [batch]

Return type:detections
build_predict_graph()[source]

Predicts unpostprocessed tensors from easy_vision.python.input tensor. This function takes an input batch of images and runs it through the forward pass of the network to yield unpostprocessesed predictions.

A side effect of calling the predict method is that self._anchors is populated with a box_list.BoxList of anchors. These anchors must be constructed before the postprocess or loss functions can be called.

build prediction graph, including multi-scale feature map construction, anchor generation and placement, box location and class prediction
Returns:
a dictionary holding “raw” prediction tensors:
  1. preprocessed_inputs: the [batch, height, width, channels] image
tensor.
  1. box_encodings: 4-D float tensor of shape [batch_size, num_anchors,
box_code_dimension] containing predicted boxes.
  1. class_predictions_with_background: 3-D float tensor of shape
[batch_size, num_anchors, num_classes+1] containing class predictions (logits) for each of the anchors. Note that this tensor includes background class predictions (at class index 0).
  1. feature_maps: a list of tensors where the ith tensor has shape
[batch, height_i, width_i, depth_i].
  1. anchors: 2-D float tensor of shape [num_anchors, 4] containing
the generated anchors in normalized coordinates.
Return type:prediction_dict