easy_vision.python.builders

easy_vision.python.builders.anchor_generator_builder

A function to build an object detection anchor generator from config.

easy_vision.python.builders.anchor_generator_builder.build(anchor_generator_config)[source]

Builds an anchor generator based on the config.

Parameters:anchor_generator_config – An anchor_generator.proto object containing the config for the desired anchor generator.
Returns:Anchor generator based on the config.
Raises:ValueError – On empty anchor generator proto.

easy_vision.python.builders.backbone_builder

easy_vision.python.builders.backbone_builder.build(backbone_config_or_name, is_training=False, **kwargs)[source]
input:
backbone_config_or_name a config object or backbone name is_training: tell backbone to build training graph or predict graph
output: return the corresponding backbone instance based on backbone_config
currently only resnet_v1 is added in ResnetBackbone
easy_vision.python.builders.backbone_builder.build_block(block_config, is_training=False)[source]
Parameters:
  • block_config – type of protos.Block
  • is_training – training mode or not
Returns:

the correspoding block instance, such as ResnetBlock()

easy_vision.python.builders.backbone_builder.build_with_config(backbone_config, is_training=False)[source]
input:
backbone_config is_training: tell backbone to build training graph or predict graph
output: return the corresponding backbone instance based on backbone_config
currently only resnet_v1 is added in ResnetBackbone

easy_vision.python.builders.box_coder_builder

A function to build an object detection box coder from configuration.

easy_vision.python.builders.box_coder_builder.build(box_coder_config)[source]

Builds a box coder object based on the box coder config.

Parameters:box_coder_config – A box_coder.proto object containing the config for the desired box coder.
Returns:BoxCoder based on the config.
Raises:ValueError – On empty box coder proto.

easy_vision.python.builders.box_predictor_builder

Function to build box predictor from configuration.

class easy_vision.python.builders.box_predictor_builder.BoxEncodingsClipRange(min, max)

Bases: tuple

max

Alias for field number 1

min

Alias for field number 0

easy_vision.python.builders.box_predictor_builder.build(argscope_fn, box_predictor_config, is_training, num_classes, add_background_class=True)[source]

Builds box predictor based on the configuration.

Builds box predictor based on the configuration. See box_predictor.proto for configurable options. Also, see box_predictor.py for more details.

Parameters:
  • argscope_fn
    A function that takes the following inputs:
    • hyperparams_pb2.Hyperparams proto
    • a boolean indicating if the model is in training mode.

    and returns a tf slim argscope for Conv and FC hyperparameters.

  • box_predictor_config – box_predictor_pb2.BoxPredictor proto containing configuration.
  • is_training – Whether the models is in training mode.
  • num_classes – Number of classes to predict.
  • add_background_class – Whether to add an implicit background class.
Returns:

box_predictor.BoxPredictor object.

Return type:

box_predictor

Raises:

ValueError – On unknown box predictor.

easy_vision.python.builders.box_predictor_builder.build_convolutional_3d_box_predictor(is_training, num_classes, conv_hyperparams_fn, min_depth, max_depth, num_layers_before_predictor, dropout_keep_prob, kernel_size, box_code_size, add_background_class=True, class_prediction_bias_init=0.0, use_depthwise=False)[source]

Builds the ConvolutionalBoxPredictor from the arguments.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for convolution ops.
  • min_depth – Minimum feature depth prior to predicting box encodings and class predictions.
  • max_depth – Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
  • num_layers_before_predictor – Number of the additional conv layers before the predictor.
  • dropout_keep_prob – Keep probability for dropout.
  • kernel_size – Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is automatically set to be min(feature_width, feature_height).
  • box_code_size – Size of encoding for each box.
  • add_background_class – Whether to add an implicit background class.
  • class_prediction_bias_init – Constant value to initialize bias of the last conv2d layer before class prediction.
  • use_depthwise – Whether to use depthwise convolutions for prediction steps. Default is False.
Returns:

A ConvolutionalBoxPredictor class.

easy_vision.python.builders.box_predictor_builder.build_convolutional_box_predictor(is_training, num_classes, conv_hyperparams_fn, min_depth, max_depth, num_layers_before_predictor, dropout_keep_prob, kernel_size, box_code_size, add_background_class=True, class_prediction_bias_init=0.0, use_depthwise=False)[source]

Builds the ConvolutionalBoxPredictor from the arguments.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for convolution ops.
  • min_depth – Minimum feature depth prior to predicting box encodings and class predictions.
  • max_depth – Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
  • num_layers_before_predictor – Number of the additional conv layers before the predictor.
  • dropout_keep_prob – Keep probability for dropout.
  • kernel_size – Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is automatically set to be min(feature_width, feature_height).
  • box_code_size – Size of encoding for each box.
  • add_background_class – Whether to add an implicit background class.
  • class_prediction_bias_init – Constant value to initialize bias of the last conv2d layer before class prediction.
  • use_depthwise – Whether to use depthwise convolutions for prediction steps. Default is False.
Returns:

A ConvolutionalBoxPredictor class.

easy_vision.python.builders.box_predictor_builder.build_mask_rcnn_3d_box_predictor(is_training, num_classes, fc_hyperparams_fn, depth, num_layers_before_predictor, dropout_keep_prob, box_code_size, add_background_class=True, agnostic=False)[source]

Builds and returns a MaskRCNNBoxPredictor class.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • fc_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for fully connected ops.
  • depth – feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
  • num_layers_before_predictor – Number of the additional fc layers before the predictor.
  • dropout_keep_prob – Keep probability for dropout. A single dropout op is applied prior to both box and class predictions, which stands in contrast to the ConvolutionalBoxPredictor below.
  • box_code_size – Size of encoding for each box.
  • add_background_class – Whether to add an implicit background class.
  • agnostic – Whether to share boxes across classes rather than use a different box for each class.
Returns:

A MaskRCNNBoxPredictor class.

easy_vision.python.builders.box_predictor_builder.build_mask_rcnn_box_predictor(is_training, num_classes, fc_hyperparams_fn, depth, num_layers_before_predictor, dropout_keep_prob, box_code_size, add_background_class=True, agnostic=False)[source]

Builds and returns a MaskRCNNBoxPredictor class.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • fc_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for fully connected ops.
  • depth – feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions.
  • num_layers_before_predictor – Number of the additional fc layers before the predictor.
  • dropout_keep_prob – Keep probability for dropout. A single dropout op is applied prior to both box and class predictions, which stands in contrast to the ConvolutionalBoxPredictor below.
  • box_code_size – Size of encoding for each box.
  • add_background_class – Whether to add an implicit background class.
  • agnostic – Whether to share boxes across classes rather than use a different box for each class.
Returns:

A MaskRCNNBoxPredictor class.

easy_vision.python.builders.box_predictor_builder.build_score_converter(score_converter_config, is_training)[source]

Builds score converter based on the config.

Builds one of [tf.identity, tf.sigmoid] score converters based on the config and whether the BoxPredictor is for training or inference.

Parameters:
  • score_converter_config – box_predictor_pb2.WeightSharedConvolutionalBoxPredictor.score_converter.
  • is_training – Indicates whether the BoxPredictor is in training mode.
Returns:

Callable score converter op.

Raises:

ValueError – On unknown score converter.

easy_vision.python.builders.box_predictor_builder.build_weight_shared_convolutional_box_predictor(is_training, num_classes, conv_hyperparams_fn, depth, num_layers_before_predictor, box_code_size, kernel_size=3, add_background_class=True, class_prediction_bias_init=0.0, dropout_keep_prob=0.8, share_prediction_tower=False, apply_batch_norm=True, use_depthwise=False, box_encodings_clip_range=None)[source]

Builds and returns a WeightSharedConvolutionalBoxPredictor class.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for convolution ops.
  • depth – depth of conv layers.
  • num_layers_before_predictor – Number of the additional conv layers before the predictor.
  • box_code_size – Size of encoding for each box.
  • kernel_size – Size of final convolution kernel.
  • add_background_class – Whether to add an implicit background class.
  • class_prediction_bias_init – constant value to initialize bias of the last conv2d layer before class prediction.
  • dropout_keep_prob – Probability of keeping activiations.
  • share_prediction_tower – Whether to share the multi-layer tower between box prediction and class prediction heads.
  • apply_batch_norm – Whether to apply batch normalization to conv layers in this predictor.
  • use_depthwise – Whether to use depthwise separable conv2d instead of conv2d.
  • box_encodings_clip_range – Min and max values for clipping the box_encodings.
Returns:

A WeightSharedConvolutionalBoxPredictor class.

easy_vision.python.builders.box_predictor_builder.build_yolo_box_predictor(is_training, num_classes, conv_hyperparams_fn, num_layers_before_predictor)[source]

Builds the YOLOBoxPredictor from the arguments.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes.
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for convolution ops.
  • num_layers_before_predictor – Number of the additional conv layers before the predictor.
Returns:

A YOLOBoxPredictor class.

easy_vision.python.builders.hyperparams_builder

Builder function to construct tf-slim arg_scope for convolution, fc ops.

easy_vision.python.builders.hyperparams_builder.build(hyperparams_config, is_training)[source]

Builds tf-slim arg_scope for convolution ops based on the config.

Returns an arg_scope to use for convolution ops containing weights initializer, weights regularizer, activation function, batch norm function and batch norm parameters based on the configuration.

Note that if the batch_norm parameteres are not specified in the config (i.e. left to default) then batch norm is excluded from the arg_scope.

The batch norm parameters are set for updates based on is_training argument and conv_hyperparams_config.batch_norm.train parameter. During training, they are updated only if batch_norm.train parameter is true. However, during eval, no updates are made to the batch norm variables. In both cases, their current values are used during forward pass.

Parameters:
  • hyperparams_config – hyperparams.proto object containing hyperparameters.
  • is_training – Whether the network is in training mode.
Returns:

A function to construct tf-slim arg_scope containing

hyperparameters for ops.

Return type:

arg_scope_fn

Raises:

ValueError – if hyperparams_config is not of type hyperparams.Hyperparams.

easy_vision.python.builders.keypoint_predictor_builder

easy_vision.python.builders.keypoint_predictor_builder.build(argscope_fn, keypoint_predictor_config, is_training, num_keypoints, predict_direction=False)[source]

Builds keypoint predictor based on the configuration.

Builds keypoint predictor based on the configuration. See keypoint_predictor.proto for configurable options. Also, see detection_predictors/predictor.py for more details.

Parameters:
  • argscope_fn
    A function that takes the following inputs:
    • hyperparams_pb2.Hyperparams proto
    • a boolean indicating if the model is in training mode.

    and returns a tf slim argscope for Conv and FC hyperparameters.

  • keypoint_predictor_config – keypoint_predictor_pb2.KeypointPredictor proto containing configuration.
  • is_training – Whether the models is in training mode.
  • num_keypoints – Number of keypoints to predict.
  • predict_direction – predict text direction or not
Returns:

predictor.KeypointPredictor object.

Return type:

keypoint_predictor

Raises:

ValueError – On unknown keypoint predictor.

easy_vision.python.builders.keypoint_predictor_builder.build_text_resnet_keypoint_predictor(is_training, num_keypoints, conv_hyperparams_fn=None, num_blocks_before_predictor=3, num_units_per_block=2, base_depth_before_predictor=64, se_rate=0, fc_hyperparams_fn=None, keypoint_prediction_num_fc_layers=2, keypoint_prediction_fc_depth=1024, predict_direction=False)[source]

Builds and returns a MaskRCNNKeypointPredictor class.

Parameters:
  • is_training – Indicates whether the KeypointPredictor is in training mode.
  • num_keypoints – number of keypoints.
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for conv ops.
  • num_blocks_before_predictor – Number of the additional resnet block before the predictor.
  • num_units_per_block – number of unit of resnet block.
  • base_depth_before_predictor – the feature depth of first resnet block.
  • se_rate – the squeeze_and_excite rate, less and equal zero for disable.
  • fc_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for fc ops.
  • keypoint_prediction_num_fc_layers – Number of fc layers applied to the image_features in keypoint prediction branch.
  • keypoint_prediction_fc_depth – The depth for the fc op applied to the image_features in the keypoint prediction branch.
  • predict_direction – predict text direction or not
Returns:

A TextResnetKeypointPredictor class.

easy_vision.python.builders.losses_builder

A function to build localization and classification losses from config.

easy_vision.python.builders.losses_builder.build(loss_config)[source]

Build losses based on the config.

Builds classification, localization losses and optionally a hard example miner based on the config.

Parameters:

loss_config – A losses_pb2.Loss object.

Returns:

Classification loss object. localization_loss: Localization loss object. classification_weight: Classification loss weight. localization_weight: Localization loss weight. hard_example_miner: Hard example miner object. random_example_sampler: BalancedPositiveNegativeSampler object.

Return type:

classification_loss

Raises:
  • ValueError – If hard_example_miner is used with sigmoid_focal_loss.
  • ValueError – If random_example_sampler is getting non-positive value as desired positive example fraction.
easy_vision.python.builders.losses_builder.build_classification_loss(loss_config)[source]

Builds a classification loss based on the loss config.

Parameters:loss_config – A losses_pb2.ClassificationLoss object.
Returns:Loss based on the config.
easy_vision.python.builders.losses_builder.build_faster_rcnn_classification_loss(loss_config)[source]

Builds a classification loss for Faster RCNN based on the loss config.

Parameters:loss_config – A losses_pb2.ClassificationLoss object.
Returns:Loss based on the config.
Raises:ValueError – On invalid loss_config.
easy_vision.python.builders.losses_builder.build_hard_example_miner(config, classification_weight, localization_weight)[source]

Builds hard example miner based on the config.

Parameters:
  • config – A losses_pb2.HardExampleMiner object.
  • classification_weight – Classification loss weight.
  • localization_weight – Localization loss weight.
Returns:

Hard example miner.

easy_vision.python.builders.mask_predictor_builder

Function to build mask predictor from configuration.

easy_vision.python.builders.mask_predictor_builder.build(argscope_fn, mask_predictor_config, is_training, num_classes)[source]

Builds box predictor based on the configuration.

Builds box predictor based on the configuration. See box_predictor.proto for configurable options. Also, see box_predictor.py for more details.

Parameters:
  • argscope_fn
    A function that takes the following inputs:
    • hyperparams_pb2.Hyperparams proto
    • a boolean indicating if the model is in training mode.

    and returns a tf slim argscope for Conv and FC hyperparameters.

  • mask_predictor_config – mask_predictor_pb2.MaskPredictor proto containing configuration.
  • is_training – Whether the models is in training mode.
  • num_classes – Number of classes to predict.
Returns:

mask_predictor.MaskPredictor object.

Return type:

mask_predictor

Raises:

ValueError – On unknown box predictor.

easy_vision.python.builders.mask_predictor_builder.build_mask_rcnn_mask_predictor(is_training, num_classes, conv_hyperparams_fn=None, mask_height=14, mask_width=14, mask_prediction_num_conv_layers=2, mask_prediction_conv_depth=256, masks_are_class_agnostic=False, convolve_then_upsample_masks=False)[source]

Builds and returns a MaskRCNNBoxPredictor class.

Parameters:
  • is_training – Indicates whether the BoxPredictor is in training mode.
  • num_classes – number of classes. Note that num_classes does not include the background category, so if groundtruth labels take values in {0, 1, .., K-1}, num_classes=K (and not K+1, even though the assigned classification targets can range from {0,… K}).
  • conv_hyperparams_fn – A function to generate tf-slim arg_scope with hyperparameters for convolution ops.
  • mask_height – Desired output mask height. The default value is 14.
  • mask_width – Desired output mask width. The default value is 14.
  • mask_prediction_num_conv_layers – Number of convolution layers applied to the image_features in mask prediction branch.
  • mask_prediction_conv_depth – The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the depth of the convolution layers will be automatically chosen based on the number of object classes and the number of channels in the image features.
  • masks_are_class_agnostic – Boolean determining if the mask-head is class-agnostic or not.
  • convolve_then_upsample_masks – Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. Otherwise, mask features are resized to [mask_height, mask_width] using bilinear resizing before applying convolutions.
Returns:

A MaskRCNNMaskPredictor class.

easy_vision.python.builders.matcher_builder

A function to build an object detection matcher from configuration.

easy_vision.python.builders.matcher_builder.build(matcher_config)[source]

Builds a matcher object based on the matcher config.

Parameters:matcher_config – A matcher.proto object containing the config for the desired Matcher.
Returns:Matcher based on the config.
Raises:ValueError – On empty matcher proto.

easy_vision.python.builders.optimizer_builder

Functions to build DetectionModel training optimizers.

easy_vision.python.builders.optimizer_builder.build(optimizer_config)[source]

Create optimizer based on config.

Parameters:optimizer_config – A Optimizer proto message.
Returns:An optimizer and a list of variables for summary.
Raises:ValueError – when using an unsupported input data type.

easy_vision.python.builders.post_processing_builder

Builder function for post processing operations.

easy_vision.python.builders.post_processing_builder.build(post_processing_config)[source]

Builds callables for post-processing operations.

Builds callables for non-max suppression and score conversion based on the configuration.

Non-max suppression callable takes boxes, scores, and optionally clip_window, parallel_iterations masks, and `scope as inputs. It returns nms_boxes, nms_scores, nms_classes nms_masks and num_detections. See post_processing.batch_multiclass_non_max_suppression for the type and shape of these tensors.

Score converter callable should be called with input tensor. The callable returns the output from one of 3 tf operations based on the configuration - tf.identity, tf.sigmoid or tf.nn.softmax. See tensorflow documentation for argument and return value descriptions.

Parameters:post_processing_config – post_processing.proto object containing the parameters for the post-processing operations.
Returns:Callable for non-max suppression. score_converter_fn: Callable for score conversion.
Return type:non_max_suppressor_fn
Raises:ValueError – if the post_processing_config is of incorrect type.

easy_vision.python.builders.preprocessor_builder

Builder for preprocessing steps.

easy_vision.python.builders.preprocessor_builder.build(preprocessor_step_config)[source]

Builds preprocessing step based on the configuration.

Parameters:preprocessor_step_config – PreprocessingStep configuration proto.
Returns:
A callable function and an argument map to call function
with.
Return type:function, argmap
Raises:ValueError – On invalid configuration.

easy_vision.python.builders.text_encoder_builder

easy_vision.python.builders.text_encoder_builder.build(config, is_training)[source]

Build an text encoder. :param config: a config contains encoder config and time_major :param is_training: train or not(eval/predict)

Returns:a instance of text_encoder.TextEncoder