easy_vision.python.core.backbones

easy_vision.python.core.backbones.alexnet_backbone

class easy_vision.python.core.backbones.alexnet_backbone.AlexnetBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

Alexnet Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.backbone

class easy_vision.python.core.backbones.backbone.Backbone(backbone_config, is_training=False)[source]

Bases: object

__init__(backbone_config, is_training=False)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

a dict of tensor for each layer’s output

Return type:

activations

classmethod create_class(name)
get_scopes_of_levels(with_logits=True)[source]
Parameters:with_logits – with classification layer scope (logits) or not.
Returns:
a list of variable scope list order by levels from outputs to inputs.
e.g. with logits: [[“logits”], [“block4”], [“block3”],
[“block2”], [“block1”], [“conv1”]]
without logits: [[“block4”], [“block3”],
[“block2”], [“block1”], [“conv1”]]
global_pool_convfc(inputs, num_classes)[source]

global pooling and conv2d used as fc to get final logits

Parameters:
  • inputs – the output of last featuremap
  • num_classes – target number of classes to be classified
Returns:

a tensor of [batch_size, num_classes]

Return type:

logits

class easy_vision.python.core.backbones.backbone.Block(block_config, batchnorm_trainable)[source]

Bases: object

build a sub graph such as a block of resnet,
such as block4 used in rcnn head
__init__(block_config, batchnorm_trainable)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_block_graph(inputs)[source]

args: inputs, usually tensor of batched images return: a feature tensor

easy_vision.python.core.backbones.c3d_backbone

class easy_vision.python.core.backbones.c3d_backbone.C3DBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

C3D Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.cifar_backbone

class easy_vision.python.core.backbones.cifar_backbone.CifarNetBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

CifarNet Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.custom_backbone

class easy_vision.python.core.backbones.custom_backbone.CustomBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

Custom Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.custom_layers

easy_vision.python.core.backbones.custom_layers.l2_normalization(*args, **kwargs)[source]

Implement L2 normalization on every feature (i.e. spatial normalization). Should be extended in some near future to other dimensions, providing a more flexible normalization framework. :param inputs: a 4-D tensor with dimensions [batch_size, height, width, channels]. :param scaling: whether or not to add a post scaling operation along the dimensions

which have been normalized.
Parameters:
  • scale_initializer – An initializer for the weights.
  • reuse – whether or not the layer and its variables should be reused. To be able to reuse the layer scope must be given.
  • variables_collections – optional list of collections for all the variables or a dictionary containing a different list of collection per variable.
  • outputs_collections – collection to add the outputs.
  • data_format – NHWC or NCHW data format.
  • trainable – If True also add variables to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
  • scope – Optional scope for variable_scope.
Returns:

A Tensor representing the output of the operation.

easy_vision.python.core.backbones.custom_layers.pad2d(*args, **kwargs)[source]

2D Padding layer, adding a symmetric padding to H and W dimensions. Aims to mimic padding in Caffe and MXNet, helping the port of models to TensorFlow. Tries to follow the naming convention of tf.contrib.layers. :param inputs: 4D input Tensor; :param pad: 2-Tuple with padding values for H and W dimensions; :param mode: Padding mode. C.f. tf.pad :param data_format: NHWC or NCHW data format.

easy_vision.python.core.backbones.darknet_backbone

class easy_vision.python.core.backbones.darknet_backbone.DarkNet53(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.efficientnet_backbone

class easy_vision.python.core.backbones.efficientnet_backbone.EfficientNetBackbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

EfficientNet Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.i3d

Inception-v1 Inflated 3D ConvNet used for Kinetics CVPR paper.

The model is introduced in:

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset Joao Carreira, Andrew Zisserman https://arxiv.org/pdf/1705.07750v1.pdf.
class easy_vision.python.core.backbones.i3d.InceptionI3d(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

Inception-v1 I3D architecture.

The model is introduced in:

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset Joao Carreira, Andrew Zisserman https://arxiv.org/pdf/1705.07750v1.pdf.

See also the Inception architecture, introduced in:

Going deeper with convolutions Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. http://arxiv.org/pdf/1409.4842v1.pdf.
VALID_ENDPOINTS = ('Conv3d_1a_7x7', 'MaxPool3d_2a_3x3', 'Conv3d_2b_1x1', 'Conv3d_2c_3x3', 'MaxPool3d_3a_3x3', 'Mixed_3b', 'Mixed_3c', 'MaxPool3d_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e', 'Mixed_4f', 'MaxPool3d_5a_2x2', 'Mixed_5b', 'Mixed_5c', 'Logits')
__init__(backbone_config, is_training=False)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]

Initializes I3D model instance.

Parameters:
  • inputs – Inputs to the model, which should have dimensions batch_size x num_frames x 224 x 224 x num_channels.
  • num_classes – The number of outputs in the logit layer (default 400, which matches the Kinetics dataset).
  • spatial_squeeze – Whether to squeeze the spatial dimensions for the logits before returning (default True).
  • final_endpoint – The model contains many possible endpoints. final_endpoint specifies the last endpoint for the model to be built up to. In addition to the output at final_endpoint, all the outputs at endpoints up to final_endpoint will also be returned, in a dictionary. final_endpoint must be one of InceptionI3d.VALID_ENDPOINTS (default ‘Logits’).
  • name – A string (optional). The name of this module.
Raises:

ValueError – if final_endpoint is not recognized.

classmethod create_class(name)
inception_block(net, b0, b1_reduce, b1, b2_reduce, b2, b3, end_point_name, end_points)[source]
class easy_vision.python.core.backbones.i3d.Unit3D(output_channels, kernel_shape=(1, 1, 1), stride=(1, 1, 1), activation_fn=<function relu>, weight_decay=0.0001, use_batch_norm=True, use_bias=False, name='unit_3d')[source]

Bases: object

Basic unit containing Conv3D + BatchNorm + non-linearity.

__init__(output_channels, kernel_shape=(1, 1, 1), stride=(1, 1, 1), activation_fn=<function relu>, weight_decay=0.0001, use_batch_norm=True, use_bias=False, name='unit_3d')[source]

Initializes Unit3D module.

easy_vision.python.core.backbones.inception_backbone

class easy_vision.python.core.backbones.inception_backbone.InceptionBackbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

Inception Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.mobilenet_backbone

class easy_vision.python.core.backbones.mobilenet_backbone.MobileNetBackbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

Mobilenet backbone :param backbone_config: a protobufer config object :param is_training: indicates to build a graph for training or for testing

__init__(backbone_config, is_training)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)
get_scopes_of_levels(with_logits=True)[source]
Parameters:with_logits – with classification layer or not.

return a list of variable scope list order by levels from outputs to inputs.

easy_vision.python.core.backbones.net_utils

easy_vision.python.core.backbones.net_utils.reduced_kernel_size_for_small_input(input_tensor, kernel_size)[source]

Define kernel size which is automatically reduced for small input.

If the shape of the input images is unknown at graph construction time this function assumes that the input images are is large enough.

Parameters:
  • input_tensor – input tensor of size [batch_size, height, width, channels].
  • kernel_size – desired kernel size of length 2: [kernel_height, kernel_width]
Returns:

a tensor with the kernel size.

TODO(jrru): Make this function work with unknown shapes. Theoretically, this can be done with the code below. Problems are two-fold: (1) If the shape was known, it will be lost. (2) inception.slim.ops._two_element_tuple cannot handle tensors that define the kernel size.

shape = tf.shape(input_tensor) return = tf.stack([tf.minimum(shape[1], kernel_size[0]),

tf.minimum(shape[2], kernel_size[1])])

easy_vision.python.core.backbones.resnet_3d_backbone

class easy_vision.python.core.backbones.resnet_3d_backbone.RestNet3DBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

3DResNet Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.resnet_backbone

class easy_vision.python.core.backbones.resnet_backbone.ResnetBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

a dict of tensor for each layer’s output

Return type:

activations

classmethod create_class(name)
get_scopes_of_levels(with_logits=True)[source]
Parameters:with_logits – with classification layer or not.

return a list of variable scope list order by levels from outputs to inputs.

class easy_vision.python.core.backbones.resnet_backbone.ResnetBlock(block_config, batchnorm_trainable=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Block

__init__(block_config, batchnorm_trainable=False)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_block_graph(inputs)[source]

args: inputs, usually tensor of batched images return: a feature tensor

easy_vision.python.core.backbones.resnext_3d_backbone

class easy_vision.python.core.backbones.resnext_3d_backbone.RestNeXt3DBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

3DResNet Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.text_resnet15_backbone

class easy_vision.python.core.backbones.text_resnet15_backbone.TextResnet15Backbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.vgg_backbone

class easy_vision.python.core.backbones.vgg_backbone.FCBlock(block_config, batchnorm_trainable)[source]

Bases: easy_vision.python.core.backbones.backbone.Block

__init__(block_config, batchnorm_trainable)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_block_graph(inputs)[source]

args: inputs, usually tensor of batched images return: a feature tensor

class easy_vision.python.core.backbones.vgg_backbone.VggBackbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

Vgg Backbone

Parameters:
  • backbone_config – a protobufer config object
  • is_training – indicates to build a graph for training or for testing
build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)
get_scopes_of_levels(with_logits=True)[source]
Parameters:with_logits – with classification layer or not.

return a list of variable scope list order by levels from outputs to inputs.

easy_vision.python.core.backbones.vgg_bai_backbone

class easy_vision.python.core.backbones.vgg_bai_backbone.VGGBaiBackbone(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)

easy_vision.python.core.backbones.vgg_reduce_fc

class easy_vision.python.core.backbones.vgg_reduce_fc.Vgg16ReduceFc(backbone_config, is_training)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)
get_scopes_of_levels(with_logits=False)[source]
Parameters:with_logits – with classification layer or not.

return a list of variable scope list order by levels from outputs to inputs.

easy_vision.python.core.backbones.vgg_reduce_fc.origin_vgg_arg_scope(is_training, weight_decay=0.0005)[source]

Defines the VGG arg scope.

Parameters:weight_decay – The l2 regularization coefficient.
Returns:An arg_scope.
easy_vision.python.core.backbones.vgg_reduce_fc.revised_vgg_arg_scope(is_training, weight_decay=0.0005)[source]

Defines the VGG arg scope. :param weight_decay: The l2 regularization coefficient.

Returns:
An arg_scope.

easy_vision.python.core.backbones.xception_backbone

class easy_vision.python.core.backbones.xception_backbone.XceptionBackbone(backbone_config, is_training=False)[source]

Bases: easy_vision.python.core.backbones.backbone.Backbone

__init__(backbone_config, is_training=False)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

build_graph(inputs, inputs_true_shape=None, num_classes=None)[source]
Parameters:
  • inputs – tensor of batched images. [batch_size, height, width, channels].
  • inputs_true_shape – true shape of batched images. [batch_size, 3].
  • num_classes – the number of predicted classes. If set None, the logits layer is ommited
Returns:

layer_feature, … }

Return type:

a dict of feature layers{ layer_name

classmethod create_class(name)