easy_vision.python.utils

easy_vision.python.utils.ac_utils

Logic to convert ac_config from proto to CompressConfig

easy_vision.python.utils.ac_utils.convert_ac_conifg(ac_config)[source]

easy_vision.python.utils.category_util

Functions for importing/exporting Object Detection categories.

easy_vision.python.utils.category_util.load_categories_from_csv_file(csv_path)[source]

Loads categories from a csv file.

The CSV file should have one comma delimited numeric category id and string category name pair per line. For example:

0,”cat” 1,”dog” 2,”bird” …

Parameters:csv_path – Path to the csv file to be parsed into categories.
Returns:
A list of dictionaries representing all possible categories.
The categories will contain an integer ‘id’ field and a string ‘name’ field.
Return type:categories
Raises:ValueError – If the csv file is incorrectly formatted.
easy_vision.python.utils.category_util.save_categories_to_csv_file(categories, csv_path)[source]

Saves categories to a csv file.

Parameters:
  • categories – A list of dictionaries representing categories to save to file. Each category must contain an ‘id’ and ‘name’ field.
  • csv_path – Path to the csv file to be parsed into categories.

easy_vision.python.utils.classification_vis_util

class easy_vision.python.utils.classification_vis_util.VisualizeClassification(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='ClassificationVis')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for text detection visualizations.

__init__(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='ClassificationVis')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N

easy_vision.python.utils.config_util

Functions for reading and updating configuration files.

easy_vision.python.utils.config_util.convert_to_pai_config_template(pipeline_config_path, dst_pipeline_config_path, exp_dir='EXP_DIR')[source]

Convert local configs to configs that can be run on PAI using public pai-vision-data 1. convert local data path to oss data path, oss://pai-vision-data-hz/data/xxx 2. convert pretrained model path to oss path, oss://pai-vision-data-hz/pretrained_models/xxx 3. convert experiment dir path to template oss path, oss://EXP_DIR/xxx

easy_vision.python.utils.config_util.create_pipeline_proto_from_configs(configs)[source]

Creates a pipeline_pb2.CVEstimator from configs dictionary.

This function performs the inverse operation of create_configs_from_pipeline_proto().

Parameters:configs – Dictionary of configs. See get_configs_from_pipeline_file().
Returns:A fully populated pipeline_pb2.CVEstimator.
easy_vision.python.utils.config_util.get_configs_from_pipeline_file(pipeline_config_path)[source]

Reads config from a file containing pipeline_pb2.CVEstimator.

Parameters:pipeline_config_path – Path to pipeline_pb2.CVEstimator text proto.
Returns:
Dictionary of configuration objects. Keys are model, train_config,
train_input_config, eval_config, eval_input_config. Value are the corresponding config objects.
easy_vision.python.utils.config_util.get_graph_rewriter_config_from_file(graph_rewriter_config_file)[source]

Parses config for graph rewriter.

Parameters:graph_rewriter_config_file – file path to the graph rewriter config.
Returns:graph_rewriter_pb2.GraphRewriter proto
easy_vision.python.utils.config_util.get_learning_rate_type(optimizer_config)[source]

Returns the learning rate type for training.

Parameters:optimizer_config – An optimizer_pb2.Optimizer.
Returns:The type of the learning rate.
easy_vision.python.utils.config_util.get_number_of_classes(model_config)[source]

Returns the number of classes for a detection model.

Parameters:model_config – A model_pb2.DetectionModel.
Returns:Number of classes.
Raises:ValueError – If the model type is not recognized.
easy_vision.python.utils.config_util.get_optimizer_type(train_config)[source]

Returns the optimizer type for training.

Parameters:train_config – A train_pb2.TrainConfig.
Returns:The type of the optimizer
easy_vision.python.utils.config_util.get_spatial_image_size(image_resizer_config)[source]

Returns expected spatial size of the output image from a given config.

Parameters:image_resizer_config – An image_resizer_pb2.ImageResizer.
Returns:A list of two integers of the form [height, width]. height and width are set -1 if they cannot be determined during graph construction.
Raises:ValueError – If the model type is not recognized.
easy_vision.python.utils.config_util.merge_external_params_with_configs(configs, hparams=None, **kwargs)[source]

Updates configs dictionary based on supplied parameters.

This utility is for modifying specific fields in the object detection configs. Say that one would like to experiment with different learning rates, momentum values, or batch sizes. Rather than creating a new config text file for each experiment, one can use a single base config file, and update particular values.

Parameters:
  • configs – Dictionary of configuration objects. See outputs from get_configs_from_pipeline_file() or get_configs_from_multiple_files().
  • hparams – A HParams.
  • **kwargs – Extra keyword arguments that are treated the same way as attribute/value pairs in hparams. Note that hyperparameters with the same names will override keyword arguments.
Returns:

configs dictionary.

easy_vision.python.utils.config_util.save_message(protobuf_message, filename)[source]

Saves a pipeline config text file to disk.

Parameters:
  • pipeline_config – A pipeline_pb2.TrainEvalPipelineConfig.
  • directory – The model directory into which the pipeline config file will be saved.
  • filename – pipelineconfig filename
easy_vision.python.utils.config_util.save_pipeline_config(pipeline_config, directory, filename='pipeline.config')[source]

Saves a pipeline config text file to disk.

Parameters:
  • pipeline_config – A pipeline_pb2.TrainEvalPipelineConfig.
  • directory – The model directory into which the pipeline config file will be saved.
  • filename – pipelineconfig filename

easy_vision.python.utils.context_manager

Python context management helper.

class easy_vision.python.utils.context_manager.IdentityContextManager[source]

Bases: object

Returns an identity context manager that does nothing.

This is helpful in setting up conditional with statement as below:

with slim.arg_scope(x) if use_slim_scope else IdentityContextManager():
do_stuff()

easy_vision.python.utils.convert_config_generator

class easy_vision.python.utils.convert_config_generator.ClassificationConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for classification

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
class easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator(params, data_prefix='')[source]

Bases: object

Baseclass to generate convert config

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
edit(config)[source]

call all set_attribute in ConfigEditor to edit config with args in kwargs_in

generate()[source]
Return
a DataConfig protobuf object
set_converter_class(config, converter_class)[source]
set_error_class(config, error_class)[source]
set_generator_cls(config, generator_class)[source]
set_ignore_class(config, ignore_class)[source]
set_image_format(config, image_format)[source]
set_image_size(config, max_image_size, max_test_image_size)[source]
set_model_type(config, model_type)[source]
set_num_samples_per_tfrecord(config, num_samples_per_tfrecord)[source]
set_oss_config(config, oss_config)[source]
set_parallel_num(config, read_parallel_num, write_parallel_num)[source]
set_queue_size(config, queue_size)[source]
set_separator(config, separator)[source]
set_task_id(config, task_id)[source]
set_test_ratio(config, test_ratio)[source]
set_user_defined_converter_path(config, user_defined_converter_path)[source]
set_user_defined_generator_path(config, user_defined_generator_path)[source]
class easy_vision.python.utils.convert_config_generator.DetectionConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for detection and text detection

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
set_min_bbox_size(config, min_bbox_size)[source]
class easy_vision.python.utils.convert_config_generator.SegmentationConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for classification

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
class easy_vision.python.utils.convert_config_generator.SelfDefinedConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for self-defined

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
class easy_vision.python.utils.convert_config_generator.TextConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for text

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_char_replace_map_path(config, char_replace_map_path)[source]
set_class_map(config, class_list, class_list_file, default_class)[source]
set_default_char_dict_path(config, default_char_dict_path)[source]
class easy_vision.python.utils.convert_config_generator.TextEnd2EndConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.TextConvertConfigGenerator

generate convert config for text end2end

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
set_ignore_recog_class(config, ignore_recog_class)[source]
set_min_bbox_size(config, min_bbox_size)[source]
class easy_vision.python.utils.convert_config_generator.VideoConvertConfigGenerator(params, data_prefix='')[source]

Bases: easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator

generate convert config for video

__init__(params, data_prefix='')[source]
Parameters:
  • params – a dict of params
  • data_prefix – data path prefix for data and pretrained-models, eg osspath oss://pai-vision-data/, local path /home/user/data
set_class_map(config, class_list, class_list_file, default_class)[source]
set_decode_batch_size(config, decode_batch_size)[source]
set_decode_keep_size(config, decode_keep_size)[source]
set_decode_type(config, decode_type)[source]
set_optical_flow(config, optical_flow)[source]
set_reshape_size(config, reshape_height, reshape_width)[source]
set_sample_fps(config, sample_fps)[source]
easy_vision.python.utils.convert_config_generator.create_generator(convert_param_config, data_prefix='')[source]

easy_vision.python.utils.dataset_util

Utility functions for creating TFRecord data sets.

easy_vision.python.utils.dataset_util.bytes_feature(value)[source]
easy_vision.python.utils.dataset_util.bytes_list_feature(value)[source]
easy_vision.python.utils.dataset_util.float_list_feature(value)[source]
easy_vision.python.utils.dataset_util.int64_feature(value)[source]
easy_vision.python.utils.dataset_util.int64_list_feature(value)[source]
easy_vision.python.utils.dataset_util.make_initializable_iterator(dataset)[source]

Creates an iterator, and initializes tables.

This is useful in cases where make_one_shot_iterator wouldn’t work because the graph contains a hash table that needs to be initialized.

Parameters:dataset – A tf.data.Dataset object.
Returns:A tf.data.Iterator.
easy_vision.python.utils.dataset_util.read_dataset(file_read_func, decode_func, input_files, config)[source]

Reads a dataset, and handles repetition and shuffling.

Parameters:
  • file_read_func – Function to use in tf.data.Dataset.interleave, to read every individual file into a tf.data.Dataset.
  • decode_func – Function to apply to all records.
  • input_files – A list of file paths to read.
  • config – A input_reader_builder.InputReader object.
Returns:

A tf.data.Dataset based on config.

easy_vision.python.utils.dataset_util.read_examples_list(path)[source]

Read list of training or validation examples.

The file is assumed to contain a single example per line where the first token in the line is an identifier that allows us to find the image and annotation xml for that example.

For example, the line: xyz 3 would allow us to find files xyz.jpg and xyz.xml (the 3 would be ignored).

Parameters:path – absolute path to examples list file.
Returns:list of example identifiers (strings).
easy_vision.python.utils.dataset_util.recursive_parse_xml_to_dict(xml)[source]

Recursively parses XML contents to python dict.

We assume that object tags are the only ones that can appear multiple times at the same level of a tree.

Parameters:xml – xml tree obtained by parsing XML file contents using lxml.etree
Returns:Python dictionary holding XML contents.

easy_vision.python.utils.im_util

Utils for images

easy_vision.python.utils.im_util.imdecode(image_data, exifrotate=False, **kwargs)[source]

decode image data

Parameters:
  • image_data – file-like object or byte string
  • exifrotate – If false, do not rotate the image according to EXIF’s orientation flag.
  • kwargs – imdecode params
easy_vision.python.utils.im_util.imsave(dst_file, image_data)[source]

save image data to file

Parameters:
  • dst_file – filename to save
  • image_data – numpy array, RGB format

easy_vision.python.utils.json_utils

Utilities for dealing with writing json strings.

json_utils wraps json.dump and json.dumps so that they can be used to safely control the precision of floats when writing to json strings or files.

class easy_vision.python.utils.json_utils.MyEncoder(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)[source]

Bases: json.encoder.JSONEncoder

default(o)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
iterencode(o, _one_shot=False)[source]

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)
easy_vision.python.utils.json_utils.PrettyParams(**params)[source]

Returns parameters for use with Dump and Dumps to output pretty json.

Example usage:

`json_str = json_utils.Dumps(obj, **json_utils.PrettyParams())` ```json_str = json_utils.Dumps(

obj, **json_utils.PrettyParams(allow_nans=False))```
Parameters:**params – Additional params to pass to json.dump or json.dumps.
Returns:
Parameters that are compatible with json_utils.Dump and
json_utils.Dumps.
Return type:params
easy_vision.python.utils.json_utils.compat_dumps(data, float_digits=-1)[source]

handle json dumps chinese and numpy data :param data python data structure: :param float_digits: The number of digits of precision when writing floats out.

Returns:
json str, in python2 , the str is encoded with utf8
in python3, the str is unicode type(python3 str)
easy_vision.python.utils.json_utils.dump(obj, fid, float_digits=-1, **params)[source]

Wrapper of json.dump that allows specifying the float precision used.

Parameters:
  • obj – The object to dump.
  • fid – The file id to write to.
  • float_digits – The number of digits of precision when writing floats out.
  • **params – Additional parameters to pass to json.dumps.
easy_vision.python.utils.json_utils.dumps(obj, float_digits=-1, **params)[source]

Wrapper of json.dumps that allows specifying the float precision used.

Parameters:
  • obj – The object to dump.
  • float_digits – The number of digits of precision when writing floats out.
  • **params – Additional parameters to pass to json.dumps.
Returns:

JSON string representation of obj.

Return type:

output

easy_vision.python.utils.label_map_util

Label map utility functions.

easy_vision.python.utils.label_map_util.create_category_index(categories)[source]

Creates dictionary of COCO compatible categories keyed by category id.

Parameters:categories

a list of dicts, each of which has the following keys: ‘id’: (required) an integer id uniquely identifying this category. ‘name’: (required) string representing category name

e.g., ‘cat’, ‘dog’, ‘pizza’.
’ignore_recog’: (optional) default false, ignore recognition or not in
end2end model
Returns:
a dict containing the same entries as categories, but keyed
by the ‘id’ field of each category.
Return type:category_index
easy_vision.python.utils.label_map_util.create_category_index_from_labelmap(label_map_path)[source]

Reads a label map and returns a category index.

Parameters:label_map_path – Path to StringIntLabelMap proto text file.
Returns:A category index, which is a dictionary that maps integer ids to dicts containing categories, e.g. {1: {‘id’: 1, ‘name’: ‘dog’}, 2: {‘id’: 2, ‘name’: ‘cat’}, …}
easy_vision.python.utils.label_map_util.create_class_agnostic_category_index()[source]

Creates a category index with a single object class.

easy_vision.python.utils.label_map_util.create_ignore_recog_class_ids(categories)[source]

Load ignore recognition classes from label map path

Parameters:categories

a list of dicts, each of which has the following keys: ‘id’: (required) an integer id uniquely identifying this category. ‘name’: (required) string representing category name

e.g., ‘cat’, ‘dog’, ‘pizza’.
’ignore_recog’: (optional) default false, ignore recognition or not in
end2end model
Returns:a list of ignore recognition class ids
easy_vision.python.utils.label_map_util.get_label_map_dict(label_map_path, use_display_name=False)[source]

Reads a label map and returns a dictionary of label names to id.

Parameters:
  • label_map_path – path to label_map.
  • use_display_name – whether to use the label map items’ display names as keys.
Returns:

A dictionary mapping label names to id.

easy_vision.python.utils.label_map_util.get_max_label_map_index(label_map)[source]

Get maximum index in label map.

Parameters:label_map – a StringIntLabelMapProto
Returns:an integer
easy_vision.python.utils.label_map_util.timeit_verbose(f)

decorator for time print

easy_vision.python.utils.np_box_list

Numpy BoxList classes and functions.

class easy_vision.python.utils.np_box_list.BoxList(data)[source]

Bases: object

Box collection.

BoxList represents a list of bounding boxes as numpy array, where each bounding box is represented as a row of 4 numbers, [y_min, x_min, y_max, x_max]. It is assumed that all bounding boxes within a given list correspond to a single image.

Optionally, users can add additional related fields (such as objectness/classification scores).

__init__(data)[source]

Constructs box collection.

Parameters:

data – a numpy array of shape [N, 4] representing box coordinates

Raises:
  • ValueError – if bbox data is not a numpy array
  • ValueError – if invalid dimensions for bbox data
add_field(field, field_data)[source]

Add data to a specified field.

Parameters:
  • field – a string parameter used to speficy a related field to be accessed.
  • field_data – a numpy array of [N, …] representing the data associated with the field.
Raises:

ValueError – if the field is already exist or the dimension of the field data does not matches the number of boxes.

get()[source]

Convenience function for accesssing box coordinates.

Returns:a numpy array of shape [N, 4] representing box corners
get_coordinates()[source]

Get corner coordinates of boxes.

Returns:a list of 4 1-d numpy arrays [y_min, x_min, y_max, x_max]
get_extra_fields()[source]

Return all non-box fields.

get_field(field)[source]

Accesses data associated with the specified field in the box collection.

Parameters:field – a string parameter used to speficy a related field to be accessed.
Returns:a numpy 1-d array representing data of an associated field
Raises:ValueError – if invalid field
has_field(field)[source]
num_boxes()[source]

Return number of boxes held in collections.

easy_vision.python.utils.np_box_list_ops

Bounding Box List operations for Numpy BoxLists.

Example box operations that are supported:
  • Areas: compute bounding box areas
  • IOU: pairwise intersection-over-union scores
class easy_vision.python.utils.np_box_list_ops.SortOrder[source]

Bases: object

Enum class for sort order.

ascend

ascend order.

descend

descend order.

ASCEND = 1
DESCEND = 2
easy_vision.python.utils.np_box_list_ops.area(boxlist)[source]

Computes area of boxes.

Parameters:boxlist – BoxList holding N boxes
Returns:a numpy array with shape [N*1] representing box areas
easy_vision.python.utils.np_box_list_ops.change_coordinate_frame(boxlist, window)[source]

Change coordinate frame of the boxlist to be relative to window’s frame.

Given a window of the form [ymin, xmin, ymax, xmax], changes bounding box coordinates from boxlist to be relative to this window (e.g., the min corner maps to (0,0) and the max corner maps to (1,1)).

An example use case is data augmentation: where we are given groundtruth boxes (boxlist) and would like to randomly crop the image to some window (window). In this case we need to change the coordinate frame of each groundtruth box to be relative to this new window.

Parameters:
  • boxlist – A BoxList object holding N boxes.
  • window – a size 4 1-D numpy array.
Returns:

Returns a BoxList object with N boxes.

easy_vision.python.utils.np_box_list_ops.clip_to_window(boxlist, window)[source]

Clip bounding boxes to a window.

This op clips input bounding boxes (represented by bounding box corners) to a window, optionally filtering out boxes that do not overlap at all with the window.

Parameters:
  • boxlist – BoxList holding M_in boxes
  • window – a numpy array of shape [4] representing the [y_min, x_min, y_max, x_max] window to which the op should clip boxes.
Returns:

a BoxList holding M_out boxes where M_out <= M_in

easy_vision.python.utils.np_box_list_ops.concatenate(boxlists, fields=None)[source]

Concatenate list of BoxLists.

This op concatenates a list of input BoxLists into a larger BoxList. It also handles concatenation of BoxList fields as long as the field tensor shapes are equal except for the first dimension.

Parameters:
  • boxlists – list of BoxList objects
  • fields – optional list of fields to also concatenate. By default, all fields from the first BoxList in the list are included in the concatenation.
Returns:

a BoxList with number of boxes equal to

sum([boxlist.num_boxes() for boxlist in BoxList])

Raises:

ValueError – if boxlists is invalid (i.e., is not a list, is empty, or contains non BoxList objects), or if requested fields are not contained in all boxlists

easy_vision.python.utils.np_box_list_ops.filter_scores_greater_than(boxlist, thresh)[source]

Filter to keep only boxes with score exceeding a given threshold.

This op keeps the collection of boxes whose corresponding scores are greater than the input threshold.

Parameters:
  • boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores.
  • thresh – scalar threshold
Returns:

a BoxList holding M boxes where M <= N

Raises:

ValueError – if boxlist not a BoxList object or if it does not have a scores field

easy_vision.python.utils.np_box_list_ops.gather(boxlist, indices, fields=None)[source]

Gather boxes from BoxList according to indices and return new BoxList.

By default, gather returns boxes corresponding to the input index list, as well as all additional fields stored in the boxlist (indexing into the first dimension). However one can optionally only gather from a subset of fields.

Parameters:
  • boxlist – BoxList holding N boxes
  • indices – a 1-d numpy array of type int_
  • fields – (optional) list of fields to also gather from. If None (default), all fields are gathered from. Pass an empty fields list to only gather the box coordinates.
Returns:

a BoxList corresponding to the subset of the input BoxList

specified by indices

Return type:

subboxlist

Raises:

ValueError – if specified field is not contained in boxlist or if the indices are not of type int_

easy_vision.python.utils.np_box_list_ops.intersection(boxlist1, boxlist2)[source]

Compute pairwise intersection areas between boxes.

Parameters:
  • boxlist1 – BoxList holding N boxes
  • boxlist2 – BoxList holding M boxes
Returns:

a numpy array with shape [N*M] representing pairwise intersection area

easy_vision.python.utils.np_box_list_ops.ioa(boxlist1, boxlist2)[source]

Computes pairwise intersection-over-area between box collections.

Intersection-over-area (ioa) between two boxes box1 and box2 is defined as their intersection area over box2’s area. Note that ioa is not symmetric, that is, IOA(box1, box2) != IOA(box2, box1).

Parameters:
  • boxlist1 – BoxList holding N boxes
  • boxlist2 – BoxList holding M boxes
Returns:

a numpy array with shape [N, M] representing pairwise ioa scores.

easy_vision.python.utils.np_box_list_ops.iou(boxlist1, boxlist2)[source]

Computes pairwise intersection-over-union between box collections.

Parameters:
  • boxlist1 – BoxList holding N boxes
  • boxlist2 – BoxList holding M boxes
Returns:

a numpy array with shape [N, M] representing pairwise iou scores.

easy_vision.python.utils.np_box_list_ops.multi_class_non_max_suppression(boxlist, score_thresh, iou_thresh, max_output_size)[source]

Multi-class version of non maximum suppression.

This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. It operates independently for each class for which scores are provided (via the scores field of the input box_list), pruning boxes with score less than a provided threshold prior to applying NMS.

Parameters:
  • boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores. This scores field is a tensor that can be 1 dimensional (in the case of a single class) or 2-dimensional, which which case we assume that it takes the shape [num_boxes, num_classes]. We further assume that this rank is known statically and that scores.shape[1] is also known (i.e., the number of classes is fixed and known at graph construction time).
  • score_thresh – scalar threshold for score (low scoring boxes are removed).
  • iou_thresh – scalar threshold for IOU (boxes that that high IOU overlap with previously selected boxes are removed).
  • max_output_size – maximum number of retained boxes per class.
Returns:

a BoxList holding M boxes with a rank-1 scores field representing

corresponding scores for each box with scores sorted in decreasing order and a rank-1 classes field representing a class label for each box.

Raises:

ValueError – if iou_thresh is not in [0, 1] or if input boxlist does not have a valid scores field.

easy_vision.python.utils.np_box_list_ops.non_max_suppression(boxlist, max_output_size=10000, iou_threshold=1.0, score_threshold=-10.0)[source]

Non maximum suppression.

This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. In each iteration, the detected bounding box with highest score in the available pool is selected.

Parameters:
  • boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores. All scores belong to the same class.
  • max_output_size – maximum number of retained boxes
  • iou_threshold – intersection over union threshold.
  • score_threshold – minimum score threshold. Remove the boxes with scores less than this value. Default value is set to -10. A very low threshold to pass pretty much all the boxes, unless the user sets a different score threshold.
Returns:

a BoxList holding M boxes where M <= max_output_size

Raises:
easy_vision.python.utils.np_box_list_ops.prune_non_overlapping_boxes(boxlist1, boxlist2, minoverlap=0.0)[source]

Prunes the boxes in boxlist1 that overlap less than thresh with boxlist2.

For each box in boxlist1, we want its IOA to be more than minoverlap with at least one of the boxes in boxlist2. If it does not, we remove it.

Parameters:
  • boxlist1 – BoxList holding N boxes.
  • boxlist2 – BoxList holding M boxes.
  • minoverlap – Minimum required overlap between boxes, to count them as overlapping.
Returns:

A pruned boxlist with size [N’, 4].

easy_vision.python.utils.np_box_list_ops.prune_outside_window(boxlist, window)[source]

Prunes bounding boxes that fall outside a given window.

This function prunes bounding boxes that even partially fall outside the given window. See also ClipToWindow which only prunes bounding boxes that fall completely outside the window, and clips any bounding boxes that partially overflow.

Parameters:
  • boxlist – a BoxList holding M_in boxes.
  • window – a numpy array of size 4, representing [ymin, xmin, ymax, xmax] of the window.
Returns:

a tensor with shape [M_out, 4] where M_out <= M_in. valid_indices: a tensor with shape [M_out] indexing the valid bounding boxes

in the input tensor.

Return type:

pruned_corners

easy_vision.python.utils.np_box_list_ops.scale(boxlist, y_scale, x_scale)[source]

Scale box coordinates in x and y dimensions.

Parameters:
  • boxlist – BoxList holding N boxes
  • y_scale – float
  • x_scale – float
Returns:

BoxList holding N boxes

Return type:

boxlist

easy_vision.python.utils.np_box_list_ops.sort_by_field(boxlist, field, order=2)[source]

Sort boxes and associated fields according to a scalar field.

A common use case is reordering the boxes according to descending scores.

Parameters:
  • boxlist – BoxList holding N boxes.
  • field – A BoxList field for sorting and reordering the BoxList.
  • order – (Optional) ‘descend’ or ‘ascend’. Default is descend.
Returns:

A sorted BoxList with the field in the specified order.

Return type:

sorted_boxlist

Raises:
  • ValueError – if specified field does not exist or is not of single dimension.
  • ValueError – if the order is not either descend or ascend.

easy_vision.python.utils.np_box_mask_list

Numpy BoxMaskList classes and functions.

class easy_vision.python.utils.np_box_mask_list.BoxMaskList(box_data, mask_data)[source]

Bases: easy_vision.python.utils.np_box_list.BoxList

Convenience wrapper for BoxList with masks.

BoxMaskList extends the np_box_list.BoxList to contain masks as well. In particular, its constructor receives both boxes and masks. Note that the masks correspond to the full image.

__init__(box_data, mask_data)[source]

Constructs box collection.

Parameters:
  • box_data – a numpy array of shape [N, 4] representing box coordinates
  • mask_data – a numpy array of shape [N, height, width] representing masks with values are in {0,1}. The masks correspond to the full image. The height and the width will be equal to image height and width.
Raises:
  • ValueError – if bbox data is not a numpy array
  • ValueError – if invalid dimensions for bbox data
  • ValueError – if mask data is not a numpy array
  • ValueError – if invalid dimension for mask data
get_masks()[source]

Convenience function for accessing masks.

Returns:a numpy array of shape [N, height, width] representing masks

easy_vision.python.utils.np_box_mask_list_ops

Operations for np_box_mask_list.BoxMaskList.

Example box operations that are supported:
  • Areas: compute bounding box areas
  • IOU: pairwise intersection-over-union scores
easy_vision.python.utils.np_box_mask_list_ops.area(box_mask_list)[source]

Computes area of masks.

Parameters:box_mask_list – np_box_mask_list.BoxMaskList holding N boxes and masks
Returns:a numpy array with shape [N*1] representing mask areas
easy_vision.python.utils.np_box_mask_list_ops.box_list_to_box_mask_list(boxlist)[source]

Converts a BoxList containing ‘masks’ into a BoxMaskList.

Parameters:boxlist – An np_box_list.BoxList object.
Returns:An np_box_mask_list.BoxMaskList object.
Raises:ValueError – If boxlist does not contain masks as a field.
easy_vision.python.utils.np_box_mask_list_ops.concatenate(box_mask_lists, fields=None)[source]

Concatenate list of box_mask_lists.

This op concatenates a list of input box_mask_lists into a larger box_mask_list. It also handles concatenation of box_mask_list fields as long as the field tensor shapes are equal except for the first dimension.

Parameters:
  • box_mask_lists – list of np_box_mask_list.BoxMaskList objects
  • fields – optional list of fields to also concatenate. By default, all fields from the first BoxMaskList in the list are included in the concatenation.
Returns:

a box_mask_list with number of boxes equal to

sum([box_mask_list.num_boxes() for box_mask_list in box_mask_list])

Raises:

ValueError – if box_mask_lists is invalid (i.e., is not a list, is empty, or contains non box_mask_list objects), or if requested fields are not contained in all box_mask_lists

easy_vision.python.utils.np_box_mask_list_ops.filter_scores_greater_than(box_mask_list, thresh)[source]

Filter to keep only boxes and masks with score exceeding a given threshold.

This op keeps the collection of boxes and masks whose corresponding scores are greater than the input threshold.

Parameters:
  • box_mask_list – BoxMaskList holding N boxes and masks. Must contain a ‘scores’ field representing detection scores.
  • thresh – scalar threshold
Returns:

a BoxMaskList holding M boxes and masks where M <= N

Raises:

ValueError – if box_mask_list not a np_box_mask_list.BoxMaskList object or if it does not have a scores field

easy_vision.python.utils.np_box_mask_list_ops.gather(box_mask_list, indices, fields=None)[source]

Gather boxes from np_box_mask_list.BoxMaskList according to indices.

By default, gather returns boxes corresponding to the input index list, as well as all additional fields stored in the box_mask_list (indexing into the first dimension). However one can optionally only gather from a subset of fields.

Parameters:
  • box_mask_list – np_box_mask_list.BoxMaskList holding N boxes
  • indices – a 1-d numpy array of type int_
  • fields – (optional) list of fields to also gather from. If None (default), all fields are gathered from. Pass an empty fields list to only gather the box coordinates.
Returns:

a np_box_mask_list.BoxMaskList corresponding to the subset

of the input box_mask_list specified by indices

Return type:

subbox_mask_list

Raises:

ValueError – if specified field is not contained in box_mask_list or if the indices are not of type int_

easy_vision.python.utils.np_box_mask_list_ops.intersection(box_mask_list1, box_mask_list2)[source]

Compute pairwise intersection areas between masks.

Parameters:
  • box_mask_list1 – BoxMaskList holding N boxes and masks
  • box_mask_list2 – BoxMaskList holding M boxes and masks
Returns:

a numpy array with shape [N*M] representing pairwise intersection area

easy_vision.python.utils.np_box_mask_list_ops.ioa(box_mask_list1, box_mask_list2)[source]

Computes pairwise intersection-over-area between box and mask collections.

Intersection-over-area (ioa) between two masks mask1 and mask2 is defined as their intersection area over mask2’s area. Note that ioa is not symmetric, that is, IOA(mask1, mask2) != IOA(mask2, mask1).

Parameters:
  • box_mask_list1 – np_box_mask_list.BoxMaskList holding N boxes and masks
  • box_mask_list2 – np_box_mask_list.BoxMaskList holding M boxes and masks
Returns:

a numpy array with shape [N, M] representing pairwise ioa scores.

easy_vision.python.utils.np_box_mask_list_ops.iou(box_mask_list1, box_mask_list2)[source]

Computes pairwise intersection-over-union between box and mask collections.

Parameters:
  • box_mask_list1 – BoxMaskList holding N boxes and masks
  • box_mask_list2 – BoxMaskList holding M boxes and masks
Returns:

a numpy array with shape [N, M] representing pairwise iou scores.

easy_vision.python.utils.np_box_mask_list_ops.multi_class_non_max_suppression(box_mask_list, score_thresh, iou_thresh, max_output_size)[source]

Multi-class version of non maximum suppression.

This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. It operates independently for each class for which scores are provided (via the scores field of the input box_list), pruning boxes with score less than a provided threshold prior to applying NMS.

Parameters:
  • box_mask_list – np_box_mask_list.BoxMaskList holding N boxes. Must contain a ‘scores’ field representing detection scores. This scores field is a tensor that can be 1 dimensional (in the case of a single class) or 2-dimensional, in which case we assume that it takes the shape [num_boxes, num_classes]. We further assume that this rank is known statically and that scores.shape[1] is also known (i.e., the number of classes is fixed and known at graph construction time).
  • score_thresh – scalar threshold for score (low scoring boxes are removed).
  • iou_thresh – scalar threshold for IOU (boxes that that high IOU overlap with previously selected boxes are removed).
  • max_output_size – maximum number of retained boxes per class.
Returns:

a box_mask_list holding M boxes with a rank-1 scores field representing

corresponding scores for each box with scores sorted in decreasing order and a rank-1 classes field representing a class label for each box.

Raises:

ValueError – if iou_thresh is not in [0, 1] or if input box_mask_list does not have a valid scores field.

easy_vision.python.utils.np_box_mask_list_ops.non_max_suppression(box_mask_list, max_output_size=10000, iou_threshold=1.0, score_threshold=-10.0)[source]

Non maximum suppression.

This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. In each iteration, the detected bounding box with highest score in the available pool is selected.

Parameters:
  • box_mask_list – np_box_mask_list.BoxMaskList holding N boxes. Must contain a ‘scores’ field representing detection scores. All scores belong to the same class.
  • max_output_size – maximum number of retained boxes
  • iou_threshold – intersection over union threshold.
  • score_threshold – minimum score threshold. Remove the boxes with scores less than this value. Default value is set to -10. A very low threshold to pass pretty much all the boxes, unless the user sets a different score threshold.
Returns:

an np_box_mask_list.BoxMaskList holding M boxes where M <= max_output_size

Raises:
easy_vision.python.utils.np_box_mask_list_ops.prune_non_overlapping_masks(box_mask_list1, box_mask_list2, minoverlap=0.0)[source]

Prunes the boxes in list1 that overlap less than thresh with list2.

For each mask in box_mask_list1, we want its IOA to be more than minoverlap with at least one of the masks in box_mask_list2. If it does not, we remove it. If the masks are not full size image, we do the pruning based on boxes.

Parameters:
  • box_mask_list1 – np_box_mask_list.BoxMaskList holding N boxes and masks.
  • box_mask_list2 – np_box_mask_list.BoxMaskList holding M boxes and masks.
  • minoverlap – Minimum required overlap between boxes, to count them as overlapping.
Returns:

A pruned box_mask_list with size [N’, 4].

easy_vision.python.utils.np_box_mask_list_ops.sort_by_field(box_mask_list, field, order=2)[source]

Sort boxes and associated fields according to a scalar field.

A common use case is reordering the boxes according to descending scores.

Parameters:
  • box_mask_list – BoxMaskList holding N boxes.
  • field – A BoxMaskList field for sorting and reordering the BoxMaskList.
  • order – (Optional) ‘descend’ or ‘ascend’. Default is descend.
Returns:

A sorted BoxMaskList with the field in the specified

order.

Return type:

sorted_box_mask_list

easy_vision.python.utils.np_box_ops

Operations for [N, 4] numpy arrays representing bounding boxes.

Example box operations that are supported:
  • Areas: compute bounding box areas
  • IOU: pairwise intersection-over-union scores
easy_vision.python.utils.np_box_ops.area(boxes)[source]

Computes area of boxes.

Parameters:boxes – Numpy array with shape [N, 4] holding N boxes
Returns:a numpy array with shape [N*1] representing box areas
easy_vision.python.utils.np_box_ops.intersection(boxes1, boxes2)[source]

Compute pairwise intersection areas between boxes.

Parameters:
  • boxes1 – a numpy array with shape [N, 4] holding N boxes
  • boxes2 – a numpy array with shape [M, 4] holding M boxes
Returns:

a numpy array with shape [N*M] representing pairwise intersection area

easy_vision.python.utils.np_box_ops.ioa(boxes1, boxes2)[source]

Computes pairwise intersection-over-area between box collections.

Intersection-over-area (ioa) between two boxes box1 and box2 is defined as their intersection area over box2’s area. Note that ioa is not symmetric, that is, IOA(box1, box2) != IOA(box2, box1).

Parameters:
  • boxes1 – a numpy array with shape [N, 4] holding N boxes.
  • boxes2 – a numpy array with shape [M, 4] holding N boxes.
Returns:

a numpy array with shape [N, M] representing pairwise ioa scores.

easy_vision.python.utils.np_box_ops.iou(boxes1, boxes2)[source]

Computes pairwise intersection-over-union between box collections.

Parameters:
  • boxes1 – a numpy array with shape [N, 4] holding N boxes.
  • boxes2 – a numpy array with shape [M, 4] holding N boxes.
Returns:

a numpy array with shape [N, M] representing pairwise iou scores.

easy_vision.python.utils.np_mask_ops

Operations for [N, height, width] numpy arrays representing masks.

Example mask operations that are supported:
  • Areas: compute mask areas
  • IOU: pairwise intersection-over-union scores
easy_vision.python.utils.np_mask_ops.area(masks)[source]

Computes area of masks.

Parameters:masks – Numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns:a numpy array with shape [N*1] representing mask areas.
Raises:ValueError – If masks.dtype is not np.uint8
easy_vision.python.utils.np_mask_ops.intersection(masks1, masks2)[source]

Compute pairwise intersection areas between masks.

Parameters:
  • masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
  • masks2 – a numpy array with shape [M, height, width] holding M masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns:

a numpy array with shape [N*M] representing pairwise intersection area.

Raises:

ValueError – If masks1 and masks2 are not of type np.uint8.

easy_vision.python.utils.np_mask_ops.ioa(masks1, masks2)[source]

Computes pairwise intersection-over-area between box collections.

Intersection-over-area (ioa) between two masks, mask1 and mask2 is defined as their intersection area over mask2’s area. Note that ioa is not symmetric, that is, IOA(mask1, mask2) != IOA(mask2, mask1).

Parameters:
  • masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
  • masks2 – a numpy array with shape [M, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns:

a numpy array with shape [N, M] representing pairwise ioa scores.

Raises:

ValueError – If masks1 and masks2 are not of type np.uint8.

easy_vision.python.utils.np_mask_ops.iou(masks1, masks2)[source]

Computes pairwise intersection-over-union between mask collections.

Parameters:
  • masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
  • masks2 – a numpy array with shape [M, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns:

a numpy array with shape [N, M] representing pairwise iou scores.

Raises:

ValueError – If masks1 and masks2 are not of type np.uint8.

easy_vision.python.utils.profiling

easy_vision.python.utils.profiling.get_func_name(f, args)[source]
easy_vision.python.utils.profiling.timeit(f, verbose=False)[source]

decorator for time print

easy_vision.python.utils.profiling.timeit_verbose(f)

decorator for time print

easy_vision.python.utils.seq_utils

class easy_vision.python.utils.seq_utils.Lcsq(seq0, seq1)[source]

@param A, B: Two strings. @return: The length of longest common subsequence of A and B.

__init__(seq0, seq1)[source]
get_match()[source]
parse()[source]
easy_vision.python.utils.seq_utils.sequence_accuracy(dense_anws, dense_pred)[source]

easy_vision.python.utils.text_vis_utils

class easy_vision.python.utils.text_vis_utils.VisualizeTextAttentionAlignment(char_dict, attention_type='line', max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEvalAttention')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for visualize text recognition attention alignment

__init__(char_dict, attention_type='line', max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEvalAttention')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
class easy_vision.python.utils.text_vis_utils.VisualizeTextDetection(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextDetectionEval')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for text detection visualizations.

__init__(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextDetectionEval')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
class easy_vision.python.utils.text_vis_utils.VisualizeTextEnd2End(char_dict, category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextEnd2EndEval')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for text spotting visualizations.

__init__(char_dict, category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextEnd2EndEval')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
class easy_vision.python.utils.text_vis_utils.VisualizeTextEnd2EndFeature(char_dict, max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEnd2EndEvalFeature')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for visualize text spotting feature.

__init__(char_dict, max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEnd2EndEvalFeature')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
class easy_vision.python.utils.text_vis_utils.VisualizeTextRectification(max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextRectificationEval')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for text rectification visualizations.

__init__(max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextRectificationEval')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
easy_vision.python.utils.text_vis_utils.draw_batch_gt_text_image(image_batched, image_shape_batched, groundtruth_boxes_batched, groundtruth_classes_batched, groundtruth_keypoints_batched, num_groundtruth_boxes_batched, groundtruth_texts_batched=None, category_index=None, use_normalized_coordinates=False, num_plots=16, n_jobs=16)[source]

Draw batch text detection or end2end groundtruth

Parameters:
  • image_batched – a list of numpy images
  • image_shape_batched – a list of true image shapes:
  • groundtruth_boxes_batched – a list of groundtruth bounding boxes
  • groundtruth_classes_batched – a list of groundtruth classes
  • groundtruth_keypoints_batched – a list of groundtruth text keypoints
  • num_groundtruth_boxes_batched – a list of number of groundtruth boxes
  • groundtruth_texts_batched – a list of groundtruth texts
  • category_index – a dictionary that maps integer ids to dicts
  • use_normalized_coordinates – If True, treat coordinates as relative to images
  • num_plots – number of plot images
  • n_jobs – number of parallel jobs
Returns:

a list of visualized images true_shape: a list of visualized images’ true shape

Return type:

vis_images

easy_vision.python.utils.text_vis_utils.draw_batch_text_attention(image_batched, image_shape_batched, attention_batched, attention_shape_batched, groundtruth_text_batched, predict_text_ids_batched, char_dict, num_plots=16, n_jobs=16, attention_type='line')[source]
easy_vision.python.utils.text_vis_utils.draw_batch_text_image(image_batched, image_shape_batched, groundtruth_boxes_batched, groundtruth_classes_batched, groundtruth_keypoints_batched, num_groundtruth_boxes_batched, detection_boxes_batched, detection_classes_batched, detection_keypoints_batched, detection_scores_batched, num_detections_batched, groundtruth_texts_batched=None, detection_texts_ids_batched=None, category_index=None, char_dict=None, use_normalized_coordinates=False, num_plots=16, n_jobs=16)[source]

Draw batch detection or end2end groundtruth and predictions

Parameters:
  • image_batched – a list of numpy images
  • image_shape_batched – a list of true image shapes:
  • groundtruth_boxes_batched – a list of groundtruth bounding boxes
  • groundtruth_classes_batched – a list of groundtruth classes
  • groundtruth_keypoints_batched – a list of groundtruth text keypoints
  • num_groundtruth_boxes_batched – a list of number of groundtruth boxes
  • detection_boxes_batched – a list of detection bounding boxes
  • detection_classes_batched – a list of detection classes
  • detection_keypoints_batched – a list of detection text keypoints
  • detection_scores_batched – a list of detection class scores
  • num_detections_batched – a list of number of detections
  • groundtruth_texts_batched – a list of groundtruth texts
  • detection_texts_ids_batched – a list of detection texts ids
  • category_index – a dictionary that maps integer ids to dicts
  • char_dict – a instance of vocab_utils.CharDict
  • use_normalized_coordinates – If True, treat coordinates as relative to images
  • num_plots – number of plot images
  • n_jobs – number of parallel jobs
Returns:

a list of visualized images true_shape: a list of visualized images’ true shape

Return type:

vis_images

easy_vision.python.utils.text_vis_utils.draw_batch_text_recognition_feature(image_batched_dict, feature_batched_dict, points_batched_dict, text_ids_batched=None, char_dict=None, num_plots=16, n_jobs=16, image_size=(32, 100))[source]
easy_vision.python.utils.text_vis_utils.draw_batch_text_rectification_image(image_batched, image_shape_batched, groundtruth_keypoints_batched, prediction_keypoints_batched, num_plots=16, n_jobs=16)[source]
easy_vision.python.utils.text_vis_utils.draw_box_text_keypoint(image, boxlist, classlist, category_index, textlist=None, keypointslist=None, scorelist=None, box_color='red', keypoint_color='blue', use_normalized_coordinates=False, scale=1)[source]

Draw boxes and texts and keypoints on image

Parameters:
  • image – a image numpy array
  • boxlist – detection boxes
  • category_index – a dictionary that maps integer ids to dicts
  • classlist – detection class ids
  • textlist – detection texts
  • keypointslist – detection keypoints
  • scorelist – detection scores:
  • box_color – color to draw bounding box
  • keypoint_color – color to draw keypoints
  • use_normalized_coordinates – If True, treat coordinates as relative to images
  • scale – image rescale ratio for better visualization
Returns:

a visualized image

easy_vision.python.utils.text_vis_utils.draw_gt_text_image(image, image_shape, groundtruth_boxes, groundtruth_classes, groundtruth_keypoints, groundtruth_texts=None, category_index=None, use_normalized_coordinates=False)[source]

Draw text detection or end2end groundtruth

Parameters:
  • image – a numpy array image
  • image_shape – true image shape:
  • groundtruth_boxes – groundtruth bounding boxes
  • groundtruth_classes – groundtruth classes
  • groundtruth_keypoints – groundtruth text keypoints
  • groundtruth_texts – groundtruth texts
  • category_index – a dictionary that maps integer ids to dicts
  • use_normalized_coordinates – If True, treat coordinates as relative to images
Returns:

a visualized image true_shape: visualized image true shape

Return type:

image

easy_vision.python.utils.text_vis_utils.draw_sequence_attention(image, image_shape, attention, attention_shape, groundtruth_text, predict_text_ids, char_dict)[source]
easy_vision.python.utils.text_vis_utils.draw_spatial_attention(image, image_shape, attention, attention_shape, groundtruth_text, predict_text_ids, char_dict)[source]
easy_vision.python.utils.text_vis_utils.draw_text_image(image, image_shape, groundtruth_boxes, groundtruth_classes, groundtruth_keypoints, detection_boxes, detection_classes, detection_keypoints, detection_scores, groundtruth_texts=None, detection_texts_ids=None, category_index=None, char_dict=None, use_normalized_coordinates=False)[source]

Draw text detection or end2end groundtruth and predictions

Parameters:
  • image – a numpy array image
  • image_shape – true image shape:
  • groundtruth_boxes – groundtruth bounding boxes
  • groundtruth_classes – groundtruth classes
  • groundtruth_keypoints – groundtruth text keypoints
  • detection_boxes – detection bounding boxes
  • detection_classes – detection classes
  • detection_keypoints – detection text keypoints
  • detection_scores – detection class scores
  • groundtruth_texts – groundtruth texts
  • detection_texts_ids – detection texts ids
  • category_index – a dictionary that maps integer ids to dicts
  • char_dict – a instance of vocab_utils.CharDict
  • use_normalized_coordinates – If True, treat coordinates as relative to images
Returns:

a visualized image true_shape: visualized image true shape

Return type:

image

easy_vision.python.utils.text_vis_utils.draw_text_recognition_feature(image_dict, feature_dict, points_dict, text_ids=None, char_dict=None, figsize=None, dpi=100)[source]
easy_vision.python.utils.text_vis_utils.draw_text_rectification_image(image, image_shape, groundtruth_keypoints, prediction_keypoints)[source]

easy_vision.python.utils.tf_util

easy_vision.python.utils.tf_util.parse_tf_config()[source]

parse TF_CONFIG and return cluster, task info

Return
cluster a dict of cluster info task_type string, if local mode, master will be returned task_index int, if local mode, 0 will be returned

easy_vision.python.utils.variables_helper

Helper functions for manipulating collections of variables during training.

easy_vision.python.utils.variables_helper.filter_variables(variables, filter_regex_list, invert=False)[source]

Filters out the variables matching the filter_regex.

Filter out the variables whose name matches the any of the regular expressions in filter_regex_list and returns the remaining variables. Optionally, if invert=True, the complement set is returned.

Parameters:
  • variables – a list of tensorflow variables.
  • filter_regex_list – a list of string regular expressions.
  • invert – (boolean). If True, returns the complement of the filter set; that is, all variables matching filter_regex are kept and all others discarded.
Returns:

a list of filtered variables.

easy_vision.python.utils.variables_helper.freeze_gradients_matching_regex(grads_and_vars, regex_list)[source]

Freeze gradients whose variable names match a regular expression.

Parameters:
  • grads_and_vars – A list of gradient to variable pairs (tuples).
  • regex_list – A list of string regular expressions.
Returns:

A list of gradient to variable pairs (tuples) that do not

contain the variables and gradients matching the regex.

Return type:

grads_and_vars

easy_vision.python.utils.variables_helper.get_variables_available_in_checkpoint(variables, checkpoint_path, include_global_step=True)[source]

Returns the subset of variables available in the checkpoint.

Inspects given checkpoint and returns the subset of variables that are available in it.

TODO(rathodv): force input and output to be a dictionary.

Parameters:
  • variables – a list or dictionary of variables to find in checkpoint.
  • checkpoint_path – path to the checkpoint to restore variables from.
  • include_global_step – whether to include global_step variable, if it exists. Default True.
Returns:

A list or dictionary of variables.

Raises:

ValueError – if variables is not a list or dict.

easy_vision.python.utils.variables_helper.multiply_gradients_matching_regex(grads_and_vars, regex_list, multiplier)[source]

Multiply gradients whose variable names match a regular expression.

Parameters:
  • grads_and_vars – A list of gradient to variable pairs (tuples).
  • regex_list – A list of string regular expressions.
  • multiplier – A (float) multiplier to apply to each gradient matching the regular expression.
Returns:

A list of gradient to variable pairs (tuples).

Return type:

grads_and_vars

easy_vision.python.utils.video_decode

easy_vision.python.utils.video_decode.video_decode(video, video_length, decode_type, sample_fps, reshape_size, decode_batch_size, decode_keep_size)[source]
Parameters:
  • video – input video
  • video_length – input video length
  • decode_type – decode type, 1–Intra only, 2–Keyframe only, 3–Without bidir, 4–Decode all
  • sampel_fps – sample rate, default -1, full sampling.
  • reshape_size – output size of decoded frames.
  • decode_batch_size – batch size of each decode phase.
  • decode_keep_size – left size of last decode phase.
Returns:

time stamp of each frame, float image_data: decode frames, string

Return type:

time_stamp

easy_vision.python.utils.video_input_utils

class easy_vision.python.utils.video_input_utils.Kinetics(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]

Bases: object

Kinetics data interface Args: text_line (string): video path and label sample_duration: video length of each sample initial_scale: specifying the initial scale for multiscale cropping n_scales: specifying the number of scales for multiscale cropping scale_step: specifying the scale step for multiscale cropping train_crop: specifying the cropping method sample_size: specifying the crop size n_samples_for_each_video: clip num of each video is_temporal_transform: temporal transform or not is_spatial_transform (callable, optional): spatial transform or not. E.g, transforms.RandomCrop

__init__(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

get_clip_and_label()[source]
class easy_vision.python.utils.video_input_utils.UCF101(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]

Bases: object

UCF101 data interface Args: text_line (string): video path and label sample_duration: video length of each sample initial_scale: specifying the initial scale for multiscale cropping n_scales: specifying the number of scales for multiscale cropping scale_step: specifying the scale step for multiscale cropping train_crop: specifying the cropping method sample_size: specifying the crop size is_spatial_transform (callable, optional): Spatial transform or not. E.g, transforms.RandomCrop is_training: is training or not

__init__(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

get_clip_and_label()[source]
easy_vision.python.utils.video_input_utils.get_frames_data_ucf101(filename, num_frames_per_clip=16)[source]

Given a directory containing extracted frames, return a video clip of (num_frames_per_clip) consecutive frames as a list of np arrays

easy_vision.python.utils.video_input_utils.get_mean(norm_value=1, dataset='activitynet')[source]
easy_vision.python.utils.video_input_utils.get_std(norm_value=1)[source]
easy_vision.python.utils.video_input_utils.get_video_frames(text_line, dataset_type='UCF101', sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=True, is_spatial_transform=True, is_training=True)[source]
Parameters:
  • text_line – each line represents one video with label
  • dataset_type – dataset type, e.g. UCF101, kinetics, default is UCF101
  • sample_duration – video length of each sample
  • initial_scale – specifying the initial scale for multiscale cropping
  • n_scales – specifying the number of scales for multiscale cropping
  • scale_step – specifying the scale step for multiscale cropping
  • train_crop – specifying the cropping method
  • sample_size – specifying the crop size
Returns:

video clips label: video labels filename: input filename

Return type:

clip

easy_vision.python.utils.video_input_utils.merge_list(filelists, is_shuffle=True)[source]
easy_vision.python.utils.video_input_utils.normalize(img, mean_dataset='activitynet')[source]
easy_vision.python.utils.video_input_utils.preprocess(video, dataset_type='UCF101', sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_spatial_transform=True, is_training=True)[source]
easy_vision.python.utils.video_input_utils.preprocess_ucf101_and_kinetics(video, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_spatial_transform=True, is_training=True)[source]
easy_vision.python.utils.video_input_utils.video_loader(video_dir_path, frame_indices)[source]

easy_vision.python.utils.video_transforms

class easy_vision.python.utils.video_transforms.CenterCrop(size)[source]

Bases: object

Crops the given PIL.Image at the center. :param size: Desired output size of the crop. If size is an

int instead of sequence like (h, w), a square crop (size, size) is made.
__init__(size)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.Compose(transforms)[source]

Bases: object

Composes several transforms together. :param transforms: list of transforms to compose. :type transforms: list of Transform objects

Example

>>> transforms.Compose([
>>>     transforms.CenterCrop(10),
>>>     transforms.ToTensor(),
>>> ])
__init__(transforms)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.CornerCrop(size, crop_position=None)[source]

Bases: object

__init__(size, crop_position=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.LoopPadding(size)[source]

Bases: object

__init__(size)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

class easy_vision.python.utils.video_transforms.MultiScaleCornerCrop(scales, size, interpolation=2, crop_positions=['c', 'tl', 'tr', 'bl', 'br'])[source]

Bases: object

Crop the given PIL.Image to randomly selected size. A crop of size is selected from scales of the original size. A position of cropping is randomly selected from 4 corners and 1 center. This crop is finally resized to given size. :param scales: cropping scales of the original size :param size: size of the smaller edge :param interpolation: Default: PIL.Image.BILINEAR

__init__(scales, size, interpolation=2, crop_positions=['c', 'tl', 'tr', 'bl', 'br'])[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.MultiScaleRandomCrop(scales, size, interpolation=2)[source]

Bases: object

__init__(scales, size, interpolation=2)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.RandomHorizontalFlip[source]

Bases: object

Horizontally flip the given PIL.Image randomly with a probability of 0.5.

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.Scale(size, interpolation=2)[source]

Bases: object

Rescale the input PIL.Image to the given size. :param size: Desired output size. If size is a sequence like

(w, h), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size)
Parameters:interpolation (int, optional) – Desired interpolation. Default is PIL.Image.BILINEAR
__init__(size, interpolation=2)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

randomize_parameters()[source]
class easy_vision.python.utils.video_transforms.TemporalBeginCrop(size)[source]

Bases: object

Temporally crop the given frame indices at a beginning. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int

__init__(size)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

class easy_vision.python.utils.video_transforms.TemporalCenterCrop(size)[source]

Bases: object

Temporally crop the given frame indices at a center. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int

__init__(size)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

class easy_vision.python.utils.video_transforms.TemporalRandomCrop(size)[source]

Bases: object

Temporally crop the given frame indices at a random location. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int

__init__(size)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

easy_vision.python.utils.visualization_utils

A set of functions that are used for visualization.

These functions often receive an image, perform some visualization on the image. The functions do not return a value, instead they modify the image itself.

class easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='evaluation_image')[source]

Bases: object

Abstract base class responsible for visualizations during evaluation. Currently, summary images are not run during evaluation. One way to produce evaluation images in Tensorboard is to provide tf.summary.image strings as value_ops in tf.estimator.EstimatorSpec’s eval_metric_ops. This class is responsible for accruing images (with overlaid detections and groundtruth) and returning a dictionary that can be passed to eval_metric_ops.

__init__(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='evaluation_image')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
add_images(images, image_shapes, image_ids)[source]

Store a list of images, each with shape [H, W, C]. :param images: an uint8 tensor of batched images with shape [N, H, W, C] :param image_shapes: an int64 tensor of batched image shapes with shape [N, 3] :param image_ids: a string tensor of batched image_ids with shape [N]

clear()[source]
get_estimator_eval_metric_ops(eval_dict)[source]

Returns metric ops for use in tf.estimator.EstimatorSpec. :param eval_dict: A dictionary that holds an image, groundtruth, and detections

for a single example. See eval_util.result_dict_for_single_example() for a convenient method for constructing such a dictionary. The dictionary contains fields.InputDataFields.original_image: [1, H, W, 3] image. fields.InputDataFields.groundtruth_boxes - [num_boxes, 4] float32

tensor with groundtruth boxes in range [0.0, 1.0].
fields.InputDataFields.groundtruth_classes - [num_boxes] int64
tensor with 1-indexed groundtruth classes.
fields.InputDataFields.groundtruth_instance_masks - (optional)
[num_boxes, H, W] int64 tensor with instance masks.
fields.DetectionResultFields.detection_boxes - [max_num_boxes, 4]
float32 tensor with detection boxes in range [0.0, 1.0].
fields.DetectionResultFields.detection_classes - [max_num_boxes]
int64 tensor with 1-indexed detection classes.
fields.DetectionResultFields.detection_scores - [max_num_boxes]
float32 tensor with detection scores.
fields.DetectionResultFields.detection_masks - (optional)
[max_num_boxes, H, W] float32 tensor of binarized masks.
fields.DetectionResultFields.detection_keypoints - (optional)
[max_num_boxes, num_keypoints, 2] float32 tensor with keypooints.
Returns:A dictionary of image summary names to tuple of (value_op, update_op). The update_op is the same for all items in the dictionary, and is responsible for saving a single side-by-side image with detections and groundtruth. Each value_op holds the tf.summary.image string for a given image.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
save_image_array(filename, image)[source]
class easy_vision.python.utils.visualization_utils.VisualizeMultiFrameDetections(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for multi-frame object detection visualizations.

__init__(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Args a dict containing following data

original_image: image tensor in the shape of 1xHxWxC key: image_id, string tensor detection_boxes: detection boxes tensor with shape of Dx4, D is the

number of detection

detection_scores: detection scores, D tensor detection_classes: D dimension tensor with type tf.float64 groundtruth_boxes: groundtruth boxes tensor with shape of Gx4, G is the

number of groundtruth

groundtruth_classes: D dimension tensor with type tf.float64 groundtruth_difficult: Optional if exists in groundtruth data. G dimension

tensor with type tf.bool
Returns:images a tensor of batched images with shape NxHxWxC image_shapes a tensor with shape Nx3 image_ids a string tensor with shape N
class easy_vision.python.utils.visualization_utils.VisualizeSingleFrameDetections(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]

Bases: easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization

Class responsible for single-frame object detection visualizations.

__init__(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]

Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in

normalized coordinates (as opposed to absolute coordiantes). Default is True.
Parameters:summary_name_prefix – A string prefix for each image summary.
images_from_evaluation_dict(eval_dict)[source]

Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing

visualizations.
Returns:images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
easy_vision.python.utils.visualization_utils.add_cdf_image_summary(values, name)[source]

Adds a tf.summary.image for a CDF plot of the values.

Normalizes values such that they sum to 1, plots the cumulative distribution function and creates a tf image summary.

Parameters:
  • values – a 1-D float32 tensor containing the values.
  • name – name for the image summary.
easy_vision.python.utils.visualization_utils.add_hist_image_summary(values, bins, name)[source]

Adds a tf.summary.image for a histogram plot of the values.

Plots the histogram of values and creates a tf image summary.

Parameters:
  • values – a 1-D float32 tensor containing the values.
  • bins – bin edges which will be directly passed to np.histogram.
  • name – name for the image summary.
easy_vision.python.utils.visualization_utils.draw_bounding_box_on_image(image, ymin, xmin, ymax, xmax, color='red', thickness=4, display_str_list=(), use_normalized_coordinates=True)[source]

Adds a bounding box to an image. Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument. Each string in display_str_list is displayed on a separate line above the bounding box in black text on a rectangle filled with the input ‘color’. If the top of the bounding box extends to the edge of the image, the strings are displayed below the bounding box. :param image: a PIL.Image object. :param ymin: ymin of bounding box. :param xmin: xmin of bounding box. :param ymax: ymax of bounding box. :param xmax: xmax of bounding box. :param color: color to draw bounding box. Default is red. :param thickness: line thickness. Default value is 4. :param display_str_list: list of strings to display in box

(each to be shown on its own line).
Parameters:use_normalized_coordinates – If True (default), treat coordinates ymin, xmin, ymax, xmax as relative to the image. Otherwise treat coordinates as absolute.
easy_vision.python.utils.visualization_utils.draw_bounding_box_on_image_array(image, ymin, xmin, ymax, xmax, color='red', thickness=4, display_str_list=(), use_normalized_coordinates=True)[source]

Adds a bounding box to an image (numpy array).

Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument.

Parameters:
  • image – a numpy array with shape [height, width, 3].
  • ymin – ymin of bounding box.
  • xmin – xmin of bounding box.
  • ymax – ymax of bounding box.
  • xmax – xmax of bounding box.
  • color – color to draw bounding box. Default is red.
  • thickness – line thickness. Default value is 4.
  • display_str_list – list of strings to display in box (each to be shown on its own line).
  • use_normalized_coordinates – If True (default), treat coordinates ymin, xmin, ymax, xmax as relative to the image. Otherwise treat coordinates as absolute.
easy_vision.python.utils.visualization_utils.draw_bounding_boxes_on_image(image, boxes, color='red', thickness=4, display_str_list_list=(), use_normalized_coordinates=True)[source]

Draws bounding boxes on image.

Parameters:
  • image – a PIL.Image object.
  • boxes – a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax). The coordinates are in normalized format between [0, 1].
  • color – color to draw bounding box. Default is red.
  • thickness – line thickness. Default value is 4.
  • display_str_list_list – list of list of strings. a list of strings for each bounding box. The reason to pass a list of strings for a bounding box is that it might contain multiple labels.
  • use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Raises:

ValueError – if boxes is not a [N, 4] array

easy_vision.python.utils.visualization_utils.draw_bounding_boxes_on_image_array(image, boxes, color='red', thickness=4, display_str_list_list=(), use_normalized_coordinates=True)[source]

Draws bounding boxes on image (numpy array).

Parameters:
  • image – a numpy array object.
  • boxes – a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax). The coordinates are in normalized format between [0, 1].
  • color – color to draw bounding box. Default is red.
  • thickness – line thickness. Default value is 4.
  • display_str_list_list – list of list of strings. a list of strings for each bounding box. The reason to pass a list of strings for a bounding box is that it might contain multiple labels.
  • use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Raises:

ValueError – if boxes is not a [N, 4] array

easy_vision.python.utils.visualization_utils.draw_bounding_boxes_on_image_tensors(images, boxes, classes, scores, category_index, instance_masks=None, keypoints=None, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True)[source]

Draws bounding boxes, masks, and keypoints on batch of image tensors.

Parameters:
  • images – A 4D uint8 image tensor of shape [N, H, W, C].
  • boxes – [N, max_detections, 4] float32 tensor of detection boxes.
  • classes – [N, max_detections] int tensor of detection classes. Note that classes are 1-indexed.
  • scores – [N, max_detections] float32 tensor of detection scores.
  • category_index – a dict that maps integer ids to category dicts. e.g. {1: {1: ‘dog’}, 2: {2: ‘cat’}, …}
  • instance_masks – A 4D uint8 tensor of shape [N, max_detection, H, W] with instance masks.
  • keypoints – A 4D float32 tensor of shape [N, max_detection, num_keypoints, 2] with keypoints.
  • max_boxes_to_draw – Maximum number of boxes to draw on an image. Default 20.
  • min_score_thresh – Minimum score threshold for visualization. Default 0.2.
  • use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Returns:

4D image tensor of type uint8, with boxes drawn on top.

easy_vision.python.utils.visualization_utils.draw_evaluation_image(eval_dict, category_index, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True)[source]

Creates a side-by-side image with detections and groundtruth.

Bounding boxes (and instance masks, if available) are visualized on both subimages.

Parameters:
  • eval_dict – The evaluation dictionary returned by eval_util.result_dict_for_single_example().
  • category_index – A category index (dictionary) produced from a labelmap.
  • max_boxes_to_draw – The maximum number of boxes to draw for detections.
  • min_score_thresh – The minimum score threshold for showing detections.
  • use_normalized_coordinates – Whether to use normalized coordinates or absolute ones
  • visualize_groundtruth – if set True, will visualize both detections and groundtruth in one image, detection left and groundtruth right
Returns:

A [1, H, 2 * W, C] uint8 tensor. The subimage on the left corresponds to

detections, while the subimage on the right corresponds to groundtruth.

easy_vision.python.utils.visualization_utils.draw_keypoints_on_image(image, keypoints, color='red', radius=2, use_normalized_coordinates=True)[source]

Draws keypoints on an image.

Parameters:
  • image – a PIL.Image object.
  • keypoints – a numpy array with shape [num_keypoints, 2].
  • color – color to draw the keypoints with. Default is red. if use AUTO, each keypoint will be painted with STANDARD_COLORS in order
  • radius – keypoint radius. Default value is 2.
  • use_normalized_coordinates – if True (default), treat keypoint values as relative to the image. Otherwise treat them as absolute.
easy_vision.python.utils.visualization_utils.draw_keypoints_on_image_array(image, keypoints, color='red', radius=2, use_normalized_coordinates=True)[source]

Draws keypoints on an image (numpy array).

Parameters:
  • image – a numpy array with shape [height, width, 3].
  • keypoints – a numpy array with shape [num_keypoints, 2].
  • color – color to draw the keypoints with. Default is red.
  • radius – keypoint radius. Default value is 2.
  • use_normalized_coordinates – if True (default), treat keypoint values as relative to the image. Otherwise treat them as absolute.
easy_vision.python.utils.visualization_utils.draw_mask_on_image_array(image, mask, color='red', alpha=0.4)[source]

Draws mask on an image.

Parameters:
  • image – uint8 numpy array with shape (img_height, img_height, 3)
  • mask – a uint8 numpy array of shape (img_height, img_height) with values between either 0 or 1.
  • color – color to draw the keypoints with. Default is red.
  • alpha – transparency value between 0 and 1. (default: 0.4)
Raises:

ValueError – On incorrect data type for image or masks.

easy_vision.python.utils.visualization_utils.draw_text_on_image_tensors(images, image_ids)[source]

Draws texts on batch of image tensors.

easy_vision.python.utils.visualization_utils.draw_text_on_top_of_image_array(image, display_str, color='black')[source]

Adds text on top of image. Draw texts on the top of images, which will return a higher image because additional image of text will be added.

Parameters:
  • image – a numpy array.
  • color – color to draw text. Default is red.
  • display_str_list – list of strings to display in box (each to be shown on its own line).
easy_vision.python.utils.visualization_utils.draw_texts_on_image(image, display_str_list, color='red', thickness=4)[source]

Adds a texts to an image. Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument. Each string in display_str_list is displayed on a separate line above the bounding box in black text on a rectangle filled with the input ‘color’. If the top of the bounding box extends to the edge of the image, the strings are displayed below the bounding box. :param image: a PIL.Image object. :param display_str_list: list of strings to display :param color: color to draw bounding box. Default is red. :param thickness: line thickness. Default value is 4.

easy_vision.python.utils.visualization_utils.draw_texts_on_image_array(image, display_str_list, color='red', thickness=4)[source]

Adds a bounding box to an image (numpy array).

Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument.

Parameters:
  • image – a numpy array with shape [height, width, 3].
  • display_str_list – list of strings to display
  • color – color to draw bounding box. Default is red.
  • thickness – line thickness. Default value is 4.
easy_vision.python.utils.visualization_utils.encode_image_array_as_png_str(image)[source]

Encodes a numpy array into a PNG string.

Parameters:image – a numpy array with shape [height, width, 3].
Returns:PNG encoded image string.
easy_vision.python.utils.visualization_utils.get_matplotlib_font(font_name)[source]

Get a matplotlib FontProperties of specify name

Parameters:font_name – name of a font with ext
Returns:FontProperties of this font
easy_vision.python.utils.visualization_utils.get_pil_font(font_name, font_size=24)[source]

Get a PIL.ImageFont of specify name

Parameters:
  • font_name – name of a font with ext
  • font_size – size of the font
Returns:

PIL.ImageFont of this font

easy_vision.python.utils.visualization_utils.pad_batch(list_of_tensor)[source]
easy_vision.python.utils.visualization_utils.save_image_array_as_png(image, output_path)[source]

Saves an image (represented as a numpy array) to PNG.

Parameters:
  • image – a numpy array with shape [height, width, 3].
  • output_path – path to which image should be written.
easy_vision.python.utils.visualization_utils.summary_image(name, tensor, max_outputs=16)[source]

summary a image, could save summary images in local directory for debugging :param name: A name for the generated node. Will also serve as a series name in

TensorBoard.
Parameters:
  • tensor – A 4-D uint8 or float32 Tensor of shape [batch_size, height, width, channels] where channels is 1, 3, or 4.
  • max_outputs – Max number of batch elements to generate images for.
easy_vision.python.utils.visualization_utils.visualize_boxes_and_labels_on_image_array(image, boxes, classes, scores, category_index, instance_masks=None, instance_boundaries=None, keypoints=None, use_normalized_coordinates=False, max_boxes_to_draw=20, min_score_thresh=0.5, agnostic_mode=False, line_thickness=4, groundtruth_box_visualization_color='black', skip_scores=False, skip_labels=False)[source]

Overlay labeled boxes on an image with formatted scores and label names.

This function groups boxes that correspond to the same location and creates a display string for each detection and overlays these on the image. Note that this function modifies the image in place, and returns that same image.

Parameters:
  • image – uint8 numpy array with shape (img_height, img_width, 3)
  • boxes – a numpy array of shape [N, 4]
  • classes – a numpy array of shape [N]. Note that class indices are 1-based, and match the keys in the label map.
  • scores – a numpy array of shape [N] or None. If scores=None, then this function assumes that the boxes to be plotted are groundtruth boxes and plot all boxes as black with no classes or scores.
  • category_index – a dict containing category dictionaries (each holding category index id and category name name) keyed by category indices.
  • instance_masks – a numpy array of shape [N, image_height, image_width] with values ranging between 0 and 1, can be None.
  • instance_boundaries – a numpy array of shape [N, image_height, image_width] with values ranging between 0 and 1, can be None.
  • keypoints – a numpy array of shape [N, num_keypoints, 2], can be None
  • use_normalized_coordinates – whether boxes is to be interpreted as normalized coordinates or not.
  • max_boxes_to_draw – maximum number of boxes to visualize. If None, draw all boxes.
  • min_score_thresh – minimum score threshold for a box to be visualized
  • agnostic_mode – boolean (default: False) controlling whether to evaluate in class-agnostic mode or not. This mode will display scores but ignore classes.
  • line_thickness – integer (default: 4) controlling line width of the boxes.
  • groundtruth_box_visualization_color – box color for visualizing groundtruth boxes
  • skip_scores – whether to skip score when drawing a single detection
  • skip_labels – whether to skip label when drawing a single detection
Returns:

uint8 numpy array with shape (img_height, img_width, 3) with overlaid boxes.

easy_vision.python.utils.vocab_utils

class easy_vision.python.utils.vocab_utils.CharDict(dict_path, load_raw=False)[source]

Character Dict of Texts

__init__(dict_path, load_raw=False)[source]

Constructor :param dict_path: char dict file path :param load_raw: if load_raw is True, UNK | SOS | EOS will not

insert into CharDict
get_char_by_id(id)[source]

Character to id

get_id_by_char(char)[source]

ID to character

get_num_char()[source]

Number of characters in the CharDict

to_char_seq(id_seq)[source]

Translate a character id sequence to a string

to_char_seq_batch(id_seq_list)[source]

Translate a batch a character id sequence to a batch of string

to_id_seq(one_str)[source]

Translate a string to a character id sequence

to_id_seq_batch(str_list)[source]

Translate a batch of string to a batch of character id sequence

easy_vision.python.utils.vocab_utils.remove_batch_eos_ids(batched_seq_ids)[source]

remove batch predict ids after EOS_ID

easy_vision.python.utils.vocab_utils.remove_eos_ids(seq_ids)[source]

remove predict ids after EOS_ID