easy_vision.python.utils¶
easy_vision.python.utils.ac_utils¶
Logic to convert ac_config from proto to CompressConfig
easy_vision.python.utils.category_util¶
Functions for importing/exporting Object Detection categories.
-
easy_vision.python.utils.category_util.
load_categories_from_csv_file
(csv_path)[source]¶ Loads categories from a csv file.
The CSV file should have one comma delimited numeric category id and string category name pair per line. For example:
0,”cat” 1,”dog” 2,”bird” …
Parameters: csv_path – Path to the csv file to be parsed into categories. Returns: - A list of dictionaries representing all possible categories.
- The categories will contain an integer ‘id’ field and a string ‘name’ field.
Return type: categories Raises: ValueError
– If the csv file is incorrectly formatted.
-
easy_vision.python.utils.category_util.
save_categories_to_csv_file
(categories, csv_path)[source]¶ Saves categories to a csv file.
Parameters: - categories – A list of dictionaries representing categories to save to file. Each category must contain an ‘id’ and ‘name’ field.
- csv_path – Path to the csv file to be parsed into categories.
easy_vision.python.utils.classification_vis_util¶
-
class
easy_vision.python.utils.classification_vis_util.
VisualizeClassification
(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='ClassificationVis')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for text detection visualizations.
-
__init__
(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='ClassificationVis')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
easy_vision.python.utils.config_util¶
Functions for reading and updating configuration files.
-
easy_vision.python.utils.config_util.
convert_to_pai_config_template
(pipeline_config_path, dst_pipeline_config_path, exp_dir='EXP_DIR')[source]¶ Convert local configs to configs that can be run on PAI using public pai-vision-data 1. convert local data path to oss data path, oss://pai-vision-data-hz/data/xxx 2. convert pretrained model path to oss path, oss://pai-vision-data-hz/pretrained_models/xxx 3. convert experiment dir path to template oss path, oss://EXP_DIR/xxx
-
easy_vision.python.utils.config_util.
create_pipeline_proto_from_configs
(configs)[source]¶ Creates a pipeline_pb2.CVEstimator from configs dictionary.
This function performs the inverse operation of create_configs_from_pipeline_proto().
Parameters: configs – Dictionary of configs. See get_configs_from_pipeline_file(). Returns: A fully populated pipeline_pb2.CVEstimator.
-
easy_vision.python.utils.config_util.
get_configs_from_pipeline_file
(pipeline_config_path)[source]¶ Reads config from a file containing pipeline_pb2.CVEstimator.
Parameters: pipeline_config_path – Path to pipeline_pb2.CVEstimator text proto. Returns: - Dictionary of configuration objects. Keys are model, train_config,
- train_input_config, eval_config, eval_input_config. Value are the corresponding config objects.
-
easy_vision.python.utils.config_util.
get_graph_rewriter_config_from_file
(graph_rewriter_config_file)[source]¶ Parses config for graph rewriter.
Parameters: graph_rewriter_config_file – file path to the graph rewriter config. Returns: graph_rewriter_pb2.GraphRewriter proto
-
easy_vision.python.utils.config_util.
get_learning_rate_type
(optimizer_config)[source]¶ Returns the learning rate type for training.
Parameters: optimizer_config – An optimizer_pb2.Optimizer. Returns: The type of the learning rate.
-
easy_vision.python.utils.config_util.
get_number_of_classes
(model_config)[source]¶ Returns the number of classes for a detection model.
Parameters: model_config – A model_pb2.DetectionModel. Returns: Number of classes. Raises: ValueError
– If the model type is not recognized.
-
easy_vision.python.utils.config_util.
get_optimizer_type
(train_config)[source]¶ Returns the optimizer type for training.
Parameters: train_config – A train_pb2.TrainConfig. Returns: The type of the optimizer
-
easy_vision.python.utils.config_util.
get_spatial_image_size
(image_resizer_config)[source]¶ Returns expected spatial size of the output image from a given config.
Parameters: image_resizer_config – An image_resizer_pb2.ImageResizer. Returns: A list of two integers of the form [height, width]. height and width are set -1 if they cannot be determined during graph construction. Raises: ValueError
– If the model type is not recognized.
-
easy_vision.python.utils.config_util.
merge_external_params_with_configs
(configs, hparams=None, **kwargs)[source]¶ Updates configs dictionary based on supplied parameters.
This utility is for modifying specific fields in the object detection configs. Say that one would like to experiment with different learning rates, momentum values, or batch sizes. Rather than creating a new config text file for each experiment, one can use a single base config file, and update particular values.
Parameters: - configs – Dictionary of configuration objects. See outputs from get_configs_from_pipeline_file() or get_configs_from_multiple_files().
- hparams – A HParams.
- **kwargs – Extra keyword arguments that are treated the same way as attribute/value pairs in hparams. Note that hyperparameters with the same names will override keyword arguments.
Returns: configs dictionary.
-
easy_vision.python.utils.config_util.
save_message
(protobuf_message, filename)[source]¶ Saves a pipeline config text file to disk.
Parameters: - pipeline_config – A pipeline_pb2.TrainEvalPipelineConfig.
- directory – The model directory into which the pipeline config file will be saved.
- filename – pipelineconfig filename
-
easy_vision.python.utils.config_util.
save_pipeline_config
(pipeline_config, directory, filename='pipeline.config')[source]¶ Saves a pipeline config text file to disk.
Parameters: - pipeline_config – A pipeline_pb2.TrainEvalPipelineConfig.
- directory – The model directory into which the pipeline config file will be saved.
- filename – pipelineconfig filename
easy_vision.python.utils.context_manager¶
Python context management helper.
easy_vision.python.utils.convert_config_generator¶
-
class
easy_vision.python.utils.convert_config_generator.
ClassificationConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for classification
-
class
easy_vision.python.utils.convert_config_generator.
ConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
object
Baseclass to generate convert config
-
class
easy_vision.python.utils.convert_config_generator.
DetectionConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for detection and text detection
-
class
easy_vision.python.utils.convert_config_generator.
SegmentationConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for classification
-
class
easy_vision.python.utils.convert_config_generator.
SelfDefinedConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for self-defined
-
class
easy_vision.python.utils.convert_config_generator.
TextConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for text
-
class
easy_vision.python.utils.convert_config_generator.
TextEnd2EndConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.TextConvertConfigGenerator
generate convert config for text end2end
-
class
easy_vision.python.utils.convert_config_generator.
VideoConvertConfigGenerator
(params, data_prefix='')[source]¶ Bases:
easy_vision.python.utils.convert_config_generator.ConvertConfigGenerator
generate convert config for video
easy_vision.python.utils.dataset_util¶
Utility functions for creating TFRecord data sets.
-
easy_vision.python.utils.dataset_util.
make_initializable_iterator
(dataset)[source]¶ Creates an iterator, and initializes tables.
This is useful in cases where make_one_shot_iterator wouldn’t work because the graph contains a hash table that needs to be initialized.
Parameters: dataset – A tf.data.Dataset object. Returns: A tf.data.Iterator.
-
easy_vision.python.utils.dataset_util.
read_dataset
(file_read_func, decode_func, input_files, config)[source]¶ Reads a dataset, and handles repetition and shuffling.
Parameters: - file_read_func – Function to use in tf.data.Dataset.interleave, to read every individual file into a tf.data.Dataset.
- decode_func – Function to apply to all records.
- input_files – A list of file paths to read.
- config – A input_reader_builder.InputReader object.
Returns: A tf.data.Dataset based on config.
-
easy_vision.python.utils.dataset_util.
read_examples_list
(path)[source]¶ Read list of training or validation examples.
The file is assumed to contain a single example per line where the first token in the line is an identifier that allows us to find the image and annotation xml for that example.
For example, the line: xyz 3 would allow us to find files xyz.jpg and xyz.xml (the 3 would be ignored).
Parameters: path – absolute path to examples list file. Returns: list of example identifiers (strings).
-
easy_vision.python.utils.dataset_util.
recursive_parse_xml_to_dict
(xml)[source]¶ Recursively parses XML contents to python dict.
We assume that object tags are the only ones that can appear multiple times at the same level of a tree.
Parameters: xml – xml tree obtained by parsing XML file contents using lxml.etree Returns: Python dictionary holding XML contents.
easy_vision.python.utils.im_util¶
Utils for images
easy_vision.python.utils.json_utils¶
Utilities for dealing with writing json strings.
json_utils wraps json.dump and json.dumps so that they can be used to safely control the precision of floats when writing to json strings or files.
-
class
easy_vision.python.utils.json_utils.
MyEncoder
(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)[source]¶ Bases:
json.encoder.JSONEncoder
-
default
(o)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
-
easy_vision.python.utils.json_utils.
PrettyParams
(**params)[source]¶ Returns parameters for use with Dump and Dumps to output pretty json.
- Example usage:
`json_str = json_utils.Dumps(obj, **json_utils.PrettyParams())`
```json_str = json_utils.Dumps(obj, **json_utils.PrettyParams(allow_nans=False))```
Parameters: **params – Additional params to pass to json.dump or json.dumps. Returns: - Parameters that are compatible with json_utils.Dump and
- json_utils.Dumps.
Return type: params
-
easy_vision.python.utils.json_utils.
compat_dumps
(data, float_digits=-1)[source]¶ handle json dumps chinese and numpy data :param data python data structure: :param float_digits: The number of digits of precision when writing floats out.
Returns: - json str, in python2 , the str is encoded with utf8
- in python3, the str is unicode type(python3 str)
-
easy_vision.python.utils.json_utils.
dump
(obj, fid, float_digits=-1, **params)[source]¶ Wrapper of json.dump that allows specifying the float precision used.
Parameters: - obj – The object to dump.
- fid – The file id to write to.
- float_digits – The number of digits of precision when writing floats out.
- **params – Additional parameters to pass to json.dumps.
-
easy_vision.python.utils.json_utils.
dumps
(obj, float_digits=-1, **params)[source]¶ Wrapper of json.dumps that allows specifying the float precision used.
Parameters: - obj – The object to dump.
- float_digits – The number of digits of precision when writing floats out.
- **params – Additional parameters to pass to json.dumps.
Returns: JSON string representation of obj.
Return type: output
easy_vision.python.utils.label_map_util¶
Label map utility functions.
-
easy_vision.python.utils.label_map_util.
create_category_index
(categories)[source]¶ Creates dictionary of COCO compatible categories keyed by category id.
Parameters: categories – a list of dicts, each of which has the following keys: ‘id’: (required) an integer id uniquely identifying this category. ‘name’: (required) string representing category name
e.g., ‘cat’, ‘dog’, ‘pizza’.- ’ignore_recog’: (optional) default false, ignore recognition or not in
- end2end model
Returns: - a dict containing the same entries as categories, but keyed
- by the ‘id’ field of each category.
Return type: category_index
-
easy_vision.python.utils.label_map_util.
create_category_index_from_labelmap
(label_map_path)[source]¶ Reads a label map and returns a category index.
Parameters: label_map_path – Path to StringIntLabelMap proto text file. Returns: A category index, which is a dictionary that maps integer ids to dicts containing categories, e.g. {1: {‘id’: 1, ‘name’: ‘dog’}, 2: {‘id’: 2, ‘name’: ‘cat’}, …}
-
easy_vision.python.utils.label_map_util.
create_class_agnostic_category_index
()[source]¶ Creates a category index with a single object class.
-
easy_vision.python.utils.label_map_util.
create_ignore_recog_class_ids
(categories)[source]¶ Load ignore recognition classes from label map path
Parameters: categories – a list of dicts, each of which has the following keys: ‘id’: (required) an integer id uniquely identifying this category. ‘name’: (required) string representing category name
e.g., ‘cat’, ‘dog’, ‘pizza’.- ’ignore_recog’: (optional) default false, ignore recognition or not in
- end2end model
Returns: a list of ignore recognition class ids
-
easy_vision.python.utils.label_map_util.
get_label_map_dict
(label_map_path, use_display_name=False)[source]¶ Reads a label map and returns a dictionary of label names to id.
Parameters: - label_map_path – path to label_map.
- use_display_name – whether to use the label map items’ display names as keys.
Returns: A dictionary mapping label names to id.
-
easy_vision.python.utils.label_map_util.
get_max_label_map_index
(label_map)[source]¶ Get maximum index in label map.
Parameters: label_map – a StringIntLabelMapProto Returns: an integer
-
easy_vision.python.utils.label_map_util.
timeit_verbose
(f)¶ decorator for time print
easy_vision.python.utils.np_box_list¶
Numpy BoxList classes and functions.
-
class
easy_vision.python.utils.np_box_list.
BoxList
(data)[source]¶ Bases:
object
Box collection.
BoxList represents a list of bounding boxes as numpy array, where each bounding box is represented as a row of 4 numbers, [y_min, x_min, y_max, x_max]. It is assumed that all bounding boxes within a given list correspond to a single image.
Optionally, users can add additional related fields (such as objectness/classification scores).
-
__init__
(data)[source]¶ Constructs box collection.
Parameters: data – a numpy array of shape [N, 4] representing box coordinates
Raises: ValueError
– if bbox data is not a numpy arrayValueError
– if invalid dimensions for bbox data
-
add_field
(field, field_data)[source]¶ Add data to a specified field.
Parameters: - field – a string parameter used to speficy a related field to be accessed.
- field_data – a numpy array of [N, …] representing the data associated with the field.
Raises: ValueError
– if the field is already exist or the dimension of the field data does not matches the number of boxes.
-
get
()[source]¶ Convenience function for accesssing box coordinates.
Returns: a numpy array of shape [N, 4] representing box corners
-
get_coordinates
()[source]¶ Get corner coordinates of boxes.
Returns: a list of 4 1-d numpy arrays [y_min, x_min, y_max, x_max]
-
get_field
(field)[source]¶ Accesses data associated with the specified field in the box collection.
Parameters: field – a string parameter used to speficy a related field to be accessed. Returns: a numpy 1-d array representing data of an associated field Raises: ValueError
– if invalid field
-
easy_vision.python.utils.np_box_list_ops¶
Bounding Box List operations for Numpy BoxLists.
- Example box operations that are supported:
- Areas: compute bounding box areas
- IOU: pairwise intersection-over-union scores
-
class
easy_vision.python.utils.np_box_list_ops.
SortOrder
[source]¶ Bases:
object
Enum class for sort order.
-
ascend
¶ ascend order.
-
descend
¶ descend order.
-
ASCEND
= 1¶
-
DESCEND
= 2¶
-
-
easy_vision.python.utils.np_box_list_ops.
area
(boxlist)[source]¶ Computes area of boxes.
Parameters: boxlist – BoxList holding N boxes Returns: a numpy array with shape [N*1] representing box areas
-
easy_vision.python.utils.np_box_list_ops.
change_coordinate_frame
(boxlist, window)[source]¶ Change coordinate frame of the boxlist to be relative to window’s frame.
Given a window of the form [ymin, xmin, ymax, xmax], changes bounding box coordinates from boxlist to be relative to this window (e.g., the min corner maps to (0,0) and the max corner maps to (1,1)).
An example use case is data augmentation: where we are given groundtruth boxes (boxlist) and would like to randomly crop the image to some window (window). In this case we need to change the coordinate frame of each groundtruth box to be relative to this new window.
Parameters: - boxlist – A BoxList object holding N boxes.
- window – a size 4 1-D numpy array.
Returns: Returns a BoxList object with N boxes.
-
easy_vision.python.utils.np_box_list_ops.
clip_to_window
(boxlist, window)[source]¶ Clip bounding boxes to a window.
This op clips input bounding boxes (represented by bounding box corners) to a window, optionally filtering out boxes that do not overlap at all with the window.
Parameters: - boxlist – BoxList holding M_in boxes
- window – a numpy array of shape [4] representing the [y_min, x_min, y_max, x_max] window to which the op should clip boxes.
Returns: a BoxList holding M_out boxes where M_out <= M_in
-
easy_vision.python.utils.np_box_list_ops.
concatenate
(boxlists, fields=None)[source]¶ Concatenate list of BoxLists.
This op concatenates a list of input BoxLists into a larger BoxList. It also handles concatenation of BoxList fields as long as the field tensor shapes are equal except for the first dimension.
Parameters: - boxlists – list of BoxList objects
- fields – optional list of fields to also concatenate. By default, all fields from the first BoxList in the list are included in the concatenation.
Returns: - a BoxList with number of boxes equal to
sum([boxlist.num_boxes() for boxlist in BoxList])
Raises: ValueError
– if boxlists is invalid (i.e., is not a list, is empty, or contains non BoxList objects), or if requested fields are not contained in all boxlists
-
easy_vision.python.utils.np_box_list_ops.
filter_scores_greater_than
(boxlist, thresh)[source]¶ Filter to keep only boxes with score exceeding a given threshold.
This op keeps the collection of boxes whose corresponding scores are greater than the input threshold.
Parameters: - boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores.
- thresh – scalar threshold
Returns: a BoxList holding M boxes where M <= N
Raises: ValueError
– if boxlist not a BoxList object or if it does not have a scores field
-
easy_vision.python.utils.np_box_list_ops.
gather
(boxlist, indices, fields=None)[source]¶ Gather boxes from BoxList according to indices and return new BoxList.
By default, gather returns boxes corresponding to the input index list, as well as all additional fields stored in the boxlist (indexing into the first dimension). However one can optionally only gather from a subset of fields.
Parameters: - boxlist – BoxList holding N boxes
- indices – a 1-d numpy array of type int_
- fields – (optional) list of fields to also gather from. If None (default), all fields are gathered from. Pass an empty fields list to only gather the box coordinates.
Returns: - a BoxList corresponding to the subset of the input BoxList
specified by indices
Return type: subboxlist
Raises: ValueError
– if specified field is not contained in boxlist or if the indices are not of type int_
-
easy_vision.python.utils.np_box_list_ops.
intersection
(boxlist1, boxlist2)[source]¶ Compute pairwise intersection areas between boxes.
Parameters: - boxlist1 – BoxList holding N boxes
- boxlist2 – BoxList holding M boxes
Returns: a numpy array with shape [N*M] representing pairwise intersection area
-
easy_vision.python.utils.np_box_list_ops.
ioa
(boxlist1, boxlist2)[source]¶ Computes pairwise intersection-over-area between box collections.
Intersection-over-area (ioa) between two boxes box1 and box2 is defined as their intersection area over box2’s area. Note that ioa is not symmetric, that is, IOA(box1, box2) != IOA(box2, box1).
Parameters: - boxlist1 – BoxList holding N boxes
- boxlist2 – BoxList holding M boxes
Returns: a numpy array with shape [N, M] representing pairwise ioa scores.
-
easy_vision.python.utils.np_box_list_ops.
iou
(boxlist1, boxlist2)[source]¶ Computes pairwise intersection-over-union between box collections.
Parameters: - boxlist1 – BoxList holding N boxes
- boxlist2 – BoxList holding M boxes
Returns: a numpy array with shape [N, M] representing pairwise iou scores.
-
easy_vision.python.utils.np_box_list_ops.
multi_class_non_max_suppression
(boxlist, score_thresh, iou_thresh, max_output_size)[source]¶ Multi-class version of non maximum suppression.
This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. It operates independently for each class for which scores are provided (via the scores field of the input box_list), pruning boxes with score less than a provided threshold prior to applying NMS.
Parameters: - boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores. This scores field is a tensor that can be 1 dimensional (in the case of a single class) or 2-dimensional, which which case we assume that it takes the shape [num_boxes, num_classes]. We further assume that this rank is known statically and that scores.shape[1] is also known (i.e., the number of classes is fixed and known at graph construction time).
- score_thresh – scalar threshold for score (low scoring boxes are removed).
- iou_thresh – scalar threshold for IOU (boxes that that high IOU overlap with previously selected boxes are removed).
- max_output_size – maximum number of retained boxes per class.
Returns: - a BoxList holding M boxes with a rank-1 scores field representing
corresponding scores for each box with scores sorted in decreasing order and a rank-1 classes field representing a class label for each box.
Raises: ValueError
– if iou_thresh is not in [0, 1] or if input boxlist does not have a valid scores field.
-
easy_vision.python.utils.np_box_list_ops.
non_max_suppression
(boxlist, max_output_size=10000, iou_threshold=1.0, score_threshold=-10.0)[source]¶ Non maximum suppression.
This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. In each iteration, the detected bounding box with highest score in the available pool is selected.
Parameters: - boxlist – BoxList holding N boxes. Must contain a ‘scores’ field representing detection scores. All scores belong to the same class.
- max_output_size – maximum number of retained boxes
- iou_threshold – intersection over union threshold.
- score_threshold – minimum score threshold. Remove the boxes with scores less than this value. Default value is set to -10. A very low threshold to pass pretty much all the boxes, unless the user sets a different score threshold.
Returns: a BoxList holding M boxes where M <= max_output_size
Raises: ValueError
– if ‘scores’ field does not existValueError
– if threshold is not in [0, 1]ValueError
– if max_output_size < 0
-
easy_vision.python.utils.np_box_list_ops.
prune_non_overlapping_boxes
(boxlist1, boxlist2, minoverlap=0.0)[source]¶ Prunes the boxes in boxlist1 that overlap less than thresh with boxlist2.
For each box in boxlist1, we want its IOA to be more than minoverlap with at least one of the boxes in boxlist2. If it does not, we remove it.
Parameters: - boxlist1 – BoxList holding N boxes.
- boxlist2 – BoxList holding M boxes.
- minoverlap – Minimum required overlap between boxes, to count them as overlapping.
Returns: A pruned boxlist with size [N’, 4].
-
easy_vision.python.utils.np_box_list_ops.
prune_outside_window
(boxlist, window)[source]¶ Prunes bounding boxes that fall outside a given window.
This function prunes bounding boxes that even partially fall outside the given window. See also ClipToWindow which only prunes bounding boxes that fall completely outside the window, and clips any bounding boxes that partially overflow.
Parameters: - boxlist – a BoxList holding M_in boxes.
- window – a numpy array of size 4, representing [ymin, xmin, ymax, xmax] of the window.
Returns: a tensor with shape [M_out, 4] where M_out <= M_in. valid_indices: a tensor with shape [M_out] indexing the valid bounding boxes
in the input tensor.
Return type: pruned_corners
-
easy_vision.python.utils.np_box_list_ops.
scale
(boxlist, y_scale, x_scale)[source]¶ Scale box coordinates in x and y dimensions.
Parameters: - boxlist – BoxList holding N boxes
- y_scale – float
- x_scale – float
Returns: BoxList holding N boxes
Return type: boxlist
-
easy_vision.python.utils.np_box_list_ops.
sort_by_field
(boxlist, field, order=2)[source]¶ Sort boxes and associated fields according to a scalar field.
A common use case is reordering the boxes according to descending scores.
Parameters: - boxlist – BoxList holding N boxes.
- field – A BoxList field for sorting and reordering the BoxList.
- order – (Optional) ‘descend’ or ‘ascend’. Default is descend.
Returns: A sorted BoxList with the field in the specified order.
Return type: sorted_boxlist
Raises: ValueError
– if specified field does not exist or is not of single dimension.ValueError
– if the order is not either descend or ascend.
easy_vision.python.utils.np_box_mask_list¶
Numpy BoxMaskList classes and functions.
-
class
easy_vision.python.utils.np_box_mask_list.
BoxMaskList
(box_data, mask_data)[source]¶ Bases:
easy_vision.python.utils.np_box_list.BoxList
Convenience wrapper for BoxList with masks.
BoxMaskList extends the np_box_list.BoxList to contain masks as well. In particular, its constructor receives both boxes and masks. Note that the masks correspond to the full image.
-
__init__
(box_data, mask_data)[source]¶ Constructs box collection.
Parameters: - box_data – a numpy array of shape [N, 4] representing box coordinates
- mask_data – a numpy array of shape [N, height, width] representing masks with values are in {0,1}. The masks correspond to the full image. The height and the width will be equal to image height and width.
Raises: ValueError
– if bbox data is not a numpy arrayValueError
– if invalid dimensions for bbox dataValueError
– if mask data is not a numpy arrayValueError
– if invalid dimension for mask data
-
easy_vision.python.utils.np_box_mask_list_ops¶
Operations for np_box_mask_list.BoxMaskList.
- Example box operations that are supported:
- Areas: compute bounding box areas
- IOU: pairwise intersection-over-union scores
-
easy_vision.python.utils.np_box_mask_list_ops.
area
(box_mask_list)[source]¶ Computes area of masks.
Parameters: box_mask_list – np_box_mask_list.BoxMaskList holding N boxes and masks Returns: a numpy array with shape [N*1] representing mask areas
-
easy_vision.python.utils.np_box_mask_list_ops.
box_list_to_box_mask_list
(boxlist)[source]¶ Converts a BoxList containing ‘masks’ into a BoxMaskList.
Parameters: boxlist – An np_box_list.BoxList object. Returns: An np_box_mask_list.BoxMaskList object. Raises: ValueError
– If boxlist does not contain masks as a field.
-
easy_vision.python.utils.np_box_mask_list_ops.
concatenate
(box_mask_lists, fields=None)[source]¶ Concatenate list of box_mask_lists.
This op concatenates a list of input box_mask_lists into a larger box_mask_list. It also handles concatenation of box_mask_list fields as long as the field tensor shapes are equal except for the first dimension.
Parameters: - box_mask_lists – list of np_box_mask_list.BoxMaskList objects
- fields – optional list of fields to also concatenate. By default, all fields from the first BoxMaskList in the list are included in the concatenation.
Returns: - a box_mask_list with number of boxes equal to
sum([box_mask_list.num_boxes() for box_mask_list in box_mask_list])
Raises: ValueError
– if box_mask_lists is invalid (i.e., is not a list, is empty, or contains non box_mask_list objects), or if requested fields are not contained in all box_mask_lists
-
easy_vision.python.utils.np_box_mask_list_ops.
filter_scores_greater_than
(box_mask_list, thresh)[source]¶ Filter to keep only boxes and masks with score exceeding a given threshold.
This op keeps the collection of boxes and masks whose corresponding scores are greater than the input threshold.
Parameters: - box_mask_list – BoxMaskList holding N boxes and masks. Must contain a ‘scores’ field representing detection scores.
- thresh – scalar threshold
Returns: a BoxMaskList holding M boxes and masks where M <= N
Raises: ValueError
– if box_mask_list not a np_box_mask_list.BoxMaskList object or if it does not have a scores field
-
easy_vision.python.utils.np_box_mask_list_ops.
gather
(box_mask_list, indices, fields=None)[source]¶ Gather boxes from np_box_mask_list.BoxMaskList according to indices.
By default, gather returns boxes corresponding to the input index list, as well as all additional fields stored in the box_mask_list (indexing into the first dimension). However one can optionally only gather from a subset of fields.
Parameters: - box_mask_list – np_box_mask_list.BoxMaskList holding N boxes
- indices – a 1-d numpy array of type int_
- fields – (optional) list of fields to also gather from. If None (default), all fields are gathered from. Pass an empty fields list to only gather the box coordinates.
Returns: - a np_box_mask_list.BoxMaskList corresponding to the subset
of the input box_mask_list specified by indices
Return type: subbox_mask_list
Raises: ValueError
– if specified field is not contained in box_mask_list or if the indices are not of type int_
-
easy_vision.python.utils.np_box_mask_list_ops.
intersection
(box_mask_list1, box_mask_list2)[source]¶ Compute pairwise intersection areas between masks.
Parameters: - box_mask_list1 – BoxMaskList holding N boxes and masks
- box_mask_list2 – BoxMaskList holding M boxes and masks
Returns: a numpy array with shape [N*M] representing pairwise intersection area
-
easy_vision.python.utils.np_box_mask_list_ops.
ioa
(box_mask_list1, box_mask_list2)[source]¶ Computes pairwise intersection-over-area between box and mask collections.
Intersection-over-area (ioa) between two masks mask1 and mask2 is defined as their intersection area over mask2’s area. Note that ioa is not symmetric, that is, IOA(mask1, mask2) != IOA(mask2, mask1).
Parameters: - box_mask_list1 – np_box_mask_list.BoxMaskList holding N boxes and masks
- box_mask_list2 – np_box_mask_list.BoxMaskList holding M boxes and masks
Returns: a numpy array with shape [N, M] representing pairwise ioa scores.
-
easy_vision.python.utils.np_box_mask_list_ops.
iou
(box_mask_list1, box_mask_list2)[source]¶ Computes pairwise intersection-over-union between box and mask collections.
Parameters: - box_mask_list1 – BoxMaskList holding N boxes and masks
- box_mask_list2 – BoxMaskList holding M boxes and masks
Returns: a numpy array with shape [N, M] representing pairwise iou scores.
-
easy_vision.python.utils.np_box_mask_list_ops.
multi_class_non_max_suppression
(box_mask_list, score_thresh, iou_thresh, max_output_size)[source]¶ Multi-class version of non maximum suppression.
This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. It operates independently for each class for which scores are provided (via the scores field of the input box_list), pruning boxes with score less than a provided threshold prior to applying NMS.
Parameters: - box_mask_list – np_box_mask_list.BoxMaskList holding N boxes. Must contain a ‘scores’ field representing detection scores. This scores field is a tensor that can be 1 dimensional (in the case of a single class) or 2-dimensional, in which case we assume that it takes the shape [num_boxes, num_classes]. We further assume that this rank is known statically and that scores.shape[1] is also known (i.e., the number of classes is fixed and known at graph construction time).
- score_thresh – scalar threshold for score (low scoring boxes are removed).
- iou_thresh – scalar threshold for IOU (boxes that that high IOU overlap with previously selected boxes are removed).
- max_output_size – maximum number of retained boxes per class.
Returns: - a box_mask_list holding M boxes with a rank-1 scores field representing
corresponding scores for each box with scores sorted in decreasing order and a rank-1 classes field representing a class label for each box.
Raises: ValueError
– if iou_thresh is not in [0, 1] or if input box_mask_list does not have a valid scores field.
-
easy_vision.python.utils.np_box_mask_list_ops.
non_max_suppression
(box_mask_list, max_output_size=10000, iou_threshold=1.0, score_threshold=-10.0)[source]¶ Non maximum suppression.
This op greedily selects a subset of detection bounding boxes, pruning away boxes that have high IOU (intersection over union) overlap (> thresh) with already selected boxes. In each iteration, the detected bounding box with highest score in the available pool is selected.
Parameters: - box_mask_list – np_box_mask_list.BoxMaskList holding N boxes. Must contain a ‘scores’ field representing detection scores. All scores belong to the same class.
- max_output_size – maximum number of retained boxes
- iou_threshold – intersection over union threshold.
- score_threshold – minimum score threshold. Remove the boxes with scores less than this value. Default value is set to -10. A very low threshold to pass pretty much all the boxes, unless the user sets a different score threshold.
Returns: an np_box_mask_list.BoxMaskList holding M boxes where M <= max_output_size
Raises: ValueError
– if ‘scores’ field does not existValueError
– if threshold is not in [0, 1]ValueError
– if max_output_size < 0
-
easy_vision.python.utils.np_box_mask_list_ops.
prune_non_overlapping_masks
(box_mask_list1, box_mask_list2, minoverlap=0.0)[source]¶ Prunes the boxes in list1 that overlap less than thresh with list2.
For each mask in box_mask_list1, we want its IOA to be more than minoverlap with at least one of the masks in box_mask_list2. If it does not, we remove it. If the masks are not full size image, we do the pruning based on boxes.
Parameters: - box_mask_list1 – np_box_mask_list.BoxMaskList holding N boxes and masks.
- box_mask_list2 – np_box_mask_list.BoxMaskList holding M boxes and masks.
- minoverlap – Minimum required overlap between boxes, to count them as overlapping.
Returns: A pruned box_mask_list with size [N’, 4].
-
easy_vision.python.utils.np_box_mask_list_ops.
sort_by_field
(box_mask_list, field, order=2)[source]¶ Sort boxes and associated fields according to a scalar field.
A common use case is reordering the boxes according to descending scores.
Parameters: - box_mask_list – BoxMaskList holding N boxes.
- field – A BoxMaskList field for sorting and reordering the BoxMaskList.
- order – (Optional) ‘descend’ or ‘ascend’. Default is descend.
Returns: - A sorted BoxMaskList with the field in the specified
order.
Return type: sorted_box_mask_list
easy_vision.python.utils.np_box_ops¶
Operations for [N, 4] numpy arrays representing bounding boxes.
- Example box operations that are supported:
- Areas: compute bounding box areas
- IOU: pairwise intersection-over-union scores
-
easy_vision.python.utils.np_box_ops.
area
(boxes)[source]¶ Computes area of boxes.
Parameters: boxes – Numpy array with shape [N, 4] holding N boxes Returns: a numpy array with shape [N*1] representing box areas
-
easy_vision.python.utils.np_box_ops.
intersection
(boxes1, boxes2)[source]¶ Compute pairwise intersection areas between boxes.
Parameters: - boxes1 – a numpy array with shape [N, 4] holding N boxes
- boxes2 – a numpy array with shape [M, 4] holding M boxes
Returns: a numpy array with shape [N*M] representing pairwise intersection area
-
easy_vision.python.utils.np_box_ops.
ioa
(boxes1, boxes2)[source]¶ Computes pairwise intersection-over-area between box collections.
Intersection-over-area (ioa) between two boxes box1 and box2 is defined as their intersection area over box2’s area. Note that ioa is not symmetric, that is, IOA(box1, box2) != IOA(box2, box1).
Parameters: - boxes1 – a numpy array with shape [N, 4] holding N boxes.
- boxes2 – a numpy array with shape [M, 4] holding N boxes.
Returns: a numpy array with shape [N, M] representing pairwise ioa scores.
-
easy_vision.python.utils.np_box_ops.
iou
(boxes1, boxes2)[source]¶ Computes pairwise intersection-over-union between box collections.
Parameters: - boxes1 – a numpy array with shape [N, 4] holding N boxes.
- boxes2 – a numpy array with shape [M, 4] holding N boxes.
Returns: a numpy array with shape [N, M] representing pairwise iou scores.
easy_vision.python.utils.np_mask_ops¶
Operations for [N, height, width] numpy arrays representing masks.
- Example mask operations that are supported:
- Areas: compute mask areas
- IOU: pairwise intersection-over-union scores
-
easy_vision.python.utils.np_mask_ops.
area
(masks)[source]¶ Computes area of masks.
Parameters: masks – Numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}. Returns: a numpy array with shape [N*1] representing mask areas. Raises: ValueError
– If masks.dtype is not np.uint8
-
easy_vision.python.utils.np_mask_ops.
intersection
(masks1, masks2)[source]¶ Compute pairwise intersection areas between masks.
Parameters: - masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
- masks2 – a numpy array with shape [M, height, width] holding M masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns: a numpy array with shape [N*M] representing pairwise intersection area.
Raises: ValueError
– If masks1 and masks2 are not of type np.uint8.
-
easy_vision.python.utils.np_mask_ops.
ioa
(masks1, masks2)[source]¶ Computes pairwise intersection-over-area between box collections.
Intersection-over-area (ioa) between two masks, mask1 and mask2 is defined as their intersection area over mask2’s area. Note that ioa is not symmetric, that is, IOA(mask1, mask2) != IOA(mask2, mask1).
Parameters: - masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
- masks2 – a numpy array with shape [M, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns: a numpy array with shape [N, M] representing pairwise ioa scores.
Raises: ValueError
– If masks1 and masks2 are not of type np.uint8.
-
easy_vision.python.utils.np_mask_ops.
iou
(masks1, masks2)[source]¶ Computes pairwise intersection-over-union between mask collections.
Parameters: - masks1 – a numpy array with shape [N, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
- masks2 – a numpy array with shape [M, height, width] holding N masks. Masks values are of type np.uint8 and values are in {0,1}.
Returns: a numpy array with shape [N, M] representing pairwise iou scores.
Raises: ValueError
– If masks1 and masks2 are not of type np.uint8.
easy_vision.python.utils.profiling¶
-
easy_vision.python.utils.profiling.
timeit_verbose
(f)¶ decorator for time print
easy_vision.python.utils.seq_utils¶
easy_vision.python.utils.text_vis_utils¶
-
class
easy_vision.python.utils.text_vis_utils.
VisualizeTextAttentionAlignment
(char_dict, attention_type='line', max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEvalAttention')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for visualize text recognition attention alignment
-
__init__
(char_dict, attention_type='line', max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEvalAttention')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.text_vis_utils.
VisualizeTextDetection
(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextDetectionEval')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for text detection visualizations.
-
__init__
(category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextDetectionEval')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.text_vis_utils.
VisualizeTextEnd2End
(char_dict, category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextEnd2EndEval')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for text spotting visualizations.
-
__init__
(char_dict, category_index, max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextEnd2EndEval')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.text_vis_utils.
VisualizeTextEnd2EndFeature
(char_dict, max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEnd2EndEvalFeature')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for visualize text spotting feature.
-
__init__
(char_dict, max_examples_to_draw=5, visualization_export_dir='', summary_name_prefix='TextEnd2EndEvalFeature')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.text_vis_utils.
VisualizeTextRectification
(max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextRectificationEval')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for text rectification visualizations.
-
__init__
(max_examples_to_draw=5, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='TextRectificationEval')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
easy_vision.python.utils.text_vis_utils.
draw_batch_gt_text_image
(image_batched, image_shape_batched, groundtruth_boxes_batched, groundtruth_classes_batched, groundtruth_keypoints_batched, num_groundtruth_boxes_batched, groundtruth_texts_batched=None, category_index=None, use_normalized_coordinates=False, num_plots=16, n_jobs=16)[source]¶ Draw batch text detection or end2end groundtruth
Parameters: - image_batched – a list of numpy images
- image_shape_batched – a list of true image shapes:
- groundtruth_boxes_batched – a list of groundtruth bounding boxes
- groundtruth_classes_batched – a list of groundtruth classes
- groundtruth_keypoints_batched – a list of groundtruth text keypoints
- num_groundtruth_boxes_batched – a list of number of groundtruth boxes
- groundtruth_texts_batched – a list of groundtruth texts
- category_index – a dictionary that maps integer ids to dicts
- use_normalized_coordinates – If True, treat coordinates as relative to images
- num_plots – number of plot images
- n_jobs – number of parallel jobs
Returns: a list of visualized images true_shape: a list of visualized images’ true shape
Return type: vis_images
-
easy_vision.python.utils.text_vis_utils.
draw_batch_text_attention
(image_batched, image_shape_batched, attention_batched, attention_shape_batched, groundtruth_text_batched, predict_text_ids_batched, char_dict, num_plots=16, n_jobs=16, attention_type='line')[source]¶
-
easy_vision.python.utils.text_vis_utils.
draw_batch_text_image
(image_batched, image_shape_batched, groundtruth_boxes_batched, groundtruth_classes_batched, groundtruth_keypoints_batched, num_groundtruth_boxes_batched, detection_boxes_batched, detection_classes_batched, detection_keypoints_batched, detection_scores_batched, num_detections_batched, groundtruth_texts_batched=None, detection_texts_ids_batched=None, category_index=None, char_dict=None, use_normalized_coordinates=False, num_plots=16, n_jobs=16)[source]¶ Draw batch detection or end2end groundtruth and predictions
Parameters: - image_batched – a list of numpy images
- image_shape_batched – a list of true image shapes:
- groundtruth_boxes_batched – a list of groundtruth bounding boxes
- groundtruth_classes_batched – a list of groundtruth classes
- groundtruth_keypoints_batched – a list of groundtruth text keypoints
- num_groundtruth_boxes_batched – a list of number of groundtruth boxes
- detection_boxes_batched – a list of detection bounding boxes
- detection_classes_batched – a list of detection classes
- detection_keypoints_batched – a list of detection text keypoints
- detection_scores_batched – a list of detection class scores
- num_detections_batched – a list of number of detections
- groundtruth_texts_batched – a list of groundtruth texts
- detection_texts_ids_batched – a list of detection texts ids
- category_index – a dictionary that maps integer ids to dicts
- char_dict – a instance of vocab_utils.CharDict
- use_normalized_coordinates – If True, treat coordinates as relative to images
- num_plots – number of plot images
- n_jobs – number of parallel jobs
Returns: a list of visualized images true_shape: a list of visualized images’ true shape
Return type: vis_images
-
easy_vision.python.utils.text_vis_utils.
draw_batch_text_recognition_feature
(image_batched_dict, feature_batched_dict, points_batched_dict, text_ids_batched=None, char_dict=None, num_plots=16, n_jobs=16, image_size=(32, 100))[source]¶
-
easy_vision.python.utils.text_vis_utils.
draw_batch_text_rectification_image
(image_batched, image_shape_batched, groundtruth_keypoints_batched, prediction_keypoints_batched, num_plots=16, n_jobs=16)[source]¶
-
easy_vision.python.utils.text_vis_utils.
draw_box_text_keypoint
(image, boxlist, classlist, category_index, textlist=None, keypointslist=None, scorelist=None, box_color='red', keypoint_color='blue', use_normalized_coordinates=False, scale=1)[source]¶ Draw boxes and texts and keypoints on image
Parameters: - image – a image numpy array
- boxlist – detection boxes
- category_index – a dictionary that maps integer ids to dicts
- classlist – detection class ids
- textlist – detection texts
- keypointslist – detection keypoints
- scorelist – detection scores:
- box_color – color to draw bounding box
- keypoint_color – color to draw keypoints
- use_normalized_coordinates – If True, treat coordinates as relative to images
- scale – image rescale ratio for better visualization
Returns: a visualized image
-
easy_vision.python.utils.text_vis_utils.
draw_gt_text_image
(image, image_shape, groundtruth_boxes, groundtruth_classes, groundtruth_keypoints, groundtruth_texts=None, category_index=None, use_normalized_coordinates=False)[source]¶ Draw text detection or end2end groundtruth
Parameters: - image – a numpy array image
- image_shape – true image shape:
- groundtruth_boxes – groundtruth bounding boxes
- groundtruth_classes – groundtruth classes
- groundtruth_keypoints – groundtruth text keypoints
- groundtruth_texts – groundtruth texts
- category_index – a dictionary that maps integer ids to dicts
- use_normalized_coordinates – If True, treat coordinates as relative to images
Returns: a visualized image true_shape: visualized image true shape
Return type: image
-
easy_vision.python.utils.text_vis_utils.
draw_sequence_attention
(image, image_shape, attention, attention_shape, groundtruth_text, predict_text_ids, char_dict)[source]¶
-
easy_vision.python.utils.text_vis_utils.
draw_spatial_attention
(image, image_shape, attention, attention_shape, groundtruth_text, predict_text_ids, char_dict)[source]¶
-
easy_vision.python.utils.text_vis_utils.
draw_text_image
(image, image_shape, groundtruth_boxes, groundtruth_classes, groundtruth_keypoints, detection_boxes, detection_classes, detection_keypoints, detection_scores, groundtruth_texts=None, detection_texts_ids=None, category_index=None, char_dict=None, use_normalized_coordinates=False)[source]¶ Draw text detection or end2end groundtruth and predictions
Parameters: - image – a numpy array image
- image_shape – true image shape:
- groundtruth_boxes – groundtruth bounding boxes
- groundtruth_classes – groundtruth classes
- groundtruth_keypoints – groundtruth text keypoints
- detection_boxes – detection bounding boxes
- detection_classes – detection classes
- detection_keypoints – detection text keypoints
- detection_scores – detection class scores
- groundtruth_texts – groundtruth texts
- detection_texts_ids – detection texts ids
- category_index – a dictionary that maps integer ids to dicts
- char_dict – a instance of vocab_utils.CharDict
- use_normalized_coordinates – If True, treat coordinates as relative to images
Returns: a visualized image true_shape: visualized image true shape
Return type: image
easy_vision.python.utils.tf_util¶
easy_vision.python.utils.variables_helper¶
Helper functions for manipulating collections of variables during training.
-
easy_vision.python.utils.variables_helper.
filter_variables
(variables, filter_regex_list, invert=False)[source]¶ Filters out the variables matching the filter_regex.
Filter out the variables whose name matches the any of the regular expressions in filter_regex_list and returns the remaining variables. Optionally, if invert=True, the complement set is returned.
Parameters: - variables – a list of tensorflow variables.
- filter_regex_list – a list of string regular expressions.
- invert – (boolean). If True, returns the complement of the filter set; that is, all variables matching filter_regex are kept and all others discarded.
Returns: a list of filtered variables.
-
easy_vision.python.utils.variables_helper.
freeze_gradients_matching_regex
(grads_and_vars, regex_list)[source]¶ Freeze gradients whose variable names match a regular expression.
Parameters: - grads_and_vars – A list of gradient to variable pairs (tuples).
- regex_list – A list of string regular expressions.
Returns: - A list of gradient to variable pairs (tuples) that do not
contain the variables and gradients matching the regex.
Return type: grads_and_vars
-
easy_vision.python.utils.variables_helper.
get_variables_available_in_checkpoint
(variables, checkpoint_path, include_global_step=True)[source]¶ Returns the subset of variables available in the checkpoint.
Inspects given checkpoint and returns the subset of variables that are available in it.
TODO(rathodv): force input and output to be a dictionary.
Parameters: - variables – a list or dictionary of variables to find in checkpoint.
- checkpoint_path – path to the checkpoint to restore variables from.
- include_global_step – whether to include global_step variable, if it exists. Default True.
Returns: A list or dictionary of variables.
Raises: ValueError
– if variables is not a list or dict.
-
easy_vision.python.utils.variables_helper.
multiply_gradients_matching_regex
(grads_and_vars, regex_list, multiplier)[source]¶ Multiply gradients whose variable names match a regular expression.
Parameters: - grads_and_vars – A list of gradient to variable pairs (tuples).
- regex_list – A list of string regular expressions.
- multiplier – A (float) multiplier to apply to each gradient matching the regular expression.
Returns: A list of gradient to variable pairs (tuples).
Return type: grads_and_vars
easy_vision.python.utils.video_decode¶
-
easy_vision.python.utils.video_decode.
video_decode
(video, video_length, decode_type, sample_fps, reshape_size, decode_batch_size, decode_keep_size)[source]¶ Parameters: - video – input video
- video_length – input video length
- decode_type – decode type, 1–Intra only, 2–Keyframe only, 3–Without bidir, 4–Decode all
- sampel_fps – sample rate, default -1, full sampling.
- reshape_size – output size of decoded frames.
- decode_batch_size – batch size of each decode phase.
- decode_keep_size – left size of last decode phase.
Returns: time stamp of each frame, float image_data: decode frames, string
Return type: time_stamp
easy_vision.python.utils.video_input_utils¶
-
class
easy_vision.python.utils.video_input_utils.
Kinetics
(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]¶ Bases:
object
Kinetics data interface Args: text_line (string): video path and label sample_duration: video length of each sample initial_scale: specifying the initial scale for multiscale cropping n_scales: specifying the number of scales for multiscale cropping scale_step: specifying the scale step for multiscale cropping train_crop: specifying the cropping method sample_size: specifying the crop size n_samples_for_each_video: clip num of each video is_temporal_transform: temporal transform or not is_spatial_transform (callable, optional): spatial transform or not. E.g,
transforms.RandomCrop
-
__init__
(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]¶ x.__init__(…) initializes x; see help(type(x)) for signature
-
-
class
easy_vision.python.utils.video_input_utils.
UCF101
(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]¶ Bases:
object
UCF101 data interface Args: text_line (string): video path and label sample_duration: video length of each sample initial_scale: specifying the initial scale for multiscale cropping n_scales: specifying the number of scales for multiscale cropping scale_step: specifying the scale step for multiscale cropping train_crop: specifying the cropping method sample_size: specifying the crop size is_spatial_transform (callable, optional): Spatial transform or not. E.g,
transforms.RandomCrop
is_training: is training or not-
__init__
(text_line, sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=None, is_spatial_transform=None, is_training=True)[source]¶ x.__init__(…) initializes x; see help(type(x)) for signature
-
-
easy_vision.python.utils.video_input_utils.
get_frames_data_ucf101
(filename, num_frames_per_clip=16)[source]¶ Given a directory containing extracted frames, return a video clip of (num_frames_per_clip) consecutive frames as a list of np arrays
-
easy_vision.python.utils.video_input_utils.
get_video_frames
(text_line, dataset_type='UCF101', sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_temporal_transform=True, is_spatial_transform=True, is_training=True)[source]¶ Parameters: - text_line – each line represents one video with label
- dataset_type – dataset type, e.g. UCF101, kinetics, default is UCF101
- sample_duration – video length of each sample
- initial_scale – specifying the initial scale for multiscale cropping
- n_scales – specifying the number of scales for multiscale cropping
- scale_step – specifying the scale step for multiscale cropping
- train_crop – specifying the cropping method
- sample_size – specifying the crop size
Returns: video clips label: video labels filename: input filename
Return type: clip
-
easy_vision.python.utils.video_input_utils.
preprocess
(video, dataset_type='UCF101', sample_duration=16, initial_scale=1, n_scales=5, scale_step=0.84089641525, train_crop='corner', sample_size=112, n_samples_for_each_video=1, is_spatial_transform=True, is_training=True)[source]¶
easy_vision.python.utils.video_transforms¶
-
class
easy_vision.python.utils.video_transforms.
CenterCrop
(size)[source]¶ Bases:
object
Crops the given PIL.Image at the center. :param size: Desired output size of the crop. If size is an
int instead of sequence like (h, w), a square crop (size, size) is made.
-
class
easy_vision.python.utils.video_transforms.
Compose
(transforms)[source]¶ Bases:
object
Composes several transforms together. :param transforms: list of transforms to compose. :type transforms: list of
Transform
objectsExample
>>> transforms.Compose([ >>> transforms.CenterCrop(10), >>> transforms.ToTensor(), >>> ])
-
class
easy_vision.python.utils.video_transforms.
CornerCrop
(size, crop_position=None)[source]¶ Bases:
object
-
class
easy_vision.python.utils.video_transforms.
MultiScaleCornerCrop
(scales, size, interpolation=2, crop_positions=['c', 'tl', 'tr', 'bl', 'br'])[source]¶ Bases:
object
Crop the given PIL.Image to randomly selected size. A crop of size is selected from scales of the original size. A position of cropping is randomly selected from 4 corners and 1 center. This crop is finally resized to given size. :param scales: cropping scales of the original size :param size: size of the smaller edge :param interpolation: Default: PIL.Image.BILINEAR
-
class
easy_vision.python.utils.video_transforms.
MultiScaleRandomCrop
(scales, size, interpolation=2)[source]¶ Bases:
object
-
class
easy_vision.python.utils.video_transforms.
RandomHorizontalFlip
[source]¶ Bases:
object
Horizontally flip the given PIL.Image randomly with a probability of 0.5.
-
class
easy_vision.python.utils.video_transforms.
Scale
(size, interpolation=2)[source]¶ Bases:
object
Rescale the input PIL.Image to the given size. :param size: Desired output size. If size is a sequence like
(w, h), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size)Parameters: interpolation (int, optional) – Desired interpolation. Default is PIL.Image.BILINEAR
-
class
easy_vision.python.utils.video_transforms.
TemporalBeginCrop
(size)[source]¶ Bases:
object
Temporally crop the given frame indices at a beginning. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int
-
class
easy_vision.python.utils.video_transforms.
TemporalCenterCrop
(size)[source]¶ Bases:
object
Temporally crop the given frame indices at a center. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int
-
class
easy_vision.python.utils.video_transforms.
TemporalRandomCrop
(size)[source]¶ Bases:
object
Temporally crop the given frame indices at a random location. If the number of frames is less than the size, loop the indices as many times as necessary to satisfy the size. :param size: Desired output size of the crop. :type size: int
easy_vision.python.utils.visualization_utils¶
A set of functions that are used for visualization.
These functions often receive an image, perform some visualization on the image. The functions do not return a value, instead they modify the image itself.
-
class
easy_vision.python.utils.visualization_utils.
EvalMetricOpsVisualization
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='evaluation_image')[source]¶ Bases:
object
Abstract base class responsible for visualizations during evaluation. Currently, summary images are not run during evaluation. One way to produce evaluation images in Tensorboard is to provide tf.summary.image strings as value_ops in tf.estimator.EstimatorSpec’s eval_metric_ops. This class is responsible for accruing images (with overlaid detections and groundtruth) and returning a dictionary that can be passed to eval_metric_ops.
-
__init__
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='evaluation_image')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
add_images
(images, image_shapes, image_ids)[source]¶ Store a list of images, each with shape [H, W, C]. :param images: an uint8 tensor of batched images with shape [N, H, W, C] :param image_shapes: an int64 tensor of batched image shapes with shape [N, 3] :param image_ids: a string tensor of batched image_ids with shape [N]
-
get_estimator_eval_metric_ops
(eval_dict)[source]¶ Returns metric ops for use in tf.estimator.EstimatorSpec. :param eval_dict: A dictionary that holds an image, groundtruth, and detections
for a single example. See eval_util.result_dict_for_single_example() for a convenient method for constructing such a dictionary. The dictionary contains fields.InputDataFields.original_image: [1, H, W, 3] image. fields.InputDataFields.groundtruth_boxes - [num_boxes, 4] float32
tensor with groundtruth boxes in range [0.0, 1.0].- fields.InputDataFields.groundtruth_classes - [num_boxes] int64
- tensor with 1-indexed groundtruth classes.
- fields.InputDataFields.groundtruth_instance_masks - (optional)
- [num_boxes, H, W] int64 tensor with instance masks.
- fields.DetectionResultFields.detection_boxes - [max_num_boxes, 4]
- float32 tensor with detection boxes in range [0.0, 1.0].
- fields.DetectionResultFields.detection_classes - [max_num_boxes]
- int64 tensor with 1-indexed detection classes.
- fields.DetectionResultFields.detection_scores - [max_num_boxes]
- float32 tensor with detection scores.
- fields.DetectionResultFields.detection_masks - (optional)
- [max_num_boxes, H, W] float32 tensor of binarized masks.
- fields.DetectionResultFields.detection_keypoints - (optional)
- [max_num_boxes, num_keypoints, 2] float32 tensor with keypooints.
Returns: A dictionary of image summary names to tuple of (value_op, update_op). The update_op is the same for all items in the dictionary, and is responsible for saving a single side-by-side image with detections and groundtruth. Each value_op holds the tf.summary.image string for a given image.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.visualization_utils.
VisualizeMultiFrameDetections
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for multi-frame object detection visualizations.
-
__init__
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Args a dict containing following data
original_image: image tensor in the shape of 1xHxWxC key: image_id, string tensor detection_boxes: detection boxes tensor with shape of Dx4, D is the
number of detectiondetection_scores: detection scores, D tensor detection_classes: D dimension tensor with type tf.float64 groundtruth_boxes: groundtruth boxes tensor with shape of Gx4, G is the
number of groundtruthgroundtruth_classes: D dimension tensor with type tf.float64 groundtruth_difficult: Optional if exists in groundtruth data. G dimension
tensor with type tf.boolReturns: images a tensor of batched images with shape NxHxWxC image_shapes a tensor with shape Nx3 image_ids a string tensor with shape N
-
-
class
easy_vision.python.utils.visualization_utils.
VisualizeSingleFrameDetections
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]¶ Bases:
easy_vision.python.utils.visualization_utils.EvalMetricOpsVisualization
Class responsible for single-frame object detection visualizations.
-
__init__
(category_index, max_examples_to_draw=5, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True, visualize_groundtruth=True, visualization_export_dir='', summary_name_prefix='Detections_Left_Groundtruth_Right')[source]¶ Creates an EvalMetricOpsVisualization. :param category_index: A category index (dictionary) produced from a labelmap. :param max_examples_to_draw: The maximum number of example summaries to produce. :param max_boxes_to_draw: The maximum number of boxes to draw for detections. :param min_score_thresh: The minimum score threshold for showing detections. :param use_normalized_coordinates: Whether to assume boxes and kepoints are in
normalized coordinates (as opposed to absolute coordiantes). Default is True.Parameters: summary_name_prefix – A string prefix for each image summary.
-
images_from_evaluation_dict
(eval_dict)[source]¶ Converts evaluation dictionary into a list of image tensors. To be overridden by implementations. :param eval_dict: A dictionary with all the necessary information for producing
visualizations.Returns: images an uint8 tensor of batched images with shape NxHxWxC image_shapes an int64 tensor with shape Nx3 image_ids a string tensor with shape N
-
-
easy_vision.python.utils.visualization_utils.
add_cdf_image_summary
(values, name)[source]¶ Adds a tf.summary.image for a CDF plot of the values.
Normalizes values such that they sum to 1, plots the cumulative distribution function and creates a tf image summary.
Parameters: - values – a 1-D float32 tensor containing the values.
- name – name for the image summary.
-
easy_vision.python.utils.visualization_utils.
add_hist_image_summary
(values, bins, name)[source]¶ Adds a tf.summary.image for a histogram plot of the values.
Plots the histogram of values and creates a tf image summary.
Parameters: - values – a 1-D float32 tensor containing the values.
- bins – bin edges which will be directly passed to np.histogram.
- name – name for the image summary.
-
easy_vision.python.utils.visualization_utils.
draw_bounding_box_on_image
(image, ymin, xmin, ymax, xmax, color='red', thickness=4, display_str_list=(), use_normalized_coordinates=True)[source]¶ Adds a bounding box to an image. Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument. Each string in display_str_list is displayed on a separate line above the bounding box in black text on a rectangle filled with the input ‘color’. If the top of the bounding box extends to the edge of the image, the strings are displayed below the bounding box. :param image: a PIL.Image object. :param ymin: ymin of bounding box. :param xmin: xmin of bounding box. :param ymax: ymax of bounding box. :param xmax: xmax of bounding box. :param color: color to draw bounding box. Default is red. :param thickness: line thickness. Default value is 4. :param display_str_list: list of strings to display in box
(each to be shown on its own line).Parameters: use_normalized_coordinates – If True (default), treat coordinates ymin, xmin, ymax, xmax as relative to the image. Otherwise treat coordinates as absolute.
-
easy_vision.python.utils.visualization_utils.
draw_bounding_box_on_image_array
(image, ymin, xmin, ymax, xmax, color='red', thickness=4, display_str_list=(), use_normalized_coordinates=True)[source]¶ Adds a bounding box to an image (numpy array).
Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument.
Parameters: - image – a numpy array with shape [height, width, 3].
- ymin – ymin of bounding box.
- xmin – xmin of bounding box.
- ymax – ymax of bounding box.
- xmax – xmax of bounding box.
- color – color to draw bounding box. Default is red.
- thickness – line thickness. Default value is 4.
- display_str_list – list of strings to display in box (each to be shown on its own line).
- use_normalized_coordinates – If True (default), treat coordinates ymin, xmin, ymax, xmax as relative to the image. Otherwise treat coordinates as absolute.
-
easy_vision.python.utils.visualization_utils.
draw_bounding_boxes_on_image
(image, boxes, color='red', thickness=4, display_str_list_list=(), use_normalized_coordinates=True)[source]¶ Draws bounding boxes on image.
Parameters: - image – a PIL.Image object.
- boxes – a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax). The coordinates are in normalized format between [0, 1].
- color – color to draw bounding box. Default is red.
- thickness – line thickness. Default value is 4.
- display_str_list_list – list of list of strings. a list of strings for each bounding box. The reason to pass a list of strings for a bounding box is that it might contain multiple labels.
- use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Raises: ValueError
– if boxes is not a [N, 4] array
-
easy_vision.python.utils.visualization_utils.
draw_bounding_boxes_on_image_array
(image, boxes, color='red', thickness=4, display_str_list_list=(), use_normalized_coordinates=True)[source]¶ Draws bounding boxes on image (numpy array).
Parameters: - image – a numpy array object.
- boxes – a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax). The coordinates are in normalized format between [0, 1].
- color – color to draw bounding box. Default is red.
- thickness – line thickness. Default value is 4.
- display_str_list_list – list of list of strings. a list of strings for each bounding box. The reason to pass a list of strings for a bounding box is that it might contain multiple labels.
- use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Raises: ValueError
– if boxes is not a [N, 4] array
-
easy_vision.python.utils.visualization_utils.
draw_bounding_boxes_on_image_tensors
(images, boxes, classes, scores, category_index, instance_masks=None, keypoints=None, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=True)[source]¶ Draws bounding boxes, masks, and keypoints on batch of image tensors.
Parameters: - images – A 4D uint8 image tensor of shape [N, H, W, C].
- boxes – [N, max_detections, 4] float32 tensor of detection boxes.
- classes – [N, max_detections] int tensor of detection classes. Note that classes are 1-indexed.
- scores – [N, max_detections] float32 tensor of detection scores.
- category_index – a dict that maps integer ids to category dicts. e.g. {1: {1: ‘dog’}, 2: {2: ‘cat’}, …}
- instance_masks – A 4D uint8 tensor of shape [N, max_detection, H, W] with instance masks.
- keypoints – A 4D float32 tensor of shape [N, max_detection, num_keypoints, 2] with keypoints.
- max_boxes_to_draw – Maximum number of boxes to draw on an image. Default 20.
- min_score_thresh – Minimum score threshold for visualization. Default 0.2.
- use_normalized_coordinates – if True (default), treat boxes values as relative to the image. Otherwise treat them as absolute.
Returns: 4D image tensor of type uint8, with boxes drawn on top.
-
easy_vision.python.utils.visualization_utils.
draw_evaluation_image
(eval_dict, category_index, max_boxes_to_draw=20, min_score_thresh=0.2, use_normalized_coordinates=False, visualize_groundtruth=True)[source]¶ Creates a side-by-side image with detections and groundtruth.
Bounding boxes (and instance masks, if available) are visualized on both subimages.
Parameters: - eval_dict – The evaluation dictionary returned by eval_util.result_dict_for_single_example().
- category_index – A category index (dictionary) produced from a labelmap.
- max_boxes_to_draw – The maximum number of boxes to draw for detections.
- min_score_thresh – The minimum score threshold for showing detections.
- use_normalized_coordinates – Whether to use normalized coordinates or absolute ones
- visualize_groundtruth – if set True, will visualize both detections and groundtruth in one image, detection left and groundtruth right
Returns: - A [1, H, 2 * W, C] uint8 tensor. The subimage on the left corresponds to
detections, while the subimage on the right corresponds to groundtruth.
-
easy_vision.python.utils.visualization_utils.
draw_keypoints_on_image
(image, keypoints, color='red', radius=2, use_normalized_coordinates=True)[source]¶ Draws keypoints on an image.
Parameters: - image – a PIL.Image object.
- keypoints – a numpy array with shape [num_keypoints, 2].
- color – color to draw the keypoints with. Default is red. if use AUTO, each keypoint will be painted with STANDARD_COLORS in order
- radius – keypoint radius. Default value is 2.
- use_normalized_coordinates – if True (default), treat keypoint values as relative to the image. Otherwise treat them as absolute.
-
easy_vision.python.utils.visualization_utils.
draw_keypoints_on_image_array
(image, keypoints, color='red', radius=2, use_normalized_coordinates=True)[source]¶ Draws keypoints on an image (numpy array).
Parameters: - image – a numpy array with shape [height, width, 3].
- keypoints – a numpy array with shape [num_keypoints, 2].
- color – color to draw the keypoints with. Default is red.
- radius – keypoint radius. Default value is 2.
- use_normalized_coordinates – if True (default), treat keypoint values as relative to the image. Otherwise treat them as absolute.
-
easy_vision.python.utils.visualization_utils.
draw_mask_on_image_array
(image, mask, color='red', alpha=0.4)[source]¶ Draws mask on an image.
Parameters: - image – uint8 numpy array with shape (img_height, img_height, 3)
- mask – a uint8 numpy array of shape (img_height, img_height) with values between either 0 or 1.
- color – color to draw the keypoints with. Default is red.
- alpha – transparency value between 0 and 1. (default: 0.4)
Raises: ValueError
– On incorrect data type for image or masks.
-
easy_vision.python.utils.visualization_utils.
draw_text_on_image_tensors
(images, image_ids)[source]¶ Draws texts on batch of image tensors.
-
easy_vision.python.utils.visualization_utils.
draw_text_on_top_of_image_array
(image, display_str, color='black')[source]¶ Adds text on top of image. Draw texts on the top of images, which will return a higher image because additional image of text will be added.
Parameters: - image – a numpy array.
- color – color to draw text. Default is red.
- display_str_list – list of strings to display in box (each to be shown on its own line).
-
easy_vision.python.utils.visualization_utils.
draw_texts_on_image
(image, display_str_list, color='red', thickness=4)[source]¶ Adds a texts to an image. Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument. Each string in display_str_list is displayed on a separate line above the bounding box in black text on a rectangle filled with the input ‘color’. If the top of the bounding box extends to the edge of the image, the strings are displayed below the bounding box. :param image: a PIL.Image object. :param display_str_list: list of strings to display :param color: color to draw bounding box. Default is red. :param thickness: line thickness. Default value is 4.
-
easy_vision.python.utils.visualization_utils.
draw_texts_on_image_array
(image, display_str_list, color='red', thickness=4)[source]¶ Adds a bounding box to an image (numpy array).
Bounding box coordinates can be specified in either absolute (pixel) or normalized coordinates by setting the use_normalized_coordinates argument.
Parameters: - image – a numpy array with shape [height, width, 3].
- display_str_list – list of strings to display
- color – color to draw bounding box. Default is red.
- thickness – line thickness. Default value is 4.
-
easy_vision.python.utils.visualization_utils.
encode_image_array_as_png_str
(image)[source]¶ Encodes a numpy array into a PNG string.
Parameters: image – a numpy array with shape [height, width, 3]. Returns: PNG encoded image string.
-
easy_vision.python.utils.visualization_utils.
get_matplotlib_font
(font_name)[source]¶ Get a matplotlib FontProperties of specify name
Parameters: font_name – name of a font with ext Returns: FontProperties of this font
-
easy_vision.python.utils.visualization_utils.
get_pil_font
(font_name, font_size=24)[source]¶ Get a PIL.ImageFont of specify name
Parameters: - font_name – name of a font with ext
- font_size – size of the font
Returns: PIL.ImageFont of this font
-
easy_vision.python.utils.visualization_utils.
save_image_array_as_png
(image, output_path)[source]¶ Saves an image (represented as a numpy array) to PNG.
Parameters: - image – a numpy array with shape [height, width, 3].
- output_path – path to which image should be written.
-
easy_vision.python.utils.visualization_utils.
summary_image
(name, tensor, max_outputs=16)[source]¶ summary a image, could save summary images in local directory for debugging :param name: A name for the generated node. Will also serve as a series name in
TensorBoard.Parameters: - tensor – A 4-D uint8 or float32 Tensor of shape [batch_size, height, width, channels] where channels is 1, 3, or 4.
- max_outputs – Max number of batch elements to generate images for.
-
easy_vision.python.utils.visualization_utils.
visualize_boxes_and_labels_on_image_array
(image, boxes, classes, scores, category_index, instance_masks=None, instance_boundaries=None, keypoints=None, use_normalized_coordinates=False, max_boxes_to_draw=20, min_score_thresh=0.5, agnostic_mode=False, line_thickness=4, groundtruth_box_visualization_color='black', skip_scores=False, skip_labels=False)[source]¶ Overlay labeled boxes on an image with formatted scores and label names.
This function groups boxes that correspond to the same location and creates a display string for each detection and overlays these on the image. Note that this function modifies the image in place, and returns that same image.
Parameters: - image – uint8 numpy array with shape (img_height, img_width, 3)
- boxes – a numpy array of shape [N, 4]
- classes – a numpy array of shape [N]. Note that class indices are 1-based, and match the keys in the label map.
- scores – a numpy array of shape [N] or None. If scores=None, then this function assumes that the boxes to be plotted are groundtruth boxes and plot all boxes as black with no classes or scores.
- category_index – a dict containing category dictionaries (each holding category index id and category name name) keyed by category indices.
- instance_masks – a numpy array of shape [N, image_height, image_width] with values ranging between 0 and 1, can be None.
- instance_boundaries – a numpy array of shape [N, image_height, image_width] with values ranging between 0 and 1, can be None.
- keypoints – a numpy array of shape [N, num_keypoints, 2], can be None
- use_normalized_coordinates – whether boxes is to be interpreted as normalized coordinates or not.
- max_boxes_to_draw – maximum number of boxes to visualize. If None, draw all boxes.
- min_score_thresh – minimum score threshold for a box to be visualized
- agnostic_mode – boolean (default: False) controlling whether to evaluate in class-agnostic mode or not. This mode will display scores but ignore classes.
- line_thickness – integer (default: 4) controlling line width of the boxes.
- groundtruth_box_visualization_color – box color for visualizing groundtruth boxes
- skip_scores – whether to skip score when drawing a single detection
- skip_labels – whether to skip label when drawing a single detection
Returns: uint8 numpy array with shape (img_height, img_width, 3) with overlaid boxes.
easy_vision.python.utils.vocab_utils¶
-
class
easy_vision.python.utils.vocab_utils.
CharDict
(dict_path, load_raw=False)[source]¶ Character Dict of Texts
-
__init__
(dict_path, load_raw=False)[source]¶ Constructor :param dict_path: char dict file path :param load_raw: if load_raw is True, UNK | SOS | EOS will not
insert into CharDict
-