easy_vision.python.core.box_coders

easy_vision.python.core.box_coders.faster_rcnn_box_coder

Faster RCNN box coder.

Faster RCNN box coder follows the coding schema described below:

ty = (y - ya) / ha tx = (x - xa) / wa th = log(h / ha) tw = log(w / wa) where x, y, w, h denote the box’s center coordinates, width and height respectively. Similarly, xa, ya, wa, ha denote the anchor’s center coordinates, width and height. tx, ty, tw and th denote the anchor-encoded center, width and height respectively.

See http://arxiv.org/abs/1506.01497 for details.

class easy_vision.python.core.box_coders.faster_rcnn_box_coder.FasterRcnnBoxCoder(scale_factors=None)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

Faster RCNN box coder.

__init__(scale_factors=None)[source]

Constructor for FasterRcnnBoxCoder.

Parameters:scale_factors – List of 4 positive scalars to scale ty, tx, th and tw. If set to None, does not perform scaling. For Faster RCNN, the open-source implementation recommends using [10.0, 10.0, 5.0, 5.0].
code_size

easy_vision.python.core.box_coders.keypoint_box_coder

Keypoint box coder.

The keypoint box coder follows the coding schema described below (this is similar to the FasterRcnnBoxCoder, except that it encodes keypoints in addition to box coordinates):

ty = (y - ya) / ha tx = (x - xa) / wa th = log(h / ha) tw = log(w / wa) tky0 = (ky0 - ya) / ha tkx0 = (kx0 - xa) / wa tky1 = (ky1 - ya) / ha tkx1 = (kx1 - xa) / wa … where x, y, w, h denote the box’s center coordinates, width and height respectively. Similarly, xa, ya, wa, ha denote the anchor’s center coordinates, width and height. tx, ty, tw and th denote the anchor-encoded center, width and height respectively. ky0, kx0, ky1, kx1, … denote the keypoints’ coordinates, and tky0, tkx0, tky1, tkx1, … denote the anchor-encoded keypoint coordinates.
class easy_vision.python.core.box_coders.keypoint_box_coder.KeypointBoxCoder(num_keypoints, scale_factors=None)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

Keypoint box coder.

__init__(num_keypoints, scale_factors=None)[source]

Constructor for KeypointBoxCoder.

Parameters:
  • num_keypoints – Number of keypoints to encode/decode.
  • scale_factors – List of 4 positive scalars to scale ty, tx, th and tw. In addition to scaling ty and tx, the first 2 scalars are used to scale the y and x coordinates of the keypoints as well. If set to None, does not perform scaling.
code_size

easy_vision.python.core.box_coders.mean_stddev_box_coder

Mean stddev box coder.

This box coder use the following coding schema to encode boxes: rel_code = (box_corner - anchor_corner_mean) / anchor_corner_stddev.

class easy_vision.python.core.box_coders.mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.01)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

Mean stddev box coder.

__init__(stddev=0.01)[source]

Constructor for MeanStddevBoxCoder.

Parameters:stddev – The standard deviation used to encode and decode boxes.
code_size

easy_vision.python.core.box_coders.rc3d_box_coder

RC3D box coder. RC3D RCNN box coder follows the coding schema described below:

tc = (c - ca) / la tl = log(l / la) where c, l denote the ts’s center coordinates and length respectively. Similarly, ca, la denote the anchor’s center and length, width and height. tc and tl denote the anchor-encoded center and length respectively.
class easy_vision.python.core.box_coders.rc3d_box_coder.RC3DBoxCoder(scale_factors=None)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

RC3D box coder.

__init__(scale_factors=None)[source]

Constructor for RC3DBoxCoder.

Parameters:scale_factors – List of 2 positive scalars to scale tc, tc and tl. If set to None, does not perform scaling. For RC3D, the open-source implementation recommends using [10.0, 5.0].
code_size

easy_vision.python.core.box_coders.square_box_coder

Square box coder.

Square box coder follows the coding schema described below: l = sqrt(h * w) la = sqrt(ha * wa) ty = (y - ya) / la tx = (x - xa) / la tl = log(l / la) where x, y, w, h denote the box’s center coordinates, width, and height, respectively. Similarly, xa, ya, wa, ha denote the anchor’s center coordinates, width and height. tx, ty, tl denote the anchor-encoded center, and length, respectively. Because the encoded box is a square, only one length is encoded.

This has shown to provide performance improvements over the Faster RCNN box coder when the objects being detected tend to be square (e.g. faces) and when the input images are not distorted via resizing.

class easy_vision.python.core.box_coders.square_box_coder.SquareBoxCoder(scale_factors=None)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

Encodes a 3-scalar representation of a square box.

__init__(scale_factors=None)[source]

Constructor for SquareBoxCoder.

Parameters:scale_factors – List of 3 positive scalars to scale ty, tx, and tl. If set to None, does not perform scaling. For faster RCNN, the open-source implementation recommends using [10.0, 10.0, 5.0].
Raises:ValueError – If scale_factors is not length 3 or contains values less than or equal to 0.
code_size

easy_vision.python.core.box_coders.yolo_box_coder

YOLO box coder.

YOLO box coder follows the coding schema described below:
sigmoid(ty) = y - ya sigmoid(tx) = x - xa th = log(h / ha) tw = log(w / wa) where x, y, w, h denote the box’s center coordinates, width and height respectively. Similarly, xa, ya, wa, ha denote the anchor’s center coordinates, width and height. tx, ty, tw and th denote the anchor-encoded center, width and height respectively.
class easy_vision.python.core.box_coders.yolo_box_coder.YOLOBoxCoder(scale_factors=None)[source]

Bases: easy_vision.python.core.ops.box_coder.BoxCoder

YOLO box coder.

__init__(scale_factors=None)[source]

Constructor for FasterRcnnBoxCoder.

Parameters:scale_factors – List of 4 positive scalars to scale ty, tx, th and tw. If set to None, does not perform scaling. For YOLO V3, the open-source implementation recommends using [1.0, 1.0, 1.0, 1.0].
code_size