# 模型导出&预测 ## 1 模型导出 执行如下python代码即可完成,其中checkpoint_path参数如果不传,默认会使用pipeline_config_path中model_dir下的最新的checkpoint ```python import easy_vision easy_vision.export(export_dir, pipeline_config_path, checkpoint_path) ``` ## 2 输入输出信息 使用saved_model模型进行预测,我们需要获取输入输出的tensor节点。 ### 2.1 输入的placeholder定义 | name | 说明 | shape | type | | ---------------- | ------------------------------------------------------------ | --------------------------- | -------- | | image | batched图像tensor, Channel为**RGB顺序** | [batch_size, None, None, 3] | tf.uint8 | | true_image_shape | 每一张图像的真实shape,最后一维顺序为[height, width, channel]例如[ [224,224,3], [448, 448, 3]] | [batch_size, 3] | tf.int32 | 注: batch_size为导出模型时,pipeline_config中export_config配置的batch_size,batch_size设置为-1,表示使用动态的batch_size,目前只有分类模型支持动态batch_size。 ### 2.2 输出tensor信息 输出为List of Json Result,List的Length与输入图像的张数相等,一下为各模型Json结果的示例与说明 #### feature_extractor 结果示例: ``` {"feature": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.4583122730255127, 0.0]} ``` 字段说明 | name | 含义 | shape | type | 备注 | | ------- | -------- | ------------- | ----- | ----------------------------------- | | feature | 输出特征 | [feature_dim] | float | 坐标顺序 [top, left, bottom, right] | #### classifier 结果示例 ``` { "class": 3, "class_name": "coho4", "class_probs": {"coho1": 4.028851974258174e-10, "coho2": 0.48115724325180054, "coho3": 5.116515922054532e-07, "coho4": 0.5188422446937221} } ``` | name | 含义 | shape | type | | ----------- | ------------ | ------------- | ------------------------------- | | class | 类别id | [] | int32 | | class_name | 类别名称 | [] | string | | class_probs | 所有类别概率 | [num_classes] | dict{key: string, value: float} | #### multilabel_classifier 结果示例 ``` { "class": [3, 4], "class_names": ["coho3", "coho4"], "class_probs": {"coho1": 4.028851974258174e-10, "coho2": 0.10115724325180054, "coho3": 0.6188422446937221, "coho4": 0.5188422446937221} } ``` | name | 含义 | shape | type | | ----------- | ------------ | ------------- | ------------------------------- | | class | 类别id | [None] | int32 | | class_names | 类别名称 | [None] | string | | class_probs | 所有类别概率 | [num_classes] | dict{key: string, value: float} | #### detector **注:同时支持实例分割** 结果示例 ``` { "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]], "detection_scores": [0.9942291975021362, 0.9940272569656372], "detection_classes": [1, 1], "detection_classe_names": ["text", "text"] } ``` 字段说明 | name | 含义 | shape | type | 备注 | | ---------------------- | -------------------- | ------------------------------------------------ | ------ | ----------------------------------- | | detection_boxes | 检测到的目标框 | [num_detections, 4] | float | 坐标顺序 [top, left, bottom, right] | | detection_scores | 目标检测概率 | num_detections | float | | | detection_classes | 目标区域类别id | num_detections | int | | | detection_class_names | 目标区域类别名称 | num_detections | string | | | detection_masks | 目标区域分割遮罩 | [num_detection, image_height, image_width] | float | 可选 | | detection_keypoints | 目标区域中的关键点 | [num_detection, num_keypoints, 2] | float | 可选 | | detection_roi_features | 目标区域的局部特征图 | [num_detection, roi_height, roi_width, channels] | float | 可选 | #### detector_with_rpn 结果示例 ``` { "proposal_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], 243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], "proposal_scores": [0.88, 0.56], "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]], "detection_scores": [0.9942291975021362, 0.9940272569656372], "detection_classes": [1, 1], "detection_classe_names": ["text", "text"] } ``` 字段说明 | name | 含义 | shape | type | 备注 | | --------------------- | ---------------- | ------------------- | ------ | ----------------------------------- | | proposal_boxes | proposal框 | [num_proposal, 4] | float | | | proposal_scores | prososal的分 | num_proposal | float | | | detection_boxes | 检测到的目标框 | [num_detections, 4] | float | 坐标顺序 [top, left, bottom, right] | | detection_scores | 目标检测概率 | num_detections | float | | | detection_classes | 目标区域类别id | num_detections | int | | | detection_class_names | 目标区域类别名称 | num_detections | string | | #### segmentor 结果示例 ``` { "probs" : [[[0.8, 0.8], [0.6, 0.7]],[[0.8, 0.5], [0.4, 0.3]]], "preds" : [[[1,1], [0, 0]], [[0, 0], [1,1]]] } ``` 字段说明 | name | 含义 | shape | type | | ----- | -------------- | ------------------------------------------ | ----- | | probs | 分割像素点概率 | [output_height, output_width, num_classes] | float | | preds | 分割像素类别id | [output_height, output_widths] | int | #### text_detector 结果示例 ``` { "detection_keypoints": [[[243.57516479492188, 198.84210205078125], [243.91038513183594, 247.62425231933594], [385.5513916015625, 246.61660766601562], [385.2197570800781, 197.79345703125]], [[292.2718200683594, 114.44700622558594], [292.2237243652344, 164.684814453125], [571.1962890625, 164.931640625], [571.2444458007812, 114.67433166503906]]], "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]], "detection_scores": [0.9942291975021362, 0.9940272569656372], "detection_classes": [1, 1], "detection_classe_names": ["text", "text"], "image_shape": [1024, 968, 3] } ``` 字段说明 | name | 含义 | shape | type | 备注 | | --------------------- | ------------------------ | ---------------------------------- | ------ | ----------------------------------- | | detection_boxes | 检测到的文字框 | [num_detections, 4] | float | 坐标顺序 [top, left, bottom, right] | | detection_scores | 文字检测概率 | num_detections | float | | | detection_classes | 文字区域类别id | num_detections | int | | | detection_class_names | 文字区域类别名称 | num_detections | string | | | detection_keypoints | 检测到的文字区域四个角点 | [num_detections, 4, 2] | float | 每个point坐标为(y,x) | | image_shape | 输入图像大小 | [3], 分别为height, width ,channel | list | | #### text_recognizer 结果示例 ``` { "sequence_predict_ids": [1,2,2008,12], "sequence_predict_texts": "这是示例", "sequence_probability": 0.88 } ``` 字段说明 | name | 含义 | shape | type | | ---------------------- | ------------------ | ------------- | ------ | | sequence_predict_ids | 单行文字识别类别id | [text_length] | int | | sequence_predict_texts | 单行文字识别结果 | [] | string | | sequence_probability | 单行文字识别概率 | [] | float | #### text_spotter/text_pipeline_predictor 结果示例 ``` { "detection_keypoints": [[[243.57516479492188, 198.84210205078125], [243.91038513183594, 247.62425231933594], [385.5513916015625, 246.61660766601562], [385.2197570800781, 197.79345703125]], [[292.2718200683594, 114.44700622558594], [292.2237243652344, 164.684814453125], [571.1962890625, 164.931640625], [571.2444458007812, 114.67433166503906]]], "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]], "detection_scores": [0.9942291975021362, 0.9940272569656372], "detection_classes": [1, 1], "detection_classe_names": ["text", "text"], "detection_texts_ids" : [[1,2,2008,12], [1,2,2008,12]], "detection_texts": ["这是示例", "这是示例"], "detection_texts_scores" : [0.88, 0.88], "image_shape": [1024, 968, 3] } ``` 字段说明 | name | 含义 | shape | type | 备注 | | ---------------------- | ------------------------ | ---------------------------------- | ------ | ----------------------------------- | | detection_boxes | 检测到的文字框 | [num_detections, 4] | float | 坐标顺序 [top, left, bottom, right] | | detection_scores | 文字检测概率 | num_detections | float | | | detection_classes | 文字区域类别id | num_detections | int | | | detection_class_names | 文字区域类别名称 | num_detections | string | | | detection_keypoints | 检测到的文字区域四个角点 | [num_detections, 4, 2] | float | 每个point坐标为(y,x) | | detection_texts_ids | 单行文字识别类别id | [num_detections, max_text_length] | int | | | detection_texts | 单行文字识别结果 | [num_detections] | string | | | detection_texts_scores | 单行文字识别概率 | [num_detections] | float | | | image_shape | 输入图像大小 | [3], 分别为height, width ,channel | list | | ## 3 本地预测 EasyVision提供python预测接口,可以加载easy-vision导出的saved model进行预测,预测api详见[API文档](api/easy_vision.python.inference.html)。 具体使用demo如下 ```python import easy_vision as ev import numpy as np #识别 saved_model_path = 'xxx/xxx' classifier = ev.Classifier(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) output_dict = classifier.predict([image]) #检测 saved_model_path = 'xxx/xxx' detector = ev.Detector(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) output_dict = detector.predict([image]) #文字识别 saved_model_path = 'xxx/xxx' text_recognizer = ev.TextRecognizer(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) output_dict = text_recognizer.predict([image]) #文字检测 saved_model_path = 'xxx/xxx' text_detector = ev.TextDetector(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) output_dict = text_detector.predict([image]) #端到端文字识别 saved_model_path = 'xxx/xxx' text_spotter = ev.TextSpotter(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) output_dict = text_spotter.predict([image]) #基础predictor saved_model_path = 'xxx/xxx' predictor = ev.Predictor(saved_model_path) image = np.zeros([640, 480, 3], dtype=np.float32) image_list = [image for i in range(10)] batched_images, origin_shapes = predictor.batch(images) input_data = { 'image': batched_images, 'true_image_shape', origin_shapes } output_data_dict = predictor.predict(input_data) ```