将训练保存的模型转化为PaddleHub Module并完成一键加载

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

本教程旨在把帮助各位开发者把在ai studio上做的精品项目转入到PaddleHub的Module里，使其具备一键预测的能力。

运行该项目请到AI Studio：
https://aistudio.baidu.com/aistudio/projectdetail/1259178

一、训练鲜花识别模型

原项目来自开发者笨笨的——图像分类-VGG，该项目使用公开的鲜花据集，数据集压缩包里包含五个文件夹，每个文件夹一种花卉。分别是雏菊，蒲公英，玫瑰，向日葵，郁金香，每种各690-890张图片，网络采用VGG

1. 解压鲜花数据集及预训练参数

# 解压花朵数据集
!cd data/data2815 && unzip -qo flower_photos.zip

# 解压预训练参数
!cd data/data6489 && unzip -qo VGG16_pretrained.zip

2. 数据预处理

!python work/DataPreprocessing.py

['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

3. 模型训练

本项目的主要目的是转换Module，所以训练的epochs设为1

!python work/train.py

2020-11-24 19:06:16,867-INFO: create prog success
2020-11-24 19:06:16,867 - train.py[line:460] - INFO: create prog success
2020-11-24 19:06:16,867-INFO: train config: {'input_size': [3, 224, 224], 'class_dim': 5, 'image_count': 2955, 'label_dict': {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}, 'data_dir': 'data/data2815', 'train_file_list': 'train.txt', 'label_file': 'label_list.txt', 'save_freeze_dir': './freeze-model', 'save_persistable_dir': './persistable-params', 'continue_train': False, 'pretrained': True, 'pretrained_dir': 'data/data6489/VGG16_pretrained', 'mode': 'train', 'num_epochs': 1, 'train_batch_size': 24, 'mean_rgb': [127.5, 127.5, 127.5], 'use_gpu': True, 'image_enhance_strategy': {'need_distort': True, 'need_rotate': True, 'need_crop': True, 'need_flip': True, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'good_acc1': 0.92}, 'rsm_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'momentum_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'sgd_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'adam_strategy': {'learning_rate': 0.0005}}
2020-11-24 19:06:16,867 - train.py[line:461] - INFO: train config: {'input_size': [3, 224, 224], 'class_dim': 5, 'image_count': 2955, 'label_dict': {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}, 'data_dir': 'data/data2815', 'train_file_list': 'train.txt', 'label_file': 'label_list.txt', 'save_freeze_dir': './freeze-model', 'save_persistable_dir': './persistable-params', 'continue_train': False, 'pretrained': True, 'pretrained_dir': 'data/data6489/VGG16_pretrained', 'mode': 'train', 'num_epochs': 1, 'train_batch_size': 24, 'mean_rgb': [127.5, 127.5, 127.5], 'use_gpu': True, 'image_enhance_strategy': {'need_distort': True, 'need_rotate': True, 'need_crop': True, 'need_flip': True, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'good_acc1': 0.92}, 'rsm_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'momentum_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'sgd_strategy': {'learning_rate': 0.0005, 'lr_epochs': [20, 40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01, 0.002]}, 'adam_strategy': {'learning_rate': 0.0005}}
2020-11-24 19:06:16,868-INFO: build input custom reader and data feeder
2020-11-24 19:06:16,868 - train.py[line:462] - INFO: build input custom reader and data feeder
2020-11-24 19:06:16,869-INFO: build newwork
2020-11-24 19:06:16,869 - train.py[line:475] - INFO: build newwork
W1124 19:06:18.083617   144 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W1124 19:06:18.087874   144 device_context.cc:244] device: 0, cuDNN Version: 7.3.
2020-11-24 19:06:19,710-INFO: load params from pretrained model
2020-11-24 19:06:19,710 - train.py[line:449] - INFO: load params from pretrained model
2020-11-24 19:06:21,383-INFO: current pass: 0, start read image
2020-11-24 19:06:21,383 - train.py[line:504] - INFO: current pass: 0, start read image
2020-11-24 19:06:24,815-INFO: Pass 0, trainbatch 10, loss 1.6219388246536255, acc1 0.0833333358168602, time 0.14 sec
2020-11-24 19:06:24,815 - train.py[line:519] - INFO: Pass 0, trainbatch 10, loss 1.6219388246536255, acc1 0.0833333358168602, time 0.14 sec
2020-11-24 19:06:28,441-INFO: Pass 0, trainbatch 20, loss 1.558526635169983, acc1 0.4583333432674408, time 0.15 sec
2020-11-24 19:06:28,441 - train.py[line:519] - INFO: Pass 0, trainbatch 20, loss 1.558526635169983, acc1 0.4583333432674408, time 0.15 sec
2020-11-24 19:06:31,856-INFO: Pass 0, trainbatch 30, loss 1.574629783630371, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:06:31,856 - train.py[line:519] - INFO: Pass 0, trainbatch 30, loss 1.574629783630371, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:06:35,593-INFO: Pass 0, trainbatch 40, loss 1.5624138116836548, acc1 0.5, time 0.14 sec
2020-11-24 19:06:35,593 - train.py[line:519] - INFO: Pass 0, trainbatch 40, loss 1.5624138116836548, acc1 0.5, time 0.14 sec
2020-11-24 19:06:39,171-INFO: Pass 0, trainbatch 50, loss 1.6100339889526367, acc1 0.1666666716337204, time 0.14 sec
2020-11-24 19:06:39,171 - train.py[line:519] - INFO: Pass 0, trainbatch 50, loss 1.6100339889526367, acc1 0.1666666716337204, time 0.14 sec
2020-11-24 19:06:39,172-INFO: temp save 50 batch train result, current acc1 0.1666666716337204
2020-11-24 19:06:39,172 - train.py[line:538] - INFO: temp save 50 batch train result, current acc1 0.1666666716337204
2020-11-24 19:06:46,603-INFO: Pass 0, trainbatch 60, loss 1.6188973188400269, acc1 0.2083333283662796, time 0.14 sec
2020-11-24 19:06:46,603 - train.py[line:519] - INFO: Pass 0, trainbatch 60, loss 1.6188973188400269, acc1 0.2083333283662796, time 0.14 sec
2020-11-24 19:06:50,057-INFO: Pass 0, trainbatch 70, loss 1.6400723457336426, acc1 0.125, time 0.14 sec
2020-11-24 19:06:50,057 - train.py[line:519] - INFO: Pass 0, trainbatch 70, loss 1.6400723457336426, acc1 0.125, time 0.14 sec
2020-11-24 19:06:53,692-INFO: Pass 0, trainbatch 80, loss 1.5995646715164185, acc1 0.25, time 0.14 sec
2020-11-24 19:06:53,692 - train.py[line:519] - INFO: Pass 0, trainbatch 80, loss 1.5995646715164185, acc1 0.25, time 0.14 sec
2020-11-24 19:06:57,141-INFO: Pass 0, trainbatch 90, loss 1.539711833000183, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:06:57,141 - train.py[line:519] - INFO: Pass 0, trainbatch 90, loss 1.539711833000183, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:07:00,644-INFO: Pass 0, trainbatch 100, loss 1.593304991722107, acc1 0.125, time 0.14 sec
2020-11-24 19:07:00,644 - train.py[line:519] - INFO: Pass 0, trainbatch 100, loss 1.593304991722107, acc1 0.125, time 0.14 sec
2020-11-24 19:07:00,645-INFO: temp save 100 batch train result, current acc1 0.125
2020-11-24 19:07:00,645 - train.py[line:538] - INFO: temp save 100 batch train result, current acc1 0.125
2020-11-24 19:07:08,069-INFO: Pass 0, trainbatch 110, loss 1.5976566076278687, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:07:08,069 - train.py[line:519] - INFO: Pass 0, trainbatch 110, loss 1.5976566076278687, acc1 0.3333333432674408, time 0.14 sec
2020-11-24 19:07:11,569-INFO: Pass 0, trainbatch 120, loss 1.6223376989364624, acc1 0.125, time 0.14 sec
2020-11-24 19:07:11,569 - train.py[line:519] - INFO: Pass 0, trainbatch 120, loss 1.6223376989364624, acc1 0.125, time 0.14 sec
2020-11-24 19:07:12,698-INFO: training till last epcho, end training
2020-11-24 19:07:12,698 - train.py[line:544] - INFO: training till last epcho, end training

二、整理成PaddleHub Module格式

PaddleHub Module是使用PaddleHub的基础。其可以通过指定名称即可方便地完成一键加载，如加载预训练模型ERNIE仅需一行代码即可完成，hub.Module(name=‘ernie’)，省去了复杂的网络结构代码以及参数加载的繁琐过程。

1. 必要的目录与文件

创建一个大目录，并在该目录下分别创建__init__.py、module.py、processor.py、net.py等文件

目录名称即Module的名称，如这里我起名为VGG16：

VGG16/
├── assets # 资源文件夹
│   ├── infer_model # 模型文件
│   └── vocab.txt # 词汇表文件
├── data_feed.py
├── __init__.py # 空文件
├── module.py # 主模块，提供Module的实现代码
├── net.py # 网络框架的实现
└── processor.py # 辅助模块，如提供词表加载的方法

2.infer_model

infer_model下存放的是使用fluid.io.save_inference_model保存的模型文件

# 创建必要的文件夹
!mkdir -p VGG16/assets/infer_model

# 将模型文件复制到Module指定目录下
!cp -r freeze-model/* VGG16/assets/infer_model

3. vocab.txt

在图像分类任务中，词汇表文件存放的是每一个类别

vocab = open("VGG16/assets/vocab.txt", "w")
vocab.writelines(['daisy\n', 'dandelion\n', 'roses\n', 'sunflowers\n', 'tulips\n'])
vocab.close()

4. init.py

__ init__.py是一个空文件，直接创建即可

init = open("VGG16/__init__.py", "w")

5. processor.py

辅助模块，在processor.py中实现词汇表的读取，以及文本输入模型之前需要做的预处理

本案例中用于加载vocab.txt，下面是代码示例：

def load_label_info(file_path):
    with open(file_path, 'r') as fr:
        return fr.read().split("\n")[:-1]

processor = open("VGG16/processor.py", "w")

6.net.py

网路框架的实现，模型训练时的网络，下面是代码示例：

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from paddle import fluid

class VGGNet(object):
    """
    vgg的网络类
    """
    def __init__(self, layers=16):
        """
        vgg网络构造函数
        :param layers:
        """
        self.layers = layers

    def name(self):
        """
        返回网络名字
        :return:
        """
        return 'vgg-net'

    def net(self, input, class_dim=1000):
        layers = self.layers
        vgg_spec = {
            11: ([1, 1, 2, 2, 2]),
            13: ([2, 2, 2, 2, 2]),
            16: ([2, 2, 3, 3, 3]),
            19: ([2, 2, 4, 4, 4])
        }
        assert layers in vgg_spec.keys(), \
            "supported layers are {} but input layer is {}".format(vgg_spec.keys(), layers)

        nums = vgg_spec[layers]
        conv1 = self.conv_block(input, 64, nums[0], name="conv1_")
        conv2 = self.conv_block(conv1, 128, nums[1], name="conv2_")
        conv3 = self.conv_block(conv2, 256, nums[2], name="conv3_")
        conv4 = self.conv_block(conv3, 512, nums[3], name="conv4_")
        conv5 = self.conv_block(conv4, 512, nums[4], name="conv5_")

        fc_dim = 4096
        fc_name = ["fc6", "fc7", "fc8"]
        fc1 = fluid.layers.fc(
            input=conv5,
            size=fc_dim,
            act='relu',
            param_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_weights"),
            bias_attr=fluid.param_attr.ParamAttr(name=fc_name[0] + "_offset"))
        fc1 = fluid.layers.dropout(x=fc1, dropout_prob=0.5)
        fc2 = fluid.layers.fc(
            input=fc1,
            size=fc_dim,
            act='relu',
            param_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_weights"),
            bias_attr=fluid.param_attr.ParamAttr(name=fc_name[1] + "_offset"))
        fc2 = fluid.layers.dropout(x=fc2, dropout_prob=0.5)
        out = fluid.layers.fc(
            input=fc2,
            size=class_dim,
            act='softmax',
            param_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_weights"),
            bias_attr=fluid.param_attr.ParamAttr(name=fc_name[2] + "_offset"))

        return out

    def conv_block(self, input, num_filter, groups, name=None):
        conv = input
        for i in range(groups):
            conv = fluid.layers.conv2d(
                input=conv,
                num_filters=num_filter,
                filter_size=3,
                stride=1,
                padding=1,
                act='relu',
                param_attr=fluid.param_attr.ParamAttr(
                    name=name + str(i + 1) + "_weights"),
                bias_attr=fluid.param_attr.ParamAttr(
                    name=name + str(i + 1) + "_offset"))
        return fluid.layers.pool2d(
            input=conv, pool_size=2, pool_type='max', pool_stride=2)

net = open("VGG16/net.py", "w")

7. data_feed.py

处理图像，以便送入网络进行预测，下面是参考代码：

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
from collections import OrderedDict

import cv2
import numpy as np
from PIL import Image, ImageEnhance
from paddle import fluid

DATA_DIM = 224
img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))


def resize_short(img, target_size):
    percent = float(target_size) / min(img.size[0], img.size[1])
    resized_width = int(round(img.size[0] * percent))
    resized_height = int(round(img.size[1] * percent))
    img = img.resize((resized_width, resized_height), Image.LANCZOS)
    return img


def crop_image(img, target_size, center):
    width, height = img.size
    size = target_size
    if center == True:
        w_start = (width - size) / 2
        h_start = (height - size) / 2
    else:
        w_start = np.random.randint(0, width - size + 1)
        h_start = np.random.randint(0, height - size + 1)
    w_end = w_start + size
    h_end = h_start + size
    img = img.crop((w_start, h_start, w_end, h_end))
    return img


def process_image(img):
    img = resize_short(img, target_size=256)
    img = crop_image(img, target_size=DATA_DIM, center=True)
    if img.mode != 'RGB':
        img = img.convert('RGB')
    #img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = np.array(img).astype('float32').transpose((2, 0, 1)) / 255
    img -= img_mean
    img /= img_std
    return img


def test_reader(paths=None, images=None):
    """data generator
    :param paths: path to images.
    :type paths: list, each element is a str
    :param images: data of images, [N, H, W, C]
    :type images: numpy.ndarray
    """
    img_list = []
    if paths:
        assert os.path.isfile(
            paths), "The {} isn't a valid file path.".format(img_path)
        img = Image.open(paths)
        #img = cv2.imread(img_path)
        img_list.append(img)
    if images is not None:
        for img in images:
            img_list.append(Image.fromarray(np.uint8(img)))
    for im in img_list:
        im = process_image(im)
        yield im

8. module.py

module.py文件为Module的入口代码所在，需要在其中实现预测逻辑。

module = open("VGG16/module.py", "w")

引入必要的头文件

当引用Module中模块时，需要输入全路径，如VGG16.net

import os
import ast
import argparse

import numpy as np
import paddlehub as hub
import paddle.fluid as fluid
from paddlehub.module.module import moduleinfo, runnable
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.io.parser import txt_parser

填写模型基本信息

一个PaddleHub Module，其基本信息如下：

@moduleinfo(
    name="VGG16",
    version="1.0.0",
    type="cv/classification",
    summary=
    "VGG16 is a image classfication model trained with Flower dataset.",
    author="paddlepaddle",
    author_email="[email protected]")

实现逻辑预测

module.py中需要有一个继承了hub.Module的类存在，该类负责实现预测逻辑，并使用moduleinfo填写基本信息。当使用hub.Module(name=“senta_test”)加载Module时，PaddleHub会自动创建该类的对象并返回。

class VGG16(hub.Module):
    def _initialize(self):
        self.default_pretrained_model_path = os.path.join(self.directory, "assets/infer_model") # 模型文件的路径
        self.label_names = load_label_info(os.path.join(self.directory, "assets/vocab.txt")) # 图像分类任务的标签
        self.infer_prog = None
        self.pred_out = None
        self._set_config()

    def get_expected_image_width(self):
        return 224

    def get_expected_image_height(self):
        return 224

    def get_pretrained_images_mean(self):
        im_mean = np.array([0.485, 0.456, 0.406]).reshape(1, 3)
        return im_mean

    def get_pretrained_images_std(self):
        im_std = np.array([0.229, 0.224, 0.225]).reshape(1, 3)
        return im_std

    def _set_config(self):
        """
        predictor config setting
        """
        cpu_config = AnalysisConfig(self.default_pretrained_model_path)
        cpu_config.disable_glog_info()
        cpu_config.disable_gpu()
        cpu_config.switch_ir_optim(False)
        self.cpu_predictor = create_paddle_predictor(cpu_config)

        try:
            _places = os.environ["CUDA_VISIBLE_DEVICES"]
            int(_places[0])
            use_gpu = True
        except:
            use_gpu = False
        if use_gpu:
            gpu_config = AnalysisConfig(self.default_pretrained_model_path)
            gpu_config.disable_glog_info()
            gpu_config.enable_use_gpu(memory_pool_init_size_mb=500, device_id=0)
            self.gpu_predictor = create_paddle_predictor(gpu_config)

    def context(self,
                input_image=None,
                trainable=True,
                pretrained=True,
                param_prefix='',
                get_prediction=False,
                extra_block_filters=((256, 512, 1, 2, 3), (128, 256, 1, 2, 3),
                                     (128, 256, 0, 1, 3), (128, 256, 0, 1, 3)),
                normalizations=(20., -1, -1, -1, -1, -1)):
        """Distill the Head Features, so as to perform transfer learning.
        :param input_image: image tensor.
        :type input_image: <class 'paddle.fluid.framework.Variable'>
        :param trainable: whether to set parameters trainable.
        :type trainable: bool
        :param pretrained: whether to load default pretrained model.
        :type pretrained: bool
        :param param_prefix: the prefix of parameters.
        :type param_prefix: str
        :param get_prediction: whether to get prediction.
        :type get_prediction: bool
        :param extra_block_filters: in each extra block, params:
            [in_channel, out_channel, padding_size, stride_size, filter_size]
        :type extra_block_filters: list
        :param normalizations: params list of init scale in l2 norm, skip init
            scale if param is -1.
        :type normalizations: list
        """
        context_prog = input_image.block.program if input_image else fluid.Program(
        )
        startup_program = fluid.Program()
        with fluid.program_guard(context_prog, startup_program):
            image = input_image if input_image else fluid.data(
                name='image',
                shape=[-1, 3, 224, 224],
                dtype='float32',
                lod_level=0)

            backbone = VGGNet(layers=16)
            out = backbone.net(input=image, class_dim=5)
            # out = backbone(image)
            inputs = {'image': image}
            if get_prediction:
                outputs = {'pred_out': out}
            else:
                outputs = {'body_feats': out}

            place = fluid.CPUPlace()
            exe = fluid.Executor(place)
            if pretrained:

                def _if_exist(var):
                    return os.path.exists(
                        os.path.join(self.default_pretrained_model_path,
                                     var.name))

                if not param_prefix:
                    fluid.io.load_vars(
                        exe,
                        self.default_pretrained_model_path,
                        main_program=context_prog,
                        predicate=_if_exist)
            else:
                exe.run(startup_program)
            return inputs, outputs, context_prog

    def classification(self,
                       paths=None,
                       images=None,
                       use_gpu=False,
                       batch_size=1,
                       top_k=1):
        """API of Classification.
        :param paths: the path of images.
        :type paths: list, each element is correspond to the path of an image.
        :param images: data of images, [N, H, W, C]
        :type images: numpy.ndarray
        :param use_gpu: whether to use gpu or not.
        :type use_gpu: bool
        :param batch_size: bathc size.
        :type batch_size: int
        :param top_k: result of top k
        :type top_k: int
        """
        if self.infer_prog is None:
            inputs, outputs, self.infer_prog = self.context(
                trainable=False, pretrained=True, get_prediction=True)
            self.infer_prog = self.infer_prog.clone(for_test=True)
            self.pred_out = outputs['pred_out']
        place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
        exe = fluid.Executor(place)
        all_images = []
        paths = paths if paths else []
        for yield_data in test_reader(paths, images):
            all_images.append(yield_data)

        images_num = len(all_images)
        loop_num = int(np.ceil(images_num / batch_size))
        res_list = []
        top_k = max(min(top_k, 1000), 1)
        for iter_id in range(loop_num):
            batch_data = []
            handle_id = iter_id * batch_size
            for image_id in range(batch_size):
                try:
                    batch_data.append(all_images[handle_id + image_id])
                except:
                    pass
            batch_data = np.array(batch_data).astype('float32')
            data_tensor = PaddleTensor(batch_data.copy())
            if use_gpu:
                result = self.gpu_predictor.run([data_tensor])
            else:
                result = self.cpu_predictor.run([data_tensor])
            for i, res in enumerate(result[0].as_ndarray()):
                res_dict = {}
                pred_label = np.argsort(res)[::-1][:top_k]
                for k in pred_label:
                    class_name = self.label_names[int(k)].split(',')[0]
                    max_prob = res[k]
                    res_dict[class_name] = max_prob
                res_list.append(res_dict)
        return res_list

    def add_module_config_arg(self):
        """
        Add the command config options
        """
        self.arg_config_group.add_argument(
            '--use_gpu',
            type=ast.literal_eval,
            default=False,
            help="whether use GPU or not")

        self.arg_config_group.add_argument(
            '--batch_size',
            type=int,
            default=1,
            help="batch size for prediction")

    def add_module_input_arg(self):
        """
        Add the command input options
        """
        self.arg_input_group.add_argument(
            '--input_path', type=str, default=None, help="input data")

        self.arg_input_group.add_argument(
            '--input_file',
            type=str,
            default=None,
            help="file contain input data")

    def check_input_data(self, args):
        input_data = []
        if args.input_path:
            input_data = [args.input_path]
        elif args.input_file:
            if not os.path.exists(args.input_file):
                raise RuntimeError("File %s is not exist." % args.input_file)
            else:
                input_data = txt_parser.parse(args.input_file, use_strip=True)
        return input_data

    @runnable
    def run_cmd(self, argvs):
        self.parser = argparse.ArgumentParser(
            description="Run the {}".format(self.name),
            prog="hub run {}".format(self.name),
            usage='%(prog)s',
            add_help=True)
        self.arg_input_group = self.parser.add_argument_group(
            title="Input options", description="Input data. Required")
        self.arg_config_group = self.parser.add_argument_group(
            title="Config options",
            description=
            "Run configuration for controlling module behavior, not required.")
        self.add_module_config_arg()

        self.add_module_input_arg()
        args = self.parser.parse_args(argvs)
        input_data = self.check_input_data(args)
        if len(input_data) == 0:
            self.parser.print_help()
            exit(1)
        else:
            for image_path in input_data:
                if not os.path.exists(image_path):
                    raise RuntimeError(
                        "File %s or %s is not exist." % image_path)
        return self.classification(
            paths=input_data, use_gpu=args.use_gpu, batch_size=args.batch_size)

# 查看目录结构
!tree VGG16/

VGG16/
├── assets
│   ├── infer_model
│   │   ├── conv1_1_offset
│   │   ├── conv1_1_weights
│   │   ├── conv1_2_offset
│   │   ├── conv1_2_weights
│   │   ├── conv2_1_offset
│   │   ├── conv2_1_weights
│   │   ├── conv2_2_offset
│   │   ├── conv2_2_weights
│   │   ├── conv3_1_offset
│   │   ├── conv3_1_weights
│   │   ├── conv3_2_offset
│   │   ├── conv3_2_weights
│   │   ├── conv3_3_offset
│   │   ├── conv3_3_weights
│   │   ├── conv4_1_offset
│   │   ├── conv4_1_weights
│   │   ├── conv4_2_offset
│   │   ├── conv4_2_weights
│   │   ├── conv4_3_offset
│   │   ├── conv4_3_weights
│   │   ├── conv5_1_offset
│   │   ├── conv5_1_weights
│   │   ├── conv5_2_offset
│   │   ├── conv5_2_weights
│   │   ├── conv5_3_offset
│   │   ├── conv5_3_weights
│   │   ├── fc6_offset
│   │   ├── fc6_weights
│   │   ├── fc7_offset
│   │   ├── fc7_weights
│   │   ├── fc8_offset
│   │   ├── fc8_weights
│   │   └── __model__
│   └── vocab.txt
├── data_feed.py
├── __init__.py
├── module.py
├── net.py
├── processor.py
└── __pycache__
    ├── data_feed.cpython-37.pyc
    ├── __init__.cpython-37.pyc
    ├── module.cpython-37.pyc
    ├── net.cpython-37.pyc
    └── processor.cpython-37.pyc

3 directories, 44 files

三、测试Module

完成Module编写后，我们可以通过以下方式测试该Module

1. 通过hub.Module(name=…)加载

将Module安装到本机中，再通过hub.Module(name=…)加载

!hub install VGG16
!hub show VGG16

  Successfully installed VGG16

+-----------------+----------------------------------------------------+
|   ModuleName    |VGG16                                               |
+-----------------+----------------------------------------------------+
|     Version     |1.1.0                                               |
+-----------------+----------------------------------------------------+
|     Summary     |VGG16 is a image classfication model trained with   |
|                 |Flower dataset.                              |
+-----------------+----------------------------------------------------+
|     Author      |paddlepaddle                                        |
+-----------------+----------------------------------------------------+
|  Author-Email   |[email protected]                                |
+-----------------+----------------------------------------------------+
|    Location     |/home/aistudio/.paddlehub/modules/VGG16             |
+-----------------+----------------------------------------------------+

import paddlehub as hub

vgg16_test = hub.Module(name="VGG16")

test_img_path = "data/data2815/tulips/17165583356_38cb1f231d_n.jpg"

# execute predict and print the result
results = vgg16_test.classification(test_img_path)
# print(results)
for result in results:
    print(result)

[32m[2020-11-25 12:37:22,531] [    INFO] - Installing VGG16 module[0m
[32m[2020-11-25 12:37:22,533] [    INFO] - Module VGG16 already installed in /home/aistudio/.paddlehub/modules/VGG16[0m


{'dandelion': 0.24343227}

2. 直接通过hub.Module(directory=…)加载

import paddlehub as hub

vgg16_test = hub.Module(directory="VGG16/")

test_img_path = "data/data2815/tulips/17165583356_38cb1f231d_n.jpg"

# execute predict and print the result
results = vgg16_test.classification(test_img_path)
# print(results)
for result in results:
    print(result)

{'dandelion': 0.24343227}

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

本文同步分享在博客“Mr.郑先生_”（CSDN）。
如有侵权，请联系 [email protected] 删除。
本文参与“OSC源创计划”，欢迎正在阅读的你也加入，一起分享。

【PaddleHub模型贡献】如何将飞桨PaddlePaddle模型收录至PaddleHub

将训练保存的模型转化为PaddleHub Module并完成一键加载

一、训练鲜花识别模型

1. 解压鲜花数据集及预训练参数

2. 数据预处理

3. 模型训练

二、整理成PaddleHub Module格式

1. 必要的目录与文件

2.infer_model

3. vocab.txt

4. init.py

5. processor.py

6.net.py

7. data_feed.py

8. module.py

引入必要的头文件

填写模型基本信息

实现逻辑预测

三、测试Module

1. 通过hub.Module(name=…)加载

2. 直接通过hub.Module(directory=…)加载

Recommend

十余行代码完成迁移学习，百度PaddleHub实战解读

选择困难症患者的福音！PaddleHub帮你任意搭配你想要的颜色！

飞桨PaddlePaddle的个人空间

Github GitHub - PaddlePaddle/PaddleX: PaddlePaddle End-to-End Development Toolki...

Github GitHub - PaddlePaddle/PaddleHub: Awesome pre-trained models toolkit based...

基于PaddlePaddle的Attention Cluster 视频分类模型

技术公开课实录：端到端预训练应用工具PaddleHub深度解析

PaddleHub元宇宙直通车：手把手教你造个虚拟数字人

百度飞桨(PaddlePaddle)-数字识别 - VipSoft

百度飞桨(PaddlePaddle) - PP-OCRv3 文字检测识别系统预测部署简介与总览 - VipSoft

About Joyk

【PaddleHub模型贡献】如何将飞桨PaddlePaddle模型收录至PaddleHub

将训练保存的模型转化为PaddleHub Module并完成一键加载

一、训练鲜花识别模型

1. 解压鲜花数据集及预训练参数

2. 数据预处理

3. 模型训练

二、整理成PaddleHub Module格式

1. 必要的目录与文件

2.infer_model

3. vocab.txt

4. __ init__.py

5. processor.py

6.net.py

7. data_feed.py

8. module.py

引入必要的头文件

填写模型基本信息

实现逻辑预测

三、 测试Module

1. 通过hub.Module(name=…)加载

2. 直接通过hub.Module(directory=…)加载

Recommend

About Joyk

4. init.py

三、测试Module