基于PaddlePaddle的Attention Cluster 视频分类模型
source link: https://my.oschina.net/u/4067628/blog/3285386
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
基于PaddlePaddle的Attention Cluster 视频分类模型
Attention Cluster模型为ActivityNet Kinetics Challenge 2017中最佳序列模型。该模型通过带Shifting Opeation的Attention Clusters处理已抽取好的RGB、Flow、Audio特征数据,Attention Cluster结构如下图所示。
Shifting Operation通过对每一个attention单元的输出添加一个独立可学习的线性变换处理后进行L2-normalization,使得各attention单元倾向于学习特征的不同成分,从而让Attention Cluster能更好地学习不同分布的数据,提高整个网络的学习表征能力。
详细内容请参考Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
本例采用的是YouTube-8M 2018年更新之后的数据集。使用官方数据集,并将TFRecord文件转化为pickle文件以便PaddlePaddle使用。Youtube-8M数据集官方提供了frame-level和video-level的特征。本例挂靠的数据集为预处理后的数据集, 该数据集为YouTUbe 8M数据集的子集,仅包含5个视频文件,并且训练和测试使用的数据一样,主要用途是模型示例。
下载安装命令 ## CPU版本安装命令 pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle ## GPU版本安装命令 pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
若用户想进行大数据集的训练可按以下步骤操作
请使用Youtube-8M官方链接分别下载训练集和验证集。每个链接里各提供了3844个文件的下载地址,用户也可以使用官方提供的下载脚本下载数据。数据下载完成后,将会得到3844个训练数据文件和3844个验证数据文件(TFRecord格式)。 假设存放视频模型代码库的主目录为: Code_Root,进入dataset/youtube8m目录
cd dataset/youtube8m
在youtube8m下新建目录tf/train和tf/val
mkdir tf && cd tf
mkdir train && mkdir val
并分别将下载的train和validate数据存放在其中。
数据格式转化
为了适用于PaddlePaddle训练,需要离线将下载好的TFRecord文件格式转成了pickle格式,转换脚本请使用PaddleVideo/tf2pkl.py。
在dataset/youtube8m 目录下新建目录pkl/train和pkl/val
cd dataset/youtube8m
mkdir pkl && cd pkl
mkdir train && mkdir val
转化文件格式(TFRecord -> pkl),进入dataset/youtube8m目录,运行脚本
python tf2pkl.py ./tf/train ./pkl/train
python tf2pkl.py ./tf/val ./pkl/val
分别将train和validate数据集转化为pkl文件。tf2pkl.py文件运行时需要两个参数,分别是数据源tf文件存放路径和转化后的pkl文件存放路径。
备注:由于TFRecord文件的读取需要用到Tensorflow,用户要先安装Tensorflow,或者在安装有Tensorflow的环境中转化完数据,再拷贝到dataset/youtube8m/pkl目录下。为了避免和PaddlePaddle环境冲突,建议先在其他地方转化完成再将数据拷贝过来。
生成文件列表
进入dataset/youtube8m目录
ls $Code_Root/dataset/youtube8m/pkl/train/* > train.list
ls $Code_Root/dataset/youtube8m/pkl/val/* > val.list
在dataset/youtube8m目录下将生成两个文件,train.list和val.list,每一行分别保存了一个pkl文件的绝对路径。
#解压数据集
!cd data/data10073/ && unzip -qo youtube8m.zip
###安装wegt
!pip install wget
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/ Collecting wget Downloading https://mirrors.tuna.tsinghua.edu.cn/pypi/web/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip Building wheels for collected packages: wget Running setup.py bdist_wheel for wget ... done Stored in directory: /home/aistudio/.cache/pip/wheels/26/28/0d/cd5205dcdeaca81bf62909a7cfd449eaf6698e8ab18992f71a Successfully built wget Installing collected packages: wget Successfully installed wget-3.2
#模型训练,模型参数保存在checkpoints,固化模型保存在freeze_model
!python PaddleVideo/train.py --model_name=AttentionCluster \
--config=PaddleVideo/configs/attention_cluster.txt \
--save_dir=PaddleVideo/checkpoints \
--log_interval=20 \
--use_gpu='True' \
--valid_interval=1
[INFO: train.py: 284]: Namespace(batch_size=None, config='PaddleVideo/configs/attention_cluster.txt', enable_ce=False, epoch=1, learning_rate=None, log_interval=20, model_name='AttentionCluster', no_memory_optimize=True, no_use_pyreader=True, pretrain=None, resume=None, save_dir='PaddleVideo/checkpoints', use_gpu=True, valid_interval=1) [INFO: config.py: 66]: ---------------- Train Arguments ---------------- [INFO: config.py: 68]: TEST: [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 70]: filelist:data/data10073/youtube8m/infer.list [INFO: config.py: 68]: TRAIN: [INFO: config.py: 70]: num_gpus:1 [INFO: config.py: 70]: use_gpu:True [INFO: config.py: 70]: learning_rate:0.001 [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 70]: filelist:data/data10073/youtube8m/train.list [INFO: config.py: 70]: epoch:1 [INFO: config.py: 70]: pretrain_base:None [INFO: config.py: 68]: INFER: [INFO: config.py: 70]: batch_size:1 [INFO: config.py: 70]: filelist:data/data10073/youtube8m/infer.list [INFO: config.py: 68]: MODEL: [INFO: config.py: 70]: drop_rate:0.5 [INFO: config.py: 70]: bone_network:None [INFO: config.py: 70]: feature_dims:[1024, 128] [INFO: config.py: 70]: topk:20 [INFO: config.py: 70]: num_classes:3862 [INFO: config.py: 70]: cluster_nums:[32, 32] [INFO: config.py: 70]: feature_num:2 [INFO: config.py: 70]: name:AttentionCluster [INFO: config.py: 70]: dataset:YouTube-8M [INFO: config.py: 70]: feature_names:['rgb', 'audio'] [INFO: config.py: 70]: seg_num:100 [INFO: config.py: 68]: VALID: [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 70]: filelist:data/data10073/youtube8m/val.list [INFO: config.py: 71]: ------------------------------------------------- W0902 17:43:55.617861 608 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0902 17:43:55.621732 608 device_context.cc:267] device: 0, cuDNN Version: 7.3. [WARNING: compiler.py: 239]: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True # pass the build_strategy to with_data_parallel API compiled_prog = compiler.CompiledProgram(main).with_data_parallel( loss_name=loss.name, build_strategy=build_strategy) !!! Memory optimize is our experimental feature !!! some variables may be removed/reused internal to save memory usage, in order to fetch the right value of the fetch_list, please set the persistable property to true for each variable in fetch_list # Sample conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) # if you need to fetch conv1, then: conv1.persistable = True I0902 17:43:55.677418 608 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I0902 17:43:55.730741 608 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 [WARNING: compiler.py: 239]: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True # pass the build_strategy to with_data_parallel API compiled_prog = compiler.CompiledProgram(main).with_data_parallel( loss_name=loss.name, build_strategy=build_strategy) !!! Memory optimize is our experimental feature !!! some variables may be removed/reused internal to save memory usage, in order to fetch the right value of the fetch_list, please set the persistable property to true for each variable in fetch_list # Sample conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) # if you need to fetch conv1, then: conv1.persistable = True share_vars_from is set, scope is ignored. I0902 17:43:55.769202 608 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I0902 17:43:55.786227 608 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 [INFO: train_utils.py: 30]: ------- learning rate [0.001], learning rate counter [-] ----- [INFO: metrics_util.py: 67]: [TRAIN] Epoch 0, iter 0 , loss = 2678.043701, Hit@1 = 0.00, PERR = 0.00, GAP = 0.00 [INFO: metrics_util.py: 67]: [TRAIN] Epoch 0, iter 20 , loss = 27.297926, Hit@1 = 0.00, PERR = 0.05, GAP = 0.02 [INFO: metrics_util.py: 67]: [TRAIN] Epoch 0, iter 40 , loss = 54.099422, Hit@1 = 0.00, PERR = 0.00, GAP = 0.00
#利用固化后的模型进行预测
!python PaddleVideo/test.py --model_name="AttentionCluster" --config=PaddleVideo/configs/attention_cluster.txt \
--log_interval=10 --weights=PaddleVideo/checkpoints/ --use_gpu='True'
[INFO: test.py: 151]: Namespace(batch_size=None, config='PaddleVideo/configs/attention_cluster.txt', log_interval=10, model_name='AttentionCluster', use_gpu=True, weights='PaddleVideo/checkpoints/') [INFO: config.py: 66]: ---------------- Test Arguments ---------------- [INFO: config.py: 68]: TRAIN: [INFO: config.py: 70]: learning_rate:0.001 [INFO: config.py: 70]: epoch:5 [INFO: config.py: 70]: filelist:data/data10073/youtube8m/train.list [INFO: config.py: 70]: use_gpu:True [INFO: config.py: 70]: num_gpus:1 [INFO: config.py: 70]: pretrain_base:None [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 68]: MODEL: [INFO: config.py: 70]: name:AttentionCluster [INFO: config.py: 70]: bone_network:None [INFO: config.py: 70]: feature_names:['rgb', 'audio'] [INFO: config.py: 70]: seg_num:100 [INFO: config.py: 70]: num_classes:3862 [INFO: config.py: 70]: feature_dims:[1024, 128] [INFO: config.py: 70]: feature_num:2 [INFO: config.py: 70]: dataset:YouTube-8M [INFO: config.py: 70]: cluster_nums:[32, 32] [INFO: config.py: 70]: topk:20 [INFO: config.py: 70]: drop_rate:0.5 [INFO: config.py: 68]: VALID: [INFO: config.py: 70]: filelist:data/data10073/youtube8m/val.list [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 68]: INFER: [INFO: config.py: 70]: filelist:data/data10073/youtube8m/infer.list [INFO: config.py: 70]: batch_size:1 [INFO: config.py: 68]: TEST: [INFO: config.py: 70]: filelist:data/data10073/youtube8m/infer.list [INFO: config.py: 70]: batch_size:5 [INFO: config.py: 71]: ------------------------------------------------- W0902 17:40:24.814287 470 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0902 17:40:24.817791 470 device_context.cc:267] device: 0, cuDNN Version: 7.3. [INFO: metrics_util.py: 67]: [EVAL] Batch 0 , loss = 16.431852, Hit@1 = 0.60, PERR = 0.27, GAP = 0.35 [INFO: metrics_util.py: 67]: [EVAL] Batch 10 , loss = 17.586128, Hit@1 = 0.20, PERR = 0.12, GAP = 0.20 [INFO: metrics_util.py: 67]: [EVAL] Batch 20 , loss = 9.226382, Hit@1 = 0.60, PERR = 0.68, GAP = 0.55 [INFO: metrics_util.py: 67]: [EVAL] Batch 30 , loss = 11.062404, Hit@1 = 0.80, PERR = 0.62, GAP = 0.48 [INFO: metrics_util.py: 67]: [EVAL] Batch 40 , loss = 11.580819, Hit@1 = 0.60, PERR = 0.40, GAP = 0.48 [INFO: metrics_util.py: 67]: [EVAL] Batch 50 , loss = 12.862601, Hit@1 = 0.80, PERR = 0.49, GAP = 0.56 [INFO: metrics_util.py: 67]: [EVAL] Batch 60 , loss = 14.932129, Hit@1 = 0.40, PERR = 0.32, GAP = 0.33 ^C current pid is 470, group id is 469
#利用固化后的模型进行预测,此处仅打印10例结果, 结果分别为vedio_id,所属类别和概率
!python PaddleVideo/freeze_infer.py --use_gpu='True'
W0902 17:42:27.264341 539 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0902 17:42:27.267810 539 device_context.cc:267] device: 0, cuDNN Version: 7.3. [b'Eu4t', [5], [0.9748388528823853]] [b'nC4t', [5], [0.8345569968223572]] [b'0i4t', [8], [0.6524688005447388]] [b'kB4t', [1], [0.8780305981636047]] [b'V04t', [0], [0.8229674696922302]] [b'mQ4t', [8], [0.2174115777015686]] [b'kI4t', [1], [0.5383145213127136]] [b'xr4t', [5], [0.3262545168399811]] [b'oz4t', [0], [0.5421494841575623]] [b'1E4t', [2], [0.6699605584144592]]
点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/205013
下载安装命令 ## CPU版本安装命令 pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle ## GPU版本安装命令 pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
>> 访问 PaddlePaddle 官网,了解更多相关内容。
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK