留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于ST-AGCN算法的物流暴力分拣识别模型

曹菁菁 余宙 李鹏飞 闵艳萍 黄齐贤 赵强伟

曹菁菁, 余宙, 李鹏飞, 闵艳萍, 黄齐贤, 赵强伟. 基于ST-AGCN算法的物流暴力分拣识别模型[J]. 交通信息与安全, 2023, 41(5): 115-126. doi: 10.3963/j.jssn.1674-4861.2023.05.012
引用本文: 曹菁菁, 余宙, 李鹏飞, 闵艳萍, 黄齐贤, 赵强伟. 基于ST-AGCN算法的物流暴力分拣识别模型[J]. 交通信息与安全, 2023, 41(5): 115-126. doi: 10.3963/j.jssn.1674-4861.2023.05.012
CAO Jingjing, YU Zhou, LI Pengfei, MIN Yanping, HUANG Qixian, ZHAO Qiangwei. A Recognition Model for Violent Sorting Activity Based on the ST-AGCN Algorithm[J]. Journal of Transport Information and Safety, 2023, 41(5): 115-126. doi: 10.3963/j.jssn.1674-4861.2023.05.012
Citation: CAO Jingjing, YU Zhou, LI Pengfei, MIN Yanping, HUANG Qixian, ZHAO Qiangwei. A Recognition Model for Violent Sorting Activity Based on the ST-AGCN Algorithm[J]. Journal of Transport Information and Safety, 2023, 41(5): 115-126. doi: 10.3963/j.jssn.1674-4861.2023.05.012

基于ST-AGCN算法的物流暴力分拣识别模型

doi: 10.3963/j.jssn.1674-4861.2023.05.012
基金项目: 

国家自然科学基金青年项目 61502360

详细信息
    通讯作者:

    曹菁菁(1984—),博士,副教授. 研究方向:机器学习和模式识别.E-mail:bettycao@whut.edu.cn

  • 中图分类号: U495

A Recognition Model for Violent Sorting Activity Based on the ST-AGCN Algorithm

  • 摘要: 目前快递物流行业普遍存在分拣人员暴力分拣现象,为减少此类行为可采用基于图像的行为识别方法,但这种方法在实际场景中存在算法鲁棒性差、人体关节点数据难获取等问题。针对上述问题,制作了1个物流暴力分拣行为视频数据集,研究了暴力分拣行为识别模型。通过树莓派采集室内外2种情景下的分拣视频数据,利用Python socket模块实现视频图像实时传输,采用切片筛选规则除去非标准数据,应用OpenPose模型获取关节点数据。针对一般人体行为识别网络模型无法较好反映暴力分拣关节点对动作重要影响程度的问题,研究了以ST-GCN为主干网络的优化图神经网络模型ST-AGCN。利用空间注意力机制学习不同关节点对于各种动作的影响,以更新各关节点的权重;通过增加自适应图结构层以端到端学习方式将人体骨骼图的拓扑结构与网络参数共同优化,突出关联度高的关节点对动作识别的影响。以室内外环境下暴力分拣视频为对象开展和多种深度学习模型的对比实验和消融实验,实验结果表明:ST-AGCN模型识别现实场景中暴力分拣行为的准确率相比ST-GCN、STA-LSTM、不含空间注意力机制的ST-AGCN和不含自适应图结构层的ST-AGCN模型分别提高了5.6%,13.82%,2.36%,1.61%,且适用于室内外环境杂乱、局部遮挡等复杂的物流分拣场景,验证了ST-AGCN的优越性以及空间注意力机制和自适应图结构层的有效性。

     

  • 图  1  摄像头摆放位置

    Figure  1.  Position of the camera placement

    图  2  数据传输流程

    Figure  2.  Data transmission process

    图  3  OpenPose处理采集数据

    Figure  3.  OpenPose processes the acquisition data

    图  4  人体关节点

    Figure  4.  Human joint point

    图  5  ST-AGCN网络结构图

    Figure  5.  ST-AGCN network structure

    图  6  SAGCN网络结构图

    Figure  6.  SAGCN network structure

    图  7  对比实验结果

    Figure  7.  Results of the comparison experiment

    图  8  Attention机制消融实验结果

    Figure  8.  Results of the Attention mechanism ablation experiment

    图  9  向心子集和离心子集的邻接矩阵

    Figure  9.  Adjacency matrices for the centripetal and centrifugal subsets

    图  10  自适应图结构消融实验结果

    Figure  10.  Results of the adaptive graph ablation experiment

    图  11  实际场景测试结果

    Figure  11.  Field test results

    表  1  室外暴力分拣场景及对应视频数量

    Table  1.   Outdoor violence sorting scene and the corresponding number of videos 单位: 个

    场景 环境杂乱 局部遮挡 拍摄不全 在面包车中
    单人 10 10 10 13
    双人 10 10 10 13
    三人 10 10 10 13
    下载: 导出CSV

    表  2  室内暴力分拣场景及对应视频数量

    Table  2.   Indoor violence sorting scene and the corresponding number of videos单位: 个

    场景 光线不足 环境杂乱 局部遮挡 拍摄不全
    单人 10 10 13 10
    双人 10 10 13 10
    三人 10 10 13 10
    下载: 导出CSV

    表  3  各类动作视频片段数量

    Table  3.   Number of action video clips of all kinds

    动作类型 视频数量/个
        正常 490
        摔 241
        踢 279
        砸 272
        丢 540
    下载: 导出CSV

    表  4  对比实验结果

    Table  4.   Results of the comparison experiment

    模型类别 准确率/%
    STA-LSTM 44.44
    ST-GCN 52.66
    Shift-GCN 57.22
    2s-AGCN 56.46
    ST-AGCN 58.26
    下载: 导出CSV

    表  5  Attention机制消融实验结果

    Table  5.   Results of the Attention mechanism ablation experiment

    模型类别 准确率/% 平均拒识率/%
    ST-AGCN w/o SA 55.90 12.03
    ST-AGCN 58.26 10.67
    下载: 导出CSV

    表  6  自适应图结构消融实验结果

    Table  6.   Results of the adaptive graph ablation experiment

    模型类别 准确率/% 平均拒识率/%
    ST-AGCN w/o adaptive graph 56.65 11.61
    ST-AGCN 58.26 10.67
    下载: 导出CSV

    表  7  单元堆叠数目消融实验结果

    Table  7.   Results of the unit stack number ablation experiment

    ST-AGCN层数 准确率/% 平均拒识率/% 时间/s
    1 40.36 24.38 1 312
    3 45.10 19.70 2 150
    5 52.66 14.65 4 037
    7 56.46 11.71 6 808
    10 58.26 10.67 8 632
    12 52.18 14.31 10 550
    下载: 导出CSV

    表  8  现场测试的误识率和拒识率

    Table  8.   Misidentification rate and rejection rate of field tests单位: %

    动作类型 误识率 拒识率
        丢 16.17 12.45
        踢 19.00 12.99
        正常 21.82 8.61
        砸 19.94 12.92
        抛 23.08 6.37
    下载: 导出CSV
  • [1] 顾欣. 高速公路互通立交合流区交通冲突预测模型研究[D]. 南京: 东南大学, 2022.

    GU X. Research on prediction model of traffic conflict in the confluence area of expressway interchange[D]. Nanjing: Southeast University, 2022. (in Chinese)
    [2] 谭志荣, 陈维, 王辉, 等. 基于视频识别技术的船舶视觉盲区增强方法研究[J]. 中国水运, 2020, 12(2): 108-109.

    TAN Z R, CHEN W, WANG H, et al. Research on the enhancement method of ship visual blind spot based on video recognition technology[J]. China Water Transport, 2020, 12 (2): 108-109. (in Chinese)
    [3] 丁奥, 张媛, 朱磊, 等. 基于加速度分布特征的快递暴力分拣识别方法[J]. 包装工程, 2020, 41(23): 162-171.

    DING A, ZHANG Y, ZHU L, et al. Recognition method for rough handling of express parcels based on acceleration distribution features[J]. Packaging Engineering, 2020, 41(23): 162-171. (in Chinese)
    [4] 徐燕, 刘军, 周丽, 等. 防暴力分拣的主动式快递分拣作业辅助和评价系统及方法: CN201910942514.1[P]. 2019-12-06.

    XU Y, LIU J, ZHOU L, et al. Active express sorting operation assistance and evaluation system and method for anti-violent sorting: CN201910942514. 1[P]. 2019-12-06. (in Chinese)
    [5] 范洪博, 郭全, 张晶, 等. 1种基于LoRa的防暴力分拣防丢的物流实时监控装置: CN201820075504. 3[P]. 2018-10-12.

    FAN H B, GUO Q, ZHANG J, et al. A new real-time logistics monitoring device based on LoRa: CN201820075504.3[P]. 2018-10-12. (in Chinese)
    [6] 尚淑玲. 基于计算机视觉的物流暴力分拣行为识别[J]. 计算机仿真, 2013, 30(12): 430-433.

    SHANG S L. Identification of logistics violence sorting behavior based on computer vision[J]. Computer Simulation, 2013, 30(12): 430-433. (in Chinese)
    [7] 罗雪阳, 蔡锦达. 基于深度学习的图像分类算法框架研究[J]. 包装工程, 2021, 42(21): 181-187.

    LUO X Y, CAI J D. Research on image classification algorithm framework based on deep learning[J]. Packaging Engineering, 2021, 42(21): 181-187. (in Chinese)
    [8] DU Y, WANG W, WANG L. Hierarchical recurrent neural network for skeleton based action recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Boston, MA, USA: IEEE, 2015.
    [9] FENG J G, ZHANG S Y, XIAO J. Explorations of skeleton features for LSTM-based action recognition[J]. Multimed Tools Appl, 2017, 78(1): 591-603.
    [10] ZHU A C, WU Q Y, CUI R, et al. Exploring a rich spatial-temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN[J]. Neurocomputing, 2020(5): 90-100.
    [11] DING Y K, ZHU Y L, WU Y R, et al. Spatio-temporal attention LSTM model for flood forecasting[C]. IEEE Green Computing and Communications and IEEE Cyber, Physical and Social Computing and IEEE Smart Data, Atlanta, GA, USA: IEEE, 2019.
    [12] LIU J, SHAHROUDY A, XU D, et al. Skeleton-based action recognition using spatio-temporal LSTM network with trust gates[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 40(12): 3007-3021.
    [13] YANG J Y, LIU W, YUAN J S, et al. Hierarchical soft quantization for skeleton-based human action recognition[J]. IEEE Transactions on Multimedia, 2020, 23: 883-898.
    [14] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv, 2016, 1609, 2907.
    [15] SHI L, ZHANG Y F, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA: IEEE, 2019.
    [16] YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]. 32nd AAAI Conference on Artificial Intelligence, New Orleans, Lousiana, USA: AAAI, 2018.
    [17] CHENG K, ZHANG Y F, HE X Y, et al. Skeleton-based action recognition with shift graph convolutional network[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA: IEEE, 2020.
    [18] 翟龙真. 基于人因工程学的快递分拣作业优化研究[D]. 长沙: 南华大学, 2016.

    ZHAI L Z. Research on optimization of express sorting operation based on human factors engineering[D]. Changsha: University of South China, 2016. (in Chinese)
    [19] 刘星余. 面向物流仓储分拣机器人的多目标视觉识别与定位方法研究[J]. 粘接, 2021, 47(7): 109-112.

    LIU X Y. Research on multi-objective visual recognition and positioning method for logistics warehousing and sorting robot[J]. Adhesion, 2021, 47(7): 109-112. (in Chinese)
    [20] 李龙棋, 方美发, 唐晓腾. 树莓派平台下的实时监控系统开发[J]. 闽江学院学报, 2014, 35(5): 67-72.

    LI L Q, FANG M F, TANG X T. Development of real-time monitoring system based on raspberry pie platform[J]. Journal of Minjiang University, 2014, 35(5): 67-72. (in Chinese)
    [21] CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA: IEEE, 2017, 143.
    [22] 刘勇, 李杰, 张建林, 等. 基于深度学习的二维人体姿态估计研究进展[J]. 计算机工程, 2021, 47(3): 1-16.

    LIU Y, LI J, ZHANG J L, et al. Research progress of 2D human pose estimation on deep learning[J]. Computer Engineering, 2021, 47(3): 1-16. (in Chinese)
    [23] WEBERING F, BLUME H, ALLAHAM I. Markerless camera-based vertical jump height measurement using OpenPose[C]. The IEEE/ CVF Conference on Computer Vision and Pattern Recognition, Online: IEEE, 2021.
    [24] SAHIN I, MODI A, KOKKONI E. Evaluation of OpenPose for quantifying infant reaching motion[J]. Arch Phys Med Rehabil, 2021, 102(10): e86.
    [25] PAPANDREOU G, ZHU T, CHEN L C, et al. PersonLab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model[C]. Computer Vision -European Conference on Computer Vision, Munich, Germany: ECCV, 2018.
    [26] WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016.
    [27] YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]. AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA: AAAI, 2018.
    [28] MAO X X, WU S X, XIAN B L, et al. Adaptive graph convolution and LSTM action recognition based on skeleton[J]. Journal of East China University of Science and Technology, 2021, 48: 1-10.
    [29] 刘海洲, 张敬宇. 基于大数据的城市轨道交通出行站外OD位置点识别方法研究[J]. 铁道运输与经济, 2022, 44(8): 115-122.

    LIU H Z, ZHANG J Y. Research on OD location point identification method outside urban rail transit travel stations based on big data[J]. Railway Transport and Economy, 2022, 44(8): 115-122. (in Chinese)
    [30] DEMOKRI D P, JOUDAKI S, KOLIVAND H. A new traffic sign recognition technique taking shuffled frog-leaping algorithm into account[J]. Wireless Personal Communications, 2022, 125(4): 11-17.
  • 加载中
图(11) / 表(8)
计量
  • 文章访问数:  325
  • HTML全文浏览量:  251
  • PDF下载量:  15
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-09-04
  • 网络出版日期:  2024-01-18

目录

    /

    返回文章
    返回