基于改进YOLOv5算法的道路交通参与者实时检测方法

张逸凡; 聂琳真; 黄灏然; 尹智帅

doi:10.3963/j.jssn.1674-4861.2024.01.013

基于改进YOLOv5算法的道路交通参与者实时检测方法

doi: 10.3963/j.jssn.1674-4861.2024.01.013

张逸凡^{1, 2,},
聂琳真^{1, 2, ,},
黄灏然³,
尹智帅^{1, 2}

1.
武汉理工大学汽车工程学院武汉 430070
2.
新能源与智能网联汽车湖北省工程技术中心武汉 430070
3.
东风商用车有限公司东风商用车技术中心武汉 430056

基金项目:

湖北省重点研发计划项目 2022BAA081

详细信息

作者简介:
张逸凡（1999—），硕士研究生. 研究方向：目标检测与图像处理. E-mail: zyifan0206@163.com

通讯作者:
聂琳真（1986—），博士，副教授. 研究方向：智能网联汽车、驾驶行为分析等. E-mail: linzhen_nie@whut.edu.cn

中图分类号: U495
计量
- 文章访问数: 231
- HTML全文浏览量: 115
- PDF下载量: 25
- 被引次数: 0
出版历程
- 收稿日期: 2023-10-15
- 网络出版日期: 2024-05-31

A Method of Real-time Detection for Road Traffic Participants Based on an Improved YOLOv5 Algorithm

ZHANG Yifan^{1, 2
,},
NIE Linzhen^{1, 2
, ,},
HUANG Haoran³,
YIN Zhishuai^{1, 2}

1.
College of Automotive Engineering, Wuhan University of Technology, Wuhan 430070, China
2.
Hubei Engineering and Technology Center of New Energy and Intelligent Connected Vehicle, Wuhan University of Technology, Wuhan 430070, China
3.
Dongfeng Commercial Vehicle Technical Center, Dongfeng Commercial Vehicle Co., Ltd., Wuhan 430056, China

摘要

摘要: 从道路监控图像中快速准确地检测交通参与者对于智能交通系统监管道路目标具有重要意义。为解决传统YOLOv5目标检测算法对多种交通参与者目标检测精度低、重叠目标漏检等问题，研究了基于改进YO-LOv5算法的道路交通参与者实时检测方法。为增强浅层网络提取图像特征信息能力，采用融合移动翻转瓶颈卷积（FusedMBC）代替原卷积结构，并通过自注意力机制学习交通参与者的纹理特征；为加强主干网络感知图像空间特征信息的能力，引入坐标注意力机制（CA），使主干网络更加关注图像中交通参与者的语义特征；为使普通卷积拥有感知构造能力，以增强激活空间的灵敏度，采用漏斗激活函数（FReLU）作为卷积层的激活函数，并能够使特征向量进行像素级建模；为增强网络对密集目标的空间特征信息提取能力，在特征融合网络中加入坐标注意力机制，通过注意力捕捉密集目标融合后的空间与通道特征信息，让网络精确定位各个目标。通过对车路协同自动驾驶数据集DAIR-V2X的交通参与者图像进行数据增强预处理，构建用于验证模型性能的测试集2 000张并进行了算法验证。实验结果表明：①改进后的YOLOv5算法平均检测精度达到82.4%，平均召回率达到95%，平均检测速度达到204帧/s。②相比于原始YOLOv5，其在平均检测精度和平均检测速度分别提高了5.8%和33.3%，证实提出的方法能够实现快速准确地检测交通参与者，有助于提升智能交通系统监管交通参与者的能力。
- 智能交通 /
- 交通目标 /
- 交通参与者检测 /
- YOLOv5 /
- 融合移动翻转瓶颈卷积 /
- 坐标注意力机制
Abstract: Rapidly and accurately detecting traffic participants from road surveillance images is of great significance for intelligent transportation systems to monitor road targets. With the aim of solving the issues low detection accuracy and disability of detecting overlapping targets of the original YOLOv5 algorithm for various traffic participants, a real-time detection method of road traffic participants based on an improved YOLOv5 algorithm is proposed. To improve the capacity of shallow network to extract image characteristics, the fused mobile inverted bottleneck convolution (FusedMBC) is adopted to replace the original convolution structure to speed up the reasoning speed of the shallow neural network, and the self-attention mechanism is used to learn the texture features of traffic participants To enhance the ability of backbone network to perceive spatial features of images, the coordinate attention mechanism (CA) is introduced, which makes the backbone network pay more attention to the semantic characteristics of traffic participants in the images. To enable conventional convolution to capture visual layouts and enhance the sensitivity of activation space, the funnel activation function (FReLU) is adopted as the activation function of the convolution layer, and the feature vector can be modeled at the pixel level. To enhance the ability of extracting spatial features for dense targets, a coordinate attention mechanism is introduced to the feature fusion net-work, which captures the spatial and channel feature information of densely fused targets through attention mecha-nism, the network can accurately locate each target. Through data enhancement preprocessing on images of traffic participants based on the data set DAIR-V2X about vehicle-road cooperative and autonomous driving, a test set of 2 000 images is developed to verify the property of the model. Experimental results show that: ①The improved YOLOv5 algorithm has a mean average precision of 82.4%, an average recall rate of 93%, and an average detection speed of 204 frames/s. ②In comparison to the original YOLOv5, its average detection accuracy and average detection speed are increased by 5.8% and 33.3%, respectively. These results verify that the proposed method can detect traffic participants quickly and accurately, which can help to improve the ability of supervising traffic participants for intelligent transportation systems.
- intelligent transportation /
- traffic targets /
- traffic participants detection /
- YOLOv5 /
- fused mobile inverted bottleneck convolution /
- coordinate attention mechanism

HTML全文

图 1 Fused-YOLOv5网络结构

Figure 1. Fused-YOLOv5 network structure

下载: 全尺寸图片幻灯片

图 2 FusedMBC结构

Figure 2. FusedMBC model structure

下载: 全尺寸图片幻灯片

图 3 SiLU激活函数

Figure 3. SiLU activation function

下载: 全尺寸图片幻灯片

图 4 CA注意力机制结构

Figure 4. CA attention mechanism structure

下载: 全尺寸图片幻灯片

图 5 CA注意力机制可视化结果

Figure 5. CA attention mechanism visualization results

下载: 全尺寸图片幻灯片

图 6 数据集标注

Figure 6. Dataset annotation

下载: 全尺寸图片幻灯片

图 7 特征图可视化

Figure 7. Feature map visualization

下载: 全尺寸图片幻灯片

图 8 模型检测精度对比

Figure 8. Comparison of model detection accuracy

下载: 全尺寸图片幻灯片

图 9 检测结果对比

Figure 9. Comparison of test results

下载: 全尺寸图片幻灯片

表 1 样本分类结果

Table 1. Sample classification results

样本类型	正样本	负样本
预测为正样本	True Positive（TP）	False Positive（FP）
预测为负样本	False Negative（FN）	True Negative（TN）

下载: 导出CSV

表 2 目标检测模型性能对比

Table 2. Performance comparison of target detection models

模型	GFLOPs	P/%	R/%	mAP/%	FPS
YOLOv5	17.1	91.2	86.0	76.6	153
CA-YOLOv5	21.8	96.3	88.5	78.8	146
Swin-YOLOv5s	23.4	96.1	88.9	78.3	142
YOLOv5s-EUDSC	31.1	96.0	89.4	78.9	120
YOLOv5+CBAM	56.2	92.8	87.6	77.7	112
Mask RCNN	34.3	97.5	95.4	83.2	87
YOLOv8	29.6	99.2	96.9	84.1	129
本文	11.3	96.9	95.0	82.4	204

下载: 导出CSV

表 3 消融实验结果

Table 3. Ablation experiment results

模型	mAP/%	GFLOPs
YOLOv5	76.6	17.1
YOLOv5+FusedMBC	64.5	8.8
YOLOv5+FusedMBC+FReLU	69.2	8.8
YOLOv5+FusedMBC+FReLU+CA	82.4	11.3

下载: 导出CSV

参考文献(27)

[1]	中国国家统计局. 中国统计年鉴-2022[EB/OL]. (2023-02-15)[2023-06-05]. http://www.stats.gov.cn/sj/ndsj/2022/indexch.htm. National Bureau of Statistics of China. China statistical year-book-2022[EB/OL]. (2023-02-15)[2023-06-05]. http://www.stats.gov.cn/sj/ndsj/2022/indexch.htm. (in Chinese)
[2]	高磊, 李超, 朱成军, 等. 基于边缘对称性的视频车辆检测算法[J]. 北京航空航天大学学报, 2008(9): 1113-1116. https://www.cnki.com.cn/Article/CJFDTOTAL-BJHK200809029.htm GAO L, LI C, ZHU C J, et al. Video vehicle detection algorithm based on edge symmetry[J]. Journal of Beijing University of Aeronautics and Astronautics, 2008(9): 1113-1116. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-BJHK200809029.htm
[3]	黄飞. 基于HOG特征和改进IKSVM的车辆检测算法[J]. 安徽工程大学学报, 2016, 31(5): 28-32. https://www.cnki.com.cn/Article/CJFDTOTAL-AHJD201605007.htm HUANG F. Vehicle detection algorithm based on HOG features and improved IKSVM[J]. Journal of Anhui Engineering University, 2016, 31(5): 28-32. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-AHJD201605007.htm
[4]	王琳. 基于HOG和SVM的车辆检测算法研究[D]. 武汉: 华中科技大学, 2017. WANG L. Research on vehicle detection algorithm based on HOG and SVM[D]. Wuhan: Huazhong University of Science and Technology, 2017. (in Chinese)
[5]	CHENG W C, JHAN D M. A self-constructing cascade classifier with AdaBoost and SVM for pedestriandetection[J]. Engineering Applications of Artificial Intelligence, 2013, 26(3): 1016-1028. doi: 10.1016/j.engappai.2012.08.013
[6]	樊春年, 杜卫平, 刘艳荣. 基于HOG特征结合Adaboost算法的行人检测[J]. 自动化技术与应用, 2018, 37(7): 89-91. https://www.cnki.com.cn/Article/CJFDTOTAL-ZDHJ201807022.htm FAN C N, DU W P, LIU Y R. Pedestrian detection based on HOG features combined with Adaboost algorithm[J]. Automation Technology and Applications, 2018, 37(7): 89-91. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZDHJ201807022.htm
[7]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016.
[8]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2023-06-25]. https://arxiv.org/pdf/1804.02767.pdf.
[9]	BOCHKOVSKI A, WANG CY, LIAO H Y. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2023-06-25]. https://arxiv.org/pdf/2004.10934.pdf.
[10]	Ultralytics. Yolov5[CP/OL]. (2022-02-09)[2023-06-25]. https://github.com/ultralytics/yolov5.
[11]	LIU W, ANGUELOV D, ERHAN D, et al. Ssd: single shot multibox detector[C]. The European Conference on Computer Vision, Amsterdam, Netherlands: Springer, 2016.
[12]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbia, USA: IEEE, 2014.
[13]	REN S, HE K, GIRSHICK R, et al. Faster r-cnn: towards real-time object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2017, 39(6): 1137- 1149.
[14]	邵毅明, 屈治华, 邓天民, 等. 基于CapsNet的行人检测方法及评价[J]. 交通运输系统工程与信息, 2019, 19(3): 54-61. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201903009.htm SHAO Y M, QU Z H, DENG T M, et al. Pedestrian detection method and evaluation based on CapsNet[J]. Journal of Transportation Systems Engineering and Information Technology, 2019, 19(3): 54-61. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201903009.htm
[15]	李永上, 马荣贵, 张美月. 改进YOLOv5s+DeepSORT的监控视频车流量统计[J]. 计算机工程与应用, 2022, 58(5): 271-279. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202205030.htm LI Y S, MA R G, ZHANG M Y. Improved surveillance video traffic statistics of YOLOv5s+DeepSORT[J]. Computer Engineering and Applications, 2022, 58(5): 271-279. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202205030.htm
[16]	马永杰, 马芸婷, 程时升, 等. 基于改进YOLO v3模型与Deep-SORT算法的道路车辆检测方法[J]. 交通运输工程学报, 2021, 21(2): 222-231. https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202102022.htm MA Y J, MA Y T, CHENG S S, et al. Road vehicle detection method based on improved YOLO v3 model and Deep-SORT algorithm[J]. Journal of Transport Information and Safety, 2021, 21(2): 222-231. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202102022.htm
[17]	刘军, 陈岚磊, 李汉冰. 基于类人视觉的多任务交通目标实时检测模型[J]. 汽车工程, 2021, 43(1): 50-58, 67. https://www.cnki.com.cn/Article/CJFDTOTAL-QCGC202101007.htm LIU J, CHEN L L, LI H B. Multi-task real-time detection model of traffic targets based on human-like vision[J]. Auto-motive Engineering, 2021, 43(1): 50-58, 67. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-QCGC202101007.htm
[18]	TAN M, LE Q. Efficientnetv2: smaller models and faster training[C]. International Conference on Machine Learning, Vienna, Austria: ICML, 2021.
[19]	MA N, ZHANG X, SUN J. Funnel activation for visual recognition[C]. Computer Vision–ECCV 2020: 16^th European Conference, Glasgow, UK: Springer, 2020.
[20]	HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA: IEEE, 2021.
[21]	XIAO Z, CHEN W, DU L, et al. An improved detection method of traffic prohibition sign for intelligent vehicles based on YOLOV5s[C]. 2023 7th International Conference on Transportation Information and Safety, Xi'an, China: IEEE, 2023.
[22]	马浩为, 张笛, 李玉立, 等. 基于改进YOLOv5的雾霾环境下船舶红外图像检测算法[J]. 交通信息与安全, 2023, 41(1): 95-104. doi: 10.3963/j.jssn.1674-4861.2023.01.010 MA H W, ZHANG D, LI Y L, et al. Ship infrared image detection algorithm in haze environment based on improved YOLOv5[J]. Journal of Transport Information and Safety, 2023, 41(1): 95-104. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.01.010
[23]	卢鹏, 曹阳, 邹国良, 等. 改进Shufflenetv2_YOLOv5的轻量级SAR图像舰船目标实时检测[J]. 海洋测绘, 2023, 43(1): 58-62, 82. https://www.cnki.com.cn/Article/CJFDTOTAL-HYCH202301013.htm LU P, CAO Y, ZOU G L, et al. Improved real-time detection of ship targets in lightweight SAR images using Shufflenetv2_YOLOv5[J]. Ocean Mapping, 2023, 43(1): 58-62, 82. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HYCH202301013.htm
[24]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Annual Conference on Neural Information Processing Systems, Cambridge, USA: MIT Press, 2017.
[25]	WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module[C]. The European Conference on Computer Vision, Munich, Germany: ECCV, 2018.
[26]	褚文杰. 基于YOLOv5的坦克装甲车辆目标检测关键技术的研究[D]. 北京: 北京交通大学, 2022. CHU W J. Research on key technologies for tank and armored vehicle target detection based on YOLOv5[D]. Beijing: Beijing Jiaotong University, 2022. (in Chinese)
[27]	YU H, LUO Y, SHU M, et al. Dair-v2x: a large-scale dataset for vehicle-infrastructure cooperative 3d object detection[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA: IEEE, 2022.