留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于XGBoost的高速公路事故类型及严重程度预测方法

高雪林 汤厚骏 沈佳平 徐铖铖 张玉杰

高雪林, 汤厚骏, 沈佳平, 徐铖铖, 张玉杰. 基于XGBoost的高速公路事故类型及严重程度预测方法[J]. 交通信息与安全, 2023, 41(4): 55-63. doi: 10.3963/j.jssn.1674-4861.2023.04.006
引用本文: 高雪林, 汤厚骏, 沈佳平, 徐铖铖, 张玉杰. 基于XGBoost的高速公路事故类型及严重程度预测方法[J]. 交通信息与安全, 2023, 41(4): 55-63. doi: 10.3963/j.jssn.1674-4861.2023.04.006
GAO Xuelin, TANG Houjun, SHEN Jiaping, XU Chengcheng, ZHANG Yujie. A Method for Predicting the Type and Severity of Freeway Accidents Based on XGBoost[J]. Journal of Transport Information and Safety, 2023, 41(4): 55-63. doi: 10.3963/j.jssn.1674-4861.2023.04.006
Citation: GAO Xuelin, TANG Houjun, SHEN Jiaping, XU Chengcheng, ZHANG Yujie. A Method for Predicting the Type and Severity of Freeway Accidents Based on XGBoost[J]. Journal of Transport Information and Safety, 2023, 41(4): 55-63. doi: 10.3963/j.jssn.1674-4861.2023.04.006

基于XGBoost的高速公路事故类型及严重程度预测方法

doi: 10.3963/j.jssn.1674-4861.2023.04.006
基金项目: 

国家自然科学基金项目 52172343

江苏省自然科学基金项目 BK20211515

江苏省重点研发计划项目 BE2022080

详细信息
    作者简介:

    高雪林(1997—),硕士研究生.研究方向交通运输规划与管理. E-mail: 220213479@seu.edu.cn

    通讯作者:

    徐铖铖(1987-),博士,教授.研究方向:交通安全、交通环境、智能交通等. E-mail: xuchengcheng@seu.edu.cn

  • 中图分类号: U491.3

A Method for Predicting the Type and Severity of Freeway Accidents Based on XGBoost

  • 摘要: 高速公路事故频发,而以往研究未能充分揭示交通流动态特性对事故类型与严重程度的影响。为此研究了基于动态交通流数据的高速公路事故类型与严重程度的预测方法。从高速公路门架数据中提取流量、密度、速度等交通流数据,同时考虑时间特征以及时间和空间不均匀性特征的数据,与事故数据相匹配构成全样本。建立了基于极端梯度提升树(extrem Gradient Boosting,XGBoost)算法的预测模型,预测事故是否发生、事故类型以及事故严重程度。分别考虑追尾事故和其他事故2种事故类型、有人员伤亡和仅财产损失2种事故严重程度,模型的结果表明:①上下游速度差大、低速、路段车流量大且频繁分流、合流条件下交通事故风险较高;②低速、路段车辆多且合流、分流交通量大、上下游速度差大的情况下发生追尾事故的风险更高;③路段车流量较少且追尾事故发生于周末或夜间可能会增大事故严重程度。将常用机器学习算法与XGBoost算法的预测效果进行对比,XGBoost事故类型预测模型与事故严重程度预测模型的ROC曲线下面积(Area Under Curve,AUC)分别达到了0.76和0.88——相比于序列Logistic、高斯朴素贝叶斯、线性SVM、随机森林以及神经网络等其他常用算法,平均分别提升了0.08和0.24。这表明基于XGBoost建立的模型具有较好的预测性能。研究结果为高速公路路段实时交通流状态预警提供了可靠手段,进而可以提升高速公路行车安全。

     

  • 图  1  基于XGBoost的事故预测模型框架图

    Figure  1.  The framework of accident prediction model based on XGBoost

    图  2  论文总体框架图

    Figure  2.  Overall framework of the thesis

    图  3  混淆矩阵结果

    Figure  3.  Confusion Matrix results

    图  4  模型的ROC曲线

    Figure  4.  ROC curve of the models

    图  5  SHAP特征分析

    Figure  5.  Feature analysis of SHAP

    图  6  双变量交互特征分析分析

    Figure  6.  Analysis of bivariate interaction characteristics

    图  7  XGBoost预测模型的正样本特征贡献图

    Figure  7.  Positivesamplefeature contributiondiagram of XGBoost prediction models

    表  1  变量含义描述

    Table  1.   Description of variable meanings

    名称 含义
    Night 样本对应的时间(夜晚则取值为1;白天则取值为0)
    DayOfWeek 样本对应的日期(周末则取值为1;工作日则取值为0)
    LossNum 只通过该断面而未通过下游断面的交通量(可视为分流交通量,veh/min)
    NewNum 未通过该断面而通过下游断面的交通量(可视为合流交通量,veh/min)
    Dens 对应分钟平均密度[veh/(km·lane)
    CellFlow 路段总流量[veh/(h·lane)
    SecCount 分钟断面交通量(veh/min)
    InputGini 车辆到达的Gini系数(车头时距替代指标,表示时间不均匀性)
    FlowRateRatio 分钟断面流率比(分钟断面交通量·60·24 / 日累计交通量)
    DailyCountRatio 日断面流量比(日累计交通量与其均值之比)
    CvLaneNum 车道交通量变异系数(表示横向空间不均匀性)
    DensNum 对应分钟结束时刻的路段车辆数(veh)
    Conservation 路段总流入交通量与总流出交通量的差(veh/min, 纵向空间不均匀性)
    MeanSpeed 通过该门架的车辆到达下1个门架的平均速度(km/h)
    SpeedDifference 上下游速度差(km/h)
    下载: 导出CSV

    表  2  特征筛选结果

    Table  2.   Feature selection results

    模型1采用特征 模型2采用特征 模型3采用特征
    NewNum MeanSpeed DensNum
    LossNum LossNum DayOfWeek
    MeanSpeed FlowRateRatio CellFlow
    CvLaneNum InputGini Night
    SecCount SpeedDifference DailyCountRatio
    SpeedDifference NewNum LossNum
    DayOfWeek Dens Conservation
    下载: 导出CSV

    表  3  XGBoost模型最佳参数取值

    Table  3.   Optimal parameter values for XGBoost models

    模型 基学习器数量(n_estimators) 学习率(learning_rate) 树最大深度(max_depth) 最小叶子权重(min_child_weight) 训练样本占总样本比例(subsample) 列采样率(colsample_bytree) 最小损失函数下降(gamma)
    模型1 500 0.13 4 3 0.88 0.9 0.6
    模型2 400 0.01 8 6 0.75 0.75 6
    模型3 400 0.25 8 3 0.9 0.8 0.7
    下载: 导出CSV

    表  4  模型结果对比

    Table  4.   Comparison of the models' results

    预测算法 事故风险预测模型 事故类型预测模型 事故严重程度预测模型
    准确率 AUC值 准确率 AUC值 准确率 AUC值
    XGBoost 0.97 0.96 0.72 0.76 0.94 0.88
    序列Logistic 0.92 0.70 0.65 0.71 0.94 0.72
    高斯朴素贝叶斯 0.83 0.67 0.62 0.69 0.84 0.54
    线性SVM 0.91 0.72 0.64 0.69 0.94 0.61
    随机森林 0.92 0.76 0.61 0.66 0.94 0.67
    神经网络 0.92 0.74 0.61 0.67 0.88 0.66
    下载: 导出CSV
  • [1] 马壮林, 邵春福, 胡大伟, 等. 高速公路交通事故起数时空分析模型[J]. 交通运输工程学报, 2012, 12(2): 93-99. doi: 10.3969/j.issn.1671-1637.2012.02.015

    MA Z L, SHAO C F, HU D W, et al. Temporal-spatial analysis model of traffic accident frequency on expressway[J]. Journal of Traffic and Transportation Engineering, 2012, 12 (2): 93-99. (in Chinese) doi: 10.3969/j.issn.1671-1637.2012.02.015
    [2] 孟祥海, 张晓明, 郑来. 基于线形与交通状态的山区高速公路追尾事故预测[J]. 中国公路学报, 2012, 25(4): 113-118. doi: 10.3969/j.issn.1001-7372.2012.04.020

    MENG X H, ZHANG X M, ZHENG L. Prediction of rear-end collision on mountainous expressway based on geometric alignment and traffic conditions[J]. China Journal of Highway and Transport, 2012, 25(4): 113-118. (in Chinese) doi: 10.3969/j.issn.1001-7372.2012.04.020
    [3] 张璇, 唐进君, 黄合来, 等. 山区高速公路隧道路段与开放路段的事故影响因素分析[J]. 交通信息与安全, 2022, 40(3): 10-18. doi: 10.3963/j.jssn.1674-4861.2022.03.002

    ZHANG X, TANG J J, HUANG H L, et al. An analysis of influential factors of crashes at tunnels and open sections of mountainous freeways[J]. Journal of Transport Information and Safety, 2022, 40(3): 10-18. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.03.002
    [4] 马壮林, 张宏璐, 张祎祎, 等. 高速公路路侧事故起数预测模型[J]. 长安大学学报(自然科学版), 2017, 37(4): 119-126. doi: 10.3969/j.issn.1671-8879.2017.04.016

    MA Z L, ZHANG H L, ZHANG Y Y, et al. Roadside accident frequency prediction model on expressway[J]. Journal of Chang'an University(Natural Science Edition), 2017, 37(4): 119-126. (in Chinese) doi: 10.3969/j.issn.1671-8879.2017.04.016
    [5] 高昆. 基于交通流的实时交通状态辨识及事故风险预警模型研究[D]. 西安: 长安大学, 2019.

    GAO K. Research on real-time traffic state identification and accident risk early warning model based on traffic flow[D]. Xi'an: Chang'an University, 2019. (in Chinese)
    [6] 马聪, 张生瑞, 马壮林, 等. 高速公路交通事故非线性负二项预测模型[J]. 中国公路学报, 2018, 31(11): 176-185. doi: 10.3969/j.issn.1001-7372.2018.11.019

    MA C, ZHANG S R, MA Z L, et al. Nonlinear negative binomial regression model of expressway traffic accident frequency prediction[J]. China Journal of Highway and Transport, 2018, 31(11): 176-185. (in Chinese) doi: 10.3969/j.issn.1001-7372.2018.11.019
    [7] 王洁, 曲晓黎, 张金满. 河北高速公路交通事故特征及其气象预警模型[J]. 干旱气象, 2020, 38(2): 339-345. https://www.cnki.com.cn/Article/CJFDTOTAL-GSQX202002020.htm

    WANG J, QU X L, ZHANG J M. Characteristics of expressway traffic accident and meteorological warning model based on logistic regression in Hebei Province[J]. Journal of Arid Meteorology, 2020, 38(2): 339-345. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GSQX202002020.htm
    [8] LEE C, ABDEL-ATY M, HSIA L. Potential real-time indicators of sideswipe crashes on freeways[J]. Transportation Research Record Journal of the Transportation Research Board, 2006, 1953, 1: 41-49.
    [9] ABDEL-ATY MA, HASSAN H M, AHMED M, et al. Real-time prediction of visibility related crashes[J]. Transportation Research Part C: EmergingTechnologies, 2012, 24: 288-298. doi: 10.1016/j.trc.2012.04.001
    [10] SUN J, SUN J. A dynamic Bayesian network model for real-time crash prediction using traffic speed conditions data[J]. Transportation Research Part C: Emerging Technologies, 2015, 54: 176-186. doi: 10.1016/j.trc.2015.03.006
    [11] KIM D, JUNG SY, YOON S. Risk predictionfor winterroad accidents on expressways[J]. Appl Sci-Basel, 2021, 11(20): 9534.
    [12] WANG C, ZHONG M, ZHANG H, et al. Impacts of real-time traffic state on urban expressway crashes by collision and vehicle type[J]. Sustainability, 2022, 14(4): 2238.
    [13] WAKATSUKI Y, TATEBE J, XING J. Improving the accuracy of traffic accident prediction models on expressways by considering additional information[J]. International Journal of IntelligentTransportation Systems Research, 2022, 20(1): 309-319.
    [14] WANG L, ZOU L J, ABDEL-ATY M, et al. Expressway rear-end crash risk evolution mechanism analysis under different traffic states[J]. Transportmetrica B: Transport Dynamics, 2022, 11(1): 510-527.
    [15] QU X, WANG W, WANG W F, et al. Real-time freeway sideswipe crash prediction by support vector machine[J]. IET Intelligent Transport Systems, 2013, 7(4): 445-453.
    [16] LI Z B, WANG W, CHEN R Y, et al. Conditional inference tree-based analysis of hazardous traffic conditions for rear-end and sideswipe collisions with implications for control strategies on freeways[J]. IET Intelligent Transport Systems, 2014, 8(6): 509-518.
    [17] YANG B, LIU P, CHAN C Y, et al. Identifying the crash characteristics on freeway segments based on different ramp influence areas[J]. Traffic Injury Prevention, 2019, 20(4): 386-391.
    [18] GUO Y Y, LI Z B, LIU P, et al. Exploring risk factors with crashes by collision type at freeway diverge areas: accounting for unobserved heterogeneity[J]. IEEE Access, 2019(7): 11809-11819.
    [19] YE F, CHENG W, WANG C S, et al. Investigating the severity of expressway crash based on the random parameter logit model accounting for unobserved heterogeneity[J]. Advances inMechanicalEngineering, 2021, 13 (12): 16878140211067278.
    [20] WANG Y G, LUO X Y. Analyzing rear-end crash severity for a mountainous expressway in China via a classification and regression tree with random forest approach[J]. Archives of Civil Engineering, 2021, 67(4): 591-604.
    [21] LIU B, MENG Y W, WANG H H, et al. Analysis of the influencing factors of traffic accidents based on the logistics method[C]. International Conference on Smart Transportation and City Engineering, Chongqing, China: SPIE, 2021.
    [22] PANDE A, NUWORSOO C, SHEW C. Proactive assessment of accident risk to improve safety on a system of freeways[R]. California, USA: Mineta Transportation Institute, 2012.
    [23] 孙剑, 孙杰. 城市快速路实时交通流运行安全主动风险评估[J]. 同济大学学报(自然科学版), 2014, 42(6): 873-879. https://www.cnki.com.cn/Article/CJFDTOTAL-TJDZ201406008.htm

    SUN J, SUN J. Proactive assessment of real-time traffic flow accident risk on urban expressway[J]. Journal of Tongji University(Natural Science Edilion), 2014, 42(6): 873-879. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TJDZ201406008.htm
    [24] 马新露, 樊博, 陈诗敖, 等. 基于实时交通流的事故风险评估与分析模型[J]. 华南理工大学学报(自然科学版), 2021, 49(8): 19-25, 34. https://www.cnki.com.cn/Article/CJFDTOTAL-HNLG202108003.htm

    MA X L, FAN B, CHEN S A, et al. Evaluation and analysis model for freeways crash risk based on real-time traffic flow[J]. Journal of South China University of Technology(Natural Science Edition), 2021, 49(8): 19-25, 34. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HNLG202108003.htm
    [25] CHEN T Q, GUESTRIN C. XGBoost: A scalable tree boosting system[C]. The 22nd ACM SigkddInternational Conferenceon Knowledge Discovery and Data Mining, San Francisco, California, USA: Association for Computing Machinery, 2016.
    [26] LUNDBERG S M, LEE S-I. A unified approach to interpreting model predictions[C]. Advances in Neural Information Processing Systems(NIPS 2017), Long Beach, California, USA: ACM, 2017.
  • 加载中
图(7) / 表(4)
计量
  • 文章访问数:  662
  • HTML全文浏览量:  362
  • PDF下载量:  50
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-08-12
  • 网络出版日期:  2023-11-23

目录

    /

    返回文章
    返回