基于强化学习的交叉口智能网联车多目标通行控制方法

姜涵; 张健; 张海燕; 郝威; 马昌喜

doi:10.3963/j.jssn.1674-4861.2024.01.010

基于强化学习的交叉口智能网联车多目标通行控制方法

doi: 10.3963/j.jssn.1674-4861.2024.01.010

姜涵^{1, 2,},
张健^{1, 2, 3, ,},
张海燕^{1, 2},
郝威⁴,
马昌喜⁵

1.
东南大学江苏省城市智能交通重点实验室南京 211189
2.
东南大学交通学院南京 211189
3.
西藏大学工学院拉萨 850000
4.
长沙理工大学交通运输工程学院长沙 410114
5.
兰州交通大学交通运输学院兰州 730070

基金项目:

国家重点研发计划项目 2021YFB1600504

详细信息

作者简介:
姜涵（2000—），硕士研究生. 研究方向：交通管理与控制. E-mail: jianghan@seu.edu.cn

通讯作者:
张健（1984—），博士，教授. 研究方向：城市智能交通、车联网与车路协同等. E-mail: jianzhang@seu.edu.cn

中图分类号: U491.4
计量
- 文章访问数: 38
- HTML全文浏览量: 19
- PDF下载量: 9
- 被引次数: 0
出版历程
- 收稿日期: 2023-07-16
- 网络出版日期: 2024-05-31

A Multi-objective Traffic Control Method for Connected and Automated Vehicle at Signalized Intersection Based on Reinforcement Learning

JIANG Han^{1, 2
,},
ZHANG Jian^{1, 2, 3
, ,},
ZHANG Haiyan^{1, 2},
HAO wei⁴,
MA changxi⁵

1.
Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing 211189, China
2.
School of Transportation, Southeast University, Nanjing 211189, China
3.
School of Engineering, Tibet University, Lhasa 850000, China
4.
School of Traffic and Transportation Engineering, Changsha University of Science and Technology, Changsha 410114, China
5.
School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China

摘要

摘要: 针对传统控制方法下的智能网联车辆（connected and autonomous vehicle，CAV）在动态交通环境中通行能耗较高且效率较低等问题，研究了基于强化学习的CAV通行控制方法，旨在降低车辆能源消耗，提升车辆通行效率以及行驶舒适度。通过考虑CAV与交叉口信控系统的信息交互和物理环境，收集信号相位和信号配时（SPaT）以及前车速度和位置等信息，构建强化学习框架的状态空间。以电池能量回收的上限作为边界条件，建立CAV的行驶能耗模型，并基于车辆行驶的关键特征指标，如单位时间电能能耗、通行距离以及加速度变化率，设计多目标加权奖励函数。利用层次分析法确定各指标的权重，进而采用深度确定性策略梯度算法对模型进行训练，并通过梯度下降方法对算法参数进行调整和更新。采用SUMO平台开展仿真实验，实验结果表明：在设计的算法控制下的CAV各方面行驶性能最为均衡，相较于DQN算法电能消耗和加速度变化率均值分别降低了9.22%和18.77%；相较于Krauss跟驰模型行程时间缩短了8.39%。本研究提出的CAV通行控制方法在降低车辆能耗、提高行驶效率和舒适性等方面具有较好的可行性和有效性。
- 交通工程 /
- 智能网联车辆 /
- 车辆控制 /
- 强化学习 /
- 信号交叉口
Abstract: To address the issue of high energy consumption and low efficiency of connected and autonomous vehicles (CAV) in dynamic traffic environments under traditional control methods, a reinforcement learning-based control approach for CAV is proposed, aiming at reducing energy consumption, improving travel efficiency, and enhancing driving comfort. By considering the interactions between CAV and traffic signal control systems, as well as physical environmental factors, we collect signal phase and timing (SPaT), preceding vehicle speed and position, and other information to establish the state space of the reinforcement learning framework. Furthermore, an energy consumption model is established with the limit of battery energy recovery, and a multi-objective weighted reward function is designed based on key performance indicators such as energy consumption per unit time, travel distance, and acceleration change rate. The optimal weights for each performance indicator are determined using the analytic hierarchy process, and the model is trained using a deep deterministic policy gradient algorithm, with the algorithm parameters optimized through gradient descent. Simulation experiments were carried out using the SUMO platform the results demonstrate that the proposed algorithm achieves the most balanced travel performance, with a 9.22% reduction in energy consumption and an 18.77% reduction in change rate of acceleration compared to the DQN algorithm, as well as an 8.39% reduction in travel time compared to the Krauss car-following model. In conclusion, the results validate the feasibility and effectiveness of the proposed CAV control approach in reducing energy consumption, improving travel efficiency, and enhancing driving comfort.
- traffic engineering /
- connected autonomous vehicles /
- vehicle control /
- reinforcement learning /
- signalized intersection

HTML全文

图 1 交叉口车辆通行场景示意图

Figure 1. Schematic diagram of traffic scene at intersection

下载: 全尺寸图片幻灯片

图 2 DDPG算法网络结构

Figure 2. Network structure of ddpg algorithm

下载: 全尺寸图片幻灯片

图 3 智能网联车控制算法流程图

Figure 3. Flow chart of cav control algorithm

下载: 全尺寸图片幻灯片

图 4 仿真平台架构

Figure 4. Architecture of simulation platform

下载: 全尺寸图片幻灯片

图 5 多信号交叉口仿真场景

Figure 5. Multi-intersection simulation scene

下载: 全尺寸图片幻灯片

图 6 仿真结果分指标对比

Figure 6. Comparison of simulation results by indexes

下载: 全尺寸图片幻灯片

图 7 不同跟驰模式下CAV行驶轨迹

Figure 7. The trajectories of CAV under different car-following modes

下载: 全尺寸图片幻灯片

图 8 DDPG控制下CAV沿信号走廊的行驶轨迹

Figure 8. The trajectory of CAV along signal corridor under DDPG control

下载: 全尺寸图片幻灯片

图 9 不同跟驰模式下CAV行驶速度随时间的变化

Figure 9. Variation of CAV's speed with time in different car-following modes

下载: 全尺寸图片幻灯片

图 10 不同跟驰模式下CAV加速度变化率随时间的变化

Figure 10. Variation of CAV acceleration rate with time in different car-following modes

下载: 全尺寸图片幻灯片

表 1 状态空间的参数及含义

Table 1. Parameters and description of state space

参数	含义说明
车辆速度v(t)	涉及车辆的能耗和效率
车辆行驶距离d(t)	涉及车辆的能耗和效率
车辆加速度a_t	涉及车辆的舒适性。
前后车速度差Δv_t	涉及车辆的安全性
前后车间隔距离Δx_t	涉及车辆的安全性
交叉口当前相位绿灯剩余时长σ(t)	涉及车辆的效率和安全性。若剩余时长小于车辆以最高允许速度通过交叉口所需时间，则车辆需缓慢减速至停车，否则车辆可适当加速以更快通过交叉口

下载: 导出CSV

表 2 各指标相对重要性系数

Table 2. Relative importance coefficient of each index

指标	电能消耗	通行效率	驾驶舒适度	安全性
电能消耗	1	3	2	1/3
通行效率	1/3	1	1/2	1/3
驾驶舒适度	1/2	2	1	1/3
安全性	3	3	3	1

下载: 导出CSV

表 3 仿真参数设置

Table 3. Simulation parameter settings

参数	取值
道路总长L/m	2 200
相邻交叉口间距D/m	800
HV设计小时交通量q/（veh/h）	1 600
HV车身长度l_HV/m	5
HV车体重量m_HV/kg	2 000
HV驾驶人熟练度sigma	0.5
HV驾驶人反应时间tau/s	1
CAV车身长度l_CAV/m	5
CAV车体重量m_CAV/kg	2 000
CAV车辆前表面积S_CAV/m²	2.600
车辆速度v(t)/（km/h）	(0, 30)
车辆行驶距离d(t)/m	（0，2 200）
车辆加速度a_t/（m/s²）	（-4.500，4.500）
前后车相对速度Δv_t/（km/h）	（0，30）
前后车间距Δx_t/m	（0，300）
当前相位绿灯剩余时长σ(t)/s	（0，40）
空气阻力系数c_d	0.250
滚动阻力系数c_r	0.005
弯道阻力系数c_c	0.300
能量回收因子μ	0.350
重力加速度g/(m/s²)	9.800

下载: 导出CSV

表 4 不同跟驰模式下的仿真数据

Table 4. Simulation data under different car-following modes

跟驰模式	电能总消耗/Wh	行程时间/s	平均速度/（km/h）	加速度变化率均值/（m/s³）
Krauss	211.326	441	17.959	1.249
DDPG	217.627	404	19.639	1.199
DQN	239.185	442	17.918	1.476
A2C	316.511	478	16.565	2.917

下载: 导出CSV

参考文献(21)

[1]	GABRIEL R D C, PAOLO F, ROBERT H, et al. Traffic coor-dination at road intersections: autonomous decision-making algorithms using model-based heuristics[J]. IEEE Intelligent Transportation Systems Magazine, 2017, 9(1): 8-21. doi: 10.1109/MITS.2016.2630585
[2]	SABOOHI Y, FARZANEH H. Model for developing an eco-driving strategy of a passenger vehicle based on the least fuel consumption[J]. Applied Energy, 2008, 86 (10): 1925-1932.
[3]	袁伟, 张雅丽, 王虹霞, 等. 纯电动公交车交叉口节能驾驶策略[J]. 中国公路学报, 2021, 34(7): 54-66. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202107005.htm YUAN W, ZHANG Y L, WANG H X, et al. Energy-saving driving technique for pure electric buses in intersection[J]. China Journal of Highway and Transport, 2021, 34(7): 54-66. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202107005.htm
[4]	XIA H, BORIBOONSOMSIN K, BARTH M. Dynamic eco-driving for signalized arterial corridors and its indirect network-wide energy/emissions benefits[J]. Journal of Intelligent Transportation Systems, 2013, 17(1): 31-41. doi: 10.1080/15472450.2012.712494
[5]	WU X K, HE X Z, YU G Z, et al. Energy-optimal speed control for electric vehicles on signalized arterials[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(5): 2786-2796. doi: 10.1109/TITS.2015.2422778
[6]	YANG J, ZHAO D, JIANG J, et al. A less-disturbed ecological driving strategy for connected and automated vehicles[J]. IEEE Transactions on Intelligent Vehicles. 2023, 8(1): 413-424. doi: 10.1109/TIV.2021.3112499
[7]	LI M, WU X K, HE X Z, et al. An eco-driving system for electric vehicles with signal control under V2X environment[J]. Transportation Research Part C: Emerging Technologies, 2018, 93: 335-350. doi: 10.1016/j.trc.2018.06.002
[8]	MOUSA S R, ISHAK S, MOUSA R M, et al. Deep reinforcement learning agent with varying actions strategy for solving the eco-approach and departure problem at signalized intersections[J]. Transportation Research Record: Journal of the Transportation Research Board, 2020, 2674(8): 119-131. doi: 10.1177/0361198120931848
[9]	吴超仲, 冷姚, 陈志军, 等. 基于强化学习的智能车人机共融转向驾驶决策方法[J]. 交通运输工程学报, 2022, 22(3): 55-67. https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202203004.htm WU C Z, LENG Y, CHEN Z J, et al. Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning[J]. Journal of Traffic and Transportation Engineering, 2022, 22(3): 55-67. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202203004.htm
[10]	SHI J Q, QIAO F X, LI Q, et al. Application and evaluation of the reinforcement learning approach to eco-driving at intersections under infrastructure-to-vehicle communications[J]. Transportation Research Record: Journal of the Transportation Research Board, 2018, 2672(25): 89-98. doi: 10.1177/0361198118796939
[11]	陆丽萍, 程垦, 褚端峰, 等. 基于竞争循环双Q网络的自适应交通信号控制[J]. 中国公路学报, 2022, 35(8): 267-277. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202208025.htm LU L P, CHENG K, CHU D F, et al. Adaptive traffic signal control based on dueling recurrent double Q network[J]. China Journal of Highway and Transport, 2022, 35(8): 267-277. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202208025.htm
[12]	陈越, 焦朋朋, 白如玉, 等. 基于深度强化学习的自动驾驶车辆跟驰行为建模[J]. 交通信息与安全, 2023, 41(2): 67-75, 102. doi: 10.3963/j.jssn.1674-4861.2023.02.007 CHEN Y, JIAO P P, BAI R Y, et al. Modeling car following behavior of autonomous driving vehicles based on deep reinforcement learning[J]. Journal of Transport Information and Safety, 2023, 41(2): 67-75, 102. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.02.007
[13]	WU T, YUAN Y L. Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (8): 8243-8256. doi: 10.1109/TVT.2020.2997896
[14]	ZHOU M F, YU Y, QU X B. Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: a reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(1): 433-443. doi: 10.1109/TITS.2019.2942014
[15]	GUO Q Q, OHAY A, LIU Z J, et al. Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors[J]. Transportation Research Part C: Emerging Technologies, 2021, 124: 2-18.
[16]	KURCZVEIL T, LÓPEZ P Á, SCHNIEDER E. Implementation of an energy model and a charging infrastructure in SUMO[C]. Simulation of Urban Mobility User Conference, Berlin, Germany: Springer, 2013.
[17]	ZHAO W M, DONG N, SIMON S, et al. A platoon based co-operative eco-driving model for mixed automated and human-driven vehicles at a signalized intersection[J]. Transportation Research Part C: Emerging Technologies, 2018, 95: 802-821.
[18]	吕能超, 王玉刚, 周颖, 等. 道路交通安全分析与评价方法综述[J]. 中国公路报, 2023, 36(4): 183-201. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202304016.htm LYU N C, WANG Y G, ZHOU Y, et al. Review on road traffic safety analysis and evaluation method[J]. China Journal of Highway and Transport, 2023, 36(4): 183-201. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202304016.htm
[19]	ZHANG J, WU K R, CHENG M, et al. Safety evaluation for connected and autonomous vehicles' exclusive lanes considering penetrate ratios and impact of trucks using surrogate safety measures[J]. Journal of Advanced Transportation, 2020(2): 1-16.
[20]	LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6): 1-14.
[21]	GARCIA A G, TRIA L A R, TALAMPAS M C R. Development of an energy-efficient routing algorithm for electric vehicles[C]. 2019 IEEE Transportation Electrification Conference and Expo(ITEC), Michigan, USA: IEEE, 2019.