基于深度确定性梯度算法的端到端自动驾驶策略

An end-to-end autonomous driving strategy based on the deep deterministic gradient algorithm

Online:2023-02-16 Published:2023-02-16

The continuous progress of artificial intelligence has pushed cars into an intelligent era. At present, the common automatic driving scheme adopts a hierarchical architecture of perception decision control. Such a model has many difficulties: 1. Rule-based strategies require a lot of manual design, which not only requires complex processes, but also costs a lot; 2. It is unable to adapt to a densely populated and complex urban traffic environment; 3. The lower module is closely connected with the upper module, and the maintenance of the system is cumbersome and miscellaneous. In view of these problems, this paper uses the Carla urban driving simulator to conduct simulation experiments on lane keeping tasks of intelligent driving with a deep deterministic policy gradient algorithm, aiming to solve the problem of over dependence on traditional upper and lower modules through end-to-end control methods. Secondly, because the algorithm is a combination of reinforcement learning and deep learning, according to the characteristics of reinforcement learning, irregular trial and error are required in the training process, and they are too costly for vehicle driving. Therefore, in view of the characteristic that the deep deterministic policy gradient algorithm needs trial and error, based on this algorithm, a real-time monitor for dangerous automobile behaviors is designed between the environment and the agent, which can constrain and correct the dangerous behaviors of the agent so as to reduce trial and error behaviors and improve the training efficiency.

The deep deterministic policy gradient algorithm and the supervised deep deterministic policy gradient algorithm have trained 70 000 episodes respectively in the Carla simulation environment. The simulation results show that the two algorithms have finally achieved the same training effect, can effectively avoid obstacles, and make the driver drive normally without violating driving rules, but the latter has a faster convergence speed. Secondly, taking the map, the number of dynamic factors and weather as the control variables, the two algorithm models are evaluated and tested under a unified evaluation scheme of the experimental platform with a lane keeping task. Finally, the supervised deep deterministic policy gradient algorithm achieves 98% and 89% of the average task completion in the environment without and with dynamic factors respectively, while the DDPG achieves 97% and 88% of the average task completion respectively also in the two above mentioned environments.

The deep deterministic policy gradient algorithm is used to control the autonomous vehicle end to end, which not only effectively improves the disadvantages of the heavy dependence on the upper and lower modules in the traditional scheme, but also shortens the development cycle. Although the final control effect of the supervised reinforcement learning is the same as that of the original algorithm, it significantly improves the convergence speed and effectively reduces the early trial and error frequency of agents. Therefore, the combination of supervised learning and reinforcement learning can provide a new solution to reduce the risk of trial and error in reinforcement learning, and provide a certain reference value for the realization of end-to-end intelligent driving from simulation to the practical application of in-deep reinforcement learning.

[1]	. Research on intelligent vehicle target detection algorithm based on binocular vision [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(11): 11-19.
[2]	. 3D complex road target detection method by fusing PointPillar network and DETR [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(11): 32-39.
[3]	. Performance optimization of NOP cruise speed control for safety of the intended functionality [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(11): 51-63.
[4]	. A target detection algorithm based on camera and LiDAR data fusion [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(10): 81-88.
[5]	. Lateral tracking control of a driverless four-wheel steering vehicle under extreme conditions [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(1): 66-74.
[6]	. Research on the integrated vehicle positioning algorithm in complex urban environments [J]. Journal of Chongqing University of Technology(Natural Science), 2023, 37(1): 9-18.
[7]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2021, 35(11): 41-48.
[8]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2021, 35(3): 53-60.
[9]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2020, 34(12): 18-26.
[10]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2020, 34(12): 27-35.
[11]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2019, 33(9): 86-94.
[12]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2020, 34(1): 1-8.
[13]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2019, 33(12): 8-17.
[14]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2020, 34(9): 73-81.
[15]	. [J]. Journal of Chongqing University of Technology(Natural Science), 2020, 34(10): 9-16.

An end-to-end autonomous driving strategy based on the deep deterministic gradient algorithm

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Research on intelligent vehicle target detection algorithm based on binocular vision

3D complex road target detection method by fusing PointPillar network and DETR

Performance optimization of NOP cruise speed control for safety of the intended functionality

A target detection algorithm based on camera and LiDAR data fusion

Lateral tracking control of a driverless four-wheel steering vehicle under extreme conditions

Research on the integrated vehicle positioning algorithm in complex urban environments

Metrics

Comments

Recommended 10