Journal of Chongqing University of Technology(Natural Science) ›› 2023, Vol. 37 ›› Issue (12): 260-266.
• Intelligent Technology • Previous Articles Next Articles
Online:
Published:
Abstract: The Traveling Salesman Problem with Time Window (TSPTW), widely applied in material distribution, is a variant of the traveling salesman problem. To remedy such problems as long solution time and poor generalization of the traditional method as well as to to improve the solution efficiency of TSPTW, this paper models the solution process as a Markov decision process, defines the state, action and reward, and proposes a deep reinforcement learning based Transformer + pointer network model, which encodes the input features through multi-head attention, and employs the pointer network to work out the probability distribution of the solution. The deep learning network is trained by reinforcement learning algorithm. The experimental results show the proposed method obtains higher quality solutions compared with the traditional heuristic algorithms. Moreover, it markedly improves the final results and easily transfers the model to other problems of different scales compared with solvers and traditional heuristic algorithms.
CLC Number:
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://clgzk.qks.cqut.edu.cn/EN/
http://clgzk.qks.cqut.edu.cn/EN/Y2023/V37/I12/260
Cited