RMIT University
Browse

Learning to Optimise Routing Problems using Policy Optimisation

conference contribution
posted on 2024-11-03, 14:31 authored by Nasrin Sultana, Jeffrey ChanJeffrey Chan, Tabinda Sarwar, Kyle Qin
Deep reinforcement learning (DRL) has demonstrated promising performance to learn effective heuristics to solve complex combinatorial optimisation problems via policy networks. However, traditional reinforcement learning (RL) suffers from insufficient exploration, which often results in pre-convergence to poor policies and many challenges the performance of DRL. To prevent this, we propose an Entropy Regularised Reinforcement Learning (ERRL) method that supports exploration by providing more stochastic policies, improving optimisation. The ERRL method incorporates an entropy term, defined over the policy network's outputs, into the loss function of the policy network. Hence, policy exploration can be explicitly advocated subjected to a balance to maximise the reward. As a result, the risk of pre-convergence to inferior policies can be reduced. We implement the ERRL method based on two existing DRL algorithms. We have compared the performances of our implementations with the two DRL algorithms along with several state-of-the-art heuristic-based non-RL approaches for three categories of routing problems, i.e., travelling salesman problem (TSP), capacitated vehicle routing problem (CVRP) and multiple routing with fixed fleet problems (MRPFF). Experimental results show that the proposed method can find better and faster solutions in most test cases than the state-of-the-art algorithms.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/IJCNN52387.2021.9534010
  2. 2.
    ISBN - Is published in 9781665445979 (urn:isbn:9781665445979)

Start page

6043

End page

6050

Total pages

8

Outlet

Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN 2021)

Name of conference

IJCNN 2021

Publisher

IEEE

Place published

United States

Start date

2021-07-18

End date

2021-07-22

Language

English

Copyright

© 2021 IEEE

Former Identifier

2006110650

Esploro creation date

2022-02-19

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC