RMIT University
Browse

Enhancing combinatorial optimization through solution prediction using machine learning

Download (1.59 MB)
thesis
posted on 2024-11-24, 05:19 authored by Yunzhuang Shen
This thesis aims to enhance traditional combinatorial optimization techniques with the use of machine learning (ML) for solution prediction. Combinatorial optimization is essential for solving problems that impact our daily lives. It helps make wise decisions in complex and pressing situations, leading to improved outcomes in various fields such as healthcare, transportation, environmental sustainability, energy productivity, and businesses. However, many of these real-world problems are characterized by NP-hardness, and their ever-increasing sizes make it challenging to find (near-)optimal solutions. Devising efficient and effective methods often requires leveraging the unique structure of a problem. However, to date, specialized methods have been developed for only a limited number of problems. Many real-world challenging problems remain to be tackled by general-purpose solvers. Such a solver typically relies on sophisticated implementations of mixed-integer (linear) programming (MIP) frameworks, and its efficiency heavily depends on the effectiveness of heuristic rules to address its critical decision points. For instance, primal heuristics are responsible for finding good feasible solutions, which have been demonstrated to significantly influence the performance of exact MIP solvers. However, these heuristics are designed without incorporating adequate problem-specific knowledge, primarily due to the difficulty and prohibitive effort required to obtain such knowledge. Additional tools to improve the efficiency in the creation of effective heuristics are important for continued progress in the field of combinatorial optimization. ML holds great potential to address this challenge. It can automatically discover valuable patterns in data and leverage them to create adequate solutions, for a variety of challenging tasks such as image classification. In the context of combinatorial optimization, many real-world applications require solving similar problem instances repeatedly, resulting in rich data with exploitable statistical patterns. This provides opportunities to use ML to discover effective heuristics, for constructing high-quality solutions directly or enhancing critical decision-making processes in MIP solvers. The research in this thesis is centered around ML-based solution prediction and its integration with traditional combinatorial optimization techniques. Solution prediction involves training an ML model offline using problem instances with known optimal solutions and then employing the model to predict optimal solutions for previously unseen, yet similar, problem instances. While a few recent studies have demonstrated potential in enhancing heuristic methods through solution prediction, the effectiveness of such approaches can be significantly affected by the accuracy of the ML predictions and can rapidly decline as problem complexity (e.g., size) increases. Additionally, the impact of the utilization of ML predictions on solution quality has not been adequately investigated in existing research. In the initial phase of our research, we concentrate on addressing these challenges: enhancing the accuracy of ML predictions and optimizing their utilization in search processes. Our overarching objective is to develop more effective solution-prediction-based heuristic searches for large-scale, general combinatorial problems. The second part of this research focuses on accelerating exact combinatorial optimization that delivers provably optimal solutions. In particular, we investigate the use of ML-based solution prediction to enhance a widely-utilized decomposition method, column generation (CG). CG is primarily used for solving large-scale routing and scheduling applications, which have emerged in industries such as the airline sector. By effectively exploiting the problem structure, this method establishes tight dual bounds that can significantly enhance exact combinatorial methods. CG breaks down and relaxes the problem into a master linear programming (LP) problem and several combinatorial subproblems. The master LP typically contains an exponential number of columns, which correspond to the solutions to subproblems.  Hence, the master LP must be solved progressively and iteratively, beginning with a restricted subset of columns. In a single CG iteration, the dual values of this restricted master are employed to define combinatorial subproblems, which generate columns with negative reduced costs. At optimality, all the optimal columns (i.e., columns with non-zero values in the optimal LP solution) have been generated and the dual values have converged to the optimal dual solution simultaneously. There are several open challenges for CG, particularly the inefficiency of repeatedly solving combinatorial subproblems and the slow, unstable convergence of dual values. While existing research addresses the first challenge by using efficient heuristic methods, it often compromises the quality of the generated columns. Furthermore, generating multiple promising columns at an iteration of CG has shown advantages, a factor often overlooked by existing methods. As such, our first research objective is to develop a heuristic search that better addresses these concerns through solution prediction. Additionally, obtaining high-quality dual values that are close to the optimal dual solution is crucial for accelerating CG. However, existing research relies on handcrafted heuristics for estimating the optimal dual solution. Our second research objective seeks to acquire more accurate predictions of optimal dual solutions via ML. The outcomes of our research can lead to more efficient CG approaches for a specific scheduling problem and have the potential to benefit many real-life applications.

History

Degree Type

Doctorate by Research

Imprint Date

2023-01-01

School name

School of Computing Technologies, RMIT University

Former Identifier

9922263813301341

Open access

  • Yes

Usage metrics

    Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC