# Surrogate Model Algorithms for Computationally Expensive Black-Box Global Optimization Problems

##### Müller, Juliane (2012)

Müller, Juliane

Tampere University of Technology

2012

Luonnontieteiden ja ympäristötekniikan tiedekunta - Faculty of Science and Environmental Engineering

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

**Julkaisun pysyvä osoite on**

https://urn.fi/URN:ISBN:978-952-15-2991-7

##### Tiivistelmä

Surrogate models (also called response surface models or metamodels) have been widely used in the literature to solve continuous black-box global optimization problems that have computationally expensive objective functions. Different surrogate models such as radial basis functions, kriging, multivariate adaptive regression splines, and polynomial regression models have been used in various applications. It is in general however unknown which model will perform best for a given application, and computation time restrictions do not allow trying different models. Thus, in the first part of this thesis, a family of algorithms (SO-M, SO-M-c, SO-M-s) based on using a mixture of surrogate models is developed. The second part of the thesis extends the research in using surrogate models for mixed-integer (algorithm SO-MI) and purely integer (algorithms SO-I) optimization problems. Finally, a real world application problem arising in the agricultural land use management of a watershed is examined (algorithms SO-Ic).

The algorithm SO-M uses Dempster-Shafer theory to combine information derived from various model characteristics in order to determine the influence of individual models in the mixture. Extensions of SO-M with respect to the sampling strategy (algorithms SO-M-c and SO-M-s) have been compared in numerical experiments, and it was found that whenever it is a priori unknown which surrogate model should be used, it is advisable to use a mixture model in order to prevent accidentally selecting the worst model. It could be shown that mixture models containing radial basis function interpolants generally work very well, whereas using only polynomial regression models should be avoided. Moreover, algorithms using mixture models often outperform the algorithms that use only the single models that are contributing to the mixture.

Although there are many computationally expensive black-box optimization applications that have besides continuous also integer variables, or that have only integer variables, algorithms for solving these types of problems are scarce. In the second part of this thesis two algorithms, namely SO-MI for mixed-integer problems, and SO-I for purely integer problems have been developed and were shown to find accurate solutions for computationally expensive problems with black-box objective functions and possibly black-box constraints. The constraints were treated with a penalty approach and numerical experiments showed that the surrogate model based algorithms outperformed commonly used algorithms for (mixed-) integer problems such as branch and bound, and genetic algorithms. Also NOMAD (Nonsmooth Optimization by Mesh Adaptive Direct Search) has been included in the comparison. NOMAD is suitable for integer and mixed-integer black-box problems, but its performance for these problem types has not been studied in the literature. In the numerical experiments, NOMAD also proved superior as compared to branch and bound and the genetic algorithm, but it performed worse than SO-I and SO-MI for most test problems.

Lastly, the algorithm SO-I has been further extended to directly handling constraints with a response surface. The algorithm, SO-Ic, has been developed specifically for a watershed management problem that has only one constraint, but SO-Ic is easily generalizable for problems with more constraints. In the considered application problem parts of the agricultural land in the Cannonsville reservoir watershed in upstate New York have to be retired in order to decrease the total phosphorus runoff to a given limit at minimal cost. A computationally expensive simulation model has to be used to compute the costs and phosphorus runoff. The performance of SO-Ic has been compared to a genetic algorithm, NOMAD, and the discrete dynamically dimensioned search algorithm on three problem instances with different sizes of the feasible region. The surrogate model based algorithm SO-Ic performed also for these problems significantly better than all other algorithms and could be shown to be the most robust.

The algorithm SO-M uses Dempster-Shafer theory to combine information derived from various model characteristics in order to determine the influence of individual models in the mixture. Extensions of SO-M with respect to the sampling strategy (algorithms SO-M-c and SO-M-s) have been compared in numerical experiments, and it was found that whenever it is a priori unknown which surrogate model should be used, it is advisable to use a mixture model in order to prevent accidentally selecting the worst model. It could be shown that mixture models containing radial basis function interpolants generally work very well, whereas using only polynomial regression models should be avoided. Moreover, algorithms using mixture models often outperform the algorithms that use only the single models that are contributing to the mixture.

Although there are many computationally expensive black-box optimization applications that have besides continuous also integer variables, or that have only integer variables, algorithms for solving these types of problems are scarce. In the second part of this thesis two algorithms, namely SO-MI for mixed-integer problems, and SO-I for purely integer problems have been developed and were shown to find accurate solutions for computationally expensive problems with black-box objective functions and possibly black-box constraints. The constraints were treated with a penalty approach and numerical experiments showed that the surrogate model based algorithms outperformed commonly used algorithms for (mixed-) integer problems such as branch and bound, and genetic algorithms. Also NOMAD (Nonsmooth Optimization by Mesh Adaptive Direct Search) has been included in the comparison. NOMAD is suitable for integer and mixed-integer black-box problems, but its performance for these problem types has not been studied in the literature. In the numerical experiments, NOMAD also proved superior as compared to branch and bound and the genetic algorithm, but it performed worse than SO-I and SO-MI for most test problems.

Lastly, the algorithm SO-I has been further extended to directly handling constraints with a response surface. The algorithm, SO-Ic, has been developed specifically for a watershed management problem that has only one constraint, but SO-Ic is easily generalizable for problems with more constraints. In the considered application problem parts of the agricultural land in the Cannonsville reservoir watershed in upstate New York have to be retired in order to decrease the total phosphorus runoff to a given limit at minimal cost. A computationally expensive simulation model has to be used to compute the costs and phosphorus runoff. The performance of SO-Ic has been compared to a genetic algorithm, NOMAD, and the discrete dynamically dimensioned search algorithm on three problem instances with different sizes of the feasible region. The surrogate model based algorithm SO-Ic performed also for these problems significantly better than all other algorithms and could be shown to be the most robust.

##### Kokoelmat

- Väitöskirjat [4793]