Home Curriculum Vitae Current Research Publications Books Software Presentations Photos Contact MEDAL MEDAL Blogging BOA hBOA External links:    hBOATM    IlliGAL    ACM SIGEVO    www.arxiv.org    more... Martin Pelikan - Publication See also How to: Click on the title of the paper of your interest to get more information about the paper, including an abstract, publication data, and a downloadable version (if available). Please send me an email to report any problems with the d ownload or errors in text. 2017Martin Pelikan, Giles Hogben, Ulfar Erlingsson (Google Security and Privacy) Identifying Intrusive Mobile Apps Using Peer Group Analysis 2012Martin Pelikan and Mark W. Hauschild Learn from the past: Improving model-directed optimization by transfer learning based on distance-based bias Alexander E.I. Brownlee, John A. W. McCall, and Martin Pelikan Influence of selection on structure learning in Markov network EDAs: An empirical study B. Hoda Helmi, Martin Pelikan, and Adel Rahmani Linkage learning using the maximum spanning tree of the dependency graph Martin Pelikan, Mark W. Hauschild, and Pier Luca Lanzi Transfer learning, soft distance-based bias, and the hierarchical BOA Martin Pelikan, Mark W. Hauschild, and Fernando G. Lobo Introduction to estimation of distribution algorithms Mark W. Hauschild, Sanjiv Bhatia and Martin Pelikan Image segmentation using a genetic algorithm and hierarchical local search Mark W. Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg Using Previous Models to Bias Structural Learning in the Hierarchical BOA Martin Pelikan and Mark W. Hauschild Distance-based bias in model-directed optimization of additively decomposable problems 2011Mark Hauschild and Martin Pelikan An Introduction and Survey of Estimation of Distribution Algorithms Mark Hauschild and Martin Pelikan Advanced Neighborhoods and Problem Difficulty Measures Martin Pelikan, Mark Hauschild, and Dirk Thierens Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Algorithm 2010Martin Pelikan Genetic Algorithms Mark W. Hauschild and Martin Pelikan Performance of Network Crossover on NK Landscapes and Spin Glasses Claudio F. Lima, Martin Pelikan, Fernando G. Lobo, and David E. Goldberg Loopy Substructural Local Search for the Bayesian Optimization Algorithm Claudio F. Lima, Fernando G. Lobo, Martin Pelikan, and David E. Goldberg Model Accuracy in the Bayesian Optimization Algorithm Mark W. Hauschild and Martin Pelikan Network Crossover Performance on NK Landscapes and Deceptive Problems Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability 2009Martin Pelikan Evolucne Algoritmy (in Slovak) Tian-Li Yu, David E. Goldberg, Kumara Sastry, Claudio F. Lima, and Martin Pelikan Dependency Structure Matrix, Genetic Algorithms, and Effective Recombination Martin Pelikan, Greg L. Hura, and Michal Hammel Structure and Flexibility within Proteins as Identified through Small Angle X-ray Scattering Mark Hauschild and Martin Pelikan Intelligent Bias of Network Structures in the Hierarchical BOA Martin Pelikan and Helmut G. Katzgraber Analysis of Evolutionary Algorithms on the One-Dimensional Spin Glass with Power-Law Interactions Elizabeth Radetic, Martin Pelikan, and David E. Goldberg Effects of a Deterministic Hill climber on hBOA Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Martin Pelikan and Kumara Sastry Initial-Population Bias in the Univariate Estimation of Distribution Algorithm 2008Martin Pelikan, Kumara Sastry, and David E. Goldberg Sporadic Model Building for Efficiency Enhancement of the Hierarchical BOA Claudio F. Lima, Fernando G. Lobo, and Martin Pelikan From Mating Pool Distributions to Model Overfitting Mark Hauschild and Martin Pelikan Enhancing Efficiency of Hierarchical BOA via Distance-Based Model Restrictions Alexander Brownlee, Martin Pelikan, John McCall, and Andrei Petrovski An Application of a Multivariate Estimation of Distribution Algorithm to Cancer Chemotherapy Martin Pelikan, Helmut G. Katzgraber, and Sigismund Kobe Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Mark Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg Using Previous Models to Bias Structural Learning in the Hierarchical BOA Martin Pelikan, Kumara Sastry, and David E. Goldberg iBOA: The Incremental Bayesian Optimization Algorithm 2007Claudio F. Lima, Martin Pelikan, David E. Goldberg, Fernando G. Lobo, Kumara Sastry, and Mark Hauschild Influence of Selection and Replacement Strategies on Linkage Learning in BOA Rajiv Kalapala, Martin Pelikan and Alexander K. Hartmann Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs Martin Pelikan, Shigeyoshi Tsutsui, and Rajiv Kalapala Dependency Trees, Permutations, and Quadratic Assignment Problem Martin Pelikan and Alexander K. Hartmann Obtaining Ground States of Ising Spin Glasses via Optimizing Bonds Instead of Spins Mark Hauschild, Martin Pelikan, Claudio F. Lima, and Kumara Sastry Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses 2006Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg Node Histogram vs. Edge Histogram: A Comparison of PMBGAs in Permutation Domains Shigeyoshi Tsutsui and Martin Pelikan Cunning Ant System: An Extension of Edge Histogram Sampling Algorithms to ACO Claudio F. Lima, Martin Pelikan, Kumara Sastry, Martin Butz, David E. Goldberg, and Fernando G. Lobo Substructural Neighborhoods for Local Search in the Bayesian Optimization Algorithm Tian-Li Yu, Kumara Sastry, David E. Goldberg, and Martin Pelikan Population Sizing for Entropy-Based Model Building in Genetic Algorithms Martin Pelikan and James D. Laury, Jr. Order or Not: Does Parallelization of Model Building in hBOA Affect Its Scalability? Kumara Sastry, Martin Pelikan, and David E. Goldberg Analysis of Ideal Recombination on Random Decomposable Problems Martin Pelikan, Kumara Sastry, Martin V. Butz, and David E. Goldberg Generator and Interface for Random Decomposable Problems in C Martin Pelikan and Alexander K. Hartmann Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan, Kumara Sastry, Martin V. Butz, and David E. Goldberg Hierarchical BOA on Random Decomposable Problems Martin Pelikan, Kumara Sastry, and David E. Goldberg Sporadic Model Building for Efficiency Enhancement of hBOA 2005Martin V. Butz, Martin Pelikan, Xavier Llora, and David E. Goldberg Automated Global Structure Extraction for Effective Local Building Block Processing in XCS Martin V. Butz, Martin Pelikan, Xavier Llora, and David E. Goldberg Extracted Global Structure Makes Local Building Block Processing Effective in XCS Martin Pelikan, Kumara Sastry, and David E. Goldberg Multiobjective hBOA, Clustering, and Scalability Kumara Sastry, Martin Pelikan, and David E. Goldberg Decomposable Problems, Niching, and Scalability of Multiobjective Estimation of Distribution Algorithms Radovan Ondas, Martin Pelikan, and Kumara Sastry Scalability of Genetic Programming and Probabilistic Incremental Program Evolution Frank Potthast, Jiri Ocenasek, Dorothea Rutishauser, Martin Pelikan, and Ralph Schlapbach Database Independent Detection of Isotopically Labeled MS/MS Spectrum Peptide Pairs 2004Kumara Sastry, David E. Goldberg, and Martin Pelikan Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms Kumara Sastry, David E. Goldberg, and Martin Pelikan Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms Jiri Ocenasek and Martin Pelikan Parallel Mixed Bayesian Optimization Algorithm: A Scaleup Analysis Kumara Sastry, Martin Pelikan, and David E. Goldberg Efficiency Enhancement of Genetic Algorithms via Building-Block-Wise Fitness Estimation Martin Pelikan, Jiri Ocenasek, Simon Trebst, Matthias Troyer, and Fabien Alet Computational Complexity and Simulation of Rare Events of Ising Spin Glasses Martin Pelikan and Tz-Kai Lin Parameter-less Hierarchical BOA Martin Pelikan and Kumara Sastry Fitness inheritance in the Bayesian optimization algorithm 2003Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg Using edge histogram models to solve permutation problems with probabilistic model-building genetic algorithms Martin Pelikan, David E. Goldberg, Jiri Ocenasek, and Simon Trebst Robust and Scalable Black-Box Optimization, Hierarchy, and Ising Spin Glasses Jiri Ocenasek, Josef Schwarz, and Martin Pelikan Design of Multithreaded Estimation of Distribution Algorithms Martin Pelikan and David E. Goldberg Hierarchical BOA Solves Ising Spin Glasses and MAXSAT 2002Shigeyoshi Tsutsui, David E. Goldberg, and Martin Pelikan Solving Sequence Problems by Building and Sampling Edge Histograms Nazan Khan, David E. Goldberg, Martin Pelikan Multi-objective Bayesian Optimization Algorithm 2001Martin Pelikan, Kumara Sastry, David E. Goldberg Evolutionary Algorithms + Graphical Models = Scalable Black-Box Optimization Alessio Ceroni, Martin Pelikan, David E. Goldberg Convergence-Time Models for the Simple Genetic Algorithm with Finite Population Martin Pelikan, David E. Goldberg, Shigeyoshi Tsutsui Combining the Strengths of the Bayesian Optimization Algorithm and Adaptive Evolution Strategies Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg Evolutionary Algorithm Using Marginal Histogram Models in Continuous Domain Kumara Sastry, David E. Goldberg, and Martin Pelikan Don't Evaluate, Inherit Martin V. Butz and Martin Pelikan Analyzing the Evolutionary Pressures in XCS Martin Pelikan and David E. Goldberg Escaping Hierarchical Traps with Competent Genetic Algorithms 2000Martin Pelikan and David E. Goldberg Genetic Algorithms, Clustering, and the Breaking of Symmetry Martin Pelikan and David E. Goldberg Research on the Bayesian Optimization Algorithm Martin Pelikan and David E. Goldberg Hierarchical Problem Solving and the Bayesian Optimization Algorithm Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence 1999Martin Pelikan, David E. Goldberg, and Kumara Sastry Bayesian Optimization Algorithm, Decision Graphs, and Occam's Razor Martin Pelikan, David E. Goldberg, and Fernando Lobo A Survey of Optimization by Building and Using Probabilistic Models Martin Pelikan and Fernando Lobo Parameter-less Genetic Algorithm: A Worst-case Time and Space Complexity Analysis Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz BOA: The Bayesian optimization algorithm Martin Pelikan and Heinz Muehlenbein The Bivariate Marginal Distribution Algorithm 1998Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz Linkage Problem, Distribution Estimation, and Bayesian Networks Martin Pelikan, and Heinz Muehlenbein Marginal Distributions in Evolutionary Algorithms 1996Vladimir Kvasnicka, Jiri Pospichal, and Martin Pelikan Read's Linear Codes and Evolutionary Computation Over Populations of Rooted Trees Vladimir Kvasnicka, Jiri Pospichal, and Martin Pelikan Stochastic Simulation of Multiagent Models Vladimir Kvasnicka, Martin Pelikan, and Jiri Pospichal Hill Climbing with Learning (An Abstraction of Genetic Algorithm) 2017 Identifying Intrusive Mobile Apps Using Peer Group Analysis Martin Pelikan, Giles Hogben, Ulfar Erlingsson (Google Security and Privacy) Abstract https://security.googleblog.com/2017/07/identifying-intrusive-mobile-apps-using.html ReferenceM. Pelikan, G. Hogben, and U. Erlingsson (2017). Identifying intrusive mobile apps using peer group analysis. Google Security Blog 2017/07/12. https://security.googleblog.com/2017/07/identifying-intrusive-mobile-apps-using.html 2012 Learn from the past: Improving model-directed optimization by transfer learning based on distance-based bias Martin Pelikan and Mark W. Hauschild Abstract For many optimization problems it is possible to define a problem-specific distance metric over decision variables that correlates with the strength of interactions between the variables. Examples of such problems include additively decomposable functions, facility location problems, and atomic cluster optimization. However, the use of such a metric for enhancing efficiency of optimization techniques is often not straightforward. This paper describes a framework that allows optimization practitioners to improve efficiency of model-directed optimization techniques by combining such a distance metric with information mined from previous optimization runs on similar problems. The framework is demonstrated and empirically evaluated in the context of the hierarchical Bayesian optimization algorithm (hBOA). Experimental results provide strong empirical evidence that the proposed approach provides significant speedups and that it can be effectively combined with other efficiency enhancements. The paper demonstrates how straightforward it is to adapt the proposed framework to other model-directed optimization techniques by presenting several examples. ReferenceMartin Pelikan and Mark W. Hauschild (2012). Learn from the past: Improving model-directed optimization by transfer learning based on distance-based bias. MEDAL Report No. 2012007, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Download PS Influence of selection on structure learning in Markov network EDAs: An empirical study Alexander E.I. Brownlee, John A. W. McCall, and Martin Pelikan Abstract Learning a good model structure is important to the efficient solving of problems by estimation of distribution algorithms. In this paper we present the results of a series of experiments, applying a structure learning algorithm for undirected probabilistic graphical models based on statistical dependency tests to three fitness functions with different selection operators, proportions and pressures. The number of spurious interactions found by the algorithm are measured and reported. Truncation selection, and its complement (selecting only low fitness solutions) prove quite robust, resulting in a similar number of spurious dependencies regardless of selection pressure. In contrast, tournament and fitness proportionate selection are strongly affected by the selection proportion and pressure. ReferenceAlexander E.I. Brownlee, John A. W. McCall, and Martin Pelikan (2012). Influence of selection on structure learning in Markov network EDAs: An empirical study. MEDAL Report No. 2012006, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Linkage learning using the maximum spanning tree of the dependency graph B. Hoda Helmi, Martin Pelikan, and Adel Rahmani Abstract The goal of linkage learning in genetic and evolutionary algorithms is to identify the interactions between variables of a problem. Knowing the linkage information helps search algorithms to find the optimum solution efficiently and reliably in hard problems. This paper presents a simple approach for linkage learning based on the graph theory. A graph is used as the structure to keep the the pairwise dependencies between variables of the problem. We call this graph 'the underlying dependency graph of the problem'. Then maximum spanning tree (MST) of the dependency graph is found. It is shown that MST contains all the necessary linkage if the dependency graph is built upon enough population. In this approach, pairwise dependencies calculated based on a perturbation based identification method, are used as the variable dependencies. The proposed approach has the advantage of being capable of learning the linkage without the need for the costly fit-to-data evaluations for model search. It is parameter-less and the algorithm description is simple and straight forward. The proposed technique is tested on several benchmark problems and it is shown to be able to compete with similar approaches. Based on the experimental results it can successfully find the linkage groups in a polynomial number of fitness evaluations. ReferenceB. Hoda Helmi, Martin Pelikan, and Adel Rahmani (2012). Linkage learning using the maximum spanning tree of the dependency graph. MEDAL Report No. 2012005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Transfer learning, soft distance-based bias, and the hierarchical BOA Martin Pelikan, Mark W. Hauschild, and Pier Luca Lanzi Abstract An automated technique has recently been proposed to transfer learning in the hierarchical Bayesian optimization algorithm (hBOA) based on distance-based statistics. The technique enables practitioners to improve hBOA efficiency by collecting statistics from probabilistic models obtained in previous hBOA runs and using the obtained statistics to bias future hBOA runs on similar problems. The purpose of this paper is threefold: (1) test the technique on several classes of NP-complete problems, including MAXSAT, spin glasses and minimum vertex cover; (2) demonstrate that the technique is effective even when previous runs were done on problems of different size; (3) provide empirical evidence that combining transfer learning with other efficiency enhancement techniques can often yield nearly multiplicative speedups. ReferenceMartin Pelikan, Mark W. Hauschild, and Pier Luca Lanzi (2012). Transfer learning, soft distance-based bias, and the hierarchical BOA. MEDAL Report No. 2012004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Download PS Introduction to estimation of distribution algorithms Martin Pelikan, Mark W. Hauschild, and Fernando G. Lobo Abstract Estimation of distribution algorithms (EDAs) guide the search for the optimum by building and sampling explicit probabilistic models of promising candidate solutions. However, EDAs are not only optimization techniques; besides the optimum or its approximation, EDAs provide practitioners with a series of probabilistic models that reveal a lot of information about the problem being solved. This information can in turn be used to design problem-specific neighborhood operators for local search, to bias future runs of EDAs on a similar problem, or to create an efficient computational model of the problem. This chapter provides an introduction to EDAs as well as a number of pointers for obtaining more information about this class of algorithms. ReferenceMartin Pelikan, Mark W. Hauschild, and Fernando G. Lobo (2012). Introduction to estimation of distribution algorithms. MEDAL Report No. 2012003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Download PS Image segmentation using a genetic algorithm and hierarchical local search Mark W. Hauschild, Sanjiv Bhatia and Martin Pelikan Abstract This paper proposes a hybrid genetic algorithm to perform image segmentation based on applying the q-state Potts spin glass model to a grayscale image. First, the image is converted to a set of weights for a q-state spin glass and then a steady-state genetic algorithm is used to evolve candidate segmented images until a suitable candidate solution is found. To speed up the convergence to an adequate solution, hierarchical local search is used on each evaluated solution. The results show that the hybrid genetic algorithm with hierarchical local search is able to efficiently perform image segmentation. The necessity of hierarchical search for these types of problems is also clearly demonstrated. ReferenceMark W. Hauschild, Sanjiv Bhatia and Martin Pelikan (2012). Image Segmentation using a Genetic Algorithm and Hierarchical Local Search. MEDAL Report No. 2012002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Download PS Using Previous Models to Bias Structural Learning in the Hierarchical BOA Mark W. Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract Estimation of distribution algorithms (EDAs) are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. While the primary goal of applying EDAs is to discover the global optimum or at least its accurate approximation, besides this, any EDA provides us with a sequence of probabilistic models, which in most cases hold a great deal of information about the problem. Although using problem-specific knowledge has been shown to significantly improve performance of EDAs and other evolutionary algorithms, this readily available source of problem-specific information has been practically ignored by the EDA community. This paper takes the first step toward the use of probabilistic models obtained by EDAs to speed up the solution of similar problems in the future. More specifically, we propose two approaches to biasing model building in the hierarchical Bayesian optimization algorithm (hBOA) based on knowledge automatically learned from previous hBOA runs on similar problems. We show that the proposed methods lead to substantial speedups and argue that the methods should work well in other applications that require solving a large number of problems with similar structure. ReferenceMark W. Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg (2012). Using Previous Models to Bias Structural Learning in the Hierarchical BOA. Evolutionary Computation 20(1), 135-160. Download PDF Distance-based bias in model-directed optimization of additively decomposable problems Martin Pelikan and Mark W. Hauschild Abstract For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable. ReferenceMartin Pelikan and Mark Hauschild (2012). Distance-based bias in model-directed optimization of additively decomposable problems. MEDAL Report No. 2012001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Download PS 2011 An Introduction and Survey of Estimation of Distribution Algorithms Mark Hauschild and Martin Pelikan Abstract Estimation of distribution algorithms (EDAs) are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. This explicit use of probablistic models in optimization offers some significant advantages over other types of metaheuristics. This paper discusses these advantages and outlines many of the different types of EDAs. In addition, some of the most powerful efficiency enhancement techniques applied to EDAs are discussed. ReferenceMark Hauschild and Martin Pelikan (2011). A Survey of Estimation of Distribution Algorithms. MEDAL Report No. 2011004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Swarm and Evolutionary Computation, 1(3), 111-128. Download PDF Advanced Neighborhoods and Problem Difficulty Measures Mark Hauschild and Martin Pelikan Abstract While different measures of problem difficulty of fitness landscapes have been proposed, recent studies have shown that many of the common ones do not closely correspond to the actual difficulty of problems when solved by evolutionary algorithms. One of the reasons for this is that most problem difficulty measures are based on neighborhood structures that are quite different from those used in most evolutionary algorithms. This paper examines several ways to increase the accuracy of problem difficulty measures by including linkage information in the measure to more accurately take into account the advanced neighborhoods explored by some evolutionary algorithms. The effects of these modifications of problem difficulty are examined in the context of several simple and advanced evolutionary algorithms. The results are then discussed and promising areas for future research are proposed. ReferenceMark Hauschild and Martin Pelikan (2011). Advanced Neighborhoods and Problem Difficulty Measures. MEDAL Report No. 2011003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2011), ACM Press, 625-632. Download PDF Download PS Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions Martin Pelikan Abstract Epistasis correlation is a measure that estimates the strength of interactions between problem variables. This paper presents an empirical study of epistasis correlation on a large number of random problem instances of NK landscapes with nearest neighbor interactions. The results are analyzed with respect to the performance of hybrid variants of two evolutionary algorithms: (1) the genetic algorithm with uniform crossover and (2) the hierarchical Bayesian optimization algorithm. ReferenceMartin Pelikan (2011). Analysis of Epistasis Correlation on NK Landscapes with Nearest-Neighbor Interactions. MEDAL Report No. 2011002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2011), ACM Press, 1013-1020. Download PDF Download PS Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Algorithm Martin Pelikan, Mark Hauschild, and Dirk Thierens Abstract The linkage tree genetic algorithm (LTGA) identifies linkages between problem variables using an agglomerative hierarchical clustering algorithm and linkage trees. This enables LTGA to solve many decomposable problems that are difficult with more conventional genetic algorithms. The goal of this paper is two-fold: (1) Present a thorough empirical evaluation of LTGA on a large set of problem instances of additively decomposable problems and (2) speed up the clustering algorithm used to build the linkage trees in LTGA by using a pairwise and a problem-specific metric. ReferenceMartin Pelikan, Mark Hauschild, and Dirk Thierens (2011). Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Algorithm. Genetic and Evolutionary Computation Conference (GECCO-2011), ACM Press, pp. 1005-1012. Download the final version from ACM Digital Library Preprint also available as MEDAL Report No. 2011001 (click here for a postscript version). 2010 Genetic Algorithms Martin Pelikan Abstract Genetic algorithms are stochastic optimization methods inspired by natural evolution and genetics. Over the last few decades, genetic algorithms have been successfully applied to many problems of business, engineering, and science. Because of their operational simplicity and wide applicability, genetic algorithms are now playing an increasingly important role in computational optimization and operations research. This article provides an introduction to genetic algorithms as well as numerous pointers for obtaining additional information. ReferenceMartin Pelikan (2010). Genetic Algorithms. MEDAL Report No. 20010007, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Accepted by the Encyclopedia of Operations Research and Management Science (EORMS) (Wiley). Download PDF Download PS Performance of Network Crossover on NK Landscapes and Spin Glasses Mark W. Hauschild and Martin Pelikan Abstract This paper describes a network crossover operator based on knowledge gathered from either prior problem-specific knowledge or linkage learning methods such as estimation of distribution algorithms (EDAs). This operator can be used in a genetic algorithm (GA) to incorporate linkage in recombination. The performance of GA with network crossover is compared to that of GA with uniform crossover and the hierarchical Bayesian optimization algorithm (hBOA) on 2D Ising spin glasses, NK landscapes, and SK spin glasses. The results are analyzed and discussed. ReferenceMartin Pelikan (2010). Performance of Network Crossover on NK Landscapes and Spin Glasses. MEDAL Report No. 2001006, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Parallel Problem Solving from Nature (PPSN XI), Springer, pp. 462-471. Download PDF Loopy Substructural Local Search for the Bayesian Optimization Algorithm Claudio F. Lima, Martin Pelikan, Fernando G. Lobo, and David E. Goldberg Abstract This paper presents a local search method for the Bayesian optimization algorithm (BOA) based on the concepts of substructural neighborhoods and loopy belief propagation. The probabilistic model of BOA, which automatically identifies important problem substructures, is used to define the topology of the neighborhoods explored in local search. On the other hand, belief propagation in graphical models is employed to find the most suitable configuration of conflicting substructures. The results show that performing loopy substructural local search (SLS) in BOA can dramatically reduce the number of generations necessary to converge to optimal solutions and thus provides substantial speedups. ReferenceClaudio F. Lima, Martin Pelikan, Fernando G. Lobo, and David E. Goldberg (2010). Loopy Substructural Local Search for the Bayesian Optimization Algorithm. MEDAL Report No. 2010005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Model Accuracy in the Bayesian Optimization Algorithm Claudio F. Lima, Fernando G. Lobo, Martin Pelikan, and David E. Goldberg Abstract Evolutionary algorithms (EAs) are particularly suited to solve problems for which there is not much information available. From this standpoint, estimation of distribution algorithms (EDAs), which guide the search by using probabilistic models of the population, have brought a new view to evolutionary computation. While solving a given problem with an EDA, the user has access to a set of models that reveal probabilistic dependencies between variables, an important source of information about the problem. However, as the complexity of the used models increases, the chance of overfitting and consequently reducing model interpretability, increases as well. This paper investigates the relationship between the probabilistic models learned by the Bayesian optimization algorithm (BOA) and the underlying problem structure. The purpose of the paper is threefold. First, model building in BOA is analyzed to understand how the problem structure is learned. Second, it is shown how the selection operator can lead to model overfitting in Bayesian EDAs. Third, the scoring metric that guides the search for an adequate model structure is modified to take into account the non-uniform distribution of the mating pool generated by tournament selection. Overall, this paper makes a contribution towards understanding and improving model accuracy in BOA, providing more interpretable models to assist efficiency enhancement techniques and human researchers. ReferenceClaudio F. Lima, Fernando G. Lobo, Martin Pelikan, and David E. Goldberg (2010). Model Accuracy in the Bayesian Optimization Algorithm. MEDAL Report No. 2010004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published as Claudio F. Lima, Fernando G. Lobo, Martin Pelikan, and David E. Goldberg (2011). Model Accuracy in the Bayesian Optimization Algorithm. Soft Computing, 15(7), 1351-1371. Download PDF Network Crossover Performance on NK Landscapes and Deceptive Problems Mark W. Hauschild and Martin Pelikan Abstract By being able to build and sample models of promising solutions, Estimation of Distribution Algorithms (EDAs) gain many advantages over standard evolutionary algorithms. However, this comes at the cost of an expensive model-building phase. One way to try and eliminate this bottleneck is to incorporate prior knowledge into the model-building, but this can be a difficult task. Another approach is to modify the crossover operator in a standard genetic algorithm (GA) to exploit this prior information. This paper describes a method to build a network crossover operator that can be used in a GA to easily incorporate problem-specific knowledge such that it better respects linkages between bits. The performance of this operator in the simple genetic algorithm (GA) is then compared to other operators as well as the hierarchical Bayesian Optimization Algorithm (hBOA) on several different problem types, all with both elitism replacement and Restricted Tournament Replacement (RTR). The performance of all the algorithms and replacement mechanisms are then analyzed and the results are discussed. ReferenceMark Hauschild and Martin Pelikan (2010). Network Crossover Performance on NK Landscapes and Deceptive Problems. MEDAL Report No. 2010003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2010), ACM Press, pp. 713-720. Download PDF Spurious Dependencies and EDA Scalability Elizabeth Radetic and Martin Pelikan Abstract Numerous studies have shown that advanced estimation of distribution algorithms (EDAs) often discover spurious (unnecessary) dependencies. Nonetheless, only little prior work exists that would study the effects of spurious dependencies on EDA performance. This paper examines the effects of spurious dependencies on the performance and scalability of EDAs with the main focus on EDAs with marginal product models and the onemax problem. A theoretical model is proposed to analyze the effects of spurious dependencies on the population sizing in EDAs and the theory is verified with experiments. The effects of spurious dependencies on the number of generations are studied empirically. ReferenceElizabeth Radetic and Martin Pelikan (2010). Spurious Dependencies and EDA Scalability. MEDAL Report No. 2010002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2010), ACM Press, pp. 303-310. Download PDF Download PS NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms Martin Pelikan Abstract This paper presents an empirical study of NK landscapes with the main focus on the relationship between (1) problem parameters, (2) various measures of problem difficulty of fitness landscapes, and (3) performance of hybrid evolutionary algorithms. As the target class of problems, the study considers two types of NK landscapes: (1) Standard, unrestricted NK landscapes and (2) shuffled NK landscapes with nearest-neighbor interactions. As problem difficulty measures, the paper considers the fitness distance correlation, the correlation coefficient, the distance of local and global optima, and the escape rate. Experimental results are presented, analyzed and discussed. Promising avenues for future work are also outlined. ReferenceMartin Pelikan (2010). NK Landscapes, Problem Difficulty, and Hybrid Evolutionary Algorithms. MEDAL Report No. 2010001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2010), ACM Press, pp. 665-672. Download PDF Download PS 2009 Evolucne Algoritmy (in Slovak) Martin Pelikan ReferenceMartin Pelikan (2009). Evolucne algoritmy (in Slovak). Umela Inteligencia a Kognitivna Veda I. V. Kvasnicka et al. (eds.), pp. 335-353. Download PDF Dependency Structure Matrix, Genetic Algorithms, and Effective Recombination Tian-Li Yu, David E. Goldberg, Kumara Sastry, Claudio F. Lima, and Martin Pelikan Abstract In many different fields, researchers are often confronted by problems arising from complex systems. Simple heuristics or even enumeration works quite well on small and easy problems; however, to efficiently solve large and difficult problems, proper decomposition is the key. In this paper, investigating and analyzing interactions between components of complex systems shed some light on problem decomposition. By recognizing three bare-bone interactions-modularity, hierarchy, and overlap-facetwise models are developed to dissect and inspect problem decomposition in the context of genetic algorithms. The proposed genetic algorithm design utilizes a matrix representation of an interaction graph to analyze and explicitly decompose the problem. The results from this paper should benefit research both technically and scientifically. Technically, this paper develops an automated dependency structure matrix clustering technique and utilizes it to design a model-building genetic algorithm which learns and delivers the problem structure. Scientifically, the explicit interaction model well describes the problem structure and helps researchers gain important insights through the explicitness of the procedure. ReferenceTian-Li Yu, David E. Goldberg, Kumara Sastry, Claudio F. Lima, and Martin Pelikan (in press). Dependency Structure Matrix, Genetic Algorithms, and Effective Recombination. Evolutionary Computation. Structure and Flexibility within Proteins as Identified through Small Angle X-ray Scattering Martin Pelikan, Greg L. Hura, and Michal Hammel Abstract Proteins often adopt flexible multiple domain conformations. These proteins are often not readily amenable to conventional structural analysis such as X-ray crystallography or NMR. Here we perform a use of small angle X-ray scattering (SAXS) measurements to filter candidate protein conformations for the purpose of protein structure prediction in its native state in solution. SAXS is used to validate the constructed atomic models and to determine the ensemble of most probable conformations. In our rigid body modeling strategy "BILBOMD", molecular dynamics (MD) simulations are used to explore conformational space. A common strategy is to perform the MD simulation on the domains connections at very high temperature, where the additional kinetic energy prevents the molecule from becoming trapped in a local minimum. The MD simulations provide an ensemble of molecular models from which a SAXS curve is calculated and compared to the experimental curve. A genetic algorithm is used to identify the minimal ensemble (Minimal Ensemble Search, MES) required to best fit the experimental data. We demonstrate the use of MES in several model and four experimental examples. ReferenceMartin Pelikan, Greg L. Hura, and Michal Hammel (in press). Structure and Flexibility within Proteins as Identified through Small Angle X-ray Scattering. General Physiology and Biophysics, 28 (2), pp. 174-189. Intelligent Bias of Network Structures in the Hierarchical BOA Mark Hauschild and Martin Pelikan Abstract One of the primary advantages of estimation of distribution algorithms (EDAs) over many other stochastic optimization techniques is that they supply us with a roadmap of how they solve a problem. This roadmap consists of a sequence of probabilistic models of candidate solutions of increasing quality. The first model in this sequence would typically encode the uniform distribution over all admissible solutions whereas the last model would encode a distribution that generates at least one global optimum with high probability. It has been argued that exploiting this knowledge should improve EDA performance when solving similar problems. This paper presents an approach to bias the building of Bayesian network models in the hierarchical Bayesian optimization algorithm (hBOA) using information gathered from models generated during previous hBOA runs on similar problems. The approach is evaluated on trap-5 and 2D spin glass problems. ReferenceMark Hauschild and Martin Pelikan (2009). Intelligent Bias of Network Structures in the Hierarchical BOA. MEDAL Report No. 2009005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2009), ACM Press, pp. 413-420. Download PDF Analysis of Evolutionary Algorithms on the One-Dimensional Spin Glass with Power-Law Interactions Martin Pelikan and Helmut G. Katzgraber Abstract This paper provides an in-depth empirical analysis of several evolutionary algorithms on the one-dimensional spin glass model with power-law interactions. The considered spin glass model provides a mechanism for tuning the effective range of interactions, what makes the problem interesting as an algorithm benchmark. As algorithms, the paper considers the genetic algorithm (GA) with twopoint and uniform crossover, and the hierarchical Bayesian optimization algorithm (hBOA). hBOA is shown to outperform both variants of GA, whereas GA with uniform crossover is shown to perform worst. The differences between the compared algorithms become more significant as the problem size grows and as the range of interactions decreases. Unlike for GA with uniform crossover, for hBOA and GA with twopoint crossover, instances with short-range interactions are shown to be easier. The paper also points out interesting avenues for future research. ReferenceMartin Pelikan and Helmut G. Katzgraber (2009). Analysis of Evolutionary Algorithms on the One-Dimensional Spin Glass with Power-Law Interactions. MEDAL Report No. 2009004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2009), ACM Press, pp. 843-850. Download PDF Effects of a Deterministic Hill climber on hBOA Elizabeth Radetic, Martin Pelikan, and David E. Goldberg Abstract Hybridization of global and local search algorithms is a well-established technique for enhancing the efficiency of search algorithms. Hybridizing estimation of distribution algorithms (EDAs) has been repeatedly shown to produce better performance than either the global or local search algorithm alone. The hierarchical Bayesian optimization algorithm (hBOA) is an advanced EDA which has previously been shown to benefit from hybridization with a local searcher. This paper examines the effects of combining hBOA with a deterministic hill climber (DHC). Experiments reveal that allowing DHC to find the local optima makes model building and decision making much easier for hBOA. This reduces the minimum population size required to find the global optimum, which substantially improves overall performance. ReferenceElizabeth Radetic, Martin Pelikan, and David E. Goldberg (2009). Effects of a Deterministic Hill climber on hBOA. MEDAL Report No. 2009003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2009), ACM Press, pp. 437-444. Download PDF Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild Abstract This paper presents a class of NK landscapes with nearest-neighbor interactions and tunable overlap. The considered class of NK landscapes is solvable in polynomial time using dynamic programming; this allows us to generate a large number of random problem instances with known optima. Several variants of standard genetic algorithms and estimation of distribution algorithms are then applied to the generated problem instances. The results are analyzed and related to scalability theory for selectorecombinative genetic algorithms and estimation of distribution algorithms. ReferenceMartin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild (2009). Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap. MEDAL Report No. 2009002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2009), ACM Press, pp. 851-858. Download PDF Initial-Population Bias in the Univariate Estimation of Distribution Algorithm Martin Pelikan and Kumara Sastry Abstract This paper analyzes the effects of an initial-population bias on the performance of the univariate marginal distribution algorithm (UMDA). The analysis considers two test problems: (1) onemax and (2) noisy onemax. Theoretical models are provided and verified with experiments. Intuitively, biasing the initial population toward the global optimum should improve performance of UMDA, whereas biasing the initial population away from the global optimum should have the opposite effect. Both theoretical and experimental results confirm this intuition. Effects of mutation and sampling are also analyzed and the performance of UMDA is compared to that of the mutation-based hill climbing. While for deterministic onemax the hill climbing is shown to deal with the initial bias very well, for noisy onemax performance of the hill climbing is poor regardless of the bias. ReferenceMartin Pelikan and Kumara Sastry (2009). Initial-Population Bias in the Univariate Estimation of Distribution Algorithm. MEDAL Report No. 2009001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2009), ACM Press, pp. 429-436. Download PDF 2008 Sporadic Model Building for Efficiency Enhancement of the Hierarchical BOA Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract Efficiency enhancement techniques---such as parallelization and hybridization---are among the most important ingredients of practical applications of genetic and evolutionary algorithms and that is why this research area represents an important niche of evolutionary computation. This paper describes and analyzes sporadic model building, which can be used to enhance the efficiency of the hierarchical Bayesian optimization algorithm (hBOA) and other estimation of distribution algorithms (EDAs) that use complex multivariate probabilistic models. With sporadic model building, the structure of the probabilistic model is updated once in every few iterations (generations), whereas in the remaining iterations, only model parameters (conditional and marginal probabilities) are updated. Since the time complexity of updating model parameters is much lower than the time complexity of learning the model structure, sporadic model building decreases the overall time complexity of model building. The paper shows that for boundedly difficult nearly decomposable and hierarchical optimization problems, sporadic model building leads to a significant model-building speedup, which decreases the asymptotic time complexity of model building in hBOA by a factor of O(n^0.26) to O(n^0.5), where n is the problem size. On the other hand, sporadic model building also increases the number of evaluations until convergence; nonetheless, if model building is the bottleneck, the evaluation slowdown is insignificant compared to the gains in the asymptotic complexity of model building. The paper also presents a dimensional model to provide a heuristic for scaling the structure-building period, which is the only parameter of the proposed sporadic model-building approach. The paper then tests the proposed method and the rule for setting the structure-building period on the problem of finding ground states of 2D and 3D Ising spin glasses. ReferenceMartin Pelikan, Kumara Sastry, and David E. Goldberg (2008). Sporadic Model Building for Efficiency Enhancement of the Hierarchical BOA. Genetic Programming and Evolvable Machines, 9(1), Springer, pp. 53-84. From Mating Pool Distributions to Model Overfitting Claudio F. Lima, Fernando G. Lobo, and Martin Pelikan Abstract This paper addresses selection as a source of overfitting in Bayesian estimation of distribution algorithms (EDAs). The purpose of the paper is twofold. First, it shows how the selection operator can lead to model overfitting in the Bayesian optimization algorithm (BOA). Second, the metric score that guides the search for an adequate model structure is modified to take into account the non-uniform distribution of the mating pool generated by tournament selection. ReferenceClaudio F. Lima, Fernando G. Lobo, and Martin Pelikan (2008). From Mating Pool Distributions to Model Overfitting. Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, pp. 431-438. Enhancing Efficiency of Hierarchical BOA via Distance-Based Model Restrictions Mark Hauschild and Martin Pelikan Abstract This paper analyzes the effects of restricting probabilistic models in the hierarchical Bayesian optimization algorithm (hBOA) by defining a distance metric over variables and disallowing dependencies between variables at distances greater than a given threshold. We argue that by using prior problem-specific knowledge, it is often possible to develop a distance metric that closely corresponds to the strength of interactions between variables. This distance metric can then be used to speed up model building in hBOA. Three test problems are considered: 3D Ising spin glasses, random additively decomposable problems, and the minimum vertex cover. ReferenceMark Hauschild, Martin Pelikan (2008). Enhancing Efficiency of Hierarchical BOA via Distance-Based Model Restrictions. MEDAL Report No. 2008007, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Parallel Problem Solving from Nature (PPSN IX), Springer, pp. 417-427. Download PDF An Application of a Multivariate Estimation of Distribution Algorithm to Cancer Chemotherapy Alexander Brownlee, Martin Pelikan, John McCall, and Andrei Petrovski Abstract Chemotherapy treatment for cancer is a complex optimisation problem with a large number of interacting variables and constraints. A number of different probabilistic algorithms have been applied to it with varying success. In this paper we expand on this by applying two estimation of distribution algorithms to the problem. One is UMDA, which uses a univariate probabilistic model similar to previously applied EDAs. The other is hBOA, the first EDA using a multivariate probabilistic model to be applied to the chemotherapy problem. While instinct would lead us to predict that the more sophisticated algorithm would yield better performance on a complex problem like this, we show that it is outperformed by the algorithms using the simpler univariate model. We hypothesise that this is caused by the more sophisticated algorithm being impeded by the large number of interactions in the problem which are unnecessary for its solution. ReferenceAlexander Brownlee, Martin Pelikan, John McCall, and Andrei Petrovski (2008). An Application of a Multivariate Estimation of Distribution Algorithm to Cancer Chemotherapy. MEDAL Report No. 2008005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms Martin Pelikan, Helmut G. Katzgraber, and Sigismund Kobe Abstract This study focuses on the problem of finding ground states of random instances of the Sherrington-Kirkpatrick (SK) spin-glass model with Gaussian couplings. While the ground states of SK spin-glass instances can be obtained with branch and bound, the computational complexity of branch and bound yields instances of not more than about 90 spins. We describe several approaches based on the hierarchical Bayesian optimization algorithm (hBOA) to reliably identifying ground states of SK instances intractable with branch and bound, and present a broad range of empirical results on such problem instances. We argue that the proposed methodology holds a big promise for reliably solving large SK spin-glass instances to optimality with practical time complexity. The proposed approaches to identifying global optima reliably can also be applied to other problems and they can be used with many other evolutionary algorithms. Performance of hBOA is compared to that of the genetic algorithm with two common crossover operators. ReferenceMartin Pelikan, Helmut G. Katzgraber, and Sigismund Kobe (2008). Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchical BOA and Genetic Algorithms. MEDAL Report No. 2008004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, pp. 447-454. Download PDF Using Previous Models to Bias Structural Learning in the Hierarchical BOA Mark Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract Estimation of distribution algorithms (EDAs) are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. While the primary goal of applying EDAs is to discover the global optimum or at least its accurate approximation, besides this, any EDA provides us with a sequence of probabilistic models, which in most cases hold a great deal of information about the problem. Although using problem-specific knowledge has been shown to significantly improve performance of EDAs and other evolutionary algorithms, this readily available source of problem-specific information has been practically ignored by the EDA community. This paper takes the first step towards the use of probabilistic models obtained by EDAs to speed up the solution of similar problems in future. More specifically, we propose two approaches to biasing model building in the hierarchical Bayesian optimization algorithm (hBOA) based on knowledge automatically learned from previous hBOA runs on similar problems. We show that the proposed methods lead to substantial speedups and argue that the methods should work well in other applications that require solving a large number of problems with similar structure. ReferenceMark Hauschild, Martin Pelikan, Kumara Sastry, and David E. Goldberg (2008). Using Previous Models to Bias Structural Learning in the Hierarchical BOA. MEDAL Report No. 2008003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, 415-422. Download PDF iBOA: The Incremental Bayesian Optimization Algorithm Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably. ReferenceMartin Pelikan, Kumara Sastry, and David E. Goldberg (2008). iBOA: The Incremental Bayesian Optimization Algorithm. MEDAL Report No. 2008002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, 455-462. Download PDF Analysis of Estimation of Distribution Algorithms and Genetic Algorithms on NK Landscapes Martin Pelikan Abstract This study analyzes performance of several genetic and evolutionary algorithms on randomly generated NK fitness landscapes with various values of n and k. A large number of NK problem instances are first generated for each $n$ and $k$, and the global optimum of each instance is obtained using the branch-and-bound algorithm. Next, the hierarchical Bayesian optimization algorithm (hBOA), the univariate marginal distribution algorithm (UMDA), and the simple genetic algorithm (GA) with uniform and two-point crossover operators are applied to all generated instances. Performance of all algorithms is then analyzed and compared, and the results are discussed. ReferenceMartin Pelikan (2008). Analysis of Estimation of Distribution Algorithms and Genetic Algorithms on NK Landscapes. MEDAL Report No. 2008001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference (GECCO-2008), ACM Press, pp. 1033-1040. Download PDF 2007 Influence of Selection and Replacement Strategies on Linkage Learning in BOA Claudio F. Lima, Martin Pelikan, David E. Goldberg, Fernando G. Lobo, Kumara Sastry, and Mark Hauschild Abstract The Bayesian optimization algorithm (BOA) uses Bayesian networks to learn linkages between the decision variables of an optimization problem. This paper studies the influence of different selection and replacement methods on the accuracy of linkage learning in BOA. Results on concatenated m-k deceptive trap functions show that the model accuracy depends on a large extent on the choice of selection method and to a lesser extent on the replacement strategy used. Specifically, it is shown that linkage learning in BOA is more accurate with truncation selection than with tournament selection. The choice of replacement strategy is important when tournament selection is used, but it is not relevant when using truncation selection. On the other hand, if performance is our main concern, tournament selection and restricted tournament replacement should be preferred. These results aim to provide practitioners with useful information about the best way to tune BOA with respect to structural model accuracy and overall performance. ReferenceClaudio F. Lima, Martin Pelikan, David E. Goldberg, Fernando G. Lobo, Kumara Sastry, and Mark Hauschild (2007). Influence of Selection and Replacement Strategies on Linkage Learning in BOA. MEDAL Report No. 2007005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in IEEE International Conference on Evolutionary Computation (CEC-2007), pp. 1083-1090. Download PDF Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs Rajiv Kalapala, Martin Pelikan and Alexander K. Hartmann Abstract This paper analyzes the hierarchical Bayesian optimization algorithm (hBOA) on minimum vertex cover for standard classes of random graphs and transformed SAT instances. The performance of hBOA is compared with that of the branch-and-bound problem solver (BB), the simple genetic algorithm (GA) and the parallel simulated annealing (PSA). The results indicate that BB is significantly outperformed by all the other tested methods, which is expected as BB is a complete search algorithm and minimum vertex cover is an NP-complete problem. The best performance is achieved by hBOA; nonetheless, the performance differences between hBOA and other evolutionary algorithms are relatively small, indicating that mutation-based search and recombination-based search lead to similar performance on the tested classes of minimum vertex cover problems. ReferenceRajiv Kalapala, Martin Pelikan and Alexander K. Hartmann (2007). Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs. MEDAL Report No. 2007004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published as Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs, Genetic and Evolutionary Computation Conference (GECCO-2007), pp. 547-554. Download PDF Dependency Trees, Permutations, and Quadratic Assignment Problem Martin Pelikan, Shigeyoshi Tsutsui, and Rajiv Kalapala Abstract This paper describes and analyzes an estimation of distribution algorithm based on dependency tree models (dtEDA), which can explicitly encode probabilistic models for permutations. dtEDA is tested on deceptive ordering problems and a number of instances of the quadratic assignment problem. The performance of dtEDA is compared to that of the standard genetic algorithm with the partially matched crossover (PMX) and the linear order crossover (LOX). In the quadratic assignment problem, the robust tabu search is also included in the comparison. ReferenceMartin Pelikan, Shigeyoshi Tsutsui, and Rajiv Kalapala (2007). Dependency Trees, Permutations, and Quadratic Assignment Problem. MEDAL Report No. 2007003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Obtaining Ground States of Ising Spin Glasses via Optimizing Bonds Instead of Spins Martin Pelikan and Alexander K. Hartmann Abstract Frustrated Ising spin glasses represent a rich class of challenging optimization problems that share many features with other complex, highly multimodal optimization and combinatorial problems. This paper shows that transforming candidate solutions to an alternative representation that is strongly tied to the energy function simplifies the exploration of the space of potential spin configurations and that it significantly improves performance of evolutionary algorithms with simple variation operators on Ising spin glasses. The proposed techniques are incorporated into the simple genetic algorithm, the univariate marginal distribution algorithm, and the hierarchical Bayesian optimization algorithm. ReferenceMartin Pelikan and Alexander K. Hartmann (2007). Obtaining Ground States of Ising Spin Glasses via Optimizing Bonds Instead of Spins. MEDAL Report No. 2007002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses Mark Hauschild, Martin Pelikan, Claudio F. Lima, and Kumara Sastry Abstract The hierarchical Bayesian optimization algorithm (hBOA) can solve nearly decomposable and hierarchical problems of bounded difficulty in a robust and scalable manner by building and sampling probabilistic models of promising solutions. This paper analyzes probabilistic models in hBOA on two common test problems: concatenated traps and 2D Ising spin glasses with periodic boundary conditions. We argue that although Bayesian networks with local structures can encode complex probability distributions, analyzing these models in hBOA is relatively straightforward and the results of such analyses may provide practitioners with useful information about their problems. The results show that the probabilistic models in hBOA closely correspond to the structure of the underlying optimization problem, the models do not change significantly in subsequent iterations of BOA, and creating adequate probabilistic models by hand is not straightforward even with complete knowledge of the optimization problem. ReferenceMark Hauschild, Martin Pelikan, Claudio F. Lima, and Kumara Sastry (2007). Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses. MEDAL Report No. 2007001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF 2006 Implementation of the Dependency-Tree Estimation of Distribution Algorithm in C++ Martin Pelikan Abstract This technical report describes how to download, compile and use the C++ source code of the dependency-tree estimation of distribution algorithm, which uses dependency trees to model promising solutions and sample the new ones. Candidate solutions are represented by fixed-length strings with a finite number of values in each string position (the code is not restricted to binary strings). The report also describes how to modify the code to solve new optimization problems. Additional documentation can be found in the provided software package, which includes a detailed documentation in PDF, HTML, and LaTeX formats. ReferenceMartin Pelikan (2006). Implementation of the Dependency-Tree Estimation of Distribution Algorithm in C++. MEDAL Report No. 2006010, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Node Histogram vs. Edge Histogram: A Comparison of PMBGAs in Permutation Domains Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg Abstract Previous papers have proposed an algorithm called the edge histogram sampling algorithm (EHBSA) that models the relative relation between two nodes (edge) of permutation strings of a population within the PMBGA framework for permutation domains. This paper proposes another histogram based model we call the node histogram sampling algorithm (NHBSA). The NHBSA models node frequencies at each absolute position in strings of a population. Sampling methods are similar to that of EHBSA. Performance of NHBSA is compared with that of EHBSA using two types of permutation problems: the FSSP and the quadratic assignment problem (QAP). The results showed that the NHBSA works better than the EHBSA on these problems. ReferenceShigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg (2006). Node Histogram vs. Edge Histogram: A Comparison of PMBGAs in Permutation Domains. MEDAL Report No. 2006009, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Cunning Ant System: An Extension of Edge Histogram Sampling Algorithms to ACO Shigeyoshi Tsutsui and Martin Pelikan Abstract In this paper, we propose a variant of an ACO algorithm called the cunning Ant System (cAS). In cAS, each ant generates a solution by borrowing a part of a solution which was generated in previous iterations, instead of generating the solution entirely from pheromone density. Thus we named it, cunning ant. This cunning action reduces premature stagnation and exhibits good performance in the search. The experimental results showed cAS worked very well on the TSP and it may be one of the most promising ACO algorithms. ReferenceShigeyoshi Tsutsui and Martin Pelikan (2006). Cunning Ant System: An Extension of Edge Histogram Sampling Algorithms to ACO. MEDAL Report No. 2006008, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Substructural Neighborhoods for Local Search in the Bayesian Optimization Algorithm Claudio F. Lima, Martin Pelikan, Kumara Sastry, Martin Butz, David E. Goldberg, and Fernando G. Lobo Abstract It has been shown that model building in the hierarchical Bayesian optimization algorithm (hBOA) can be efficiently parallelized by randomly generating an ancestral ordering of the nodes of the network prior to learning the network structure and allowing only dependencies consistent with the generated ordering. However, it has not been thoroughly shown that this approach to restricting probabilistic models does not affect scalability of hBOA on important classes of problems. This paper demonstrates that although the use of a random ancestral ordering restricts the structure of considered models to allow efficient parallelization of model building, its effects on hBOA performance and scalability are negligible. ReferenceClaudio F. Lima, Martin Pelikan, Kumara Sastry, Martin Butz, David E. Goldberg, and Fernando G. Lobo (2006). Substructural neighborhoods for local search in the Bayesian optimization algorithm. MEDAL Report No. 2006007, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Parallel Problem Solving from Nature (PPSN IX), pp. 232-241. Download PDF Population Sizing for Entropy-Based Model Building in Genetic Algorithms Tian-Li Yu, Kumara Sastry, David E. Goldberg, and Martin Pelikan Abstract This paper presents a population-sizing model for the entropy-based model building in genetic algorithms. Specifically, the population size required for building an accurate model is investigated. The effect of the selection pressure on population sizing is also incorporated. The proposed model indicates that the population size required for building an accurate model scales as O(mlog m), where m is the number of substructures and proportional to the problem size. Experiments are conducted to verify the derivations, and the results agree with the proposed model. ReferenceTian-Li Yu, Kumara Sastry, David E. Goldberg, and Martin Pelikan (2006). Population Sizing for Entropy-based Model Building in Genetic Algorithms. MEDAL Report No. 2006006, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference 2007 (GECCO-2007), pp. 601-608. Download PDF Order or Not: Does Parallelization of Model Building in hBOA Affect Its Scalability? Martin Pelikan and James D. Laury, Jr. Abstract It has been shown that model building in the hierarchical Bayesian optimization algorithm (hBOA) can be efficiently parallelized by randomly generating an ancestral ordering of the nodes of the network prior to learning the network structure and allowing only dependencies consistent with the generated ordering. However, it has not been thoroughly shown that this approach to restricting probabilistic models does not affect scalability of hBOA on important classes of problems. This paper demonstrates that although the use of a random ancestral ordering restricts the structure of considered models to allow efficient parallelization of model building, its effects on hBOA performance and scalability are negligible. ReferencePelikan, M., Laury, J. D., Jr. (2006). Order or Not: Does Parallelization of Model Building in hBOA Affect Its Scalability?. MEDAL Report No. 2006005, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published in Genetic and Evolutionary Computation Conference 2007 (GECCO-2007), pp. 555-561. Download PDF Analysis of Ideal Recombination on Random Decomposable Problems Kumara Sastry, Martin Pelikan, and David E. Goldberg Abstract This paper analyzes the behavior of a selectorecombinative genetic algorithm (GA) with an ideal crossover on a class of random additively decomposable problems (rADPs). Specifically, additively decomposable problems of order k whose subsolution fitnesses are sampled from the standard uniform distribution U[0,1] are analyzed. The scalability of the selectorecombinative GA is investigated for 10,000 rADP instances. The validity of facetwise models in bounding the population size, run duration, and the number of function evaluations required to successfully solve the problems is also verified. Finally, rADP instances that are easiest and most difficult are also investigated. ReferenceSastry, K., Pelikan, M., Goldberg, D. E. (2006). Analysis of ideal recombination on random decomposable problems. MEDAL Report No. 2006004, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Published as Empirical Analysis of Ideal Recombination on Random Decomposable Problems, Genetic and Evolutionary Computation Conference (GECCO-2007), pp. 1388-1395. Download PDF Generator and Interface for Random Decomposable Problems in C Martin Pelikan, Kumara Sastry, Martin V. Butz, and David E. Goldberg Abstract This technical report describes how to download, compile and use the C source code of the generator of random additively decomposable functions and the functions that enable the use of the generated instances in any standard optimization algorithm. ReferencePelikan, M., Sastry, K., Butz, M. V., Goldberg, D. E. (2006). Generator and interface for random decomposable problems in C. MEDAL Report No. 2006003, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Hierarchical BOA, Cluster Exact Approximation, and Ising Spin Glasses Martin Pelikan and Alexander K. Hartmann Abstract This paper analyzes the hierarchical Bayesian optimization algorithm (hBOA) on the problem of finding ground states of Ising spin glasses with $\pm J$ couplings in two and three dimensions. The performance of hBOA is compared to that of the simple genetic algorithm (GA) and the univariate marginal distribution algorithm (UMDA). The performance of all tested algorithms is improved by incorporating a deterministic hill climber based on single-bit flips and cluster exact approximation (CEA). The results show that hBOA significantly outperforms GA and UMDA on a broad spectrum of spin-glass instances and that CEA enables all tested algorithms to solve larger spin-glass instances. Using advanced hybrid methods created by combining competent genetic and evolutionary algorithms with advanced local searchers thus proves advantageous in this challenging class of problems. ReferencePelikan, M., Hartmann, A. K. (2006). Hierarchical BOA, cluster exact approximation, and Ising spin glasses. MEDAL Report No. 2006002, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Hierarchical BOA on Random Decomposable Problems Martin Pelikan, Kumara Sastry, Martin V. Butz, and David E. Goldberg Abstract The hierarchical Bayesian optimization algorithm (hBOA) can solve nearly decomposable and hierarchical problems scalably and reliably. This paper describes a class of random additively decomposable problems with and without interactions between the subproblems and tests hBOA on a large number of random instances of the proposed class of problems. The performance of hBOA is compared to that of the simple genetic algorithm with standard crossover and mutation operators, the univariate marginal distribution algorithm, and the hill climbing with bit-flip mutation. The results confirm that hBOA achieves quadratic or subquadratic performance on the proposed class of random decomposable problems and that it significantly outperforms all other methods included in the comparison. The results also show that low-order polynomial scalability is retained even when only a small percentage of hardest problems are considered and that hBOA is a robust algorithm because its performance does not change much across the entire spectrum of random problem instances of the same structure. The proposed class of decomposable problems can be used to test other optimization algorithms that address nearly decomposable problems. ReferencePelikan, M., Sastry, K., Butz, M. V., Goldberg, D. E. (2006). Hierarchical BOA on random decomposable problems. MEDAL Report No. 2006001, Missouri Estimation of Distribution Algorithms Laboratory, University of Missouri in St. Louis, MO. Download PDF Sporadic Model Building for Efficiency Enhancement of hBOA Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract This paper describes and analyzes sporadic model building, which can be used to enhance the efficiency of the hierarchical Bayesian optimization algorithm (hBOA) and other estimation of distribution algorithms (EDAs). With sporadic model building, the structure of the probabilistic model is updated once every few iterations (generations), whereas in the remaining iterations only model parameters (conditional and marginal probabilities) are updated. Since the time complexity of updating model parameters is much lower than the time complexity of learning the model structure, sporadic model building decreases the overall time complexity of model building. The paper shows that for boundedly difficult nearly decomposable and hierarchical optimization problems, sporadic model building leads to a significant model-building speedup that decreases the asymptotic time complexity of model building in hBOA by a factor of O(n^0.26) to O(n^0.65), where n is the problem size. On the other hand, sporadic model building also increases the number of evaluations until convergence; nonetheless, for decomposable problems, the evaluation slowdown is insignificant compared to the gains in the asymptotic complexity of model building. ReferencePelikan, M., Sastry, K., Goldberg, D. E. (2006). Sporadic model building for efficiency enhancement of hBOA. Genetic and Evolutionary Computation Conference (GECCO-2006), 405-412. Also IlliGAL Report No. 2005026. Download PDF Download PS 2005 Automated Global Structure Extraction for Effective Local Building Block Processing in XCS Martin V. Butz, Martin Pelikan, Xavier Llora, and David E. Goldberg Abstract Learning Classifier Systems (LCSs), such as XCS and other accuracy-based classifier systems, evolve distributed problem solutions represented by a population of rules. During evolution, features are specialized, propagated, and recombined to provide increasingly accurate subsolutions. Recently, it was shown that, like in conventional genetic algorithms (GAs), some problems require the processing of subsets of features as opposed to individual features to find the problem solution efficiently. In such problems, standard variation operators of genetic and evolutionary algorithms used in LCSs suffer from potential disruption of groups of interacting features, resulting in poor performance. This paper introduces two competent crossover operators to XCS by incorporating techniques derived from competent GAs: the extended compact GA (ECGA) and the Bayesian optimization algorithm (BOA). Instead of simple crossover operators such as uniform crossover or one-point crossover, here a probabilistic model of the global population is built and sampled to generate offspring classifiers locally. The distinction between the global and local problem structure is an additional challenge since the local problem structure may differ from the global structure. Two offspring generation methods are introduced and evaluated. The results show that it is possible to achieve performance similar to that of informed crossover operators, which are specifically designed to provide ideal exploration on tested problems. Thus, by detecting dependency structures online, we create the first competent LCSs, XCS/ECGA and XCS/BOA, that are able to identify and propagate lower-level dependency structures effectively without any information about these structures given in advance. ReferenceButz, M.V., Pelikan, M., Llora, X., Goldberg, D.E. (2005). Automated global structure extraction for effective local building block processing in XCS. IlliGAL Report No. 2005011, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL. Download PDF Download PS Extracted Global Structure Makes Local Building Block Processing Effective in XCS Martin V. Butz, Martin Pelikan, Xavier Llora, and David E. Goldberg Abstract Learning Classifier Systems (LCSs), such as the accuracy-based XCS system, evolve distributed problem solutions represented by a population of rules. Recently, it was shown that decomposable problems may require effective processing of feature subsets as opposed to individual features, which cannot be generally assured with standard crossover operators. Although a number of competent crossover operators capable of effective identification and processing of arbitrary subsets of variables or string positions were proposed for genetic and evolutionary algorithms, none of these operators were incorporated into LCSs. This paper introduces two competent crossover operators to XCS by incorporating techniques from competent genetic algorithms (GAs): the extended compact GA (ECGA) and the Bayesian optimization algorithm (BOA). Instead of applying standard crossover operators, here a probabilistic model of the global population is built and sampled to generate offspring classifiers locally. Two offspring generation methods are introduced and evaluated. Results indicate that the performance of the proposed learning classifier systems XCS/ECGA and XCS/BOA is similar to that of XCS with informed crossover operators that is given all information about problem structure on input and exploits this knowledge using problem-specific crossover operators. ReferenceButz, M.V., Pelikan, M., Llora, X., Goldberg, D.E. (2005). Extracted global structure makes local building block processing effective in XCS. Genetic and Evolutionary Computation Conference (GECCO-2005), pp. 655-662. Also IlliGAL Report No. 2005010. Download PDF Download PS Multiobjective hBOA, Clustering, and Scalability Martin Pelikan, Kumara Sastry, and David E. Goldberg Abstract This paper describes a scalable algorithm for solving multiobjective decomposable problems by combining the hierarchical Bayesian optimization algorithm (hBOA) with the nondominated sorting genetic algorithm (NSGA-II) and clustering in the objective space. It is first argued that for good scalability, clustering or some other form of niching in the objective space is necessary and the size of each niche should be approximately equal. Multiobjective hBOA (mohBOA) is then described that combines hBOA, NSGA-II and clustering in the objective space. The algorithm mohBOA differs from the multiobjective variants of BOA and hBOA proposed in the past by including clustering in the objective space and allocating an approximately equally sizedportion of the population to each cluster. The algorithm mohBOA is shown to scale up well on a number of problems on which standard multiobjective evolutionary algorithms perform poorly. ReferencePelikan, M., Sastry, K., Goldberg, D.E. (2005). Multiobjective hBOA, clustering, and scalability. Genetic and Evolutionary Computation Conference (GECCO-2005), pp. 663-670. Also IlliGAL Report No. 2005005. Download PDF Download PS Decomposable Problems, Niching, and Scalability of Multiobjective Estimation of Distribution Algorithms Kumara Sastry, Martin Pelikan, and David E. Goldberg Abstract The paper analyzes the scalability of multiobjective estimation of distribution algorithms (MOEDAs) on a class of boundedly-difficult additively-separable multiobjective optimization problems. The paper illustrates that even if the linkage is correctly identified, massive multimodality of the search problems can easily overwhelm the nicher and lead to exponential scale-up. Facetwise models are subsequently used to propose a growth rate of the number of differing substructures between the two objectives to avoid the niching method from being overwhelmed and lead to polynomial scalability of MOEDAs. ReferenceSastry, K., Pelikan, M., Goldberg, D.E. (2005). Decomposable problems, niching, and scalability of multiobjective estimation of distribution algorithms, IlliGAL Report No. 2005004. Published as Limits of scalability of multiobjective estimation of distribution algorithms, IEEE International Conference on Evolutionary Computation (CEC-2005), pp. 2217-2224. Download PDF Download PS Scalability of Genetic Programming and Probabilistic Incremental Program Evolution Radovan Ondas, Martin Pelikan, and Kumara Sastry Abstract This paper discusses scalability of standard genetic programming (GP) and the probabilistic incremental program evolution (PIPE). To investigate the need for both effective mixing and linkage learning, two test problems are considered: ORDER problem, which is rather easy for any recombination-based GP, and TRAP or the deceptive trap problem, which requires the algorithm to learn interactions among subsets of terminals. The scalability results show that both GP and PIPE scale up polynomially with problem size on the simple ORDER problem, but they both scale up exponentially on the deceptive problem. This indicates that while standard recombination is sufficient when no interactions need to be considered, for some problems linkage learning is necessary. These results are in agreement with the lessons learned in the domain of binary-string genetic algorithms (GAs). Furthermore, the paper investigates the effects of introducing unnecessary and irrelevant primitives on the performance of GP and PIPE. ReferenceOndas, R., Pelikan, M., Sastry, K. (2005). Scalability of genetic programming and probabilistic incremental program evolution, arXiv:cs/0502029. Published as Genetic Programming, Probabilistic Incremental Program Evolution, and Scalability, WSC10: 10th Online World Conference on Soft Computing in Industrial Applications, pp. 363-372. Download PDF Download PS Hierarchical Bayesian Optimization Algorithm: Toward a New Generation of Evolutionary Algorithms Martin Pelikan Abstract This book provides a framework for the design of competent optimization techniques by combining advanced evolutionary algorithms with state-of-the-art machine learning techniques. The primary focus of the book is on two algorithms that replace traditional variation operators of evolutionary algorithms, by learning and sampling Bayesian networks: the Bayesian optimization algorithm (BOA) and the hierarchical BOA (hBOA). They provide a scalable solution to a broad class of problems. The book provides an overview of evolutionary algorithms that use probabilistic models to guide their search, motivates and describes BOA and hBOA in a way accessible to a wide audience, and presents numerous results confirming that they are revolutionary approaches to black-box optimization. ReferencePelikan, M. (2005). Hierarchical Bayesian optimization algorithm: Toward a new generation of evolutionary algorithms. Springer. The book can be ordered from Springer here. Database Independent Detection of Isotopically Labeled MS/MS Spectrum Peptide Pairs Frank Potthast, Jiri Ocenasek, Dorothea Rutishauser, Martin Pelikan, and Ralph Schlapbach Abstract Mass spectrometry data generated in differential profiling of complex protein samples are classically exploited using database searches. In addition, quantitative profiling is performed by various methods, one of them using isotopically coded affinity tags, where one typically uses a light and a heavy tag. Here, we present a new algorithm, ICATcher, which detects pairs of light/heavy peptide MS/MS spectra independent of sequence databases. We also give a framework on how to organize spectrum pairs into links, clusters, and hyperclusters. The method can be used for de novo sequencing and detection of posttranslational modifications. ICATcher is distributed as open source software. ReferencePotthast, F., Ocenasek, J., Rutishauser, D., Pelikan, M., Schlapbach, R. (2005). Database Independent Detection of Isotopically Labeled MS/MS Spectrum Peptide Pairs. Journal of Chromatography B, 817 (2), pp. 225-230. 2004 Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms Kumara Sastry, David E. Goldberg, and Martin Pelikan Abstract This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the probabilistic model of variable interactions. The fitness values of some offspring are estimated using the probabilistic model, thereby avoiding computationally expensive function evaluations. The scalability of the aforementioned techniques are analyzed using facetwise models for convergence time and population sizing. The speed-up obtained by each of the methods is predicted and verified with empirical results. The results show that for additively separable problems the competent mutation operator requires O(k 0.5 logm)--where k is the building-block size, and m is the number of building blocks--less function evaluations than its selectorecombinative counterpart. The results also show that the use of an internal probabilistic fitness model reduces the required number of function evaluations to as low as 1-10% and yields a speed-up of 2--50. ReferenceSastry, K., Goldberg, D.E., Pelikan, M. (2004). Efficiency Enhancement of of Probabilistic Model Building Genetic Algorithms. IlliGAL Report No. 2004020, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL. Download PDF Download PS Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms Kumara Sastry, David E. Goldberg, and Martin Pelikan Abstract This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the probabilistic model of variable interactions. The fitness values of some offspring are estimated using the probabilistic model, thereby avoiding computationally expensive function evaluations. The scalability of the aforementioned techniques are analyzed using facetwise models for convergence time and population sizing. The speed-up obtained by each of the methods is predicted and verified with empirical results. The results show that for additively separable problems the competent mutation operator requires O(k 0.5 logm)--where k is the building-block size, and m is the number of building blocks--less function evaluations than its selectorecombinative counterpart. The results also show that the use of an internal probabilistic fitness model reduces the required number of function evaluations to as low as 1-10% and yields a speed-up of 2--50. ReferenceSastry, K., Goldberg, D.E., Pelikan, M. (2004). Efficiency Enhancement of of Probabilistic Model Building Genetic Algorithms. IlliGAL Report No. 2004020, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL. Download PDF Download PS Parallel Mixed Bayesian Optimization Algorithm: A Scaleup Analysis Jiri Ocenasek and Martin Pelikan Abstract Estimation of Distribution Algorithms have been proposed as a new paradigm for evolutionary optimization. This paper focuses on the parallelization of Estimation of Distribution Algorithms. More specifically, the paper discusses how to predict performance of parallel Mixed Bayesian Optimization Algorithm (MBOA) that is based on parallel construction of Bayesian networks with decision trees. We determine the time complexity of parallel Mixed Bayesian Optimization Algorithm and compare this complexity with experimental results obtained by solving the spin glass optimization problem. The empirical results fit well the theoretical time complexity, so the scalability and efficiency of parallel Mixed Bayesian Optimization Algorithm for unknown instances of spin glass benchmarks can be predicted. Furthermore, we derive the guidelines that can be used to design effective parallel Estimation of Distribution Algorithms with the speedup proportional to the number of variables in the problem. ReferencePelikan, M., Ocenasek, J. (2004). Parallel mixed Bayesian optimization algorithm: A scaleup analysis. Workshop Proceedings of the Genetic and Evolutionary Computation Conference 2004 (GECCO-2004). Download PDF Download PS Efficiency Enhancement of Genetic Algorithms via Building-Block-Wise Fitness Estimation Kumara Sastry, Martin Pelikan, and David E. Goldberg Abstract This paper studies fitness inheritance as an efficiency enhancement technique for a class of competent genetic algorithms called estimation distribution algorithms. Probabilistic models of important sub-solutions are developed to estimate the fitness of a proportion of individuals in the population, thereby avoiding computationally expensive function evaluations. The effect of fitness inheritance on the convergence time and population sizing are modeled and the speed-up obtained through inheritance is predicted. The results show that a fitness-inheritance mechanism which utilizes information on building-block fitnesses provides significant efficiency enhancement. For additively separable problems, fitness inheritance reduces the number of function evaluations to about half and yields a speed-up of about 1.75--2.25. ReferenceSastry, K., Pelikan, M., Goldberg, D.E. (2004). Efficiency Enhancement of Genetic Algorithms via Building-Block-Wise Fitness Estimation. IlliGAL Report No. 2004010, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL. Published in IEEE International Conference on Evolutionary Computation (CEC-2004), IEEE Press, pp. 720-727. Download PDF Download PS Computational Complexity and Simulation of Rare Events of Ising Spin Glasses Martin Pelikan, Jiri Ocenasek, Simon Trebst, Matthias Troyer, and Fabien Alet Abstract We discuss the computational complexity of random 2D Ising spin glasses, which represent an interesting class of constraint satisfaction problems for black box optimization. Two extremal cases are considered: (1) the +/- J spin glass, and (2) the Gaussian spin glass. We also study a smooth transition between these two extremal cases. The computational complexity of all studied spin glass systems is found to be dominated by rare events of extremely hard spin glass samples. We show that complexity of all studied spin glass systems is closely related to Frechet extremal value distribution. In a hybrid algorithm that combines the hierarchical Bayesian optimization algorithm (hBOA) with a deterministic bit-flip hill climber, the number of steps performed by both the global searcher (hBOA) and the local searcher follow Frechet distributions. Nonetheless, unlike in methods based purely on local search, the parameters of these distributions confirm good scalability of hBOA with local search. We further argue that standard performance measures for optimization algorithms---such as the average number of evaluations until convergence---can be misleading. Finally, our results indicate that for highly multimodal constraint satisfaction problems, such as Ising spin glasses, recombination-based search can provide qualitatively better results than mutation-based search. ReferencePelikan, M., Ocenasek, J., Trebst, S., Troyer, M., Alet, F. (2004). Computational Complexity and Simulation of Rare Events of Ising Spin Glasses.Genetic and Evolutionary Computation Conference 2004 (GECCO-2004), Springer-Verlag, pp. 36-47. Download PDF Download PS Parameter-less Hierarchical BOA Martin Pelikan and Tz-Kai Lin Abstract The parameter-less hierarchical Bayesian optimization algorithm (hBOA) enables the use of hBOA without the need for tuning parameters for solving each problem instance. There are three crucial parameters in hBOA: (1) the selection pressure, (2) the window size for restricted tournaments, and (3) the population size. Although both the selection pressure and the window size influence hBOA performance, performance should remain low-order polynomial with standard choices of these two parameters. However, there is no standard population size that would work for all problems of interest and the population size must thus be eliminated in a different way. To eliminate the population size, the parameter-less hBOA adopts the population-sizing technique of the parameter-less genetic algorithm. Based on the existing theory, the parameter-less hBOA should be able to solve nearly decomposable and hierarchical problems in quadratic or subquadratic number of function evaluations without the need for setting any parameters whatsoever. A number of experiments are presented to verify scalability of the parameter-less hBOA. ReferencePelikan, M., Tz-Kai Lin (2004). Parameter-less hierarchical BOA. Genetic and Evolutionary Computation Conference 2004 (GECCO-2004), Springer-Verlag, pp. 24-35. Download PDF Download PS Fitness inheritance in the Bayesian optimization algorithm Martin Pelikan and Kumara Sastry Abstract This paper describes how fitness inheritance can be used to estimate fitness for a proportion of newly sampled candidate solutions in the Bayesian optimization algorithm (BOA). The goal of estimating fitness for some candidate solutions is to reduce the number of fitness evaluations for problems where fitness evaluation is expensive. Bayesian networks used in BOA to model promising solutions and generate the new ones are extended to allow not only for modeling and sampling candidate solutions, but also for estimating their fitness. The results indicate that fitness inheritance is a promising concept in BOA, because population-sizing requirements for building appropriate models of promising solutions lead to good fitness estimates even if only a small proportion of candidate solutions is evaluated using the actual fitness function. This can lead to a reduction of the number of actual fitness evaluations by a factor of 30 or more. ReferenceMartin Pelikan and Kumara Sastry (2004). Fitness inheritance in the Bayesian optimization algorithm. Genetic and Evolutionary Computation Conference 2004 (GECCO-2004), Springer-Verlag, pp. 48-59. Also IlliGAL Report No. 2004009. Download PDF Download PS 2003 Using edge histogram models to solve permutation problems with probabilistic model-building genetic algorithms Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg Abstract Recently, there has been a growing interest in probabilistic model-building genetic algorithms (PMBGAs), which replace traditional variation operators of genetic and evolutionary algorithms by building and sampling a probabilistic model of promising solutions. In this paper we propose a PMBGA that uses edge histogram based sampling algorithms (EHBSAs) to solve problems with candidate solutions represented by permutations. Two sampling algorithmsthe sampling without template (EHBSA/WO) and the sampling with template (EHBSA/WT)are presented. The proposed algorithms are tested on several instances of the traveling salesman problem (TSP). The results show that EHBSA/WT works fairly well even with a small population size on all tested problem instances and that it outperforms popular two-parent recombination operators for permutations and other PMBGAs for permutation problems. Combining EHBSA with a simple local heuristic for solving TSP called 2-OPT improves the performance of the algorithm, enabling efficient solution to problems of hundreds of cities. Nonetheless, unlike most other TSP solvers, EHBSA is not limited to solving TSP instances, but it can be applied to any problem defined on permutations. ReferenceShigeyoshi Tsutsui, Martin Pelikan, & David E. Goldberg (2003). Using edge histogram models to solve permutation problems with probabilistic model-building genetic algorithms. IlliGAL Report No. 2003022. Download PDF Download PS Robust and Scalable Black-Box Optimization, Hierarchy, and Ising Spin Glasses Martin Pelikan, David E. Goldberg, Jiri Ocenasek, and Simon Trebst Abstract One of the most important challenges in computational optimization is the design of advanced black-box optimization techniques that would enable automated, robust, and scalable solution to challenging optimization problems. This paper describes an advanced black-box optimizer---the hierarchical Bayesian optimization algorithm (hBOA)---that combines techniques of genetic and evolutionary computation, machine learning, and statistics to create a widely applicable tool for solving real-world optimization problems. The paper motivates hBOA, describes its basic procedure, and provides an in-depth empirical analysis of hBOA on the class of random 2D and 3D Ising spin glass problems. The results on Ising spin glasses indicate that even without much problem-specific knowledge, hBOA can provide competitive or better results than techniques specialized in solving the particular problem or class of problems. Furthermore, hBOA can solve a large class of nearly decomposable and hierarchical problems for which there exists no other scalable solution. ReferenceMartin Pelikan, David E. Goldberg, Jiri Ocenasek, & Simon Trebst (2003). Robust and scalable black-box optimization, hierarchy, and Ising spin glasses. IlliGAL Report No. 2003019. Download PDF Download PS Design of Multithreaded Estimation of Distribution Algorithms Jiri Ocenasek, Josef Schwarz, and Martin Pelikan Abstract Estimation of Distribution Algorithms (EDAs) use a probabilistic model of promising solutions found so far to obtain new candidate solutions of an optimization problem. This paper focuses on the design of parallel EDAs. More specifically, the paper describes a method for parallel construction of Bayesian networks with local structures in form of decision trees in the Mixed Bayesian Optimization Algorithm. The proposed Multithreaded Mixed Bayesian Optimization Algorithm (MMBOA) is intended for implementation on a cluster of workstations that communicate by Message Passing Interface (MPI). Communication latencies between workstations are eliminated by multithreaded processing, so in each workstation the high-priority model-building thread, which is communication demanding, can be overlapped by low-priority model sampling thread when necessary. High performance of MMBOA is verified via simulation in TRANSIM tool. ReferenceJiri Ocenasek, Josef Schwarz, & Martin Pelikan (2003). Design of multithreaded estimation of distribution algorithms. Genetic and Evolutionary Computation Conference 2003 (GECCO-2003), Springer, pp. 1247-1258. Download PDF Download PS Hierarchical BOA Solves Ising Spin Glasses and MAXSAT Martin Pelikan and David E. Goldberg Abstract Theoretical and empirical evidence exists that the hierarchical Bayesian optimization algorithm (hBOA) can solve challenging hierarchical problems and anything easier. This paper applies hBOA to two important classes of real-world problems: Ising spin-glass systems and maximum satisfiability (MAXSAT). The paper shows how easy it is to apply hBOA to real-world optimization problems---in most cases hBOA can be applied without any prior problem analysis, it can acquire enough problem-specific knowledge automatically. The results indicate that hBOA is capable of solving enormously difficult problems that cannot be solved by other optimizers and still provide competitive or better performance than problem-specific approaches on other problems. The results thus confirm that hBOA is a practical, robust, and scalable technique for solving challenging real-world problems. ReferenceMartin Pelikan & David E. Goldberg (2003). Hierarchical BOA solves Ising spin glasses and MAXSAT. Genetic and Evolutionary Computation Conference 2003 (GECCO-2003), Springer, pp. 1271-1282. Also IlliGAL Report No. 2003001. Download PDF Download PS 2002 Solving Sequence Problems by Building and Sampling Edge Histograms Shigeyoshi Tsutsui, David E. Goldberg, and Martin Pelikan Abstract Recently, there has been a growing interest in developing evolutionary algorithms based on probabilistic modeling. In this scheme, the offspring population is generated according to the estimated probability density model of the parent instead of using recombination and mutation operators. In this paper, we have proposed probabilistic model-building genetic algorithms (PMBGAs) in permutation representation domain using edge histogram based sampling algorithms (EHBSAs). Two types of sampling algorithms, without template (EHBSA/WO) and with template (EHBSA/WT), are presented. The results were tested in the TSP and showed EHBSA/WT worked fairly well with a small population size in the test problems used. It also worked better than well-known traditional two-parent recombination operators. ReferenceShigeyoshi Tsutsui, David E. Goldberg, & Martin Pelikan (2002). Solving sequence problems by building and sampling edge histograms. IlliGAL Report No. 2002024, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS Bayesian Optimization Algorithm: From Single Level to Hierarchy Martin Pelikan Abstract There are four primary goals of this dissertation. First, design a competent optimization algorithm capable of learning and exploiting appropriate problem decomposition by sampling and evaluating candidate solutions. Second, extend the proposed algorithm to enable the use of hierarchical decomposition as opposed to decomposition on only a single level. Third, design a class of difficult hierarchical problems that can be used to test the algorithms that attempt to exploit hierarchical decomposition. Fourth, test the developed algorithms on the designed class of problems and several real-world applications. The dissertation proposes the Bayesian optimization algorithm (BOA), which uses Bayesian networks to model the promising solutions found so far and sample new candidate solutions. BOA is theoretically and empirically shown to be capable of both learning a proper decomposition of the problem and exploiting the learned decomposition to ensure robust and scalable search for the optimum across a wide range of problems. The dissertation then identifies important features that must be incorporated into the basic BOA to solve problems that are not decomposable on a single level, but that can still be solved by decomposition over multiple levels of difficulty. Hierarchical BOA extends BOA by incorporating those features for robust and scalable optimization of hierarchically decomposable problems. A class of problems called hierarchical traps is then proposed to test the ability of optimizers to learn and exploit hierarchical decomposition. Hierarchical BOA passes the test and is shown to solve hierarchical traps and other hierarchical problems in a scalable manner. Finally, the dissertation applies hierarchical BOA to two important classes of problems of statistical physics and artificial intelligence---Ising spin-glass systems and maximum satisfiability. Experiments show that even without requiring any prior problem-specific knowledge about the structure of the problem at hand or its properties, hierarchical BOA is capable of achieving comparable or better performance than other state-of-the-art methods specializing in solving the examined classes of problems. ReferenceMartin Pelikan (2002). Bayesian optimization algorithm: From single level to hierarchy. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana, IL. Also IlliGAL Report No. 2002023. Download PDF Multi-objective Bayesian Optimization Algorithm Nazan Khan, David E. Goldberg, Martin Pelikan Abstract This paper proposes a competent multi-objective genetic algorithm called the multi-objective Bayesian optimization algorithm (mBOA). mBOA incorporates the selection method of the non-dominated sorting genetic algorithm-II (NSGA-II) into the Bayesian optimization algorithm (BOA). The proposed algorithm has been tested on an array of test functions which incorporate deception and loose-linkage and the results are compared to those of NSGA-II. Results indicate that mBOA outperforms NSGA-II on large loosely linked deceptive problems. ReferenceNazan Khan, David E. Goldberg, & Martin Pelikan (2002). Multi-objective Bayesian Optimization Algorithm. IlliGAL Report No. 2002009, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS 2001 Evolutionary Algorithms + Graphical Models = Scalable Black-Box Optimization Martin Pelikan, Kumara Sastry, David E. Goldberg Abstract To solve a wide range of different problems, the research in black-box optimization faces several important challenges. One of the most important challenges is the design of methods capable of automatically discovering the regularities in the problem and utilizing these to ensure efficient and reliable search. This paper discusses the Bayesian optimization algorithm that uses Bayesian networks to model promising solutions and guide exploration of the search space. Using Bayesian networks in combination with population-based genetic and evolutionary search allows the algorithm to discover and utilize regularities in the form of a problem decomposition. The paper analyzes the applicability of the methods for learning Bayesian networks in context of genetic and evolutionary search and concludes that the combination of the two approaches yields a robust, efficient, and accurate search. ReferenceMartin Pelikan, Kumara Sastry, & David E. Goldberg (2001). Evolutionary Algorithms + Graphical Models = Scalable Black-Box Optimization. IlliGAL Report No. 2001029, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published as Scalability of the Bayesian optimization algorithm, 2002, International Journal of Approximate Reasoning, 31 (3), pp. 221-258. Download PDF Download PS Convergence-Time Models for the Simple Genetic Algorithm with Finite Population Alessio Ceroni, Martin Pelikan, David E. Goldberg Abstract This paper presents various convergence models for the simple genetic algorithm (SGA) in the case of finite population. A piecewise convergence-time model is derived using ideas from two existing convergence models. The factors affecting the convergence with small population are explained and used to construct a correct model of the variance in fitness for the OneMax problem. This knowledge is included in the existing asymptotic model to derive the embedded convergence-time model. The model is extended to a different environment and modified to include an unexpected behavior that makes the SGA converge solely by genetic drift. ReferenceAlessio Ceroni, Martin Pelikan, & David E. Goldberg (2001). Convergence-time models for the simple genetic algorithm with finite population. IlliGAL Report No. 2001028, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS Combining the Strengths of the Bayesian Optimization Algorithm and Adaptive Evolution Strategies Martin Pelikan, David E. Goldberg, Shigeyoshi Tsutsui Abstract A method which combines competent genetic algorithms working in discrete domains with adaptive evolution strategies working in continuous domains is described. Discretization is used to transform solution between the two domains. Experiments with Bayesian optimization algorithm, sigma-self-adaptive mutation, and three different discretization methods are presented. The results suggest that the algorithm scales up well on all tested problems. ReferenceMartin Pelikan, David E. Goldberg, & Shigeyoshi Tsutsui (2001). Combining the Strengths of the Bayesian Optimization Algorithm and Adaptive Evolution Strategies. IlliGAL Report No. 2001023, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Extended version published as Getting the best of both worlds: Discrete and continuous genetic and evolutionary algorithms in concert, Information Sciences, 156(3-4), pp. 147-171. Download PDF Download PS Evolutionary Algorithm Using Marginal Histogram Models in Continuous Domain Shigeyoshi Tsutsui, Martin Pelikan, and David E. Goldberg ReferenceShigeyoshi Tsutsui, Martin Pelikan, & David E. Goldberg (2001). Evolutionary Algorithm Using Marginal Histogram Models in Continuous Domain. IlliGAL Report No. 2001019, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2001 (GECCO-2001), pp. 230-233. Download PDF Download PS Don't Evaluate, Inherit Kumara Sastry, David E. Goldberg, and Martin Pelikan Abstract This paper studies fitness inheritance as an efficiency enhancement technique for genetic and evolutionary algorithms. Convergence and population sizing models are derived and compared with experimental results. These models are optimized for greatest speed-up and the optimal inheritance proportion to obtain such a speed-up is derived. Results also show that when the inheritance effects are considered in the population sizing model, the number of function evaluations are reduced by 20% with the use of fitness inheritance. Results indicate that for a fixed population size, the number of function evaluations can be reduced by 70\% using a simple fitness inheritance technique. ReferenceKumara Sastry, David E. Goldberg, & Martin Pelikan (2001). Don't Evaluate, Inherit. IlliGAL Report No. 2001013, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2001 (GECCO-2001), pp. 551-558. Download PDF Download PS Analyzing the Evolutionary Pressures in XCS Martin V. Butz and Martin Pelikan Abstract After an increasing interest in learning classifier systems and the XCS classifier system in particular, this paper locates and analyzes the distinct evolutionary pressures in XCS. Combining several of the pressures, an equation is derived that validates the generalization hypothesis which was stated by Wilson(1995). A detailed experimental study of the equation exhibits its applicability in predicting the change in specificity in XCS as well as reveals several other specificity influences. ReferenceMartin V. Butz, & Martin Pelikan (2001). Analyzing the Evolutionary Pressures in XCS. IlliGAL Report No. 2001009, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2001 (GECCO-2001), pp. 935-942. Download PDF Download PS Escaping Hierarchical Traps with Competent Genetic Algorithms Martin Pelikan and David E. Goldberg Abstract To solve hierarchical problems, one must be able to learn the linkage, represent partial solutions efficiently, and assure effective niching. Linkage learning results in a good problem solver on a single level. Niching and efficient representation of partial solutions ensure that the algorithm maintains enough alternative solutions on each level to compose the solutions on a higher level. We combine the Bayesian optimization algorithm, which has been shown to solve problems on a single level efficiently, with a powerful niching technique based on crowding and restricted tournament selection. Decision graphs are used as local structures to encode information about the relationships among the variables in a problem. The proposed algorithm is called the hierarchical Bayesian optimization algorithm. Additionally, we propose a new class of hierarchically decomposable problems that are deceptive on each level and show that the proposed algorithm scales up subquadratically on all test problems. The proposed class of problems is called hierarchical traps. Empirical results are in agreement with our recent convergence and population sizing theory. ReferenceMartin Pelikan, & David E. Goldberg (2001). Escaping Hierarchical Traps with Competent Genetic Algorithms. IlliGAL Report No. 2001003, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2001 (GECCO-2001), pp. 511-518. Download PDF Download PS 2000 A C++ Implementation of the Bayesian Optimization Algorithm with Decision Graphs Martin Pelikan Abstract The paper explains how to download, compile, and use the C++ implementation of the Bayesian optimization algorithm (BOA) with decision graphs (Pelikan, Goldberg, & Sastry, 2000). It provides the instructions for creating input files for the BOA to solve various problems with various parameter settings and for adding new test functions into the existing code. Outputs of an example experiment are discussed. ReferenceMartin Pelikan (2000). A C++ Implementation of the Bayesian Optimization Algorithm with Decision Graphs. IlliGAL Report No. 2000025, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS Genetic Algorithms, Clustering, and the Breaking of Symmetry Martin Pelikan and David E. Goldberg Abstract This paper introduces clustering as a tool to improve the effects of recombination and incorporate niching in evolutionary algorithms. Instead of processing the entire set of parent solutions, the set is first clustered and the solutions in each of the clusters are processed separately. This alleviates the problem of symmetry which is often a major difficulty of many evolutionary algorithms in combinatorial optimization. Furthermore, it incorporates niching into genetic algorithms and, for the first time, the probabilistic model-building genetic algorithms. The dynamics and performance of the proposed method are illustrated on example problems. ReferenceMartin Pelikan, & David E. Goldberg (2000). Genetic Algorithms, Clustering, and the Breaking of Symmetry. IlliGAL Report No. 2000013, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Parallel Problem Solving from Nature (PPSN VI), 2000, Springer, pp. 385-394. Download PDF Download PS Research on the Bayesian Optimization Algorithm Martin Pelikan and David E. Goldberg Abstract This paper summarizes our recent research on the Bayesian optimization algorithm (BOA) and outlines the directions our research in this area has been following. It settles the algorithm in the problem decomposition framework used often to understand the complex behavior of genetic algorithms. It provides the most important research issues to tackle and reviews our recent progress in each of these areas. ReferenceMartin Pelikan, & David E. Goldberg (2000). Research on the Bayesian Optimization Algorithm. IlliGAL Report No. 2000010, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Workshop Proceedings of the Genetic and Evolutionary Computation Conference 2000 (GECCO-2000), pages 216-219. Download PDF Download PS Hierarchical Problem Solving and the Bayesian Optimization Algorithm Martin Pelikan and David E. Goldberg ReferenceMartin Pelikan, & David E. Goldberg (2000). Hierarchical Problem Solving and the Bayesian Optimization Algorithm. IlliGAL Report No. 2000002, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2000 (GECCO-2000), pp. 267-274. Download PDF Download PS Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz Abstract This paper analyzes convergence properties of the Bayesian optimization algorithm (BOA). It settles the BOA into the framework of problem decomposition used frequently in order to model and understand the behavior of simple genetic algorithms. The growth of the population size and the number of generations until convergence with respect to the size of a problem is theoretically analyzed. The theoretical results are supported by a number of experiments. ReferenceMartin Pelikan, David E. Goldberg, & Erick Cantu-Paz (2000). Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence. IlliGAL Report No. 2000001, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 13 pages. Published in Genetic and Evolutionary Computation Conference 2000 (GECCO-2000), pp. 275-282. Download PDF Download PS 1999 Bayesian Optimization Algorithm, Decision Graphs, and Occam's Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry Abstract This paper discusses the use of various scoring metrics in the Bayesian optimization algorithm (BOA) which uses Bayesian networks to model promising solutions and generate the new ones. The use of decision graphs in Bayesian networks to improve the performance of the BOA is proposed. To favor simple models, a complexity measure is incorporated into the Bayesian-Dirichlet metric for Bayesian networks with decision graphs. The presented algorithms are compared on a number of interesting problems. ReferenceMartin Pelikan, David E. Goldberg, & Kumara Sastry (2000). Bayesian Optimization Algorithm, Decision Graphs, and Bayesian Networks. IlliGAL Report No. 2000020, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Genetic and Evolutionary Computation Conference 2001 (GECCO-2001), pp. 519-526. Download PDF Download PS A Survey of Optimization by Building and Using Probabilistic Models Martin Pelikan, David E. Goldberg, and Fernando Lobo Abstract This paper summarizes the research on population-based probabilistic search algorithms based on modeling promising solutions by estimating their probability distribution and using the constructed model to guide the further exploration of the search space. It settles the algorithms in the field of genetic and evolutionary computation where they have been originated. All methods are classified into a few classes according to the complexity of the class of models they use. Algorithms from each of these classes are briefly described and their strengths and weaknesses are discussed. ReferenceMartin Pelikan, David E. Goldberg, and Fernando Lobo (1999). A Survey of Optimization by Building and Using Probabilistic Models. IlliGAL Report No.99018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 11 pages. Published in Computational Optimization and Applications, 21(1), 5-20. Kluwer Academic Publishers, 2002. Download PDF Download PS Parameter-less Genetic Algorithm: A Worst-case Time and Space Complexity Analysis Martin Pelikan and Fernando Lobo Abstract In this paper, the worst-case analysis of the time and space complexity of the parameter-less genetic algorithm versus the genetic algorithm with an optimal population size is provided and the results of the analysis are discussed. Since the assumptions in order for the analysis to be correct are very weak, the result is applicable to a wide range of problems. Various configurations of the parameter-less genetic algorithm are considered and the results of their time and space complexity are compared. ReferenceMartin Pelikan, Fernando Lobo (1999). Parameter-less Genetic Algorithm: A Worst-case Time and Space Complexity Analysis. IlliGAL Report No. 99014, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS A Simple Implementation of the Bayesian Optimization Algorithm (BOA) in C++ (version 1.0) Martin Pelikan Abstract The paper explains how to download, compile, and use the simple implementation of the Bayesian optimization algorithm (BOA) (Pelikan et al., 1998, 1999), version 1.0, written in C++. It provides the instructions for creating input files for the BOA to solve various problems with various parameter settings and for adding new test functions into the existing code. Outputs of an example experiment are discussed. ReferenceMartin Pelikan (1999). A Simple Implementation of the Bayesian Optimization Algorithm (BOA) in C++ (version 1.0). IlliGAL Report No. 99011, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Download PDF Download PS BOA: The Bayesian optimization algorithm Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz Abstract In this paper, an algorithm based on the concepts of genetic algorithms that uses an estimation of the joint distribution of promising solutions in order to generate new candidate solutions is proposed. The proposed algorithm is called the Bayesian optimization algorithm (BOA). To estimate the distribution of promising solutions, techniques for modeling multivariate data by Bayesian networks are used. The proposed algorithm identifies, reproduces and mixes building blocks up to a specified order. It is independent of the ordering of the variables in the strings representing the solutions. Moreover, prior information about the problem can be incorporated into the algorithm. However, the prior information is not essential. Preliminary experiments show that the BOA outperforms the simple genetic algorithm even on decomposable functions with tight building blocks as the problem size grows. ReferenceMartin Pelikan, David E. Goldberg, & Erick Cantu-Paz, E. (1999). BOA: The Bayesian optimization algorithm. Genetic and Evolutionary Computation Conference (GECCO-99), I, 525-532. Also IlliGAL Report No. 99003. Download PDF Download PS The Bivariate Marginal Distribution Algorithm Martin Pelikan and Heinz Muehlenbein Abstract The paper deals with the Bivariate Marginal Distribution Algorithm (BMDA). It is an extension of the Univariate Marginal Distribution Algorithm (UMDA). The algorithm uses the dependencies of the genes in order to improve algorithms that use simple univariate marginal distributions. BMDA is a special case of the Factorization Distribution Algorithm, but without any a problem specific knowledge in the initial stage. The dependencies of the genes are discovered during the optimization process itself. In this paper, BMDA is described in detail. BMDA is compared to different algorithms including the simple genetic algorithm with different crossover methods and UMDA. For some fitness functions the relation between problem size and the number of fitness evaluations until convergence is shown. ReferenceMartin Pelikan, & Heinz Muehlenbein (1999). Advances in Soft Computing - Engineering Design and Manufacturing, 521-535. Download PS 1998 Linkage Problem, Distribution Estimation, and Bayesian Networks Martin Pelikan, David E. Goldberg, and Erick Cantu-Paz Abstract In this paper, an algorithm based on the concepts of genetic algorithms that uses an estimation of the joint distribution of promising solutions in order to generate new candidate solutions is proposed. The algorithm is settled into the context of evolutionary computation and the algorithms based on the estimation of distributions. The proposed algorithm is called the Bayesian optimization algorithm (BOA). To estimate the distribution of promising solutions, the techniques for modeling multivariate data by Bayesian networks are used. The proposed algorithm identifies, reproduces and mixes building blocks up to a specified order. It is independent of the ordering of the variables in strings representing the solutions. Moreover, prior information about the problem can be incorporated into the algorithm. However, the prior information is not essential. Experiments were done with additively decomposable problems. The proposed algorithm was able to solve all tested problems in linear or close to linear time with respect to the problem size without the use of any prior knowledge about the problem. ReferenceMartin Pelikan, David E. Goldberg, and Erick Cantu-Paz (1998). Linkage Problem, Distribution Estimation, and Bayesian Networks. IlliGAL Report No.98013, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL. Published in Evolutionary Computation, 8(3), 2002, pp. 311-341. Download PDF Download PS Marginal Distributions in Evolutionary Algorithms Martin Pelikan, and Heinz Muehlenbein Abstract In this paper, the description of two gene pool recombination operators is described. Both operators are based on the estimation of the distribution of the parents and its use to generate new individuals. The Univariate Marginal Distribution Algorithm (UMDA) uses simple univariate distributions. It is a perfect algorithm for linear problems. The Bivariate Marginal Distribution Algorithm (BMDA) is an extension of UMDA. In BMDA, the most important pair dependencies are taken into account. The dependencies are measured by the Pearson's chi-square statistics. The structure of a problem is being discovered during an optimization. BMDA works well for linear problems as well as for problems with interacting genes. ReferenceMartin Pelikan, & Heinz Muehlenbein (1998). Marginal Distributions in Evolutionary Algorithms. In: Proceedings of the International Conference on Genetic Algorithms Mendel '98, Brno, Czech Republic, pp. 90-95. Download PS 1996 Read's Linear Codes and Evolutionary Computation Over Populations of Rooted Trees Vladimir Kvasnicka, Jiri Pospichal, and Martin Pelikan Abstract A theoretical base for efficient coding of the part of evolutionary computation, mostly represented by genetic programming, is given. The efficiency of the application of this theoretical approach was already confirmed in programs for symbolic regression [Martin Pelikan, Genetic Programming, Student Case Study, 1996 (in Slovak)]. The Read's coding of rooted trees is outlined, followed by methods of reverse translation of the code into a graph (so-called parsing). Theorem for checking, whether for a given code there exists a corresponding rooted tree, is used by an algorithm for efficient generation of a correct code. Algorithm which keeps correctness of this code of tree, while handling changes and swaps of its part, is presented. Compression os the code of trees corresponding to Koza's "automatic defined functions" is formalized. All the mentioned algorithms are given in a pseodocode. ReferenceVladimir Kvasnicka, Jiri Pospichal, & Martin Pelikan (1996). Read's Linear Codes and Evolutionary Computation Over Populations of Rooted Trees. In: Proceedings of the International Conference on Neural Networks Intelligent Technologies '96, Kosice, Slovakia, pp. 141-154. Stochastic Simulation of Multiagent Models Vladimir Kvasnicka, Jiri Pospichal, and Martin Pelikan Abstract This paper proposes an analytic framework for the analysis of evolutionary mechanisms of agents. Population dynamics in a simplified computational ecosystem comprised of many agents is studied. The goal is a simulation and a prediction of population development of artificial agents, which are able to reproduce proportionally to the density of their population, and are removed by a constant speed. One-type-agent and two-types-agent models are investigated, with and without competition for the living space. Application of the models can range from computer science (modeling distributed genetic algorithm optimization requiring high concentration of agents for crossover, where agents can be represented by neural networks) to economy (competition of products or technologies on the market, when a more spread product has lower production costs, gets better advertisement and therefore spreads even more) or biology (modeling an epidemic of a contagious incurable disease). ReferenceVladimir Kvasnicka, Jiri Pospichal, & Martin Pelikan (1996). Stochastic Simulation of Multiagent Models. In: Proceedings of the International Conference on Neural Networks Intelligent Technologies '96, Kosice, Slovakia, pp. 165-184. Hill Climbing with Learning (An Abstraction of Genetic Algorithm) Vladimir Kvasnicka, Martin Pelikan, and Jiri Pospichal Abstract Hill Climbing with Learning - a simple modification of standard hill climbing algorithm by taking into account learning features. Basic concept of this approach is the so-called probability vector, its single entries determine probabilities of appearance of '1' entries in n-bit vectors. This vector is used for the random generation of n-bit vectors that form a neighbourhood. Within the neighbourhood a few best solutions are used for updating probability vector by a formal analogue of Hebbian learning rule. The process is repeated until the probability vector entries are close either to zero or to one. The resulting probability vector unambigously determines an n-bit vector which may be interpreted as an optimal solution of the given optimization task. Resemblance with genetic algorithms is discussed. Effectiveness of the proposed method is illustrated by an example of looking for global minima of a highly multimodal function. ReferenceVladimir Kvasnicka, Martin Pelikan, & Jiri Pospichal (1996). Hill Climbing with Learning (An Abstraction of Genetic Algorithm). Neural Network World, 6, pp. 773-796. Last update: Thu Aug 9 19:53:49 CDT 2012 by Martin Pelikan