Genetic programming

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory

In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. It is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task.

History[]

In 1954, GP began with the evolutionary algorithms first used by Nils Aall Barricelli applied to evolutionary simulations. In the 1960s and early 1970s, evolutionary algorithms became widely recognized as optimization methods. Ingo Rechenberg and his group were able to solve complex engineering problems through evolution strategies as documented in his 1971 PhD thesis and the resulting 1973 book. John Holland was highly influential during the 1970s.

In 1964, Lawrence J. Fogel, one of the earliest practitioners of the GP methodology, applied evolutionary algorithms to the problem of discovering finite-state automata. Later GP-related work grew out of the learning classifier system community, which developed sets of sparse rules describing optimal policies for Markov decision processes. The first statement of modern "tree-based" Genetic Programming (that is, procedural languages organized in tree-based structures and operated on by suitably defined GA-operators) was given by Nichael L. Cramer (1985).^[1] This work was later greatly expanded by John R. Koza, a main proponent of GP who has pioneered the application of genetic programming in various complex optimization and search problems.^[2] Gianna Giavelli, a student of Koza's, later pionered the use of genetic programming as a technique to model DNA expression.^[3]

In the 1990s, GP was mainly used to solve relatively simple problems because it is very computationally intensive. Recently GP has produced many novel and outstanding results in areas such as quantum computing, electronic design, game playing, sorting, and searching, due to improvements in GP technology and the exponential growth in CPU power.^[4] These results include the replication or development of several post-year-2000 inventions. GP has also been applied to evolvable hardware as well as computer programs.

Developing a theory for GP has been very difficult and so in the 1990s GP was considered a sort of outcast among search techniques. But after a series of breakthroughs in the early 2000s, the theory of GP has had a formidable and rapid development. So much so that it has been possible to build exact probabilistic models of GP (schema theories, Markov chain models and meta-optimization algorithms). ^{[citation needed]}^{[dubious — see talk page]}

Program representation[]

File:Genetic Program Tree.png

A function represented as a tree structure.

GP evolves computer programs, traditionally represented in memory as tree structures.^[5] Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable).

Non-tree representations have been suggested and successfully implemented, such as linear genetic programming which suits the more traditional imperative languages [see, for example, Banzhaf et al. (1998)]. The commercial GP software Discipulus uses automatic induction of binary machine code ("AIM")^[6] to achieve better performance. µGP^[7] uses directed multigraphs to generate programs that fully exploit the syntax of a given assembly language.

Genetic operators[]

The main operators used in evolutionary algorithms such as GP are crossover and mutation.

Crossover[]

Crossover is applied on an individual by simply switching one of its nodes with another node from another individual in the population. With a tree-based representation, replacing a node means replacing the whole branch. This adds greater effectiveness to the crossover operator. The expressions resulting from crossover are very much different from their initial parents.

Mutation[]

Mutation affects an individual in the population. It can replace a whole node in the selected individual, or it can replace just the node's information. To maintain integrity, operations must be fail-safe or the type of information the node holds must be taken into account. For example, mutation must be aware of binary operation nodes, or the operator must be able to handle missing values.

Other approaches[]

The basic ideas of genetic programming have been modified and extended in a variety of ways:

Extended Compact Genetic Programming (ECGP)
Embedded Cartesian Genetic Programming (ECGP)
Probabilistic Incremental Program Evolution (PIPE)

MOSES[]

Meta-Optimizing Semantic Evolutionary Search (MOSES) is a meta-programming technique for evolving programs by iteratively optimizing genetic populations.^[8] It has been shown to strongly outperform genetic and evolutionary program learning systems, and has been successfully applied to many real-world problems, including computational biology, sentiment evaluation, and agent control.^[9] When applied to supervised classification problems, MOSES performs as well as, or better than support vector machines (SVM), while offering more insight into the structure of the data, as the resulting program demonstrates dependencies and is understandable in a way that a large vector of numbers is not.^[9]

MOSES is able to out-perform standard GP systems for two important reasons. One is that it uses estimation of distribution algorithms (EDA) to determine the Markov blanket (that is, the dependencies in a Bayesian network) between different parts of a program. This quickly rules out pointless mutations that change one part of a program without making corresponding changes in other, related parts of the program. The other is that it performs reduction to reduce programs to normal form at each iteration stage, thus making programs smaller, more compact, faster to execute, and more human readable. Besides avoiding spaghetti code, normalization removes redundancies in programs, thus allowing smaller populations of less complex programs, speeding convergence.

Meta-Genetic Programming[]

Meta-Genetic Programming is the proposed meta learning technique of evolving a genetic programming system using genetic programming itself. It suggests that chromosomes, crossover, and mutation were themselves evolved, therefore like their real life counterparts should be allowed to change on their own rather than being determined by a human programmer. Meta-GP was formally proposed by Jürgen Schmidhuber in 1987,.^[10] Doug Lenat's Eurisko is an earlier effort that may be the same technique. It is a recursive but terminating algorithm, allowing it to avoid infinite recursion.

Critics of this idea often say this approach is overly broad in scope. However, it might be possible to constrain the fitness criterion onto a general class of results, and so obtain an evolved GP that would more efficiently produce results for sub-classes. This might take the form of a Meta evolved GP for producing human walking algorithms which is then used to evolve human running, jumping, etc. The fitness criterion applied to the Meta GP would simply be one of efficiency.

For general problem classes there may be no way to show that Meta GP will reliably produce results more efficiently than a created algorithm other than exhaustion. The same holds for standard GP and other search algorithms.

Implementations[]

Possibly most used:

ECJ - Evolutionary Computation/Genetic Programming research system (Java)
Lil-Gp Genetic Programming System (C).
Beagle - Open BEAGLE, a versatile EC framework (C++ with STL)
EO Evolutionary Computation Framework (C++ with static polymorphism)

Other:

Implementation	Description	License	Language
BraneCloud Evolution	Fork of ECJ for .NET 4.0	Apache License	C#
RobGP	Robust Genetic Programming System	GNU GPL	C++
EvoJ	Evolutionary computations framework	Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License	Java
JEF	JAVA Evolution Framework	GNU Lesser GPL	Java
GPC++	Genetic Programming C++ Class Library	GNU GPL	C++
TinyGP	A tiny genetic programming system.		C and Java
GeneXproTools	Commercial gene expression programming software for logistic regression, classification and regression	Proprietary	C++
GenPro	Reflective Object Oriented Genetic Programming. Open Source Framework. Extend with POJO's, generates plain Java code.	Apache License 2.0	Java
deap	Distributed Evolutionary Algorithms in Python	GNU Lesser GPL	Python
pySTEP	Python Strongly Typed gEnetic Programming	MIT License	Python
Pyevolve		Modified PSF	Python
JAGA	Extensible and pluggable open source API for implementing genetic algorithms and genetic programming applications		Java
RMIT GP	A Genetic Programming Package with support for Automatically Defined Functions		C++
GPE	Framework for conducting experiments in Genetic Programming		.NET
DGPF	simple Genetic Programming research system		Java
JGAP	Java Genetic Algorithms and Genetic Programming, an open-source framework		Java
n-genes	Java Genetic Algorithms and Genetic Programming (stack oriented) framework		Java
PMDGP	object oriented framework for solving genetic programming problems		C++
DRP	Directed Ruby Programming, Genetic Programming & Grammatical Evolution Library		Ruby
GPLAB	A Genetic Programming Toolbox for MATLAB		MATLAB
MOEA Framework	Multiobjective Evolutionary Algorithm Framework	GNU LGPL	Java
GPTIPS	Genetic Programming & Symbolic Regression Toolbox for MATLAB. Aimed at performing multigene symbolic regression.	GNU GPL	MATLAB
PyRobot	Evolutionary Algorithms (GA + GP) Modules, Open Source		Python
PerlGP	Grammar-based genetic programming in Perl	GNU GPL	Perl
Discipulus	Commercial Genetic Programming Software from RML Technologies, Inc		Generates code in most high level languages
GAlib	Object oriented framework with 4 different GA implementations and 4 representation types (arbitrary derivations possible)	GAlib License	C++
Java GALib	Source Forge open source Java genetic algorithm library, complete with Javadocs and examples (see bottom of page)		Java
LAGEP	Supporting single/multiple population genetic programming to generate mathematical functions. Open Source, OpenMP used.	GNU GPL	C/C++
PushGP	a strongly typed, stack-based genetic programming system that allows GP to manipulate its own code (auto-constructive evolution)		Java / C++ / Javascript / Scheme / Clojure / Lisp
Groovy	Groovy Java Genetic Programming	GNU GPL	Java
GEVA	Grammatical Evolution in Java		Java
jGE	Java Grammatical Evolution	GNU GPL v3	Java
ECF	Evolutionary Computation Framework. different genotypes, parallel algorithms, tutorial		C++
JCLEC	Evolutionary Computation Library in Java, expression tree encoding, syntax tree encoding	GNU GPL	Java
HeuristicLab	A Paradigm-Independent and Extensible Environment for Heuristic Optimization, rich graphical user interface, open source, plugin-based architecture		C#
PolyGP	Strong typing and lambda abstractions		Haskell
PonyGE	a small, one source file implementation of GE, with an interactive graphics demo application	GNU GPL v3	Python
inspyred	Biologically inspired computation encompasses a broad range of algorithms including evolutionary computation, swarm intelligence, and neural networks	GNU GPL v3	Python
MicroGP (uGP)	General purpose tool, mostly exploited for assembly language generation	GNU GPL	C++
DCTG-GP	Grammar-guided GP using logic-based attribute grammars		Prolog
Eureqa/Formulize	GP based Symbolic Regression	Commercial	Not Open source
Watchmaker Framework	Genetic Computation Framework	Apache License 2.0	Java

References and notes[]

↑ Nichael Cramer's HomePage
↑ genetic-programming.com-Home-Page
↑ The Genetic Coding of Behavioral Attributes in Cellular Automata. Artificial Life at Stanford 1994 Stanford, California, 94305-3079 USA.
↑ humancompetitive
↑ Cramer, 1985
↑ (Peter Nordin, 1997, Banzhaf et al., 1998, Section 11.6.2-11.6.3)
↑ MicroGP page on SourceForge, complete with tutorials and wiki
↑ OpenCog MOSES
↑ ^9.0 ^9.1 Moshe Looks (2006), Competent Program Learning, PhD Thesis, http://metacog.org/doc.html
↑ 1987 THESIS ON LEARNING HOW TO LEARN, METALEARNING, META GENETIC PROGRAMMING, CREDIT-CONSERVING MACHINE LEARNING ECONOMY

Bibliography[]

Banzhaf, W., Nordin, P., Keller, R.E., and Francone, F.D. (1998), Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications, Morgan Kaufmann
Barricelli, Nils Aall (1954), Esempi numerici di processi di evoluzione, Methodos, pp. 45–68.
Brameier, M. and Banzhaf, W. (2007), Linear Genetic Programming, Springer, New York
Crosby, Jack L. (1973), Computer Simulation in Genetics, John Wiley & Sons, London.
Cramer, Nichael Lynn (1985), "A representation for the Adaptive Generation of Simple Sequential Programs" in Proceedings of an International Conference on Genetic Algorithms and the Applications, Grefenstette, John J. (ed.), Carnegie Mellon University
Fogel, David B. (2000) Evolutionary Computation: Towards a New Philosophy of Machine Intelligence IEEE Press, New York.
Fogel, David B. (editor) (1998) Evolutionary Computation: The Fossil Record, IEEE Press, New York.
Forsyth, Richard (1981), BEAGLE A Darwinian Approach to Pattern Recognition Kybernetes, Vol. 10, pp. 159–166.
Fraser, Alex S. (1957), Simulation of Genetic Systems by Automatic Digital Computers. I. Introduction. Australian Journal of Biological Sciences vol. 10 484-491.
Fraser, Alex and Donald Burnell (1970), Computer Models in Genetics, McGraw-Hill, New York.
Holland, John H (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor
Korns, Michael (2007), Large-Scale, Time-Constrained, Symbolic Regression-Classification, in Genetic Programming Theory and Practice V. Springer, New York.
Korns, Michael (2009), Symbolic Regression of Conditional Target Expressions, in Genetic Programming Theory and Practice VII. Springer, New York.
Korns, Michael (2010), Abstract Expression Grammar Symbolic Regression, in Genetic Programming Theory and Practice VIII. Springer, New York.
Koza, J.R. (1990), Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems, Stanford University Computer Science Department technical report STAN-CS-90-1314. A thorough report, possibly used as a draft to his 1992 book.
Koza, J.R. (1992), Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press
Koza, J.R. (1994), Genetic Programming II: Automatic Discovery of Reusable Programs, MIT Press
Koza, J.R., Bennett, F.H., Andre, D., and Keane, M.A. (1999), Genetic Programming III: Darwinian Invention and Problem Solving, Morgan Kaufmann
Koza, J.R., Keane, M.A., Streeter, M.J., Mydlowec, W., Yu, J., Lanza, G. (2003), Genetic Programming IV: Routine Human-Competitive Machine Intelligence, Kluwer Academic Publishers
Langdon, W. B., Genetic Programming and Data Structures, Springer ISBN 0-7923-8135-1
Langdon, W. B., Poli, R. (2002), Foundations of Genetic Programming, Springer-Verlag ISBN 3-540-42451-2
Nordin, J.P., (1997) Evolutionary Program Induction of Binary Machine Code and its Application. Krehl Verlag, Muenster, Germany.
Poli, R., Langdon, W. B., McPhee, N. F. (2008). A Field Guide to Genetic Programming, Lulu.com, freely available from the internet.
Rechenberg, I. (1971): Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Reprinted by Fromman-Holzboog (1973).
Schmidhuber, J. (1987). Evolutionary principles in self-referential learning. (On learning how to learn: The meta-meta-... hook.) Diploma thesis, Institut f. Informatik, Tech. Univ. Munich.
Smith, S.F. (1980), A Learning System Based on Genetic Adaptive Algorithms, PhD dissertation (University of Pittsburgh)
Smith, Jeff S. (2002), Evolving a Better Solution, Developers Network Journal, March 2002 issue
Shu-Heng Chen et al. (2008), Genetic Programming: An Emerging Engineering Tool,International Journal of Knowledge-based Intelligent Engineering System, 12(1): 1-2, 2008.
Weise, T, Global Optimization Algorithms: Theory and Application, 2008

External links[]

Riccardo Poli, William B. Langdon,Nicholas F. McPhee, John R. Koza, "A Field Guide to Genetic Programming" (2008)
DigitalBiology.NET Vertical search engine for GA/GP resources
Aymen S Saket & Mark C Sinclair
The Hitch-Hiker's Guide to Evolutionary Computation
GP bibliography
People who work on GP

This page uses Creative Commons Licensed content from Wikipedia (view authors).

[1] Nichael Cramer's HomePage

[2] tic-programming.com-Home-Page

[3] The Genetic Coding of Behavioral Attributes in Cellular Automata. Artificial Life at Stanford 1994 Stanford, California, 94305-3079 USA.

[4] umancompetitive

[5] Cramer, 1985

[6] (Peter Nordin, 1997, Banzhaf et al., 1998, Section 11.6.2-11.6.3)

[7] MicroGP page on SourceForge, complete with tutorials and wiki

[8] OpenCog MOSES

[looks-9] 9.0 ^9.1 Moshe Looks (2006), Competent Program Learning, PhD Thesis, http://metacog.org/doc.html

[10] 1987 THESIS ON LEARNING HOW TO LEARN, METALEARNING, META GENETIC PROGRAMMING, CREDIT-CONSERVING MACHINE LEARNING ECONOMY

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]