fungp

0.3.2


Genetic programming in Clojure

dependencies

org.clojure/clojure
1.3.0



(this space intentionally left almost blank)
 

Mike Vollmer, 2012, GPL

Downloading and Installing

This project is hosted on GitHub.

To install it, you can clone the git repository or grab the zip/tar file from the above link. You'll need Clojure installed, and you'll probably want Leiningen (the latter will take care of the former). See below for how to run the samples included with this library.

What is this?

fungp is a parallel genetic programming library implemented in the Clojure programming language, pronounced fun-gee-pee. The "fun" comes from functional, and because genetic programming can be fun! Also I'm bad at naming things.

There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

--- Paraphrased from Phil Karlton

The library is in its early stages right now, but it's usable. Currently it has the following features:

  • Custom evaluation and reporting logic
  • Parallelism: subpopulations run in native threads
  • Evolve and test functions of multiple arities
  • Evolve subroutines
  • Evolve code that interacts with Java or other JVM languages

How do I use it?

Call the run-genetic-programming function with a map containing these keyword parameters:

  • iterations : number of iterations between migrations
  • migrations : number of migrations
  • num-islands : number of islands
  • population-size : size of the populations
  • tournament-size : size of the tournaments (default 5)
  • mutation-probability : probability of mutation (default 0.2)
  • mutation-depth : depth of mutated trees (default 6)
  • max-depth : maximum depth of trees
  • terminals : terminals used in tree building
  • numbers : number literals to use as terminals (default [])
  • functions : functions used in tree building, in the form [function arity]
  • fitness : a fitness function that takes a tree and returns an error number, lower is better
  • report : a reporting function passed [best-tree best-fit] at each migration
  • adf-count : number of automatically defined functions (default 0)
  • adf-arity : number of arguments for automatically defined functions (default 1)

Most fitness functions will eval the trees of code they are passed. In Clojure, eval'ing a list of code will compile it to JVM bytecode.

If you're interested primarily in how to use fungp, you can skip to the example files.

Notes on mutable variables and loops

fungp is fully capable of evolving code that uses mutable variables, side-effects, and loops. Currently it does not support Koza's architecture-altering operations so you have to determine some of the architecture beforehand (like, the names of the variables, whether you need a loop, etc).

There also is no built-in mechanism for managing mutable variables --- I would like there to be, but there are a few obstacles. Clojure doesn't really support mutable local variables. That's certainly not a bad thing, but it makes it slightly more complicated to create local variables (or something that acts like them) at runtime. Because of this, I decided to leave the management of local variables up to the user, which (thanks to dynamic Vars) is fairly straightforard. See the compile-ants sample for an example of this.

Environment, distribution

I highly recommend you use Leiningen. fungp is set up as a Leiningen project. Leiningen (or "lein") is available for all major operating systems and is trivial to install.

By itself fungp has no "main" method, so running it won't do anything. The samples can be run individually, and each has a "test-*" function that launches it. I recommend you run them from the REPL, as they have some settable parameters.

Here's how you would run the cart example, starting from the fungp directory:

> lein repl
nREPL server started on port 54110
user=> (use 'fungp.sample.cart)
nil
user=> (test-cart 3 6)
...

At that point the cart example will run.

The code

The code here in the core.clj file is rather brief (but dense), and it should be readable in one sitting. To actually use it, you'll likely have to at least read the sample code, and probably read most of this code as well.

This is the namespace declaration. It is the start of the core of the library.

(ns fungp.core
  (:use fungp.util)
  (:use fungp.defined-branches))

Tree creation

My method of random tree generation is a combination of the "grow" and "fill" methods of tree building, similar to Koza's "ramped half-and-half" method.

Return a random terminal or number.

(defn terminal
  [terminals numbers]
  (if (and (flip 0.5)
           (not (empty? numbers)))
    (rand-nth numbers)
    (rand-nth terminals)))

Build a tree of source code, given a mutation depth, terminal sequence, function sequence, and type keyword. The type can be either :grow or :fill. The terminal sequence should consist of symbols or quoted code, while elements in the function sequence should contain both the function and a number representing its arity, in this form: [function arity].

(defn create-tree
  [mutation-depth terminals numbers functions gtype]
  ;; conditions: either return terminal or create function and recurse
  (cond (zero? mutation-depth) (terminal terminals numbers)
        (and (= gtype :grow)
             (flip 0.5))
        (terminal terminals numbers)
        :else (let [[func arity] (rand-nth functions)]
                (cons func (repeatedly arity
                                       #(create-tree (dec mutation-depth)
                                                     terminals
                                                     numbers
                                                     functions
                                                     gtype))))))

Branches

This is a good time to introduce one of the defining features of the code evolved in fungp: it consists of multiple ordered branches, the last of which is the Result Defining Branch, or the branch that actually returns the final result. Other branches in the evolved code are stored in a let statement, which in Clojure binds and evaluates sequentially.

For example:

(let [a 5
      b (fn [x] (* a x))]
  (b a))

The above code would bind a to 5, and b to the anonymous function. The reference to a in b will resolve to 5, because elements in let are evaluated and bound in order. Finally, the expression (b a) would evaluate to 25.

Most likely the type of branch you'll find most useful are Automatically Defined Functions. These are branches that are separate functions. They are exposed to the main branch, so the main branch can call the separately evolved functions, passing arguments to them.

In addition to the ADFs, I implement a couple of other types of architecture described by Koza. I don't, however, implement architecture altering operations, so whatever architecture your individuals have must be determined beforehand.

I also don't have support for automatically defined memory. See compiled-ants for an example of how to support mutable local variables (a feature that doesn't really exist in Clojure). For more information on the different types of branches in fungp see fungp.defined-branches.

This is a version of create-tree that handles multiple branches. It builds a let form that has a (potentially empty) vector for automatically defined functions and loops, and it has a main branch that is the result defining branch.

(defn create-module-tree
  [mutation-depth terminals numbers functions adf-arity adf-count adl-count adl-limit type]
  (list 'let
        ;; create the vector of branches
        (build-branch-vector create-tree mutation-depth terminals numbers functions
                             adf-arity adf-count adl-count adl-limit)
        ;; create the main branch
        (create-tree mutation-depth (concat terminals (gen-adl-terminals adl-count)) numbers
                     ;; add the branches to the function vector
                     (concat functions (gen-adf-func adf-count adf-arity))
                     type)))

Populations

A population is a list of individuals. Creating the population involves randomly generating a list of individuals as a starting point. Islands, which will be implemented further down, are lists of populations

Creates a population of trees given a population size, mutation depth, terminal sequence, and function sequence. It uses a variation of Koza's "ramped half-and-half" method: a coin flip determines whether to use the "fill" or "grow" method, and the mutation depth is a randomly chosen number between 1 and the specified max mutation depth.

(defn create-population
  [population-size mutation-depth terminals numbers functions adf-arity adf-count adl-count adl-limit]
  (if (zero? population-size) []
    (conj (create-population (dec population-size) mutation-depth terminals numbers functions
                             adf-arity adf-count adl-count adl-limit)
          (create-module-tree (inc (rand-int mutation-depth)) terminals numbers functions
                              adf-arity adf-count adl-count adl-limit
                              (if (flip 0.5) :grow :fill)))))

Tree manipulation

rand-subtree and replace-subtree are two of the most important functions. They define how the trees are modified.

The basic idea for how I implemented both of them is that recursion can be used to reduce the problem at each step: given a tree (or a subtree, all of which have the same form), recurse on a random subtree, along with a reduced value of n. The base case is when n is zero or the function hits a leaf.

Additionally, replace-subtree uses concat to reconstruct the tree on its way back up the stack.

Return a random subtree of a list. Takes an optional second parameter that limits the depth to go before selecting a crossover point.

(defn rand-subtree
  ([tree]
    (rand-subtree tree (rand-int (inc (max-tree-height tree)))))
  ([tree n]
    (if (or (zero? n) (and (seq? tree) (= (count tree) 1)) ;; don't split up (leaf)
                           (not (seq? tree))) tree
      (recur (rand-nth (rest tree))
             (rand-int n)))))

Replace a random subtree with a given subtree. Takes an optional second parameter that limits the depth to go before selecting a crossover point.

(defn replace-subtree
  ([tree sub]
    (replace-subtree tree sub (rand-int (+ 1 (max-tree-height tree)))))
  ([tree sub n]
    (if (or (zero? n) (and (seq? tree) (= (count tree) 1)) ;; don't split up (leaf)
                           (not (seq? tree))) sub
      (let [r (+ 1 (rand-int (count (rest tree))))]
        (concat (take r tree)
                (list (replace-subtree
                        (nth tree r) sub
                        (rand-int n)))
                (nthrest tree (inc r)))))))

Prevent trees from growing too big by lifting a subtree if the tree height is greater than the max tree height.

(defn truncate
  [tree height]
  (if (> (max-tree-height tree) height)
    (recur (rand-subtree tree) height)
    tree))

A version of truncate that handles branches.

(defn truncate-module
  [tree height]
  (list (first tree)
        (truncate-branch tree truncate height)
        (truncate (nth tree 2) height)))

Mutation, crossover, and selection

With rand-subtree and replace-subtree out of the way, the rest of the single-generation pass is pretty simple. Mutation and crossover both can easily be written in terms of rand-subtree and replace-subtree.

Mutation takes a tree and (occasionally) randomly changes part of it. The idea, like the rest of the fundamental aspects of genetic algorithms, is taken from nature; when DNA is copied, there is a slight chance of "mistakes" being introduced in the copy. This can lead to beneficial changes and increases genetic diversity.

Mutate a tree. Mutation takes one of three forms, chosen randomly: replace a random subtree with a newly generated tree, replace a random subtree with a terminal, or "lift" a random subtree to replace the root. The function takes a tree, mutation rate, a mutation depth (max size of new subtrees), terminals, and functions.

(defn mutate-tree
  [tree mutation-probability mutation-depth terminals numbers functions]
  (if (flip mutation-probability)
    (let [coin (rand)] ;; random number between 0 and 1
      (cond (< coin 0.33)
            (replace-subtree tree (create-tree mutation-depth terminals numbers functions :grow))
            (< coin 0.66)
            (replace-subtree tree (rand-nth terminals))
            :else (rand-subtree tree)))
    tree))

A version of mutate-tree that handles branches. It applies the mutation operation not only to the result defining branch but to the automatically defined branches in the let statement, and it preserve the overall structure of the tree (the let form).

(defn mutate-module-tree
  [module-tree mutation-probability mutation-depth terminals numbers functions]
  (if (or (flip 0.5)
          (zero? (count (second module-tree))))
    ;; mutate the main branch
    (concat (take 2 module-tree)
            (list (mutate-tree (nth module-tree 2) mutation-probability
                               mutation-depth terminals numbers functions)))
    (concat (list (nth module-tree 0))
            (list (vec (map (fn [letf]
                         ;; mutate the branches
                         (mutate-branch letf mutate-tree
                                        mutation-probability mutation-depth
                                        terminals numbers functions))
                            (nth module-tree 1))))
            (list (nth module-tree 2)))))

Apply mutation to every tree in a population. Similar arguments to mutate-tree.

(defn mutate-population
  [population mutation-probability mutation-depth terminals numbers functions]
  (map #(mutate-module-tree % mutation-probability mutation-depth terminals
                            numbers functions) population))

Crossover is the process of combining two parents to make a child. It involves copying the genetic material (in this case, lisp code) from the two parents, combining them, and returning the result of the combination.

The crossover function is simple to define in terms of replace-subtree and rand-subtree. Basically, crossing over two trees involves selecting a random subtree from one tree, and placing it randomly in the other tree.

(defn crossover
  [tree1 tree2] (replace-subtree tree1 (rand-subtree tree2)))

Crossover that preserves the let form and branches.

(defn crossover-module
  [tree1 tree2]
  (if (or (flip 0.5)
          (zero? (count (second tree1))))
    (let [new-subtree (crossover (nth tree1 2) (nth tree2 2))]
      (list (first tree1) (vec (second tree1)) new-subtree))
    (let [cross-branch (+ 1 (* 2 (rand-int (/ (count (second tree1))))))]
      (list (first tree1)
            (vec (crossover-branch cross-branch crossover
                                   (second tree1) (second tree2)))
            (nth tree1 2)))))

Now it's time to get into functions that operate on populations.

Selection is the process in which more fit individuals are "selected," or more likely to breed (be involved in a crossover), while less fit individuals are less likely to breed.

To carry out the selection phase, it's necessary to determine how fit the individuals are. The following functions use the fitness function to give the individual trees a grade (I sometimes refer to it as "error"). Lower grades are better. Then, in the selection phase, individuals with lower error are more likely to be selected for crossover, and thus pass on their genetic material to the next generation.

Note that the fitness function is expected to return lower numbers for better results, with 0 being "perfect." Execution will stop once 0 is reached.

Apply truncate to all individuals in a population.

(defn truncate-population
  [population height]
  (map #(truncate-module % height) population))

Compute the fitness of all the trees in the population, and map the trees to their population in a seq of a zipmap.

(defn fitness-zip
  [population fitness]
  (seq (zipmap population (map fitness population))))

Use tournament selection to create a new generation. In each tournament the two best individuals in the randomly chosen group will reproduce to form a child tree. A larger tournament size will lead to more selective pressure. The function takes a population, tournament size, "fitness-zip" or sequence of the zip of trees and fitness scores, and the max depth, and it returns a new population.

(defn tournament-selection
  [population tournament-size fitness-zip]
  (let [child
        (fn []
          (let [selected (map first (sort-by second
                                             (repeatedly tournament-size
                                                         #(rand-nth fitness-zip))))]
            (crossover-module (first selected)
                              (second selected))))]
    (repeatedly (count population) child)))

Takes the fitness zip and finds the best in the population.

(defn get-best-fitness
  [fitness-zip]
  (first (sort-by second fitness-zip)))

Put the best-seen individual back in the population.

(defn elitism
  [population best]
  (conj (rest population) best))

Putting it together

This takes care of all the steps necessary to complete one generation of the algorithm. The process can be extended to multiple generations with a simple tail recursive function.

There are some extra considerations here. The function should:

  • stop when a perfect individual has been found, meaning fitness is zero

  • be resumable, meaning the search can halt, returning information, and that information can be passed back in to start the search at the same place

Runs n generations of a population, and returns the population and the best tree in the form [population best-tree fitness]. Takes a long list of parameters. This function is meant to be called by island-generations, which in turn is called by run-genetic-programming.

(defn generations
  [n population tournament-size mutation-probability mutation-depth max-depth terminals
   numbers functions fitness]
  (loop [n (int n) population population] ;; optimize inner loop
    (let [computed-fitness (fitness-zip population fitness)
          [best-tree best-fit] (get-best-fitness computed-fitness)]
      (if (or (zero? n) (zero? best-fit)) ;; terminating condition
        [population best-tree best-fit]   ;; return
        (recur (- n 1)                    ;; else recurse
               (-> population
                   (tournament-selection tournament-size computed-fitness)
                   (mutate-population mutation-probability mutation-depth terminals numbers functions)
                   (truncate-population max-depth)
                   (elitism best-tree)))))))

Islands

The above code works for running generations of a single population. The concept of islands is to have multiple separated populations evolving in parallel, and cross over between them.

Thanks to clojure's parallelism features, the islands actually do run in parallel. To an extent, anyway: they're processed with a thread pool of reasonable size ("reasonable size" being the key phrase -- clojure decides based on your availabe resources).

So, islands do two things.

  • Better results: by separating individuals for part of their evolution, and recombining them occasionally, we get (hopefully) more diversity.

  • Better performance: by exploiting multiple CPU cores with native threads, the program can process more individuals quickly.

Create a list of populations (islands).

(defn create-islands
  [num-islands population-size mutation-depth terminals numbers functions adf-arity adf-count
   adl-count adl-limit]
  (repeatedly num-islands #(create-population population-size mutation-depth terminals numbers
                                              functions adf-arity adf-count adl-count adl-limit)))

Individuals migrate between islands.

(defn island-crossover
  [islands]
  (let [cross (map rand-nth islands)]
    (map (fn [[island selected]] (conj (rest (shuffle island)) selected))
         (zipmap islands cross))))

And... drum roll

Run generations on all the islands and cross over between them. See the documentation for the generations function. Returns with the form [island best-tree best-fit].

(defn island-generations
  [n1 n2 islands tournament-size mutation-probability mutation-depth max-depth terminals
   numbers functions fitness report]
  (loop [n (int n1) islands islands] ;; optimize inner loop
    (let [islands-fit (pmap #(generations n2 % tournament-size mutation-probability
                                          mutation-depth max-depth terminals numbers
                                          functions fitness)
                            (if (> (count islands) 1) (island-crossover islands) islands))
          islands (map first islands-fit)
          [_ best-tree best-fit] (first (sort-by #(nth % 2) islands-fit))]
      (if (or (zero? n) (zero? best-fit))
        [islands best-tree best-fit]
        (do (report best-tree best-fit)
            (recur (- n 1) islands))))))

Wrap it up

This is the final function, the one to be called from outside. It takes a key->value hash as a parameter, so the options can be provided in any order and some have default values. See the top of this file for a complete explanation of each of the keyword parameters.

This is the entry function. Call this with a map of the parameters to run the genetic programming algorithm.

(defn run-genetic-programming
  [{:keys [iterations migrations num-islands population-size tournament-size mutation-probability
           mutation-depth max-depth terminals functions numbers fitness report adf-arity adf-count
           adl-count adl-limit]
    ;; the :or block here specifies default values for some of the arguments
   :or {tournament-size 3 mutation-probability 0.1 mutation-depth 6 adf-arity 1 adf-count 0
        adl-count 0 adl-limit 25 numbers []}}]
  ;; some basic type checking: most of the parameters must be integers
  (map #(assert (integer? %)) [iterations migrations num-islands population-size tournament-size mutation-probability
                               mutation-depth max-depth adf-arity adf-count adl-count adl-limit])
  ;; report and fitness should be functions
  (map #(assert (fn? %)) [report fitness])
  ;; call island generations
  (island-generations migrations iterations
                      (create-islands num-islands population-size mutation-depth terminals
                                      numbers functions adf-arity adf-count adl-count adl-limit)
                      tournament-size mutation-probability mutation-depth max-depth terminals
                      numbers functions fitness report))

And that's it! For the core of the library, anyway.

 

Branches and Modular Structure

fungp is capable of evolving multiple branches in an individual. In addition to the main branch, or the "Result Defining Branch," it can evolve functions (Automatically Defined Functions, or ADFs).

Some of the handling for these branches was covered in fungp.core, but the details are handled here.

(ns fungp.defined-branches)

Automatically Defined Functions

ADFs are functions of arbitrary arity that can be called from the main branch. The following functions are helper functions for creating individuals with ADF branches.

Generate arguments vector for ADF

(defn gen-adf-arg
  [adf-arity]
  (vec (map #(symbol (str "arg" %))
            (range adf-arity))))

Generate ADF function names

(defn gen-adf-func
  [adf-count adf-arity]
  (when (> adf-arity 0)
    (vec (map (fn [n] [(symbol (str "adf" n))
                       adf-arity])
              (range adf-count)))))

Build the ADF code

(defn make-adf
  [create-tree mutation-depth terminals numbers functions adf-arity num]
  (let [adf-args (gen-adf-arg adf-arity)]
  [(symbol (str "adf" num))
   (list 'fn adf-args
         (create-tree mutation-depth
                      (concat terminals adf-args)
                      numbers
                      functions
                      :grow))]))

Make the ADF branch

(defn make-adf-branch
  [create-tree mutation-depth terminals numbers functions adf-arity adf-count]
  (if (zero? adf-count) []
      (concat (make-adf-branch create-tree mutation-depth terminals numbers
                               functions adf-arity (dec adf-count))
              (make-adf create-tree mutation-depth terminals numbers
                        (concat functions
                                (gen-adf-func (dec adf-count)
                                              adf-arity))
                        adf-arity (dec adf-count)))))

Automatically Defined Loops

ADLs are loops that evolve the different stages of the loop as distinct branches. The following functions are helper functions for creating individuals with ADL branches.

Currently, ADLs are only partially implemented and not tested or stable.

Build the ADL code

(defn make-adl
  [create-tree mutation-depth terminals numbers functions adl-limit num]
  (let [make-branch #(create-tree mutation-depth
                                  terminals
                                  numbers
                                  functions
                                  :grow)]
    [(symbol (str "adl" num))
     (list 'fungp.util/do-loop
           ;; do-loop has 4 branches of code and one number literal
           (make-branch)
           (make-branch)
           (make-branch)
           (make-branch)
           adl-limit)]))

Make the ADL branch

(defn make-adl-branch
  [create-tree mutation-depth terminals numbers functions adl-count adl-limit]
  (if (zero? adl-count) []
    (concat (make-adl-branch create-tree mutation-depth terminals numbers
                             functions (dec adl-count) adl-limit)
            (make-adl create-tree mutation-depth terminals numbers
                      functions ; TODO: include ADFs and previous ADLs, if any
                      adl-limit (dec adl-count)))))

Generate a vector of symbols to reference the result of the ADLs.

(defn gen-adl-terminals
  [adl-count]
  (vec (map #(symbol (str "adl" %)) (range adl-count))))

Putting it together

This function builds the vector that goes into the let statement for each individual. It is intended to be called by create-module-tree.

(defn build-branch-vector
  [create-tree mutation-depth terminals numbers functions adf-arity adf-count adl-count adl-limit]
  (vec (concat (if (zero? adf-count) '()
                   (make-adf-branch create-tree mutation-depth terminals numbers
                                    functions adf-arity adf-count))
               (if (zero? adl-count) '()
                   (make-adl-branch create-tree mutation-depth terminals numbers
                                    functions adl-count adl-limit)))))

Takes an element of the branch vector, a mutation function, and the tunable parameters the mutation function needs, and applies mutation to each of the branches.

(defn mutate-branch
  [letf mutate-tree mutation-probability mutation-depth terminals numbers functions]
  (cond (and (list? letf)
             (= 'fn (first letf)))
        (list (first letf) (second letf)
              ;; call mutate-tree on the ADF body
              (mutate-tree (nth letf 2)
                           mutation-probability
                           mutation-depth terminals
                           numbers functions))
        (and (list? letf)
             (= 'fungp.util/do-loop (first letf)))
        ;; it's a loop! mutate each branch individually
        (list (first letf)
              (mutate-tree (nth letf 1)
                           mutation-probability
                           mutation-depth terminals
                           numbers functions)
              (mutate-tree (nth letf 2)
                           mutation-probability
                           mutation-depth terminals
                           numbers functions)
              (mutate-tree (nth letf 3)
                           mutation-probability
                           mutation-depth terminals
                           numbers functions)
              (mutate-tree (nth letf 4)
                           mutation-probability
                           mutation-depth terminals
                           numbers functions)
              (nth letf 5))
        :else letf))

Call truncate on branches. Takes a tree, truncate function, and a height.

(defn truncate-branch
  [tree truncate height]
  (vec (map
        #(cond (and (list? %)
                    (= 'fn (first %)))
               (list (first %) (second %)
                     (truncate (nth % 2) height))
               (and (list? %)
                    (= 'fungp.util/do-loop (first %)))
               (list (first %)
                     (truncate (nth % 1) height)
                     (truncate (nth % 2) height)
                     (truncate (nth % 3) height)
                     (truncate (nth % 4) height)
                     (nth % 5))
               :else %)
        (second tree))))

Crossover function for the vectors of branches. Takes the cross point, the crossover function, and the two vectors.

(defn crossover-branch
  [cross crossover vec1 vec2]
  (let [tree1 (nth vec2 cross)
        tree2 (nth vec1 cross)
        newtree (cond (= (first tree1) 'fn)
                      (list (first tree1) (second tree1)
                            (crossover (nth tree1 2) (nth tree2 2)))
                      (= (first tree1) 'fungp.util/do-loop)
                      (list (nth tree1 0)
                            (crossover (nth tree1 1) (nth tree2 1))
                            (crossover (nth tree1 2) (nth tree2 2))
                            (crossover (nth tree1 3) (nth tree2 2))
                            (crossover (nth tree1 4) (nth tree2 4))
                            (nth tree1 5)))]
    (concat (take cross vec1)
            (list newtree)
            (drop (+ cross 1) vec1))))
 

Cart problem

This is an example of evolving a function that takes parameters and returns a number. In this case, the function acts as a controller for and imaginary cart.

The cart is on a track and has a rocket attached to it. The goal of the controller is to aim the rocket either forward and backward and eventually center the cart. The controller function takes two parameters: the current velocity, v, and the current position, x. When v is positive the cart is moving to the right, and when v is negative the cart is moving to the left. The evolved function will use these inputs to return either a positive or negative number, which designates which direction to point the rocket; negative points it to the left, and positive points it to the right. The rocket applies a constant force (1 unit/second^2) in whichever direction it is facing on every tick of the clock.

Evolved programs will be tested with a variety of random starting configurations (position and velocity) to see if the cart can be centered every time.

To test an evolved program, we need to simulate a cart and let the controller programs experiment by pushing it around. As they do this they should "learn" how to center the cart. The simulation will work like this: the cart is placed in a random location with a random velocity, and, in a loop, the simulation function will call the evolved function with the state of the cart as input, and use the output of the program to update the state of the cart. Remember, the cart's state is its location and velocity. The simulation function will do this repeatedly until the cart is centered or we hit a pre-defined time limit.

(ns fungp.sample.cart
  (:use fungp.core)
  (:use fungp.util)
  (:use clojure.pprint))

Some constants

Max time limit

Step increment

Number of test cases

Add this to fitness if program runs out of time

Stop if fitness is lower than this value.

(def MAX_T  5)
(def STEP_INC  0.02)
(def TEST_POINTS  40)
(def PENALTY  200)
(def CRITERIA  65)

The simulation functions

These functions are designed to test a given cart controlling function to see if it is able to center the cart.

Is the cart at rest? This function tests its location, x, and velocity, v, to see if it is stopped in the center of the track.

(defn cart-at-rest?
  [x v]
  (and (<= (Math/abs x) 0.01) (<= (Math/abs v) 0.01)))

Has the simulation hit the maximum time limit? This function compares the amount of time, t, in the simulation so far to the maximum time, MAX_T.

(defn out-of-time?
  [t]
  (>= t MAX_T))

Get the thrust of the cart by running the generated function with velocity v and position x as inputs.

(defn get-thrust
  [f v x]
  (if (> (f v x) 0) 1.0 -1.0))

Update the t (time) value in the simulation by incrementing it by STEP_INC.

(defn update-t
  [t]
  (+ t STEP_INC))

Update the v (velocity) value in the simulation given the current thrust.

(defn update-v
  [v thrust]
  (+ v (* thrust STEP_INC)))

Update the x (position) value in the simulation given the current x and v (velocity)

(defn update-x
  [x v]
  (+ x (* v STEP_INC)))

Simulate cart movement with the evolved function used to calculate thrust. Parameters are: t (time), v (velocity), x (position), and f (function). This function loops until it is out of time or the cart is centered and at rest. On each iteration of the loop it updates the values of t, v, and x.

(defn move-cart
  [t v x f]
  (loop [t (float t) v (float v) x (float x)]
    ;; Explicitly calling "float" like this allows the Clojure compiler to unbox
    ;; the floating point values in the loop body.
    (cond (cart-at-rest? x v) t
          (out-of-time? t) PENALTY
          :else (recur (update-t t)
                       (update-v v (get-thrust f v x))
                       (update-x x v)))))

Test data

Range of starting position or velocity

(defn rand-spread
  [l] (repeatedly l #(- (* 1.5 (rand)) 0.75)))

Functions available to the evolved programs

(def cart-functions 
  '[[+ 2][* 2][- 2][fungp.util/abs 1][fungp.util/gt 2]])

Terminals available to the evolved programs

(def cart-terminals 
  '[x v -1])

Calculate error using move-cart, given a function and sequences of starting velocities and positions.

(defn cart-error
  [f velocity position]
  (reduce + (map (fn [v x] (move-cart 0 v x f))
                 velocity
                 position)))

Test an evolved function on random starting configurations.

(defn cart-test
  [f] (let [err (cart-error f (rand-spread TEST_POINTS) (rand-spread TEST_POINTS))]
        (if (> err CRITERIA) err 0)))

Some existing solutions

These are some examples to compare your outputs to.

The solution according to physics

(defn cart-optimal 
  [v x] (gt (* -1 x) (* v (abs v))))

A non-optimal evolved program

(defn cart-suboptimal 
  [v x] (gt (- x x) (+ (+ x (+ v x)) x)))

Initializing fungp

Compute fitness for cart problem.

(defn cart-fitness
  [tree] (let [f (eval (list 'fn '[v x] tree))]
           (cart-test f)))

Reporting function. Prints out the tree and its score

(defn cart-report
  [tree fitness]
  (pprint (list 'fn '[v x] tree))
  (print (str "Error:\t" fitness "\n\n"))
  (flush))

Run the cart problem

(defn test-cart
  [n1 n2]
  (println "Cart Problem")
  (println "Error is number of seconds taken total across all the tests.")
  (println (str "If a program times out it is assigned " PENALTY " seconds."))
  (println (str "Because the simulation randomizes the test inputs, "
                "error values will go up and down during a search."))
  (let [options {:iterations n1 :migrations n2 :num-islands 4
                 :population-size 30 :terminals cart-terminals
                 :tournament-size 3 :mutation-probability 0.15
                 :functions cart-functions :fitness cart-fitness
                 :report cart-report
                 :max-depth 4 :mutation-depth 3}
        [tree score] (rest (run-genetic-programming options))]
    (do (println "Done!")
        (cart-report tree score))))
 

Santa Fe Ant Trail

This is an implementation of Koza's Santa Fe Ant Trail that uses fungp's local variables and result wrappers. Unlike the interpret-ants implementation, this one takes advantage of the JVM by compiling the individuals into JVM bytecode.

(ns fungp.sample.compile-ants
  (:use fungp.core)
  (:use fungp.util)
  (:use clojure.pprint))

First some constants need to be established.

Maximum number of steps before penalty

(def MAXSTEPS 
  50)

Penalty applied to individuals that don't eat all the food

(def PENALTY 
  50)

Maximum number of iterations (length of loop in fitness function)

(def MAXITER 
  50)

Locations of food to be eaten.

(def STARTFOOD1 
  [[1 1][1 2][1 3][1 4][1 5][2 5][3 5][4 5][5 5][5 4]
   [5 3][6 3][7 3][8 3][8 4][7 4][6 4][6 2][7 2][8 3]
   [8 4][9 4][9 5][9 6][9 8][8 8][8 7][9 7]])

Second set of locations of food to be eaten.

(def STARTFOOD2 
  [[0 1][1 1][1 2][2 2][3 2][3 3][4 3][5 4][5 5][5 6]
   [6 6][7 6][8 6][9 6][9 7][9 8][9 9][8 9][7 9][6 9]
   [6 8][5 8][5 7][4 7][4 6][4 5][4 4][4 3]])

New set of food to test for generalization

(def TESTFOOD 
  [[1 0][2 0][3 0][4 0][5 0][6 1][6 2][6 3][6 4][6 5]
   [6 6][5 6][4 6][4 7][3 7][2 7][1 7][0 7][0 6][0 5]
   [1 5][2 5][2 4][2 3][2 1][3 1][3 2][3 3]])

Vars in Clojure are thread-local, which is a useful property. Here a bunch of dynamic vars are declared, and they can be referenced by the rest of the code in this file. Once fungp has been started, there will be separate versions of these vars in each thread, and so the code that references them will refer to the local copy in the respective thread.

Basically, you can write your application as if these were global variables, and they will work in a multithreaded environment automatically, like magic, as long as you include the "binding" statement below in the fitness function

(def ^:dynamic ant-x)
(def ^:dynamic ant-y)
(def ^:dynamic ant-dir)
(def ^:dynamic food)
(def ^:dynamic eaten)
(def ^:dynamic steps)

Then some helper functions.

(defn sense-food [dir x y food]
  (cond (= dir 0) (some #(= % [x (inc y)]) food)
        (= dir 1) (some #(= % [(inc x) y]) food)
        (= dir 2) (some #(= % [x (dec y)]) food)
        (= dir 3) (some #(= % [(dec x) y]) food)
        :else (throw (Throwable. "Invalid direction"))))
(defmacro sense [x y]
  `(if (sense-food ant-dir ant-x ant-y food)
    ~x ~y))
(defn new-direction [elm ant-dir]
  (cond (= elm 'left)
        (cond (= ant-dir 0) 3
              (= ant-dir 1) 0
              (= ant-dir 2) 1
              (= ant-dir 3) 2)
        (= elm 'right)
        (cond (= ant-dir 0) 1
              (= ant-dir 1) 2
              (= ant-dir 2) 3
              (= ant-dir 3) 0)
        :else ant-dir))
(defn left []
  (set! ant-dir (new-direction 'left ant-dir)))
(defn right []
  (set! ant-dir (new-direction 'right ant-dir)))
(defn new-x [ant-dir ant-x]
  (cond (= ant-dir 1) (inc ant-x)
        (= ant-dir 3) (dec ant-x)
        :else ant-x))
(defn new-y [ant-dir ant-y]
  (cond (= ant-dir 0) (inc ant-y)
        (= ant-dir 2) (dec ant-y)
        :else ant-y))
(defn new-food [ant-x ant-y food]
  (remove #(= % [ant-x ant-y]) food))
(defn new-eaten [ant-x ant-y food eaten]
  (if (some #(= % [ant-x ant-y]) food) (inc eaten) eaten))
(defn move []
  (do
    (set! steps (inc steps))
    (set! ant-x (new-x ant-dir ant-x))
    (set! ant-y (new-y ant-dir ant-y))
    (set! eaten (new-eaten ant-x ant-y food eaten))
    (set! food (new-food ant-x ant-y food))))
(def ant-terminals '[(left) (right) (move)])
(def ant-functions '[[sense 2][do 2][do 3]])

This evaluates the fitness of the individuals. It uses a binding block to create thread-local bindings, evaluates the evolved code, and runs it in a loop.

(defn run-ant
  [tree init-food]
  (binding [ant-x 0 ant-y 0 ant-dir 0 food init-food eaten 0 steps 0]
    (let [func (eval (list 'fn [] tree))]
      (loop [iter MAXITER]
        (do
          (func) ;; run the code
          (if (zero? (count food))
            steps
            (if (or (zero? iter) (zero? steps) (> steps MAXSTEPS))
              (+ MAXSTEPS PENALTY (count food))
              (recur (dec iter)))))))))

Tests the fitness of an ant using run-ant and STARTFOOD.

(defn ant-fitness
  [tree] (+ (run-ant tree STARTFOOD1) (run-ant tree STARTFOOD2)))

Reporting function. Prints out the tree and its score

(defn ant-report
  [tree fitness]
  (pprint (nth tree 2))
  (println (str "Error:\t" fitness "\n\n")))
(defn test-compile-ants [n1 n2]
  (println "Ant problem, from Koza")
  (println (str "Generations: " (* n1 n2)))
  (println "Training is done on two data sets, each with a penalty of 50 and 50 max steps.")
  (let [options {:iterations n1 :migrations n2 :num-islands 8
                 :tournament-size 3
                 :population-size 20 :max-depth 5
                 :terminals ant-terminals :fitness ant-fitness
                 :functions ant-functions :report ant-report}
        [tree score] (rest (run-genetic-programming options))]
    (do (println "Done!")
        (ant-report tree score)
        (println "Testing for generalization (one data set, same penalty and max steps)")
        (println (str "Score: " (run-ant tree TESTFOOD))))))
 

Interpreter-based Santa Fe Ant Trail

This implementation of Koza's Santa Fe Ant Trail problem evolves an s-expression that is treated as a tree rather than as literal Clojure code. It is passed to a function that interprets it by traversing the evolved tree.

This is one of the two examples of the Ant Trail program included with fungp. This example shows how you might approach a GP problem by constructing a machine and evolving a tree as a data structure to be understood by that machine, as opposed to evolving literal Clojure code. There are benefits and downsides to both approaches.

(ns fungp.sample.interpret-ants
  (:use fungp.core)
  (:use fungp.util)
  (:use clojure.pprint))

First some constants need to be established.

(def MAXSTEPS 40)
(def STARTFOOD [[1 0][1 1][1 2][1 3][1 4][1 5][2 5][3 5][3 6][4 6]
                [5 6][6 6][7 7][7 8][8 8][9 8][9 9][10 9][10 10][9 10]
                [8 10][8 9][11 9][11 10]])
(def NUMFOOD (count STARTFOOD))

Then some helper functions.

(defn sense-food [dir x y food]
  (cond (= dir 0) (some #(= % [x (inc y)]) food)
        (= dir 1) (some #(= % [(inc x) y]) food)
        (= dir 2) (some #(= % [x (dec y)]) food)
        (= dir 3) (some #(= % [(dec x) y]) food)
        :else (throw (Throwable. "Invalid direction"))))
(defn new-direction [elm ant-dir]
  (cond (= elm 'left)
        (cond (= ant-dir 0) 3
              (= ant-dir 1) 0
              (= ant-dir 2) 1
              (= ant-dir 3) 2)
        (= elm 'right)
        (cond (= ant-dir 0) 1
              (= ant-dir 1) 2
              (= ant-dir 2) 3
              (= ant-dir 3) 0)
        :else ant-dir))
(defn new-x [elm ant-dir ant-x]
  (if (= elm 'move)
    (cond (= ant-dir 1) (inc ant-x)
          (= ant-dir 3) (dec ant-x)
          :else ant-x)
    ant-x))
(defn new-y [elm ant-dir ant-y]
  (if (= elm 'move)
    (cond (= ant-dir 0) (inc ant-y)
          (= ant-dir 2) (dec ant-y)
          :else ant-y)
    ant-y))
(defn new-food [ant-x ant-y food]
  (remove #(= % [ant-x ant-y]) food))
(defn new-eaten [ant-x ant-y food eaten]
  (if (some #(= % [ant-x ant-y]) food) (inc eaten) eaten))

Simulate the ant movements by repeatedly interpreting the movement function.

(defn simulate-ants
  [iter full-tree tree ant-dir ant-x ant-y food eaten steps]
  (cond (zero? iter) (+ MAXSTEPS (count food))
        (and (seq? tree) (empty? tree))
        (recur (dec iter) full-tree full-tree ant-dir ant-x ant-y food eaten (inc steps))
        (empty? food) steps ;; ant ate all the food
        :else
        (cond (not (seq? tree))
              (recur iter full-tree '()
                     (new-direction tree ant-dir)
                     (new-x tree ant-dir ant-x)
                     (new-y tree ant-dir ant-y)
                     (new-food ant-x ant-y food)
                     (new-eaten ant-x ant-y food eaten)
                     steps)
              (= (first tree) 'sense)
              (let [tree (if (sense-food ant-dir ant-x ant-y food)
                           (nth tree 1) (nth tree 2))]
                (recur iter full-tree tree ant-dir ant-x ant-y food eaten steps))
              (= (first tree) 'do)
              (if (= (count (rest tree)) 1)
                (recur iter full-tree (nth tree 1) ant-dir ant-x ant-y food eaten steps)
                (recur iter full-tree (cons 'do (rest (rest tree)))
                       (new-direction (nth tree 1) ant-dir)
                       (new-x (nth tree 1) ant-dir ant-x)
                       (new-y (nth tree 1) ant-dir ant-y)
                       (new-food ant-x ant-y food)
                       (new-eaten ant-x ant-y food eaten)
                       steps))
              :else (throw (Throwable. (str "Unexpected element in tree: " tree))))))
(def ant-terminals '[move left right])
(def ant-functions '[[sense 2][do 2][do 3]])
(defn ant-fitness [tree]
  (simulate-ants MAXSTEPS (nth tree 2) (nth tree 2) 0 0 0 STARTFOOD 0 0))

Reporting function. Prints out the tree and its score

(defn ant-report
  [tree fitness]
  (pprint (nth tree 2))
  (print (str "Error:\t" fitness "\n\n"))
  (flush))
(defn test-ants [n1 n2]
  (println "Ant problem\n")
  (println (str "Food count: " NUMFOOD "\n"))
  (let [options {:iterations n1 :migrations n2 :num-islands 8 :population-size 50
                 :max-depth 5 :mutation-depth 1
                 :terminals ant-terminals :fitness ant-fitness
                 :functions ant-functions :report ant-report}
        [tree score] (rest (run-genetic-programming options))]
    (do (println "Done!")
        (ant-report tree score))))
 

Robot Grid

Here we want to control a robot on a grid and have it collect coins.

The robot has three state variables: x, y, and direction. These values are not passed as parameters to functions, but stored outside the function in variables. The evolved programs mutate these variables, and the simulation function checks these changes to determine the fitness of the evolved code.

Each evolved program is tested on a series of grids with coins scattered randomly around. The robot can move forward, rotate left, rotate right, and sense whether a coin is in front of it.

A sample evolved program might be:

 (detect-coin (move-forward) 
              (do (left) (move-forward)))

There are 5 total state variables: robot-x, robot-y, robot-dir, ticks, and coins-available. The first two are obvious (x and y coordinates of the robot). The third is a number corresponding to the direction the robot is facing. The fourth is the number of clock ticks that have gone by (each movement of the robot takes one click). If too much time has gone by, the simulation stops and the program is given a penalty. The last one, coins-available, is a list of (x,y) pairs corresponding to coin locations. It is randomly generated. Once the robot picks up a coin, that pair is removed from the list.

(ns fungp.sample.robot
    (:use fungp.core)
    (:use fungp.util)
    (:use clojure.pprint))

Some constants

Maximum number of ticks

Points added when time is up

Number of coins on grid

Length of one side of the grid

Acceptable minimum value

Number of trials for each program

(def MAX_T  100)
(def PENALTY  200)
(def COIN_COUNT  5)
(def GRID_SIZE  4)
(def CRITERIA  120)
(def TRIALS  4)

State variables

State variables are dynamic vars. They are thread-local in Clojure, which is useful because populations in fungp run in parallel threads.

When these variables are referenced in the functions below, they will be re-bound at runtime to the appropriate thread-local values.

(def ^:dynamic robot-x)
(def ^:dynamic robot-y)
(def ^:dynamic robot-dir)
(def ^:dynamic ticks)
(def ^:dynamic coins-available)

Generate a list of coins consisting of [x y] pairs.

(defn generate-coins-list
  []
  (take COIN_COUNT (shuffle (for [x (range GRID_SIZE)
                                  y (range GRID_SIZE)]
                               [x y]))))

Simulation functions

The robot direction is stored as an integer.

  • 0 is left
  • 1 is up
  • 2 is right
  • 3 is down

Return a new x coordinate given a direction for movement.

(defn new-x
     [dir x]
     (mod 
       (case dir
         0 (dec x),
         1 x,
         2 (inc x),
         3 x) 
      4))

Return a new y coordinate given a direction for movement.

(defn new-y
     [dir y]
     (mod 
       (case dir
         0 y,
         1 (inc y),
         2 y,
         3 (dec y)) 
       4))

Check if a coin exists at a certain location on the grid. Expects x and y coordinates and the coins sequence.

(defn coin-at?
     [x y coins] (some #(= % [x y]) coins))

Robot sensor to detect if a coin is in front of the robot.

(defn coin-ahead?
     [] (coin-at? (new-x robot-dir robot-x) 
                  (new-y robot-dir robot-y) 
                  coins-available))

Use robot's sensors to look for a coin, and if it's found execute the first branch, otherwise execute the second branch.

(defmacro detect-coin
  [found not-found]
  `(if (coin-ahead?)
     ~found
     ~not-found))

Make the robot turn left.

(defn left
  [] 
  (set! robot-dir (mod (- robot-dir 1) 4))
  (set! ticks (inc ticks)))

Make the robot turn right.

(defn right
  [] 
  (set! robot-dir (mod (+ robot-dir 1) 4))
  (set! ticks (inc ticks)))

Make the robot move forward one step in whatever direction it is facing.

(defn move-forward
  [] 
  (set! robot-x (new-x robot-dir robot-x))
  (set! robot-y (new-y robot-dir robot-y))
  (set! ticks (inc ticks))
  (when (coin-at? robot-x robot-y coins-available)
    (set! coins-available (remove #(= % [robot-x robot-y]) coins-available))))
(def robot-terminals '[(left) (right) (move-forward)])
(def robot-functions '[[detect-coin 2] [do 2]])

This function repeatedly runs the evolved code and checks its side effects (the dynamic variables) to see if the robot has run out of time or collected all of the coins.

(defn simulate-robot
  [f]
  (binding [robot-x (int (/ GRID_SIZE 2)), robot-y (int (/ GRID_SIZE 2)),
            robot-dir 0, ticks 0,
            coins-available (generate-coins-list)]
           (loop []
                 (f)
                 (cond (> ticks MAX_T) (+ PENALTY ticks (count coins-available))
                       (empty? coins-available) ticks
                       :else (recur)))))

Calculate the error of the evolved program.

(defn robot-error
  [tree] 
  (let [f (eval (list 'fn [] tree))] 
    (apply + (repeatedly TRIALS #(simulate-robot f)))))

Calculate the fitness, taking the criteria into account.

(defn robot-fitness
  [tree]
  (let [error (robot-error tree)]
    (if (> error CRITERIA) error 0)))
(defn robot-report
  [tree fitness]
  (pprint (nth tree 2))
  (println (str "Error:\t" fitness "\n\n")))
(defn test-robots [n1 n2]
  (println "Robot Grid Movement Control Problem")
  (let [options {:iterations n1 :migrations n2 :num-islands 2
                 :tournament-size 3 :population-size 50 :max-depth 4
                 :terminals robot-terminals :fitness robot-fitness
                 :functions robot-functions :report robot-report
                 :mutation-depth 3
        }
        [tree score] (rest (run-genetic-programming options))]
    (do (println "Done!")
        (robot-report tree score))))
 

Sinbowl regression problem

(ns fungp.sample.sinbowl
  (:use fungp.core)
  (:use fungp.util)
  (:use clojure.pprint))

The function to match

(defn sinbowl-function
  [x] (- (* 0.1 (abs x)) (sin x)))

Functions and their arities

(def sinbowl-functions
  '[[+ 2][- 2][* 2][fungp.util/abs 1]
    [Math/sin 1][fungp.util/sdiv 2][inc 1][dec 1]])

Only one parameter for sinbowl

(def sinbowl-parameters
  '[x])

Lots of number literals to use as terminals

(def sinbowl-numbers
  '[-1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10])

The input test cases

(def test-range
  (map #(float (* (- % 7) 3)) (range 14)))

The output test cases

(def sinbowl-actual
  (map sinbowl-function test-range))

Fitness function. Takes a tree, evals it, and returns a fitness/error score.

(defn sinbowl-fitness
  [tree]
  (try
    (let [f (eval (list 'fn 'testfunc '[x] tree))
          results (map f test-range)]
      (reduce + (map off-by-sq sinbowl-actual results)))
    (catch Exception e (println e) (println tree))))

Reporting function. Prints out the tree and its score

(defn sinbowl-report
  [tree fitness]
  (pprint (list 'fn '[a] tree))
  (print (str "Error:\t" fitness "\n\n"))
  (flush))
(defn test-sinbowl [n1 n2]
  (println (str "Test inputs: " (vec test-range)))
  (println (str "Test outputs: " (vec sinbowl-actual)))
  (println (str "Max generations: " (* n1 n2)))
  (println)
  (let [options {:iterations n1 :migrations n2 :num-islands 6 
                 :population-size 30
                 :max-depth 5 :terminals sinbowl-parameters 
                 :numbers sinbowl-numbers :fitness sinbowl-fitness
                 :functions sinbowl-functions :report sinbowl-report}
        [tree score] (rest (run-genetic-programming options))]
    (do (println "Done!")
        (sinbowl-report tree score))))
 

Getting Started Tutorial

The simplest problem for GP is regression, or coming up with a function to map pairs of input and output. This tutorial goes through how to set up fungp to evolve a function to match a series of points.

If you don't already have fungp set up, look at the explanation at the top of fungp.core. Basically, all you need to do is install lein and download the fungp code.

Namespace for sample use of fungp.

(ns fungp.tutorial
  (:use fungp.core) ;; include the core framework
  (:use fungp.util) ;; include utility functions
  (:use clojure.pprint))

The general structure here is very simple, and probably similar to what you'll want if you're using fungp. We define a bunch of values and use them at the bottom when invoking the library. In creating these values we have to think a little bit about the problem we're trying to solve.

Choosing functions and terminals

The trees evolved in fungp consist of functions and terminals. The functions take an arbitrary number of arguments.

For example, the expression:

(+ a (* b c))

Is a tree with two functions: + and *. Each function takes two parameters. Functions can take the result of other functions as parameters, as well as terminals.

Here's a vector of vectors consisting of [symbol arity] pairs. The symbol must resolve to a function, and the arity number is an integer representing how many arguments that function takes.

(def sample-functions
  '[[+ 2][- 2][* 2][fungp.util/abs 1]
    [fungp.util/sdiv 2][inc 1][dec 1]])

We also need to pick the terminals. In the above expression, a b and c were terminals. Terminals can be anything, but fungp separates them into two groups: numbers and everything else. This is just a suggestion, though, not a requirement. You can put anything you want in your ordinary terminals list, including numbers.

For this example, the only terminals are the function parameter (which I'll call a) and a range of numbers.

This defines the parameters (or, in this case, parameter) to be used to eval the generated functions.

(def sample-parameters
  ['a])

This generates floating point numbers up to 10 as number literals to be used in the code.

(def number-literals
  (map float (range 10)))

Testing data

The following code is used to test the evolved functions.

This defines the range of input to use as training input. The first argument for map here uses a shortcut for single-variable anonymous functions in Clojure.

(def training-range
  (map #(* 2 (- % 5)) (range 10)))

For sake of convenience, we can define a function to generate the outputs we're attempting to match.

(defn match-func
  [a] (abs (* 3 (* a a))))

This defines the actual outputs we are trying to match.

(def actual-output
  (map float (map match-func training-range)))

Computing and reporting fitness

The fitness function; it takes a tree, evals it, and returns a fitness/error score.

(defn sample-fitness
  [tree]
  (try
    (let [f (compile-tree tree sample-parameters) ;; compile using compile-tree
          results (map f training-range)] ;; map the function to the test range
      ;; then we compare the test results to the actual expected output
      ;; off-by-sq is a utility function that calculates difference squared
      (reduce + (map off-by-sq actual-output results)))
    ;; not necessary here, but this is how you might catch a possible exception
    (catch Exception e (println e) (println tree))))

Reporting function. Prints out the tree and its score

(defn sample-report
  [tree fitness]
  (pprint tree)
  (println (str "Error:\t" fitness "\n"))
  (flush))

Running it!

This is the function that launches fungp and starts the evolution. It takes iteration and migration counts as parameters.

(defn test-genetic-program
  [n1 n2]
  (println "\nfungp :: Functional Genetic Programming in Clojure")
  (println "Mike Vollmer, 2012")
  (println (str "Test inputs: " (vec training-range)))
  (println (str "Test outputs: " (vec actual-output)))
  (println (str "Max generations: " (* n1 n2)))
  (println)
  ;; These keyword arguments specify the options for fungp. They should be self-explanatory,
  ;; but you can read more about them in fungp.core
  (let [options {:iterations n1 :migrations n2 :num-islands 6 :population-size 40
                 :tournament-size 5 :mutation-probability 0.1
                 :max-depth 10 :terminals sample-parameters
                 :numbers number-literals :fitness sample-fitness
                 :functions sample-functions :report sample-report }
        ;; the data returned by run-genetic-programming is as follows:
        ;; [population [best-tree score]]
        ;; since we don't care about keeping the whole population
        ;; around, we can save off the tree and score like this
        [tree score] (rest (run-genetic-programming options))]
    ;; that's it!
    (do (println "Done!")
        (sample-report tree score))))

To run test-genetic-program you can load this file in the repl and run the function:

user=> (use 'fungp.tutorial)
nil
user=> (test-genetic-program 15 15)
...

The two parameters correspond to the iterations between migrations and the number of migrations. This also results in the reporting rate -- results are printed to the screen at each migration. The total number of generations is the result of these two numbers multiplied together.

 
(ns fungp.util)

Useful functions for evolved code

Inverts a number. If it's 0, return 0

(defn inv
  [x]
  (if (zero? x) 0
      (/ 1 x)))

Divide two numbers. Returns 0 on attempt to divide by 0

(defn sdiv
  [x y]
  (if (zero? y) 0
      (/ x y)))

Conditionals can be done in terms of 4 arity functions

(defn ifeq [a b c d] (if (= a b) c d))
(defn ifnoteq [a b c d] (if (not (= a b)) c d))
(defn gte [a b c d] (if (>= a b) c d))
(defn gt [x y] (if (> x y) 1 -1))
(defn sin [x] (Math/sin x))
(defn cos [x] (Math/cos x))
(defn tan [x] (Math/tan x))
(defn abs [x] (if (< x 0) (* -1 x) x))
(defn dub [x] (* x x))
(defn half [x] (sdiv x 2))
(defn sqrt [x] (if (x > 0) (Math/sqrt x) 0))

Misc. manipulation/utility functions

Find the maximum height of a tree. The max height is the distance from the root to the deepest leaf.

(defn max-tree-height
  [tree]
  (if (not (seq? tree)) 0
    (+ 1 (reduce max (map max-tree-height tree)))))

Checks the type on a tree to make sure it's a list, symbol, or number.

(defn valid-tree?
  [tree] (or (list? tree) (symbol? tree) (number? tree)))

Compiles a tree into a Clojure function, and thus into JVM bytecode. Takes a tree and the parameters the function should have.

(defn compile-tree
  [tree parameters] (eval (list 'fn parameters tree)))

Convenience function. Generates true with a probablilty of the given parameter (a number between 0 and 1)

(defn flip
  [chance] (< (rand) chance))

Find the entry in the function sequence for a given operator.

(defn find-op
  [op funcs] (first (filter (fn [x] (= (:op x) op)) funcs)))
(defn average
  [list]
  (let [sum (reduce + list)
        length (count list)]
    (/ sum length)))

Calculate error.

(defn off-by
  [x y] (abs(- x y)))

Calculate error squared.

(defn off-by-sq
  [x y] (let [error (off-by x y)]
          (* error error)))

true if seq contains elm

(defn in?
  [seq elm]
  (some #(= elm %) seq))

The macro used to define loops. Used by ADLs. Takes four code branches and one integer.

(defmacro do-loop
  [init break body update limit]
    `(do ~init
       (loop [loop-counter# ~limit value# 0]
         (if (or (zero? loop-counter#)
                 (< ~break 0))
           value#
           (let [newvalue# ~body]
             (do ~update
               (recur (dec loop-counter#) newvalue#)))))))

Reporting function. Prints out the tree and its score

(defn generic-report
  [tree fitness]
  (println tree)
  (println (str "Error:\t" fitness "\n"))
  (flush))