github.com/yaricom/goNEAT@v0.0.0-20210507221059-e2110b885482/README.md (about)

     1  [![Build Status](https://travis-ci.org/yaricom/goNEAT.svg?branch=master)](https://travis-ci.org/yaricom/goNEAT) [![GoDoc](https://godoc.org/github.com/yaricom/goNEAT/neat?status.svg)](https://godoc.org/github.com/yaricom/goNEAT/neat)
     2  
     3  ## Overview
     4  This repository provides implementation of [NeuroEvolution of Augmenting Topologies (NEAT)][1] method written in Go language.
     5  
     6  The Neuroevolution (NE) is an artificial evolution of Neural Networks (NN) using genetic algorithms in order to find
     7  optimal NN parameters and topology. Neuroevolution of NN may assume search for optimal weights of connections between
     8  NN nodes as well as search for optimal topology of resulting NN. The NEAT method implemented in this work do search for
     9  both: optimal connections weights and topology for given task (number of NN nodes per layer and their interconnections).
    10  
    11  #### System Requirements
    12  The source code written and compiled against GO 1.9.x.
    13  
    14  ## Installation
    15  Make sure that you have at least GO 1.8.x. environment installed onto your system and execute following command:
    16  ```bash
    17  
    18  go get github.com/yaricom/goNEAT
    19  ```
    20  
    21  ## Performance Evaluations
    22  The basic system's performance is evaluated by two kind of experiments:
    23  1. The XOR experiment which test whether topology augmenting actually happens by NEAT algorithm evaluation. To build XOR
    24  solving network the NEAT algorithm should grow new hidden unit in the provided start genome.
    25  2. The pole balancing experiments which is classic Reinforcement Learning experiment allowing us to estimate performance
    26  of NEAT algorithm against proven results by many of other algorithms. I.e. we can benchmark NEAT performance against
    27  other algorithms and find out if it performs better or worse.
    28  
    29  
    30  ### 1. The XOR Experiments
    31  Because XOR is not linearly separable, a neural network requires hidden units to solve it. The two inputs must be
    32  combined at some hidden unit, as opposed to only at the output node, because there is no function over a linear
    33  combination of the inputs that can separate the inputs into the proper classes. These structural requirements make XOR
    34  suitable for testing NEAT’s ability to evolve structure.
    35  
    36  #### 1.1. The XOR Experiment with connected inputs in start genome
    37  In this experiment we will use start (seed) genome with inputs connected to the output. Thus it will check mostly the
    38  ability of NEAT to grow new hidden unit necessary for solving XOR problem.
    39  
    40  To run this experiment execute following commands:
    41  ```bash
    42  
    43  cd $GOPATH/src/github.com/yaricom/goNEAT
    44  go run executor.go -out ./out/xor -context ./data/xor.neat -genome ./data/xorstartgenes -experiment XOR
    45  
    46  ```
    47  Where: ./data/xor.neat is the configuration of NEAT execution context and ./data/xorstartgenes is the start genome
    48  configuration.
    49  
    50  This will execute 100 trials of XOR experiment within 100 generations. As result of execution into the ./out directory
    51  will be stored several 'gen_x' files with snapshots of population per 'print_every'
    52  generation or when winner solution found. Also in mentioned directory will be stored 'xor_winner' with winner genome and
    53  'xor_optimal' with optimal XOR solution if any (has exactly 5 units).
    54  
    55  By examining resulting 'xor_winner' from series of experiments you will find that at least one hidden unit was grown by NEAT
    56  to solve XOR problem which is proof that it works as expected.
    57  
    58  The XOR experiment for start genes with inputs connected will not fail almost always (at least 100 simulations)
    59  
    60  The experiment results will be similar to the following:
    61  
    62  ```
    63  Average
    64  	Winner Nodes:	5.0
    65  	Winner Genes:	6.0
    66  	Winner Evals:	7753.0
    67  Mean
    68  	Complexity:	10.6
    69  	Diversity:	19.8
    70  	Age:		34.6
    71  ```
    72  
    73  Where:
    74  - **Winner nodes/genes** is number of units and links between in produced Neural Network which was able to solve XOR problem.
    75  - **Winner evals** is the number of evaluations of intermediate organisms/genomes before winner was found.
    76  - **Mean Complexity** is an average compexity (number of nodes + number of links) of best organisms per epoch for all epochs.
    77  - **Mean Diversity** is an average diversity (number of species) per epoch for all epochs
    78  - **Mean Age** is an average age of surviving species per epoch for all epochs
    79  
    80  #### 1.2. The XOR experiment with disconnected inputs in start genome
    81  This experiment will use start genome with disconnected inputs in order to check ability of algorithm to not only grow
    82  need hidden nodes, but also to build missed connections between input nodes and rest of the network.
    83  
    84  To run this experiment execute following commands:
    85  ```bash
    86  
    87  cd $GOPATH/src/github.com/yaricom/goNEAT
    88  go run executor.go -out ./out/xor_disconnected -context ./data/xor.neat -genome ./data/xordisconnectedstartgenes -experiment XOR
    89  
    90  ```
    91  
    92  This will execute 100 trials of XOR (disconnected) experiment within 100 generations. The results of experiment execution
    93  will be saved into the ./out directory as in previous experiment.
    94  
    95  The experiment will fail sometimes to produce XOR solution over 100 generations, but most of times solution will be found. This
    96  confirms that algorithm is able not only grow needed hidden units, but also to restore input connections as needed.
    97  
    98  The example output of the command as following:
    99  ```
   100  
   101  Average
   102  	Winner Nodes:	5.7
   103  	Winner Genes:	9.2
   104  	Winner Evals:	9347.7
   105  Mean
   106  	Complexity:	7.8
   107  	Diversity:	20.0
   108  	Age:		46.7
   109  
   110  ```
   111  
   112  ### 2. The single pole-balancing experiment
   113  The pole-balancing or inverted pendulum problem has long been established as a standard benchmark for artificial learning
   114  systems. It is one of the best early examples of a reinforcement learning task under conditions of incomplete knowledge.
   115  
   116  ![alt text][single_pole-balancing_scheme]
   117  
   118  Figure 1.
   119  
   120  ##### System Constraints
   121  1. The pole must remain upright within ±r the pole failure angle.
   122  2. The cart must remain within ±h of origin.
   123  3. The controller must always exert a non-zero force F.
   124  
   125  Where r is a pole failure angle (±12 ̊ from 0) and h is a track limit (±2.4 meters from the track centre).
   126  
   127  The simulation of the cart ends when either the pole exceeds the failure angle or the cart exceeds the limit of the track.
   128  The objective is to devise a controller that can keep the pole balanced for a defined length of simulation time.
   129  The controller must always output a force at full magnitude in either direction (bang-bang control).
   130  
   131  In this experiment the Genome considered as a winner if it's able to simulate single pole balancing at least 500’000 time
   132  steps (10’000 simulated seconds).
   133  
   134  To run this experiment with 150 population size execute following commands:
   135  ```bash
   136  
   137  cd $GOPATH/src/github.com/yaricom/goNEAT
   138  go run executor.go -out ./out/pole1 -context ./data/pole1_150.neat -genome ./data/pole1startgenes -experiment cart_pole
   139  
   140  ```
   141  
   142  This will execute 100 trials of single pole-balancing experiment within 100 generations and over population with 150
   143  organisms. The results of experiment execution will be saved in ./out directory under specified folder.
   144  
   145  To run this experiment with 1’000 population size execute following commands:
   146  ```bash
   147  
   148  cd $GOPATH/src/github.com/yaricom/goNEAT
   149  go run executor.go -out ./out/pole1 -context ./data/pole1_1000.neat -genome ./data/pole1startgenes -experiment cart_pole
   150  
   151  ```
   152  
   153  The example output of the command for population of 1000 organisms as following:
   154  ```
   155  
   156  Average
   157  	Winner Nodes:	7.0
   158  	Winner Genes:	10.2
   159  	Winner Evals:	1880.0
   160  Mean
   161  	Complexity:	17.1
   162  	Diversity:	25.3
   163  	Age:		2.2
   164  
   165  ```
   166  
   167  This will execute 100 trials of single pole-balancing experiment within 100 generations and over population with 1’000
   168  organisms.
   169  
   170  The results demonstrate that winning Genome can be found in average within 2 generations among population of 1’000 organisms (which
   171  belongs to 17 species in average) and within 30 generations for population of 150 organisms.
   172  
   173  It's interesting to note that for population with 1000 organisms the winning solution often found in the null generation,
   174  i.e. within initial random population. Thus the next experiment with double pole-balancing setup seems more interesting
   175  for performance testing.
   176  
   177  In both single pole-balancing configurations described above the optimal winner organism has number of nodes and genes - 7 and 10 correspondingly.
   178  
   179  The seven network nodes has following meaning:
   180  * node #1 is a bias
   181  * nodes #2-5 are sensors receiving system state: X position, acceleration among X, pole angle, and pole angular velocity
   182  * nodes #6, 7 are output nodes signaling what action should be applied to the system to balance pole at
   183  each simulation step, i.e. force direction to be applied. The applied force direction depends on relative strength of
   184  activations of both output neurons. If activation of first output neuron (6-th node) greater than activation of second
   185  neuron (7-th node) the positive force direction applied. Otherwise the negative force direction applied.
   186  
   187  The TEN genes is exactly number of links required to connect FIVE input sensor nodes with TWO output neuron nodes (5x2).
   188  
   189  ### 3. The double pole-balancing experiment
   190  
   191  This is advanced version of pole-balancing which assumes that cart has two poles with different mass and length to be balanced.
   192  
   193  ![alt text][double_pole-balancing_scheme]
   194  
   195  Figure 2.
   196  
   197  We will consider for benchmarking the two types of this problem:
   198  * the Markovian with full system state known (including velocities);
   199  * the Non-Markovian without velocity information.
   200  
   201  The former one is fairly simple and last one is a quite challenging.
   202  
   203  ##### System Constraints
   204  1. The both poles must remain upright within ±r the pole failure angle.
   205  2. The cart must remain within ±h of origin.
   206  3. The controller must always exert a non-zero force F.
   207  
   208  Where r is a pole failure angle (±36 ̊ from 0) and h is a track limit (±2.4 meters from the track centre).
   209  
   210  #### 3.1. The double pole-balancing Markovian experiment (with known velocity)
   211  
   212  In this experiment agent will receive at each time step full system state including velocity of cart and both poles. The
   213  winner solution will be determined as the one which is able to perform double pole-balancing at least 100’000 time steps or
   214  1’000 simulated seconds.
   215  
   216  To run experiment execute following command:
   217  ```bash
   218  
   219  cd $GOPATH/src/github.com/yaricom/goNEAT
   220  go run executor.go -out ./out/pole2_markov -context ./data/pole2_markov.neat -genome ./data/pole2_markov_startgenes -experiment cart_2pole_markov
   221  
   222  ```
   223  
   224  This will execute 10 trials of double pole-balancing experiment within 100 generations and over population with 1’000
   225  organisms.
   226  
   227  The example output of the command:
   228  
   229  ```
   230  
   231  Average
   232  	Winner Nodes:	16.6
   233  	Winner Genes:	35.2
   234  	Winner Evals:	35593.2
   235  Mean
   236  	Complexity:	29.4
   237  	Diversity:	686.9
   238  	Age:		14.7
   239  
   240  ```
   241  
   242  The winner solution can be found approximately within 13 generation with nearly doubled complexity of resulting genome
   243  compared to the seed genome. The seed genome has eight nodes where nodes #1-6 is sensors for x, x', θ1, θ1', θ2, and θ2'
   244  correspondingly, node #7 is a bias, and node #8 is an output signaling what action should be applied at each time step.
   245  
   246  
   247  #### 3.2. The double pole-balancing Non-Markovian experiment (without velocity information)
   248  
   249  In this experiment agent will receive at each time step partial system state excluding velocity information about cart and both poles.
   250  Only horizontal cart position X, and angles of both poles θ1 and θ2 will be provided to the agent.
   251  
   252  The best individual (i.e. the one with the highest fitness value) of every generation is tested for
   253  its ability to balance the system for a longer time period. If a potential solution passes this test
   254  by keeping the system balanced for 100’000 time steps, the so called generalization score(GS) of this
   255  particular individual is calculated. This score measures the potential of a controller to balance the
   256  system starting from different initial conditions. It's calculated with a series of experiments, running
   257  over 1000 time steps, starting from 625 different initial conditions.
   258  
   259  The initial conditions are chosen by assigning each value of the set Ω = \[0.05, 0.25, 0.5, 0.75, 0.95\] to
   260  each of the states x, ∆x/∆t, θ1 and ∆θ1/∆t, scaled to the range of the corresponding variables. The short pole 
   261  angle θ2 and its angular velocity ∆θ2/∆t are set to zero. The GS is
   262  then defined as the number of successful runs from the 625 initial conditions and an individual
   263  is defined as a solution if it reaches a generalization score of 200 or more.
   264  
   265  To run experiment execute following command:
   266  ```bash
   267  
   268  cd $GOPATH/src/github.com/yaricom/goNEAT
   269  go run executor.go -out ./out/pole2_non-markov -context ./data/pole2_non-markov.neat -genome ./data/pole2_non-markov_startgenes -experiment cart_2pole_non-markov
   270  
   271  ```
   272  
   273  This will execute 10 trials of double pole-balancing Non-Markovian experiment within 100 generations and over population
   274  with 1’000 organisms.
   275  
   276  The example output of the command:
   277  
   278  ```
   279  Average
   280  	Winner Nodes:	5.5
   281  	Winner Genes:	11.5
   282  	Winner Evals:	44049.3
   283  Mean
   284  	Complexity:	13.9
   285  	Diversity:	611.1
   286  	Age:		18.9
   287  
   288  ```
   289  
   290  The maximal generalization score achieved in this test run is about 347 for very simple genome which comprise of five nodes
   291  and five genes (links). It has the same number of nodes as seed genome and only grew one extra recurrent gene connecting
   292  output node to itself. And it happen to be the most useful configuration among other.
   293  
   294  The most fit organism's genome based on test non-markov run with maximal generalization score of 347:
   295  
   296  ```
   297  genomestart 46
   298  trait 1 0.2138946217467174 0.06002354203083418 0.0028906105813590443 0.2713154623271218 0.35856547551861573 0.15170654613596346 0.14290235997205494 0.28328098202348406
   299  node 1 1 1 1
   300  node 2 1 1 1
   301  node 3 1 1 1
   302  node 4 1 1 3
   303  node 5 1 0 2
   304  gene 1 1 5 -0.8193796421450023 false 1 -0.8193796421450023 true
   305  gene 1 2 5 -6.591284446514077 false 2 -6.591284446514077 true
   306  gene 1 3 5 5.783520492164443 false 3 5.783520492164443 true
   307  gene 1 4 5 0.6165420336465628 false 4 0.6165420336465628 true
   308  gene 1 5 5 -1.240555929293543 true 50 -1.240555929293543 true
   309  genomeend 46
   310  
   311  ```
   312  
   313  ## Conclusion
   314  
   315  The experiments described in this work confirm that implemented NEAT method is able to evolve new structures in ANNs (XOR
   316  experiment) and is able to solve reinforcement learning tasks under conditions of incomplete knowledge (pole-balancing).
   317  
   318  ## Credits
   319  
   320  * The original C++ NEAT implementation created by Kenneth Stanley, see: [NEAT][1]
   321  * Other NEAT implementations may be found at [NEAT Software Catalog][2]
   322  
   323  This source code maintained and managed by [Iaroslav Omelianenko][3]
   324  
   325  
   326  [1]:http://www.cs.ucf.edu/~kstanley/neat.html
   327  [2]:http://eplex.cs.ucf.edu/neat_software/
   328  [3]:https://io42.space
   329  
   330  [single_pole-balancing_scheme]: https://github.com/yaricom/goNEAT/blob/master/contents/single_pole-balancing.jpg "The single pole-balancing experimental setup"
   331  [double_pole-balancing_scheme]: https://github.com/yaricom/goNEAT/blob/master/contents/double_pole-balancing.png "The double pole-balancing experimental setup"