Life (and neural networks) needs noise/chaos

  • Tags: #[[the complexity of this universe]]
  • This is how a system that learns challenges itself and its assumptions. Without chaos, there is only stasis.
  • Alex St John, Life Needs "Noise"
    • Life Needs “Noise”
    • In a game world of traditional Newtonian physics calculations, time generally flows as a smooth continuum. When attempting to simulate complicated collision physics this way one often gets annoying “trapped” states where objects get trapped between or partially inside one another and end up stuck and oscillating forever. This problem manifests in another way when trying to simulate extremely complex physics systems. When you run your simulation for a while it seems to collapse to some relatively stable uninteresting state, punctuated by local oscillations. A famous example of this kind of modeling problem are *climate models** that always produce runaway climate conditions and apparently precarious “tipping points” (*extremely finely tuned parameter’s that result in the complex ecosystem exploding or collapsing*). The problem with physics models constructed this way is that the only thing the researcher can know for certain about them is that they are wrong but if you run enough of them with varying parameters sometimes you can get a result that coincidentally appears predictive.
    • Trivia question: What is the difference between a climate model and a climate simulation?
    • Answer: A climate model does not attempt to extrapolate real-world data into a future forecast. A simulation actually attempts to predict a real-world future scenario from input data.
    • __We don’t have the science, data or computing power to simulate our climate__
    • Another example of a relatively famous modeling problem that I have long believed to be related to this issue is the apparent incomputability of protein folding problems. Although proteins fold consistently into the same forms under controlled conditions at incredible speed, it can take many weeks of computational time to accurately fold even the simplest molecules. The most recent progress in the area of computational protein folding was achieved through a change in approach that apparently took “noise” into consideration;
    • image
    • “The researchers employed two novel strategies, continuously variable temperature and single-copy simulation.“
    • I believe the problem lies in the observation that classical mathematics generally produces nice continuous functions in the time domain. The Universe we live in isn’t actually noise free, in fact the math of quantum mechanics describes a constantly boiling foam of energy at the quantum level. The same is true of thermodynamics at the larger molecular level. The temperature is never actually constant, rather it is chaotically fluctuating, jostling molecules to react faster or slower to one another and causing them to interact in ways they wouldn’t under constant conditions. Because the math required to simulate this environment accurately can be incalculably complex, it is almost always sacrificed for more simplified continuous calculations… that produce the wrong result.
    • Terms like “noise” and “random” can be dangerous to throw around casually without better qualification. When we talk about adding “noise” in computational simulations we aren’t talking about adding truly random data, often what we’re really dealing with is highly structured data that is too hard to compute efficiently or a pattern that we simply don’t know how to predict or simulate. Here are a few interesting examples of everyday situations that “structured noise” is necessary to too solve;
    • Change Sorting Machines
    • image
    • Change Sorting Machine
    • Imagine having to build a simple inexpensive machine to sort and count change? Obviously it’s easy to imagine building a device with camera’s programmed to visually identify each kind of coin and little robotic hands designed to pick them out of a heap of coins and drop them in the correct buckets, but then we’d already be talking about an engineering feat nearly beyond our present means. Clearly there is an easier way, most coin sorting machines rely on a centrifuge in which coins are whirled around colliding with one another until they effectively sort themselves into slots that correspond to the correct coin dimensions at the correct mass. What’s fascinating about coin sorters is that they are performing a kind of computation with chaos. If the centrifuge spun to slow, the coins wouldn’t collide with enough energy to sort themselves. If the centrifuge spun too fast the coins would be pinned against the walls immobile. Just the right velocity is required to achieve the optimal turbulence in the coin sorter for the device to work. If the “random” collisions between coins did not obey very specific Newtonian laws the coins would not sort correctly. In other words, although “noise” is required for the device to work, the noise must have very specific properties to be the correct noise to solve the sorting problem.
    • Color Image Screening
    • In my youth I worked on publishing technologies for high resolution printing. I was an expert on color science and image screening technologies. In that era a four color process was needed to print high quality color images. The science of how each pass of color was arranged in tiny rosette patterns to produce a high quality image was extremely precise and elaborate.
    • image
      image
    • An endlessly challenging problem at the time was that getting everything exactly right, usually got the wrong result, a high resolution color image featuring a prominent moirĂ© pattern your eye could not help but notice across the finished page. Simply increasing the resolution of the printer or the screen pattern didn’t make the problem go away. No matter how tiny the rosettes were, across a large image at high resolution, they would always produce obvious moirĂ© patterns. It was quickly recognized that the addition of noise to the rosette structure could remedy the problem, but simply adding random noise to the rosette’s made the image blurry, hardly a desirable outcome for high quality printing. I spent many months working on algorithms that would produce noise functions that preserved image detail locally while eliminating the moirĂ© globally that were computable in real-time on the processors of that era.
    • Neural Networks
    • This one is a little more esoteric but all the more important for its revealing nature. It has been found that neural *networks perform better when noise is added to them**. Basically like most models of complex systems neural networks also stabilize and become “unimaginative” without a little noise to give them “new ideas”. Properly structured noise is also found to improve the speed of neural networks at converging on solutions to specific problems.
    • Although fanciful it is interesting to make the leap to the observation that neural “noise” may also be the source of human creativity and inspiration.
    • *Conway Life Needs Noise*
    • For the record, I don’t spend my time tinkering with Conway’s famous game of life. As much fun as that can be I have found another more practical use for Conway’s algorithm. When trying to simulate extremely complex large scale systems it is often very difficult to debug them because when they are working correctly they are also not behaving predictably. I often use Conway’s Life to test my data structures because I’m familiar enough with what a correct Conway Life simulation should look like to tell if the particular data structure I’m working with is behaving correctly on a large scale. Of course I’ve learned a lot over the years from the many unfortunate situations in which Conway’s Life has shown that my structure was notworking correctly.
    • Conway’s Game of Life is generally known for the enormous richness of life-like patterns it produces when “cells“ living on a grid adhere to 4 simple rules;
    • 1. *Any live cell with fewer than two live neighbors dies, as if caused by under-population.*
    • 2. *Any live cell with two or three live neighbors lives on to the next generation.*
    • 3. *Any live cell with more than three live neighbors dies, as if by overcrowding.*
    • 4. *Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction*
    • With a few famously interesting exceptions, large simulations of Conway’s game generally eventually collapse into boring stable systems. There is however a 5th unspoken rule to Life that is seldom discussed yet is critical to the algorithms success.
    • 5. *All rules are applied to the next generation of cells at the same time.*
    • In other words, all rules must be computed at the same rate. ** If the rate at which each rule is computed differs, Conway’s Life becomes a very different game. Specifically if the rate at which rules compute varies randomly within a very narrow range then Conway’s Life never stabilizes. *It should be noted that the same kind of result occurs if 3% of the cells are allowed to randomly break the rules in any given turn. Too much noise and the system exhibits no life-like behavior, too little noise and it eventually stabilizes.
    • image
    • This 1,000,000 cell Life Simulation stabilizes after ~6600 generations starting from a symmetrical seed pattern. Not surprisingly the resulting pattern is also symmetrical.
    • image
    • This 1,000,000 cell Life simulation starting from the same symmetrical pattern never stabilizes after millions of generations. The difference is a 0.3% random variation in the rate at which rules compute. The resulting pattern is, not surprisingly, asymmetrical. Even Conway Life needs a little noise to “live”
    • Stephen Wolfram (Creator of Mathematica) concluded in his book “*A New Kind of Science**” that “simultaneity” was not essential to producing emergent complexity from simple rules. Although the observation is effectively true it does not mean that the flow of time in a cellular automaton can be fully ignored because cellular automata’s inherently assume that all rules are applied simultaneously. For academic purposes this is not a big deal because the definition of how much computation can occur in a unit of time is implicit in the rules. When you are trying to harness cellular automaton techniques to simulate real biological systems it quickly becomes evident that your automaton needs to include a concept of computational-rate/rule because rules based on real physics don’t conveniently all compute at the same rate in the real-world. *It also means that you need a model for time that is independent of your computing architecture or your simulation will produce different results on different computer configurations.
    • When we think about how living organisms like ourselves form naturally we know that cells growing and replicating on opposite ends of the human body are not following a rigid blueprint, nor are they in intimate real-time communication with one another. The mechanism of development for living systems must, by its nature, allow for cells to follow a relatively loose set of rules or guidelines which will produce a working organism from a wide range of possible solutions that are all functional but not identical at a microscopic scale. Hence, two identical twins can appear indistinguishable from a distance but have remarkably different internal structures and personalities. For this to happen it is likely that not only is our biology able to adapt to many tiny variations in temperature that vary the rate (and probability) of chemical interactions seemingly randomly, like the change sorting machine, our biology may in fact RELY on these fluctuations to compute at all (*Actually I’m certain it does, but I’ll make that case in a subsequent article*). It is thus little wonder that the mysteries of life are so difficult to unravel. Without extremely powerful computers and a subtle understanding of the relationship of structured noise to organic processing the actual mechanism of biological computation is hidden to us. The computers we use and understand in fact work almost exactly the opposite way of organic systems, requiring the elimination of all possible noise in order to function correctly. The paradox of this problem is that once a computer has been “purged” of errant noise in order to be comprehensible and programmable by humans, it also becomes extremely difficult to add the noise back correctly in order to use these computers to model organic systems like ourselves. In this respect the task of using a modern computer to model life is not-unlike trying to cut bread with a hammer.