1. Thinking on the Information Plane
    1. We think in theories; the universe runs on laws
      1. Laws describe what happens; theories try to explain why
        1. Examples of laws vs. theories
      2. Superiority of laws
        1. generalize
        2. extrapolate to novel circumstances
      3. Conservation laws are a recurring motif
        1. conservation of matter, momentum, charge, energy
        2. describe conservation of energy in terms of free energy
    2. Free energy minimization is an overarching, unifying theme of nature
      1. classical physics
      2. dynamic chemical equilibrium
      3. blood sugar
      4. neuronal free energy signaling hypothesis
      5. cost estimation (cognitive psychology)
      6. even intelligence
        1. Information theory applies energy minimization to probabilistic systems
          1. Introduce self information with a distribution
          2. Entropy, cross-entropy, KL divergence, mutual information
          3. Action and Perception as Divergence Minimization uses this framework to build intelligence
        2. Neuronal energy homeostasis theory
    3. Don’t rely on heuristics to learn intelligence. Use the framework of information theory
      1. Some useful heuristics may deceive us - yet they are not perfect; they break down under
        1. Weight regularization
        2. Batch/layer/group/instance normalization
        3. Backpropagation and
        4. more examples
        5. Graphs are a very flexible language - more than linear
      2. Information-theoretic solutions are superior
        1. Mutual-information regularization
        2. solutions to more examples
        3. We are still looking for a better language
    4. Pay attention to the plumbing of deep architectures and training
      1. how the information flows within the network
        1. Dense networks
        2. GRU/LSTM
        3. ResNet
        4. Inception
        5. Pyramid Point
        6. Transformers
        7. Graph Networks
        8. Neural turing machine
      2. how information travels from the objective to the network
        1. supervised learning
        2. semi supervised learning (finetuning)
          1. contrastive learning
        3. reinforcement learning
        4. corrupt feedback RL
  2. Intelligence by Generality
    1. We need metrics of intelligence to improve
      1. Intelligence is domain specific skill acquisition efficiency w.r.t. information (On the Measure of Intelligence)
      2. “our most significant measurement of progress is an agent's ability to achieve goals in a wide range of environments” (Using Unity to Help Solve Intelligence)
    2. The question is how general is its intelligence
      1. Virtual particles
      2. Chemical equilibrium
      3. Biological adaptation
        1. Imagine DNA as living through creatures of its kind
      4. The immune system has a few experts, but is able to quickly train millions more
        1. Resting, it sits at a “generally diverse” critical point in state space
    3. Those examples also highlight superior convergence of “blind” optimizers
      1. Broaden-and-build / low motivational intensity cognitive and behavioral development
      2. Tight / loose social norms transition when necessary (hopefully)
      3. Homogeneous components become specialized
    4. Is it possible to make superhuman narrow AI’s to continuous succession?
  3. Alignment is Possible by a Heterogeneous Blend of Methods
    1. Safety contains the criminal; security protects the president
      1. We cannot totally contain AI. Perception and action together are essential for learning
      2. We must prepare it for world integration
    2. Methods of AGI behavioral analysis and forecasting
      1. Human observation is deceptive
        1. Neural networks are prone to adversarial attack
        2. Qualitative functional is probably not sufficient either
          1. Poor intuition for high dimensionality and numerical sensitivity
      2. XAI
        1. Research and list more methods
        2. Simulation
      3. Biological methods
        1. Function and structure
      4. Psychological methods
    3. Deterministic complex systems can effectively behave nondeterministically
      1. Averaging corrupts information flow
        1. The brain’s action evoked potential is really an average of phase shift
        2. Corrupting the information mapping from stimulus to response improves security
      2. We cannot influence these systems without knowledge of internal state
        1. With hidden information, we shape behavior better
    4. We must exert energy to shape behavior during training
      1. The world’s distributions are dangerous
        1. word embedding unbiasing
        2. gpt3 error
        3. more examples
    5. A mixture of methods will be necessary to train
      1. Open and closed optimizers
        1. Behavioral psychology: openness to experience; closed to destructive behavior
        2. Affective psychology: broaden and build / motivational intensity
        3. Tight / loose social norm enforcement
      2. Value shaping
        1. behavior follows values
    6. What values should we instill?
      1. justice
      2. more values
      3. integrity
        1. value consistency
        2. frame as energy minimization
    7. Democratic value selection
      1. Expert panel may not be able to adequately
      2. BlenderBot’s maximum certainty method may help identify suitable values
      3. Ideally, a general intelligence can learn to autonomously identify values
  4. Copy themes of human intelligence
    1. The human body is innately intelligent
      1. unsupervised homeostasis learning
      2. innate and adaptive immune system
      3. lower sensorimotor pathways reduce information load
        1. reduced brain activation in mathematicians than lay people
    2. Neurons are genetically equipped for learning
    3. The connectome is disposed for HL Intelligence and skills
    4. Development
    5. Emotions
    6. Language
    7. Social Organization
  5. Keep AGI Open