Reaching for the intangible

Exactly how intelligent would you say humans are? With Hunter and Legg’s definition of intelligence as the ability to achieve a wide range of goals in many environments, it's safe to say that humans are highly intelligent in comparison to all other observed goal-seeking systems. In fact, it is extremely difficult to identify goals they cannot seek within the framework of our universe. However, unreachable goals do 'exist' as identified by Godel's incompleteness theorems and Turing's halting problem. Those formal proofs established the unprovability, undecidability, and intractability of their decade- and century-long problem domains, and their eager or reluctant acceptance optimized the ambitions of the mathematical and computational science research that followed.

I see certain fields of AI today reaching for a similarly intangible goal: human-level artificial intelligence. You see, ‘human-level’ AI research often begins with the speculation: “I see that human intelligence uniquely does this or that, so if I make an AI system with those features, it must be ‘human-level’ artificial intelligence.” Consider three examples of this thinking in action:

[Turing 1950] proposed the Imitation game as a discriminative test of humanly-indistinguishable machine “thinking”.
- ‘What will happen when a machine takes the part of A in this game?’ Will the interrogator decide wrongly as often when the game is played like this [between a machine and a woman] as he does when the game is played between a man and a woman? [...] The question and answer method seems to be suitable for introducing almost any one of the fields of human endeavour that we wish to include.
[Nilson 2005] proposes the employment test as identifying a the fractional degree of progress towards human-level AI:
- To pass the employment test, AI programs must be able to perform the jobs ordinarily performed by humans. Progress toward human-level AI could then be measured by the fraction of these jobs that can be acceptably performed by machines
[Park 2020] proposes the language acquisition test and writes:
- We conjecture that learning from others' experience with the language is the essential characteristic that distinguishes human intelligence from the rest. Humans can update the action-value function with the verbal description as if they experience states, actions, and corresponding rewards sequences firsthand.

While these theses provide concrete measures which are useful feedback signals to compare AI, I find them insufficient to fully define the meaning of ‘human-level’ intelligence. It is now common to read statements from gpt3 and other large language models that pass our subjective ‘Turing test’. The employment test is probably the most general of the above three measure of human-level AI since it intrinsically demands few-shot on-the-job learning, but AI evolution is a slow feedback signal and better suited as an auxiliary dev metric than a primary optimization objective. Finally, the language acquisition test is trivially solved by only reducing the complexity of the action-value function V. Learned optimizers have already been shown capable of optimizing themselves (“Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves” and “Training Learned Optimizers with Randomly Initialized Learned Optimizers”), and in the language of reinforcement learning, you might say the update their own action-value function. However, I would not suggest that those optimizers represent categorically ‘human-level’ artificial intelligence.

Those are just three examples where pivotal components of ‘human-level’ intelligence are idolatrized as if they defined it. Even the brain as a whole is often exalted too high above its underlying physiological world interface and environmental interaction experience. These other two players guide the development of the brain and the intelligence it expresses. Reciprocally, the brain acts as a forcing function to maintain order over its body and interact in its environment. The body and environment play key roles in grounding the brain’s internal oscillations into metaphors of cognition, and without all three, there is no human-level intelligence. Individuals raised in stimulation-poor environments or who have underlying physiological limitations, show statistically diminished potential in cultivating intelligence.

There are many other examples where an average AI researcher’s prior on “human-level” artificial intelligence diverges from the real deal, and I hypothesize those seemingly sparse cases are actually uncountable. As with physics's models of the universe, the brain has been subjected to numerous comparisons over the ages including the oscillator, the clock, the steam engine, the formal proof machine, the computer, and even the neural network. However neuroscience continually reminds us that the Brain is something else, and while we AI researchers may surpass it in various complexity, accuracy, and recall metrics, we’re still not capturing its ‘human-level’ intelligence.

These thoughts are not new, and even researchers in the field of human level artificial intelligence acknowledge them. They may justify their use of the term ‘human-level’ as a means to communicate their objective to non-specialists - including committee boards, funding agencies, and executives. However, I argue that this is where the term may be abused in its worst. We scientists recognize the nuances and history underlying our terminology, so we are often able to afford the use of vague ambiguous terms like thinking, attention, and perception to describe the activity occurring in the brain or artificial neural network. I know what you’re reaching for when you say “I’m developing a human-level AI system”. However, when that term is taken out of context, a person may be introduced to human level AI with the idea that it can do everything a human can. Of course the realistic engineer expects to find flaws in his artificial system and eagerly looks for them, but when a funding agency, review board, or the general public are surprised by those same discrepancies, they are not amused. Funding for everybody gets cut; public interest declines; and the anticipation built up for ‘human-level’ artificial intelligence has been abused. If you want to help AI continue to evolve and avoid a third AI winter, don't use the buzzword "human-level" to describe your artificial intelligence.

I see two paradigms driving the advancement of artificial intelligence. The first uses natural language, philosophy, and logic to describe and reason on intelligence. It says, "intelligence involves a collection of discrete processes like attention, perception, memory, etc." The latter paradigm acknowledges that biological systems consist of a heterogeneous set of mechanisms to achieve their goals, but it cannot express itself in natural language. Instead it uses formal descriptions, statistical tools, and algorithms to describe intelligence. Both involve programming and experimentation. However the flow of information from experimental results to next iteration parameters must pass through a human mind in the first paradigm, while seamlessly optimizing on a computer in the second paradigm.

Admittedly, those are extremes that machine learning research usually sits somewhere in the middle of, however as we develop increasingly advanced artificial systems, it becomes necessary to acquire and utilize a more mathematical oriented framework of intelligence - instead of speaking about philosophically-defined axioms. When the time comes to program a cognitive architecture, there are no universal off-the-shelf perceive(observation), decide(thoughts), or attention functions. However, shuttling mathematical and statistical statements between the computer and brain is a much more straightforward task, and its is easy to find or define precise building blocks like mutual_information, entropy, or `powerlaw_coef.

It should be clear to AI researchers who sit closer to the former extreme that advancing the intellectual capacity of artificial systems demands acquiring a basic understanding of the mathematical and statistical tools used to represent real intelligence -- and not just be content with applying one or two in his or her research -- but like brain’s predictive model ensemble, I encourage the activate learning researcher to study many principles of neuronal structure and function. Please consider starting with the following (list in the order you may find easiest to understand):

Action and Perception as Divergence Minimization
Fristen's Free Energy Principle
The Energy Homeostasis Principle
The Critical Brain Hypothesis
The Entropic Brain Hypothesis
Buszaki’s neural syntax hypothesis