Building an artificial general intelligence: The current paradigm of AI
(Editor's note: This is the third installment of a five-part series on building artificial intelligence.)
As the manipulation of social media in the 2016 U.S. election shows, it is sometimes difficult to tell the difference between human and artificial intelligence (AI) behavior. In fact, at least since the original ELIZA application, a natural language processing computer program demonstrated by Joseph Weizenbaum in 1964, some people have attributed human-like attributes and capabilities to computer applications.
By today’s standards, ELIZA was a very simple program that would hardly qualify as AI. ELIZA replicated conversations by using a simple set of rules, pattern matching, and word substitution. It had no information about the meaning of specific words; instead, it relied on using word patterns in response to specific inputs. The most famous version of ELIZA, which simulated a non-directive Rogerian psychotherapist, served to simplify the problems that came with conversational languages that computers were unable to understand and identify.
Back then users interacted with computers by typing because speech recognition was not yet available. The “patient” might begin by typing “Hello ELIZA” and the computer might respond with “What brings you here to talk to me?” The patient then might respond, “I’m feeling a little depressed,” and the computer would reply with something like “Why do you think that you are feeling depressed?”
The code used to run ELIZA is so simple, it is easy to see that there is no deep intelligence in it. In essence, it finds keywords and repeats them, as in the example above. Nevertheless, when Weizenbaum’s secretary wanted to use the program, she made Weizenbaum leave the room so that she could talk privately to her “therapist.” Many people today make similar assumptions when they attribute personality to Siri or Alexa.
People generally find it easy to attribute more intelligence to an AI program than it actually deserves. Alan Turing famously proposed that having a conversation was good evidence of the intelligence working, but, it turns out that, unless you are specifically looking to distinguish a machine from a human conversational partner, most people tend to give AI the benefit of the doubt. Even computer scientists who have more direct access to the inner workings of AI agents can tend to over-interpret the intelligence of AI.
At about the same time that ELIZA was being written, mathematician I.J. Good speculated that, at some point in the future, we would be able to create a machine that could surpass the intellectual activities of any human, and among its powers would be the ability to design even better machines. Although such an achievement might be possible someday, today’s approach to AI via machine learning does not provide any mechanism that could provide that capability.
In the current paradigm of machine learning, advancements in a machine’s capabilities depend on genius of human developers to find more efficient methods and appropriate representations of the problems that the machine is intended to solve. The intelligence of artificial intelligence is due to its designers finding ways to reduce complex intellectual activities to simpler processes that are more easily implemented on a machine.
That reduction is not an alternative to intelligence, as many, including computer scientist John McCarthy, have complained—rather, it is the mechanism of intelligence. It is the same mechanism that people have been using for the last 50,000 years to bring us from primitive cave painting to today’s technology. Clever people have designed clever methods and clever representations to make their own complex intellectual tasks easier. Some of these can be implemented in a machine, but they are not qualitatively different from the ones that have been implemented in people’s brains and behaviors.
The actual mechanisms of machine learning and AI are very simple. Pedro Domingos, a professor of computer science and engineering at the University of Washington, divides machine learning into three parts: a representation, an evaluation process and an optimization process.
The representation specifies the structure of the problem-solving solution: how the input and output data are cast into numeric form and the kind of machine learning that will be used. When we structure a chess game or a game of go as a tree, that is part of the representation.
The evaluation function describes how machine learning will be judged. Typically, it specifies an objective for the machine learning to achieve and an estimate of how close or far the system is from achieving that objective. Lastly, the optimization process specifies how the current state of the machine learning system is to be modified to reduce the discrepancy between the current state of the system and the objective. Learning is the application of the optimization function to the model representation to achieve the objective.
Another way of describing machine learning is as three groups of numbers. One group of numbers represents the inputs to the system (e.g., how the events or objects of concern are cast into numbers). The second group of numbers represents the outputs of the system and the third represents the model parameters of the system. For example, these numbers may represent the connection weights of a neural network. In this view, learning consists exclusively of changing the model parameters. The structures of these three groups of numbers are fixed by the design of the learning system.
No current machine learning system can change that structure; it can only change the values of the parameters within that structure. Therefore, using the current paradigm, there is no way that a machine could design better machines that would design better machines. That is a “thought” that no current machine could even entertain, let alone execute. Designing better machines means designing better representations, and so far, we do not know how to build a machine that can do that.
Machine learning produces artificial intelligence because some person has structured a potential solution in such a way that it can be achieved through optimization. That is, through changes in the values of the model parameters. The restriction of machine learning to changes in parameter values does not diminish the value or even the intelligence of current machine learning approaches. But it does mean that only problems that can be cast into an amenable form are susceptible to the approach of the current paradigm.
In the next part of this series, we will consider the different types of problems that may have to be addressed for general intelligence.