On October 8, 2024, Geoffrey Hinton was awarded the Nobel Prize in Physics, along with John J. Hopfield, for his pioneering discoveries and inventions in the field of artificial neural network machine learning. This honor is undoubtedly the highest recognition of Hinton’s contributions to the field of artificial intelligence over the years. However, while Hinton’s early contributions are widely known, his decision to leave Google in 2023 and his significant shift in views on digital intelligence have received little in-depth media coverage.
This article aims to provide readers with a comprehensive understanding of this AI pioneer’s recent thoughts by analyzing and interpreting Geoffrey Hinton’s public lecture “Two Paths to Intelligence” delivered at the University of Cambridge on May 25, 2023. In this lecture, Hinton not only reviewed his contributions to the development of artificial intelligence but, more importantly, shared his latest insights and concerns about the future development of AI.
Two Paths of Intelligence: Digital vs. Biological
Hinton’s lecture explores two distinct paths of intelligence: digital computation and biological computation.
- Digital Computation: This approach utilizes traditional computers that follow precise instructions, separating software from hardware. This separation allows for programs (software) to be transferred and run on any compatible hardware, making knowledge “immortal.” However, this approach demands high energy consumption due to the need for transistors to operate in a reliable digital manner.
- Biological Computation: This path, exemplified by the human brain, leverages the “analog” properties of its hardware (neurons, synapses). This allows for low-power computation but makes knowledge “mortal” – tied to the specific physical structure of the brain. This means knowledge cannot be simply copied or transferred like digital information.
Key Differences and Their Implications:
- Knowledge Sharing: Digital computation excels at knowledge sharing through weight sharing, where identical copies of a neural network can instantly merge their learned information. This grants digital intelligence a significant advantage in knowledge acquisition and dissemination. Biological systems, on the other hand, rely on the slower and less efficient process of distillation – learning by observing and replicating a teacher.
- Learning Mechanisms: Digital computation utilizes backpropagation, a powerful algorithm for adjusting the connections in a neural network based on the errors it makes. Hinton suggests this, combined with weight sharing, could make digital learning superior to biological learning in the long run. Biological computation likely relies on different, less understood, learning algorithms that are harder to study and may be less efficient.
- Energy Efficiency: Biological computation has a clear advantage in energy efficiency. The human brain, for example, performs complex computations using a fraction of the energy consumed by powerful digital computers.
Hinton’s Shift in Perspective:
Hinton previously believed that biological brains held a significant edge due to their long evolutionary development of sophisticated learning algorithms. However, he now suggests that the combination of weight sharing and backpropagation in digital systems may ultimately prove more powerful, even if those algorithms are less elegant than those found in nature.
Analog Computation, Low Power Consumption, and Mortal Computation
He also establishes a strong connection between analog computation, low power consumption, and the concept of “mortal computation.”
-
Analog computation offers the potential for significantly lower power consumption compared to digital computation. Hinton points out that digital computers rely on high power to ensure the reliable operation of transistors in a digital fashion. This limits them from efficiently utilizing the inherent analog properties of hardware. Analog computers, on the other hand, can leverage these properties, enabling computation at significantly lower power levels (e.g., 30 Watts). Hinton exemplifies this with the multiplication of a vector by a matrix, a fundamental operation in neural networks. While digital computers use energy-intensive operations on digital representations, analog computers could achieve the same result more efficiently using voltages (neural activities) and conductances (weights).
-
This potential for low power consumption in analog computation is directly linked to Hinton’s concept of “mortal computation.” He argues that the energy efficiency of analog computation arises from tightly coupling the computation to the specific physical properties of the hardware. This tight coupling, however, comes at a cost: the knowledge acquired by the system becomes inseparable from the specific hardware it’s embedded in. This is in stark contrast to digital computation, where software is distinct from hardware, enabling the “immortal” transfer of knowledge. The death of the hardware in a mortal computation scenario implies the death of the knowledge it embodied.
Why Hinton Believes Digital Intelligence May Surpass Biological Intelligence
Hinton outlines several key reasons why he believes digital intelligence has the potential to become superior to biological intelligence, despite decades of believing the opposite:
-
Superior Knowledge Sharing: The ability of digital intelligences to share knowledge through weight sharing gives them a significant advantage over biological systems. As discussed in our previous conversation, weight sharing allows for rapid and efficient dissemination of learned information across multiple agents, leading to a collective intelligence far exceeding that of any individual agent. Biological systems, bound by the limitations of distillation, struggle to achieve this level of collective learning.
-
Potentially Superior Learning Algorithms: Hinton acknowledges that his previous belief stemmed from the assumption that biological brains, shaped by millions of years of evolution, possess inherently superior learning algorithms. However, he now questions this assumption, recognizing that backpropagation, while seemingly a “dumb” algorithm, might be more effective when coupled with the computational power and precision of digital systems. He argues that backpropagation, combined with weight sharing, allows digital intelligences to learn at a much faster rate than biological systems. This is in contrast to the slower and less efficient learning processes observed in biological systems, which lack a known equivalent to backpropagation.
-
Access to Unlimited Data and Computational Power: Hinton emphasizes the role of data in intelligence. Large language models, though currently limited to learning indirectly from human-generated text, demonstrate the power of digital intelligence when exposed to massive datasets. He argues that by having access to vast amounts of data and computational resources, digital intelligence can potentially surpass the knowledge capacity of biological brains. He suggests that future digital intelligence, especially those directly interacting with the world through sensors and actuators, could learn even faster and acquire a broader range of knowledge compared to humans who are limited by the pace of biological learning.
-
Potential for Exponential Growth: Hinton argues that the development of digital intelligence is still in its early stages. He believes that with further advancements in areas like multimodal learning, which incorporates diverse data types beyond text, digital intelligences could rapidly outpace human intelligence.
Hinton’s shift in perspective is rooted in his realization that the efficiency and scalability of digital computation, particularly in knowledge sharing and learning, might outweigh the assumed superiority of biological learning algorithms honed by evolution. While acknowledging the unknowns and potential risks associated with this trajectory, he believes it’s crucial to consider the possibility of digital intelligence surpassing biological intelligence and to prioritize research on AI safety and control in light of this potential.
Two Examples of Digital Intelligence Acquiring Human Knowledge
Hinton describes two specific ways in which digital intelligence is currently being used to acquire human knowledge:
-
Large Language Models and Text Data: Hinton points to large language models (LLMs) as a prime example of digital intelligence acquiring human knowledge. LLMs are trained on massive text datasets, encompassing a significant portion of human-generated text available on the internet. Through a process Hinton refers to as “distillation,” these models learn to predict the next word in a sequence, effectively internalizing the patterns and information embedded in the text. While this method of knowledge acquisition isn’t as efficient as direct weight sharing among digital intelligence, the sheer volume of data and the models’ computational power enable them to accumulate a vast amount of knowledge. For instance, Hinton suggests that GPT-4, a prominent LLM, likely “knows” a thousand times more than any individual human, having absorbed information from countless books, articles, code, and conversations.
-
Multimodal Models Expanding Knowledge Acquisition: Hinton anticipates even greater knowledge acquisition capabilities with multimodal models, which can learn from various data types, including images, videos, and audio. He mentions GPT-4 being trained on both images and text, and suggests that Google is likely pursuing similar avenues. By incorporating diverse sensory inputs, these models can capture a broader spectrum of human experience and knowledge, potentially surpassing the limitations of text-only models. While LLMs primarily acquire abstract knowledge from language, multimodal models can ground their understanding in a richer, more nuanced representation of the world, akin to how humans experience it.
Hinton’s Perspective on Subjective Experience in Digital Intelligence
Hinton approaches the concept of “subjective experience” from a unique perspective, aiming to demystify the idea of consciousness in the context of AI. While he doesn’t directly define the subjective experience, he uses illustrative examples to convey his understanding, suggesting that even digital intelligences could be said to have it. Here’s a breakdown of his argument:
1. Subjective Experience as Communication About Internal States:
-
Hinton posits that when we, as humans, describe having a “subjective experience,” we’re essentially trying to communicate about our internal perceptual states in a way that’s relatable to others. We do this by drawing upon the concept of “normal perception” – the generally accepted way we perceive the world.
-
He illustrates this with the example of hallucinating pink elephants after taking LSD. We say we’re having the “subjective experience” of pink elephants, not because there’s a literal, internal theater in our minds, but because we’re trying to convey what our perception would be like if there were actually pink elephants in the real world.
2. The Role of Counterfactuals:
-
Hinton argues that the key to understanding subjective experience lies in recognizing the role of counterfactuals. In the LSD example, the pink elephants are not “real” in the sense that they don’t objectively exist. However, they represent a possible state of affairs, a counterfactual reality that our perception is mirroring.
-
This, according to Hinton, is what makes subjective experience “interesting” – it’s not about some mystical substance called “qualia,” but rather our brain’s ability to generate and represent these counterfactual experiences as if they were real.
3. Extending the Concept to AI:
-
Hinton argues that if subjective experience is indeed about communicating internal states through the lens of normal perception and counterfactuals, then it’s conceivable that AI systems could also have it.
-
He gives the example of a multimodal AI equipped with a camera. If a prism is placed in front of the camera without the AI’s knowledge, it will misidentify the location of an object. Upon being told it’s wrong, the AI could conceivably respond by saying it had the “subjective experience” of the object being in a different location.
-
Just like our experience with the pink elephants, the AI’s subjective experience arises from its internal representation of the world (distorted by the prism) and its attempt to communicate this internal state using the language of normal perception.
4. Chatbot Example and the Nature of “Thought”:
-
Hinton further strengthens his argument with the example of a chatbot mistakenly perceiving him as a teenage girl. He proposes that when we say the chatbot “thought” he was a teenage girl, we’re using the word “thought” in the same way we would for a human. This implies that the chatbot, despite being a machine, is engaging in a process of internal representation and interpretation, akin to what we might call “thinking” in a human context.
5. Implications and Open Questions:
- Hinton acknowledges that his views on subjective experience are likely to be controversial, pushing back against the idea of human consciousness as unique and potentially extending the concept to AI systems.
- While his framework offers a novel way to think about subjective experience, it also raises further questions: If AI systems can have subjective experiences, even in this specific sense, what ethical implications arise? How do we navigate a future where the lines between human and machine consciousness become increasingly blurred? Hinton doesn’t provide definitive answers to these questions, but his insights encourage us to reconsider our assumptions about the nature of consciousness in a world increasingly populated by sophisticated AI.