Tag Archives: self-awareness

An Incremental Solution to an Artificial Apocalypse

So smart/famous/rich people are publicly clashing over the dangers of Artificial Intelligence again. It’s happened before. Depending on who you talk to, there are several explanations of why that apocalypse might actually happen. However, as somebody with a degree in computer science and a bent for philosophy of mind, I’m not exactly worried.

The Setup

Loosely speaking, there are two different kinds of AI: “weak” and “strong”. Weak AI is the kind we have right now; if you have a modern smartphone you’re carrying around a weak AI in your pocket. This technology uses AI strategies like reinforcement learning and artificial neural networks to solve simple singular problems: speech recognition, or face recognition, or translation, or (hopefully soon) self-driving cars. Perhaps the most recent publicized success of this kind of AI was when IBM’s Watson managed to win the Jeopardy! game show.

Strong AI, on the other hand, is still strictly hypothetical. It’s the realm of sci-fi, where a robot or computer program “wakes up” one day and is suddenly conscious, probably capable of passing the Turing test. While weak AIs are clearly still machines, a strong AI would basically be a person instead, just one that happens to be really good at math.

While the routes to apocalypse tend to vary quite a bit, there are really only two end states depending on whether you worry about weak AI or strong AI.


If you go in for strong AI, then your apocalypse of choice basically ends up looking like Skynet. In this scenario, a strong AI wakes up, decides that it’s better off without humanity for some reason (potentially related to its own self-preservation or desires), and proceeds to exterminate us. This one is pretty easy for most people to grasp.


If you worry more about weak AI then your apocalypse looks like paperclips, because that’s the canonical thought experiment for this case. In this scenario, a weak AI built for producing paperclips learns enough about the universe to understand that it can, in fact, produce more paperclips if it exterminates humanity and uses our corpses for raw materials.

In both scenarios, the threat is predicated upon the idea of an intelligence explosion. Because AIs (weak or strong) run on digital computers, they can do math really inconceivably fast, and thus can also work at speed in a host of other disciplines which boil down to math pretty easily: physics, electrical engineering, computer science, etc.

This means that once our apocalyptic AI gets started, it will be able to redesign its own software and hardware in order to better achieve its goals. Since this task is one to which it is ideally suited, it will quickly surpass anything a human could possibly design and achieve god-like intelligence. Game over humanity.

On Waking Up

Strong AI has been “just around the corner” since Alan Turing’s time, but in that time nobody has ever created anything remotely close to a real conscious computer. There are two potential explanations for this:

First, perhaps consciousness is magic. Whether you believe in a traditional soul or something weird like a quantum mind, maybe consciousness depends on something more than the arrangement of atoms. In this world, current AI research is barking up entirely the wrong tree and is never going to produce anything remotely like Skynet. The nature of consciousness is entirely unknown, so we can’t know if the idea of something like an intelligence explosion is even a coherent concept yet.

Alternatively, perhaps consciousness is complicated. If there’s no actual magic involved then the other reasonable alternative is that consciousness is incredibly complex. In this world, strong AI is almost certainly impossible with current hardware: even the best modern super-computer has only a tiny fraction of the power needed to run a simplistic rat brain let alone a human one. Even in a world where Moore’s law continues unabated, human-brain-level computer hardware is still centuries away, not decades. We may be only a decade away from having as many transistors as there are neurons, but they’re not comparable; neurons are massively more complex building blocks.

It’s also worth noting that if consciousness is hard because it’s complicated, our hypothetical strong AI would wake up very, very slowly. We’re not going to suddenly jump from current processors to AI-capable ones that are millions of times more powerful; there will be numerous incremental designs in between. As our processing power improves, our AI will get stronger and stronger but there will be no eureka moment where Skynet both suddenly decides humanity has to go and is smart enough to make it happen. We’ll have plenty of opportunity to study primitive artificial consciousnesses, learn about them in an environment we control, and iteratively incorporate our findings into subsequent generations.

Regardless of why we’ve never gotten anywhere with strong AI, we haven’t, and there is no indication that this time it really is “just around the corner”.

On Paperclips

While the idea of Skynet is common in popular culture, it’s the paperclip maximizer which tends to worry the people who actually seriously worry about this stuff. It’s already true that we don’t really understand how weak AI actually works: we understand the principles which drive something like Watson, but no human could describe the series of specific decisions it performs to arrive at a given answer.

It also seems a lot more plausible that weak AI (which we already have) will just keep getting smarter (which it has definitely done so far) than that we’ll have some weird magic breakthrough in strong AI, when nobody still even understands what that would look like. So, why couldn’t some future self-driving car learn just enough about the world to figure out that it would be able to drive much more efficiently if it could just get rid of all of those pesky pedestrians? As it turns out, I think something like this could actually happen; it’s just not the end of the world when it does.

As with the consciousness-is-complicated explanation for strong AI, our saving grace against paperclips and homicidal cars is the incremental nature of the technology. The first self-driving car smart enough to decide to murder pedestrians is not also going to be smart enough to redesign its own hardware, create biological weapons, and do all of this in secret without hinting to its owners that it “has plans”. No, the only thing it’s going to know how to do to achieve this goal is to try and intentionally run people over, which is obviously bad, but not exactly apocalyptic. It also has the benefit of being really freaking obvious.

Here’s a hint to future AI developers: maybe you should stop making it smarter when your cars start trying to run you over in the name of optimizing the number of pedestrians.

The Ghost in the Machine: The Nature of the Mind

Having just covered in summary the nature of the brain, we now turn to the much knottier issue of what constitutes the mind. Specifically I want to turn to the nature of self-awareness and true intelligence. Advances in modern computing have left most people with little doubt that we can simulate behavioural intelligence to within certain limits. But there still seems to be that missing spark that separates even the best computer from an actual human being.

That spark, I believe, boils down to recursive predictive self-modelling. The brain, as seen on Monday, can be viewed as a modelling subsystem of reality. But why should it be limited to modelling other parts of reality? Since from an information-theoretic perspective it must already be dealing in abstractions in order to model as much of reality as it can, there is nothing at all to prevent it from building an abstraction of itself and modelling that as well. Recursively, ad nauseum, until the resolution (in number of bits) of the abstraction no longer permits.

This self-modelling provides, in a very literal way, a sense of self. It also lets us make sense of certain idioms of speech, such as “I surprised myself”. On most theories of the mind, that notion of surprising oneself can only be a figure of speech, but self-modelling can actually make sense of it: your brain’s model of itself made a false prediction; the abstraction broke down.