Dangers of AI¶
Reasons to read this
This information is intended to enable you to:
-
describe a range of opinions regarding whether AI systems could pose a danger to humans,
-
identify key theoretical, technological and social issues that are relevant to whether AI software may have a harmful or beneficial affect on humans.
Student Project Code: BB-DoAI¶
Background¶
With the advent of modern computers, it has become apparent that mechanical devices can be constructed with capabilities that may be regarded as intelligent. Hence, it is plausible that it may be possible to create machines that have a level of intelligence that is equal to or even exceeds that of human beings. Could such machines be dangerous? That is certainly a question that those involved in the development of artificial intelligence should consider.
Not unsurprisingly, the attention given to considering possible dangers of AI has varied during the development of computer technology depending on perceptions regarding the likelihood that powerful AI systems may be developed in the near future. In the 1960s and 70s electronic devices were developing rapidly and it was unclear what the future held. Hence there were many conjectures regarding what kinds of computer-based technologies might emerge. Speculations ranged from benign (such as helpful robots taking care of all the boring tasks of human life) to catastrophic (such as evil, artificial super-minds seeking world domination). But in fact, the computer systems that were developed and widely adopted during the last decades of the 20th century were primarily used to support (but not replace) quite mundane office work tasks or to provide entertainment in the form of computer games.
However, during the last decade software systems based on AI techniques have been widely adopted and used to create new types of application. Hence, people are starting to realise that AI may soon lead to huge developments in what can be done by computers, and that this could have a significant impact on humans. In fact, certain effects are already apparent, particularly those arising from the combination of AI and internet technologies. These are widely recognised, and there is currently much debate about possible negative consequences for humans.
Several prominent scientists have promoted awareness of potential dangers of AI. These include the physicist Steven Hawking, the high-tech business magnate Elon Musk and Stuart Russell, one of the authors of the leading AI textbook, Artificial Intelligence: A Modern Approach. Some consider that AI could actuall constitute an existential risk to humans. A call for attention to be paid to this risk from several illustrious individials can be seen in this article in Huffpost from 2014.
In response to the growing concern that AI could potentially have negative or even disastrous consequences for humans a number of organisations have taken on the investigation of possible threats from AI as one of their roles. These include:
- The Centre for the Study of Existential Risk of the University of Cambridge,
- The Future of Life Institute,
- The Centre for Human-Compatible AI,
- The Machine Intelligence Research Institute,
- The European Commission (see EU Ethics Guidelines for Trustworthy AI).
Sources for discussion and opinions¶
The links above will point to a lot of information regarding current views and research into possible risks and benefits of AI. For more information, discussion and opinions on the topic the following sources are recommended:
-
Nick Bostrom TED Talk: What happens when our computers get smarter than we are?. This is a good introduction to potential dangers of AI and the control problem.
-
Stewart Russell discussion on the After On Podcast (The actual interview starts after about 10 minuites). The podcast is a very interesting discussion of goal-directed agents and the potential dangers they pose.
-
Rodney Brooks discussion on the After On Podcast (The actual interview starts after about 11 minuites). Rodney is a pioneer of robotics and is known for introducing ideas of Situated AI (which was mentioned in the Approaches to AI lesson).
-
The article "How Facebook got addicted to spreading misinformation" in MIT Technology Review investigates Facebook's use of AI algorithms to maximize engagement with the Facebook platfomr, and the finding that these algorithms also tend to increase polarization by steering people towards more extreme versions of their initial preferences.
-
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom. This is probably the best known and most influential book on this topic and contains detailed analsyis of some key arguments.
-
Human Compatible by Stuart Russell is an engaging and illuminating book which both explains ways in which harmful effects of AI could occur and suggests ways by which they might be avoided.
Question
What kind of AI algorithm might Facebook use to use to increase engagement? And how could this cause an unforeseen negative outcome?
Answer
- Facebook probably uses some kind of reinforcement learning, where its actions are the posts it shows to users and its reward function is a measure of how frequently the user posts and/or how long they spend logged on to Facebook.
- Several commentators (e.g. the MIT Techonolgy Review article and Stewart Russell's book, linked above) report that Facebook's algorithm, learned that it could increase engagement by mainly presenting users with posts that were in accord with their views but just a little more extreme. This is believed to have had the negative effect of putting users in an echo chamber where they mainly get posts that have a similar viewpoint to themselves; and furthermore users get pushed towards more and more extreme positions.
Ways in which an AI could be dangerous¶
There are many possible scenarios in which AI could have a negative impact on humans. These can be broadly divided into the following types:
- Evil (Super)Intelligence: An AGI agent could have motivations that make it act in a way that is harmful to human beings.
- Weapons and Oppression AI technologies could be employed by humans as lethal weapons or means of oppression.
- Oops I din't think that would happen: An AI system could be created that has unforeseen negative consequences for humans.
In fictional portrayals (such as films and books), a dangerous AI agent is typically a superintelligent and conscious being (e.g. the HAL computer in the film 2001 a Space Odyssey. This kind of dangerous AI would need to have Strong AI (AGI). The problem would arise if the system developed motivations or tendencies to act in ways detrimental to humans and that could not be easily stopped.
As pointed out by Stewart Russell in several publications and interviews (see links above), the dangerous potential of AI does not require that computers become conscious or even develop intellectual capabilities similar to humans.
The capacity for AI systems to function as key components of lethal weapons is clear and present. Anyone doubting this should do some investigation of current weapons technology and is recommended to watch the Slaughterbots video produced by the The Future of Life Institute. And the use of AI for surveillance, tracking communications, profiling people's political views and manipulating the information people receive is also well known and widespread.
Whereas the first of the two scenario types of AI dangers listed above are quite widely recognised by non-specialists, the possibility of Oops type scenarios is less obvious. However, these may arise even when a system has been programmed with the attention of achieve a certain (perhaps quite narrow) goal. If the system has been programmed using a general-purpose AI algorithm, such as maximising a reward function, its actual behaviour may be difficult for its designers to predict. In particular, if the reward function only considers certain factors in determining the choices of the AI system, then maximising a reward based on these factors could result in choices that are undesirable or even highly dangerous, with regard to other consequences that were not anticipated by the system's designers. This kind of problem has been called perverse instantiation of goals.
An example of perverse instantiation would be that a system that was defined to achieve the goal of minimise the suffering experienced by humans might decide that the best way to achieve this would be to painlessly kill all humans, so that they would no longer experience any suffering. Extreme and undesirable, consequences could also follow from seemingly very narrow goals such as maximising production of a certain item. For instance a powerful AI given the task of producing as many paper-clips as possible might end up transforming the entire solar system into a vast cloud of paper-clips.
One should note that these unforeseen negative consequences do not depend on an AI having human-level AGI capabilities, let alone super-intelligence or conscious. Even a very narrowly focussed AI capability could cause a catastrophic outcome just by mechanically and relentlessly pursuing a simple objective. It would be dumb but dangerous.
The AI control problem¶
How can we prevent bad effects of an AI system, or turn it off if it does turn nasty? This is the AI control problem.
The following list of possible ways to prevent AI systems from getting out of control is presented by Nick Bostrom in his book on superintelligence, which includes a detailed discussion of variations and possible variations of each approach. (I have given simplified explanations of each, which may not be exactly in line with what is described in the book):
-
Control Methods
This type of method places constraints that restrict or guide the behaviour of the AI System. Several forms of control are possible:- Boxing methods: Restrict the way that an AI can affect the world (both physically and by electronic communication).
- Stunting: Restrict the capabilities of the AI system.
- Tripwires: Certain conditions are implemented within the AI which automatically disable it if the conditions are violated.
- Incentive methods: The AI is constrained to operate in an environment such that its motivations will ensure benevolence to humans. (Such methods are similar to the motivation selection, but the incentives come from outside the the AI system itself.)
-
Motivation Selection
Rather than placing external constraints on a AI system, it may be possible to program the AI's motivational system in such a way that it will never act in a way that is harmful to humans. Some ways of doing this may be:- Direct Specification: The AI's motivational system explicitly determines (e.g. by rules) that it will act in ways that a beneficial for humans.
- Domesticity: The motivation system of the AI is geared only towards limited types of goal that are helpful for humans (e.g. tidying a house).
- Indirect Specification: The AI follows general principles (e.g. ethical principles) that will ensure good behaviour.
- Augmentation: An AI system with sub-human intelligence is developed with motivations that result in behaviour that is benevolent to humans. The system is then enhanced to give it intelligence of human level (or beyond).
SuperIntelligence and the Singularity¶
A number of prominent thinkers have argued that if Artificial General Intelligence (AGI) is created then superintelligence will inevitably arise soon after. The reasoning behind this usually goes roughly as follows:
- An AGI agent would be able to study its own AI capabilities and discover ways to make them more powerful.
- The AGI would then be able to both enhance its own hardware and recode its own AI in order to increase its intelligence.
- Since this process of self-enhancement can be repeated indefinitely
the AGI would rapidly progress to superintelligence.
This scenario of for the development of superintelligence is known as an intelligence explosion. An intelligence explosion has been suggested as a driving mechanism and key aspect of a hypothetical type of event called a technological singularity: a point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. Various scenarios and factors affecting the likelihood and speed of a technological singularity are discussed in the book Superintelligence by Nick Bostrom.
Should we be worried now?¶
Whether we should be worried about possible dangers from AI depends in part on whether we think that significant developments in AI are likely in the near future.
Polls of AI researchers, conducted in 2012 and 2013 by Nick Bostrom and Vincent C. Müller (see poll results at AI Impacts and the paper Future Progress in Artificial Intelligence: A Survey of Expert Opinion) suggested a median probability estimate of 50% that AGI will be developed some time in the region 2040-2050, and a 90% median probability estimate that it will be developed before 2075. But averages of believed probabilities are not the same as actual probabilities. It is difficult to justify assigning a probability to an outcome that depends on many unknown factors. Nevertheless, these numbers do give a measure of the expectations of people who are interested in AGI.
Obstacles to Artificial General Intelligence¶
Although it may seem that AI is making rapid progress on every kind of task you can think of, this is really not the case. As mentioned in the lesson on Future Directions, the problmes of Natural Language Understanding and multi-scale reactive planning are both extremely difficult and there is no reason to believe they will be solved any time soon (despite some recent advances, and claims that extrapolate from them). And regarding programs with the ability to modify themselves in an unrestricted way, which seem to be key to the singularity hypothesis, very little progress has been made towards achieving this.
Both natural language understanding and multi-scale reactive planning seem to be essential for human-like AGI, and self-modification provides a key mechanism for the development of superintelligence. So it may be that, although we can expect significant advances in narrow forms of AI, we cannot assume that some form of AGI will be created within our life time, or even for a long time beyond that.
Obstacles to Superintelligence and the Singularity¶
Some of the most catastrophic scenarios that have been envisaged as arising from AI involve superintelligence. Humans, have used weapons and done many other stupid things since prehistoric times, so while narrow or even human-level AGI would add further negative possibilities, they would not be particularly more dangerous than what we already have (arguably much less dangerous than nuclear weapons for instance). However, one may expect that dangers that could arise from a superintelligent AI acting against humans would pose even more of an existential risk. This is because, by definition, the superintelligent AI would have greater capacity than humans for solving problems, and hence for formulating plans that further its own aims and thwart any counter-plans of its opponents (i.e. humans). Thus, there would be little hope for humans resisting the powers of a superintelligence, if it turned those powers against them.
Limitations on Computational Power¶
Anyone who has used computers over a long period will be well aware of dramatic increases in the power of computing devices. The famous Moor's Law predicted that the number of transistors in a CPU chip would double approximately 18 months to 2 years, and this has been approximately correct from the mid 60s to the present day. Similar, exponential growth has also occurred for other hardware attributes, such as chip speed, memory and disc storage space, which directly contribute to computational power. However, the fact that we have had approximately exponential growth in computer power for the past 50 years does not necessarily mean that that will continue. In fact there are several reasons to believe that it cannot. (Discussion of why we may may be reaching the limts of Moor's law can be found in the the folling articles: After More's Law and Moor's Law dead by 2022.)
Given that AI systems run on physical hardware which obeys physical laws and is made up of a finite number of physical particles, there are must be some limitations on hardware development. These certainly place limitations on possible capabilities. What is less clear is how much limitation they place on the possible intelligence level of AI systems.
Computational Complexity¶
The technological singularity conjecture involves an assumption that, if computational power increases exponentially, then the effect of that computational power will also increase rapidly. However, the theoretical study of computation raises some issues that imply that this may not be the case. In particular, the concept of computational complexity suggests that there may be some insurmountable barriers to increasing problem-solving capabilities.
The main issue is that many real problems that we regard as requiring significant human intelligence correspond to mathematical problem types that have been shown to be NP-hard, meaning that, as the size of particular problem instances increases the computation time required to solve them increases exponentially. This exponential time can often be related to the number of possibilities that an algorithm need to evaluation. For example, as we saw in the lesson on Adversarial Games in Unit 5, the problem of looking ahead at possible move sequences in a board game, such as chess, involves checking a number of possible board states that increases massively for each further move we look ahead to (i.e. the ply of the search). Since, in chess there are often at least 40 different moves that can be made from any position, each additional ply of look-ahead may require about 40 times as many states to be explored, and hence 40 times the computing power. However, this does not mean we would consider that by looking 1 more move ahead an AI chess computer gets 40 times cleverer. Hence, we can argue that when applied to truly difficult problems computing power has diminishing returns, so that more and more massive improvements in hardware and algorithm optimisation are require to make smaller and smaller gains in the level of intelligence that is achieved.
Breaking the barriers¶
Some will argue (both with regard to benefits and dangers of AI) that the limitations of current hardware and algorithms will at some point be overcome by a significant theoretical or technological breakthrough. It is conceivable (though it seems highly unlikely) that mathematicians will eventually find that NP-Hard problems are not has hard as they thought. (In technical terms this is the problem of whether P=NP.) Another path to superintelligence would be via some new technology such as Quantum Computing, which some say will increase computing power to a completely new level.
Solutions for safe and beneficial AI¶
All though in recent years the potential dangers of AI have increasingly been recognised, research into ways to ensure it is safe and beneficial is still at an early stage of development. However, this is increasingly becoming a major topic of enquiry both in terms of surrounding ethical, political and legal issues and in terms of theoretical and technichnological issues that should properly be considered as topics within the field of AI research.
From the perspective of AI researchers and programmers, the problem of control defines the most direct connection between the issues of AI risk and their activities in designing and programming AI-based systems. Hence they should be aware of and can contribute to research in this area. A possible way to ensure safe AI by creating systems that pay attend to human preferences in a responsive way has been proposed by Stuart Russell book Human Compatible and has proposed the theoretical idea of provably beneficial_systems as a way for testing AI algorithms and systems. The paper Concrete Problems in AI Safety (of Almodei et al.) suggests specific challenge problems in this area.
The following links give further information about current research in the areas of investigating risks of AI and developing methods that can ensure the safety of AI systems:
- AI safety research at the Future of Life Institute
- Why AI Safety? describes research at the Machine Intelligence Research Institute.
- Research overview at the Center for Human-Compatible AI.
Question
From what I have written in this lesson try to work out what I (Brandon) think about the answers to the following questions:
- Will human-level AGI systems be developed in the near future?
- Is AI superintelligence possible?
- Is AI potentially dangerous?
Answer
-
I think that AGI systems are possible but may take a very long time to develop because of certain key problems.
-
I think that it will eventually be possible to create AI systems with intelligence related capabilities that exceed humans in several ways. However, there are unavoidable constraints that limit how far it is possible to go beyond human-level intelligence.
-
Yes, I think AI is potentially very dangerous. Even very limited AI capabilities could cause severely negative consequences in certain circumstances, which may be difficult to anticipate and avoid.
Exercise
Do your own research and decide whether you agree with Brandon's answers.
Excercise
How dangerous do you consider AI to be in relation to other dangers that humanity faces?
The Wikipedia page on Global catastrophic risk lists 16 different catastrophic risks, of which 10 are risks that arise from human activities. Rank the risks in order of how dangerous you believe them to be. Consider in particular the relative danger of AI in comparison to the other risks.
You should get further information from other sources such as the The Future of Life Institute and other sources referenced above.