(Source: NicoElNino/Shutterstock.com)
The field of artificial intelligence (AI) is poised to enter a new frontier—one where machines shift from being tools that supplement the human user to being autonomous intelligent agents that can set their own goals, decide what to learn, decide how to learn, and more. The potential of highly intelligent systems to transform the world is akin to the changes brought about by previous industrial revolutions. The question isn’t whether intelligent systems will continue to transform our lives; the question is in what ways and to what extent.
AI Safety Engineering (or just “AI Safety”) is a proposed framework for AI development that combines machine ethics with psychology, economics, finance, and other fields to:
A burgeoning area of research, AI Safety has come about for a few reasons. First, the success of AI is not measured only in terms of accomplishing a goal; a successful AI is one that accomplishes goals in ways that align with human values and preferences. With 60-plus years of AI development to reflect on, we can see that misalignment between machine goals and human values and preferences are what, sooner or later, cause AIs to fail. As explored in this series, targeting this misalignment as a key vulnerability is central to developing Safe AI.
Second, recent advancements in AI have begun to reach the boundaries of artificial narrow intelligence systems, which perform single- or narrow-defined tasks in a given context. Advances in sensors, big data, processing, and machine learning in particular have made these systems more and more human-like and have expanded their capabilities and uses. With that in mind, reaching the next level of artificial intelligence—artificial general intelligence—is on the horizon, as are the potential consequences if Safe AI isn’t the priority.
At the core of Safe AI is the assumption that artificial general intelligence presents a risk to humanity. Rather than approaching this problem by trying to impart human values and preferences on machines at the task or goal level—a potentially impossible feat—AI Safety aims to:
In doing so, we would ensure AI processes and goals respect humans at a macro level, rather than trying to achieve the same at a micro level—giving machines a predisposition to be friendly toward us as the part of the core of intelligence.
As an engineering development philosophy, AI Safety treats AI system design like product design, where every angle of product liability is examined and tested, including uses, misuses, and potential vulnerabilities. Figure 1 illustrates AI Safety’s burgeoning principles and recommendations.
Figure 1: AI Safety Engineering emphasizes developing intelligent systems capable of proving that they are safe even under recursive self-improvement.
AI Safety Engineering is a burgeoning discipline with much to be researched, discussed, and codified. Mouser Electronics is pleased to present this blog series to expose AI engineers to key concepts and encourage participation in its ongoing development:
Part 2 of this series highlights what we’ve learned from the past 60-plus years of AI development that AIs fail because of a misalignment between machine goals and human values and preferences. It also discusses why imparting human values and preferences on machines would be an unsolvable problem, and it points to the need for Safe AI.
Part 3 discusses another reason why AI Safety is needed: AI advancements are pushing the boundaries of artificial narrow intelligence (ANI) systems and putting artificial general intelligence (AGI) in view.
Part 4 explores additional challenges in implementing AI Safety: Unpredictability, unexplainability, and incomprehensibility.
Part 5 describes ways that AI Safety will change engineering. Developing deep-use cases that get to the core of user values and examining intelligence vulnerabilities are two key topics here.
Part 6 concludes with discussions about using “artificial stupidity” to help us develop Safe AI. Limiting machine capabilities as well as understanding cognitive biases are key themes here.
Dr. Roman V. Yampolskiy is a Tenured Associate Professor in the department of Computer Science and Engineering at the University of Louisville. He is the founding and current director of the Cyber Security Lab and an author of many books including Artificial Superintelligence: a Futuristic Approach.