Safety labels needed for AI

This photo shows Energy Star labels help consumers understand the energy efficiency of appliances. (Photo by Joe Raedle/Getty Images)

Energy Star labels help consumers understand the energy efficiency of appliances. (Photo by Joe Raedle/Getty Images)

Looking for a new, more eco-friendly refrigerator? Chances are you’ll seek out models that have an Energy Star label and read it for safety information about the product and its energy efficiency rating. Same deal when buying a car. Before deciding on your purchase you’ll almost certainly study the manufacturer’s fact sheet to learn about the car’s specs, MPG rating, safety features and other information.

Now artificial intelligence researchers are proposing similar mechanisms for algorithms and AI services—labels or factsheets based on agreed-upon standards and practices to help users and decision makers become more informed about a technology that is famously inscrutable.   

AI already drives decisions that deeply impact people’s lives—including who gets screened at the airport, who gets parole, and who qualifies for what kind of a loan—yet there’s little or no understanding or transparency about how those decisions get made. As AI, driven by deep learning and other advances, becomes increasingly complex and more deeply embedded in every aspect of society, the black box issue is sounding alarms like never before.   

Even the creators of today’s deep learning machines driving AI’s explosion can’t fully understand how the systems they build make decisions, which raises big questions about how users will trust the technology moving forward. As Will Knight wrote in the MIT Technology Review, there is an urgent need to “find ways of making techniques like deep learning more understandable to their creators and accountable to their users.”

Opening the black box

Researchers at IBM have thrown themselves into the fray of how to create trusted AI with a proposal for Suppliers Declaration of Compliance (SDoCs) for AI services—a document that would show that a service complies with industry standards and practices. They hope that one day the mechanism might be taken up by the AI industry and become as ubiquitous as, say, the Energy Star label for refrigerators.    

Declarations of Compliance are used widely in many industries today. While it may seem far-fetched that any labeling system could demystify AI, the IBM researchers point out that today’s mature industries, such as the processed food industry, toy industry, and the automobile industry, all had to find ways to bring transparency about their products to the consumer. “Without a focus on transparency and safety standards, these industries would not be thriving,” Kush Varshney, an IBM researcher and one of the authors of the paper on SDoCs, told All Turtles in an interview.     

This illustration shows a black cube (Image credit: Getty Images/Donald Iain Smith)

(Image credit: Getty Images/Donald Iain Smith)

To Varshney, the modern automobile is somewhat of a black box. “The general public, and I count myself in that lot, knows little about how modern cars work,” he said. “But a fact sheet helps us decide if we want to buy it, and importantly, also prevents suppliers from selling us lemons. AI is no different.”

AI has a lot of catching up to do, however—because currently there are zero agreed upon industry-wide standards. New challenges, safety issues, and potential hazards inevitably arise when a major new technology is introduced. If these matters aren’t addressed very soon in the AI realm, there could be a sort of trust crisis with widespread social, ethical and business implications.   

Borrowing from Maslow, the IBM researchers lay out a “hierarchy of needs” for achieving trusted AI.  The baseline need is accuracy and performance, akin to the physiological need for food and water. Next, consumers seek out safety (the prevention of unintentional harms) and security (the prevention of deliberate harms). Then, at the top of the pyramid comes trust: “Transparency about the performance and reliability of the service, the safety and security measures instituted in the service (including operating conditions under which it was tested), and the lineage of the datasets, training algorithms, and models that go into the service all lend trust to the consumer,” according to the paper.

Safety labels for AI

The IBM proposal for SDoCs builds on other transparency proposals that focus on datasets (Dataset Nutrition Labels) and impact on public policy (Algorithmic Impact Statements) to try to begin to create a framework that could address all the complex elements at work in AI cloud-based services, which may be a combination of several models trained on many data sets.

This a photo of a nutrition label. Several research groups say that AI needs safety labels. (Photo credit: Getty Images)

Several research groups say that AI needs safety labels. (Photo credit: Getty Images)

“The datasets are a component of an AI service, but not what a consumer is finally exposed to,” the IBM researchers write.  “Systems composed of safe components may be unsafe and, conversely, it may be possible to build safe systems out of unsafe components, so it is prudent to also consider transparency and accountability of services in addition to datasets… We take a functional perspective on the overall service, and can test for performance, safety, and security aspects that are not relevant for a dataset in isolation, such as generalization accuracy, explainability, and adversarial robustness.”

Among the many parameters the SDoC proposal includes are fairness and lineage. Applications that have legal requirements for fairness, such as lending services, must be able to show that they are not introducing bias based on race, gender and religion, for example. Regarding lineage, it’s important to track the provenance of datasets, metadata and models so that users and third parties such as regulators “are able to audit the systems underlying the services.”  

SDoCs have proven to be effective mechanisms for creating trust in other industries, such as consumer products, finance and software, and the IBM team argues they can also work in AI. But the complexity of machine learning inherently presents new challenges, such as AIs general-purpose nature.  “The same speech recognition algorithm can be used to understand commands in a self-driving car, to transcribe a love letter, and to control a character in a video game,” Varshney explained. “The downstream requirements and standards for each use may be different.  In contrast, with road safety, for example, the maximum allowable curvature of a bend is what it is.”

The other major challenge, according to Kush Varshney, is getting the different parties together to figure out the details. He cited the Safety-Critical AI working group at the Partnership on AI as one forum where technologists are tackling the problem.

But the urgency for developing standards and some version of an industry-wide SDoC is not in question, Varshney said, “because machine learning models are finding their way into more and more high stakes applications with larger and larger consequences on human lives.”

Top image caption: Energy Star labels help consumers understand the energy efficiency of appliances. (Photo by Joe Raedle/Getty Images)