Trust, but verify: The quest to measure our trust in AI

Graphic illustration of two hands, one light skin-colored and one blue, reaching for one another.

Illustration by Hannah Kalas

By Mikala Kass |
February 07, 2023

If you’re in the habit of using Siri to search the web, having Alexa turn on your lights, creating unique portraits with the Lensa app or writing with the help of ChatGPT, you are interacting with artificial intelligence (AI).

AI is a growing presence in our lives and has great potential for good — but people working with these systems may have low trust in them. One research team at Arizona State University’s Center for Accelerating Operational Efficiency is working to address that concern by testing a tool that could help government and industry identify and develop trustworthy AI technology.

Artificial intelligence is a computer system able to perform human-level tasks like summarizing information, understanding images or speech, or making decisions. It shows up in many everyday technologies.

It’s also increasingly used in sectors like health care, helping medical experts make diagnoses and discover new drugs faster; in finance, where automated investing makes navigating the stock market easier; and in transportation, powering self-driving cars to navigate busy roads.

Making these innovations possible, however, requires people to work together with AI. And while most people may not harbor active suspicion against AI, they may have low trust, much like you might have for a complete stranger. If an AI system diagnosed you with a disease, for example, would you pursue treatment right away or would you seek out a second opinion from a human health professional?

“Things that would lead to low trust are if the technology’s purpose, its process or its performance were not aligned with your expectations,” says Erin Chiou, an assistant professor of human systems engineering in The Polytechnic School, part of the Ira A. Fulton Schools of Engineering.

Certainly there are many legitimate reasons to be critical of AI as we continue to develop this technology. The following are just a few ways AI can hinder, rather than help, people:

AI algorithms trained with biased data sets can perpetuate discrimination. For example, in the housing industry, AI systems that screen tenant and loan applications have been known to skew against people of color.
AI is unable to follow social conventions unless programmed to do so. For example, an AI system may know critical information but not share it with its human counterparts unless specifically asked.
Privacy issues are a constant point of concern when it comes to AI, such as apps that track your data in order to feed you customized ads and content.

As a result of these and other challenges, in October 2022, the White House issued the “Blueprint for an AI Bill of Rights,” which gives guidance on how to implement AI technology in a way that protects the rights of the American public.

Problems can also arise when people develop or acquire an AI system that doesn’t work in the best interests of the users, or when high expectations of the technology lead to catastrophic security breaches, adds Chiou.

“In some ways, a critical eye towards technology can be healthy. That's not to say that we should never use technology because of its flaws, but having a critical eye is and should be empowering. If you look at the bigger picture, people still provide a lot of value as a workforce that will be very difficult for AI to replace,” says Chiou, who is also a researcher in the Center for Accelerating Operational Efficiency.

Despite these shortcomings, AI has an immense power for good when used responsibly. Chiou points out that AI has enabled upscaled economic activity that has improved the quality of life for many, such as rapidly building critical infrastructure like highways and energy grids, as well as providing more informed decisions in health care.

It also plays a role in our national security. Regardless of whether the United States embraces AI technology, other countries will — countries with possibly different values than our own.

“If you believe that our values in the U.S. will lead us to do more good than harm to the world, then we have to remain competitive in these high-tech areas like AI,” Chiou says.

To that end, Chiou is working on a project that will help U.S. government and industry acquire AI technology that people will feel confident in using and work with more smoothly. Funded through the U.S. Department of Homeland Security, the research group is testing whether a new tool effectively measures the trustworthiness of AI systems.

The tool, called the Multisource AI Scorecard Table (MAST), is based on a set of standards originally developed to evaluate the trustworthiness of human-written intelligence reports. It uses a set of nine criteria, including describing the credibility of sources, communicating uncertainties, making logical arguments and giving accurate assessments.

People seated at a table working on laptops.

Volunteer groups of transportation security officers interact with simulated AI systems to test the accuracy of the MAST tool. Photo courtesy Pouria Salehi/Center for Accelerating Operational Efficiency

To test whether MAST can effectively measure the trustworthiness of AI systems, volunteer groups of transportation security officers interacted with one of two simulated AI systems that the ASU team created.

One version of the simulated AI system was built to rate highly on the MAST criteria. A second version was built to rate low on the MAST criteria. After completing their tasks with one of these systems, the officers completed a survey that included questionnaires on how trustworthy they thought the AI was.

The Transportation Security Administration’s federal security directors from Phoenix Sky Harbor International Airport, San Diego International Airport and Las Vegas Harry Reid International Airport each organized volunteer officers to participate in the study.

The Center for Accelerating Operational Efficiency has a history of working with the TSA. Previously, Chiou’s team participated in piloting a new AI-powered technology at one of Sky Harbor’s terminals. The new screening machine uses facial recognition to help a human document checker verify whether a person in the security line matches their ID photo. That study showed that the technology increased accuracy in screening. The current project is testing whether new, MAST-informed features will affect officers’ trust perceptions and performance with the technology.

“The Transportation Security Administration team in Arizona has had the privilege to partner with ASU and the Center for Accelerating Operational Efficiency for the past several years. We are particularly excited to have the opportunity to partner with them on this project involving artificial intelligence,” says Brian W. Towle, the assistant federal security director of TSA-Arizona. “With the use of AI rapidly growing across government and private sector organizations around the globe, there is significant value in increasing public awareness and confidence in this technology.”

For the fieldwork phase of the current project, the officers viewed images of people and ID photos as if they were in the document checker position. The simulated AI system gave them recommendations for whether the images and ID photos matched. The officer makes the final decision about whether to let a person through the line, which is why their trust in the AI recommendation matters.

AI-powered face matching technology is already in place in at least one terminal of Sky Harbor, Chiou says, as well as similar systems in airports in the United Kingdom and Canada. While it may take a while for airports nationwide to acquire this technology, we are likely to encounter it more and more in the future as we travel.

If the ASU team is able to show that the MAST tool is useful for assessing AI trustworthiness, it will help in building and buying systems that people can rely on, paving the way for AI’s smooth integration into critical sectors, protecting national security and multiplying its power for positive impact.

“The next steps for the tool would be for people in technology development or acquisition who could then use that tool to help them create or evaluate technology that would be trustworthy,” Chiou says.