Making Responsible AI development simple through prompt cards

by Marios Constantinides

28 Nov 2022

A woman presenting to a group of colleagues in a modern office setting. She is pointing at a large screen displaying code.

As Artificial Intelligence (AI) becomes ever more central to our modern lives, there is a dire need for a roadmap that ensures its sophisticated algorithms do not perpetuate the inherent biases of humans.

In a series of blog posts on Nokia’s 6 Pillars of Responsible AI, Bell Labs researchers have been exploring the ethical ramifications of AI. In our first blog post, we looked at the frequent paradox between maintaining fairness and privacy. Next, we took a closer look at where technology and morality clash, along the pillars of fairness, transparency and accountability.

Here, we will finally delve into how achieving Responsible AI can all be done in practice by offering an effective, user-friendly and interactive process for developers to preserve all six of our key principles.

The default - checklists

The current, default method for maintaining ethical guidelines is one of checklists. This is a prescriptive process that is common in aviation, medicine and engineering in which developers and practitioners need to tick off a list of requirements for verification and inspection before progressing. This tool, however, has its shortcomings and is often misused or ignored, sometimes leading to catastrophic results.

In aviation, for example, flight crews have long used checklists to ensure that required actions were performed. Research has found that they have often been mishandled, even abused, and eventually these checklists became a major contributor to interruptions and distractions associated with aircraft accidents and cockpit safety issues.

The same type of dynamic is common in AI, where there can also be high stakes, with developers similarly ignoring some of the checklists. These AI checklists have tended to be lengthy and time-consuming, containing items that deal with complex concepts that are hard to summarize in a single sentence. Furthermore, the lists are not grounded in a team-specific context and often not in compliance with standardization practices. Therefore, developers often go through checklists without proper attention to translating ethical considerations into design features.

The solution – prompt cards

Bell Labs researchers propose an alternative way of integrating ethics in AI development: using prompt cards. As with checklists, prompt cards also lay out best practices and techniques in ethical AI development. But prompt cards simplify the process by adding the benefit of statements that are co-designed with a multi-disciplinary group that includes AI developers, engineers, regulators, business leaders and standardization experts.

Most importantly, they are interactive and trigger follow-up queries in a Q&A format. Therefore, the process is less tedious, more engaging and more creative, resulting in developers thinking, and rethinking, their process. This helps to catch flaws that otherwise could have slipped through the cracks.

In an ongoing study with more than 15 AI experts, Bell Labs researchers found that prompt cards were as effective as checklists in incorporating responsible AI practices. Prompt cards require a bit of a learning curve as most developers are already familiar with checklists, but their benefits will eventually become obvious. Prompt cards create a mechanism for users to make additional comments and responses, providing flexibility that checklists simply can’t provide. This addresses the checklists’ problem of merely ticking “yes” or “no” from a prescribed list of options.

For example, when building technologies that track productivity, the cards will prompt AI developers to think about individual privacy, offering potential solutions that ensure this privacy is protected. The prompt cards, for instance, can recommend techniques that extract anonymized facial features instead of capturing entire faces, or techniques that provide aggregate analytics instead of individual metrics. A checklist would merely ask the developer if privacy was protected without providing any insight into how that protection could be achieved.

In total, the cards form the building blocks of a prompt system that guides developers throughout the AI lifecycle.

How would it work?

To evaluate how a prompt card system would work in practice, let us examine the scenario of an AI system that is deciding who is eligible for a loan.

This prompt card system has 22 cards in total, with each card describing a technique that an AI developer can implement for ensuring responsible AI development. For each card, the prompt card system triggers two questions.

The first question asks the developer whether the technique described in the card has been successfully implemented in the AI-assisted loan eligibility system. For example, the card concerning fairness could report: “Comprehensive evaluation metrics across demographic groups were reported.”

This would force the developer to consider whether the AI system has been evaluated for fairness. If the developer were to answer “yes,” he or she would be prompted to share the specifics on how fairness was implemented. Once they did that, the system would move the card into the stack of “successfully used” cards.

By contrast, if the developer were to answer “no,” the system would ask a second, follow-up question, about whether the technique described in the card should be implemented in a future iteration of the AI system. Upon answering “yes,” the developer would be prompted to share specifically how to do so, and the system would move the card to the stack of “should be considered” cards.

If instead the developer answered “no,” the system would move the card to the stack of “inapplicable” cards.

Upon completion, a developer would be presented with a summary that divided these cards into the three distinct stacks: those “successfully used,” those that “should be considered” for future development and those that were “inapplicable.”

These reports will create another chain of accountability that could be shared with others at different levels, offering more oversight and troubleshooting if backtracking were necessary, thus mitigating unintentional harm.

As we noted in a previous blog, fairness tends to clash with privacy. But by interacting with the cards, a developer can better engage with these types of opposing demands and identify the best solutions possible. For example, if a card triggered a prompt that “a document describing how sensitive information in training and testing datasets was produced in consultation with policy, privacy and legal experts,” it would urge the developer to think about the trade-off between fairness and privacy.

Integrating AI ethics is already a challenging task and there is no desire to add an additional burden on developers. But the stakes are high given previous examples of harm, and safeguards are needed.

To build responsible AI systems, it is essential to engage all stakeholders, and to do so right at the beginning when the system is designed. For the first time, prompt cards offer an effective way to do so, designing robust systems that do not need later repairs or costly redevelopment, thus practically integrating ethics into AI.

About Marios Constantinides

Marios Constantinides is a Senior Research Scientist in Nokia Bell Labs, Cambridge, and a Visiting Research Fellow at the Department of Computer Science and Technology, University of Cambridge. His research interests revolve around Human-Computer Interaction (HCI), Software Engineering, Machine Learning (ML), and Data Science. In particular, he works in the areas of user modeling, personalization, and mobile- and wearable- sensing with the aim of building technologies that augment the ways people interact and communicate.

Select your country