Can artificial intelligence be both competitive and ethical?

Register now

When people voice their concerns regarding artificial intelligence, the most commonly discussed scenario is of a world where humans are no longer the smartest beings on the planet, and we worry whether it will be possible to peacefully coexist alongside our sentient AI-overlords.

This view is understandable, if not gravely misguided. Media and popular culture have led us to believe that we are headed towards some bleak, Terminator-style future. The truth is that we have far more prosaic issues to contend with. Namely, accountability, transparency and truthfulness surrounding the collection, use and storage of the single most important ingredient for AI—data.

The ethical issues of collecting and storing data

Whether or not you have strong opinions about the ethics of data and AI, we are all directly affected by its outcome. Every time we use a smartphone or a computer to kill time on YouTube, or to manage important aspects of our lives—such as personal finances, relationships or declaring taxes—we create data. Every click, search, comment, like, email and chat message generates data, which is a highly valuable asset for both government and commercial organizations looking to enhance their algorithms or train their AI models.

Even though you and I are the creators of this asset, we often have little to no control over what data is actually being collected, and what happens to that data afterwards. While the introduction of the GDPR last year gave many people stronger legal protection of their personal information, it did little to offer them real power over what data they are comfortable sharing.

In November, Edward Snowden spoke via video link at the Web Summit technology conference in Lisbon about the widespread collection of data by governments and corporations. The former National Security Agency contractor who blew the whistle on numerous global surveillance programs criticized the GDPR because, according to him, the regulation and protection of data presumed that the collection of data in the first place was proper. He went on to claim that the most powerful institutions in society have become the least accountable.

Building trust and steering AI in the right direction

Conversational AI, a type of speech-based assistants or chatbots that can engage with people with humanlike quality by understanding speech and intent, places companies in the business of helping governments and large corporations to strategically automate customer service in a precarious position. There is enormous potential for data collection on a large scale thanks to the instantaneous and low-threshold interaction opportunities that a virtual agent offers.

Vendors need to ensure that their technology is not used for political , unethical or illegal purposes. Like Gartner, I believe that digital ethics is key to success with artificial intelligence. When it comes to ethical conversational AI, I believe we should look to the Ethics Guidelines for Trustworthy Artificial Intelligence that were presented in April 2019 by a European Union high-level expert group on artificial intelligence. These guidelines present three components—lawful, ethical and robust AI—in a framework for achieving ‘trustworthy AI’.

On the subject of ethical AI, the guidelines state that artificial intelligence systems should adhere to the basic ethical principles of respect for human autonomy, prevention of harm, fairness and explicability.

These principles should be rooted in everything we do, but in light of increasing concerns about accountability, the principle of explicability is especially important, as we agree that it is crucial for building and maintaining users’ trust in AI systems. We need to ensure that every process is transparent, the capabilities and purpose of AI systems are openly communicated, and decisions,to the best possible extent, are explainable to the clients, partners and end-users that are directly and indirectly affected.

From ethics to action - minimizing the need for too much data

The question remains, however, as to whether it is actually possible to develop, train and apply AI systems that comply with regulations and ethical standards. And if by doing so, can an AI startup still compete in a global market alongside tech giants who aren’t exactly known for accountability and respecting privacy.

The answer is, yes, it is possible that competitive, robust AI systems and data privacy do not have to be mutually exclusive—particularly in the case of conversational AI.

Conversational AI is, ultimately, about understanding and then finding the right answers to a user’s questions. While data is key in training the underlying algorithms that power a virtual agent, it is not necessary to collect and store as much data as possible from your users in order to maximize the value of the technology.

Using synthetic data when training an AI model can often be better than using raw data from conversations between users and virtual agents. This is because when people articulate questions or requests, they often do so in a way that is neither efficient or optimal for the purposes of training an AI model.

Vendors can also invest in making their algorithms sophisticated enough that they don’t require huge sets of training data to begin with.

A useful way of illustrating this approach to training AI systems is the classic ‘needle in a haystack’ analogy. If we look at the process of training conversational AI to find the right answer—as if it were getting good at finding a needle in a haystack—collecting and adding more data becomes the equivalent of adding more and more haystacks. This essentially makes it more difficult to find the proverbial needle (or the right answer) because raw data is unstructured. Feeding more unstructured (i.e., bad) training data to the algorithm doesn’t end up yielding the desired result and can actually make the AI worse at understanding the end user.

A smarter approach is to use synthetic data to train the models, letting the raw, unstructured data from conversations act as a guide to the AI trainers responsible for building and maintaining the AI. The raw data can be used to identify potential areas where a virtual agent requires additional training and then used as a basis to build synthetic training data that can be fed to the model. This makes it possible to respect end-user privacy by lowering the retention time on conversation data.

With this approach, companies can still implement an artificially intelligent virtual agent powered by conversational AI, while minimizing the necessity to collect data on their customers. It also allows a startup from a small country like Norway to compete in the global marketplace, and take part in helping to create a world of ethical, trustworthy AI.

Technology is neither inherently good nor evil. It is the ethical choices made by humans and companies that define its outcome. For AI to thrive as a force for positive change in the world, businesses need to pay attention, constructively voice their concerns, and continue to demand accountability from those building the technology.

For reprint and licensing requests for this article, click here.