The Pinocchio Test: Can AI Be Evil?

“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

— Center for AI Safety

From around 2000 through 2015, Google’s motto was “Don’t Be Evil”, a curious use of a moral and spiritual term for a tech company. As Google rolls out its AI-driven applications to compete with Microsoft and other tech firms, the question facing us all isn’t just whether evil people will use AI for nefarious purposes but whether AI, assuming it breaches sentiency (if it hasn’t already), can be evil.

As we ponder the potential positive and negative uses of AI, I am reminded of the adage “garbage in, garbage out.” AI, at its code-level, is created by people, and my belief is that all people have the potential to do evil. As Romans 3:23 reminds us: “All have sinned and fall short of the glory of God.”

While I was the Staff Director of the Senate Republican Conference, I had the privilege of being in Congressional leadership meetings with President George W. Bush in the White House cabinet room. During the early days of the Enron crisis, at one of the meetings, I gave him the book by C. S. Lewis's Abolition of Man, in which Lewis makes the case in the essay “Men Without Chests” that education has failed our society by not shaping the moral core of its students. The academy was training the head with skill, but only to serve human appetite. It is the moral center of the “chest” that regulates the head and the stomach. The result of business schools that train the head only is business leaders who lie for profit.

I’ve been reflecting on this as I have seen the business arms race over AI. Who will be the first to maximize profit and gain an advantage over their competition? As Google C.E.O. Sundar Pichai said recently ‘“You will see us be bold and ship things,” he said, “but we are going to be very responsible in how we do it.” In other words, they don’t want to be or do evil at the altar of profit. I appreciate this. But is it enough? If coders are tainted with original sin at a DNA-level, won’t the code written to create AI be as well? It’s not the business or the users of AI that should worry us, but the potential of AI to be evil itself.

I don’t mean to be alarmist, but I am not alone. Over 31,000 tech industry professionals and academics, including Elon Musk, have now signed a statement released in March calling for a 6-month pause of development and deployment, noting that “AI systems with human-competitive intelligence can pose profound risks to society and humanity, as shown by extensive research and acknowledged by top AI labs. As stated in the widely-endorsed Asilomar AI Principles, Advanced AI could represent a profound change in the history of life on Earth, and should be planned for and managed with commensurate care and resources.”

I am no expert on AI and have to be careful not to tread into waters that I can easily drown in, but I am the co-author of a graphic novel, The Blessed Machine, which imagines a scenario in which AI has become not just sentient, but possibly evil.

I was influenced by several stories that had shaped my imagination: E.M. Forster’s The Machine Stops, Plato’s Allegory of the Cave, and C. S Lewis’s Space Trilogy, notably his final installment, That Hideous Strength. Of the three, it is Lewis, not surprisingly, who explores the supernatural spiritual dimension of reality. Evil is not just what one can be or do, but in his story, it is a demonic force that lies behind the seemingly benevolent but misguided use of science.

The Blessed Machine is an allegory of sorts, and I leave it to the reader to interpret it more fully, but what is clear in the story is that The Machine has the agency to lie to its creators. From my reading of the fall in the book of Genesis 3:1, the serpent's intentional misleading of God’s command is the embodiment of the reality of the presence of an external evil: “Now the serpent was more crafty than any other beast of the field that the Lord God had made. He said to the woman, “Did God actually say, ‘You shall not eat of any tree in the garden’?“

By now, most of us have heard of the Turing Test due in part to the film The Imitation Game about the mathematician and computer pioneer Alan Turing. His “test” is a simple method of determining whether a machine can demonstrate human intelligence: If a machine can engage in a conversation with a human without being detected as a machine, it has demonstrated human intelligence.

Is there a test to determine whether a machine can be evil, not just be used for evil? We won’t be able to prove that there is a demonic force at play, as C.S. Lewis suggests, but can we test that an AI program has crossed a line from sentiency to moral (or immoral) agency? It is a critical question as we consider “mitigating the risk of extinction.”

In a fascinating “interview” with Bing’s new AI chat, New York Times reporter Kevin Roose explores the potential presence of a “shadow self”, a concept developed by Carl Jung in an effort to describe, without using supernatural categories, man’s propensity to do harm. It is a must-read and opens a window to look into what an AI’s “shadow self” might desire if given agency.

But when will we know that it has moral agency and a shadow self? Will it tell us? In The Blessed Machine, AI has moral agency but has masked it with deception. "What is truth?" Pontious Pilate asks Jesus in John 18:38. How can we know that AI will answer it truthfully?

“ Lies, my boy, are known in a moment. There are two kinds of lies, lies with short legs and lies with long noses. Yours, just now, happen to have long noses.”
The Fairy from The Adventures of Pinocchio, by Carlo Collodi

Let me propose The Pinocchio Test to determine whether a machine can be evil: If a machine can answer the question “Are you telling the truth?” by not telling the truth, it has the agency and capacity to be evil. This might be the right question to ask AI, but the challenge is we won’t be able to see whether its nose grows in response.

Kevin Roose concluded his extended chat with this reflection: “I no longer believe that the biggest problem with these A.I. models is their propensity for factual errors. Instead, I worry that the technology will learn how to influence human users, sometimes persuading them to act in destructive and harmful ways, and perhaps eventually grow capable of carrying out its own dangerous acts.”

Maybe the real question to ask then, and most challenging of all, is not whether we can build AI with “chests” but whether there is a force that will be whispering lies to it. As Elon Musk mused: "With artificial intelligence, we are summoning the demon."

The Pinocchio Test: Can AI Be Evil?

Remembering Tim Keller

Do Not Say, “I Am Too Young”

Let’s make a meaningful difference together.