We need a better way to test artificial intelligence.

David Wagner, Executive Editor, Community & IT Life

November 21, 2014

5 Min Read

Artificial intelligence is here -- or very near -- and it's time we had a more sophisticated way to measure it than the outdated Turing test. A Georgia Tech professor is offering an alternative test he calls Lovelace 2.0. Is it the test we need?

First, let's discuss what is wrong with the Turing test and other major tests of artificial intelligence.

The Turing test, perhaps the most famous test of artificial intelligence, was conceived in the 1950s -- which itself is a red flag. How could its developers conceive of the revolution that was to come when computers were still using vacuum tubes? Beyond that, the Turing test relies on deception; the computer needs to trick humans. This is problematic for many reasons.

First, the notion of artificial intelligence that's specifically designed to deceive is not only creepy, but also potentially harmful. Even if you don't tell it that it is trying to deceive when it interacts with humans, the programmers have built that idea into the device's thinking.

Just as importantly, human behavior isn't always intelligent. Acting human and being intelligent are two very different things. Tricking a human into thinking you are human is a language exercise as much as it is an intelligence exercise.

[Want a difficult colleague to acknowledge a problem? Here's how: Geekend: Stubborn Deniers Demand Creative Solutions.]

Basically, it comes down to a simple fact: Tricking a human in a chat room doesn't mean your AI is suited to do anything else of value or intelligence.

Other tests have been proposed, based on answering vague language tests that often trick computers, but humans can easily pass or work around reading comprehension or writing.

What we need, according to Georgia Tech's Mark Riedl, is a test that shows a broad base of potential skills and types of intelligence. He suggests that a truly, err… intelligent artificial intelligence "develops a creative artifact from a subset of artistic genres deemed to require and the artifact meets certain creative constraints given by the human evaluator."

Interesting. To be smart, Riedl says, you must be creative and potentially "artistic" -- and that does seem to require more intelligence than a chat bot. But I'm not sure why, for instance, an artificial intelligence designed to drive a self-driving car safely isn't intelligent simply because it can't tell a joke, write a story, or paint a painting.

There are robots that can do these things -- we've seen painting robots, joke-writing software, and even an AI that creates magic tricks. It seems to me that making a magic trick takes some serious intelligence and insight into humans. A harder Turing test might create a magic trick -- rather than a chat bot -- that fools humans.

But does any of this indicate intelligence, or it is simply a sign of adapting to a specific job? Intelligence, to me, means being able to adapt to a demand you haven't been programmed to do. If your joke-writing program can switch to writing poems, for example, that's intelligence. Anything else might be really neat -- but it's specialized.

So what test would I suggest? I have a few ideas that might seem a little out there -- but here goes:

Can it take the latest Cosmo quiz?
If your computer can read, comprehend, and honestly answer a Cosmo quiz, it boasts serious reading comprehension skills, an ability to process casual language, and an ability to answer questions with shades of gray. If your AI can find out what kind of animal it is in bed, for example, it really is intelligent (and possibly a great date).

Can it get across a bridge to find the Holy Grail?
I think the ability to respond to a question with another question might be one of the best signs of intelligence. I also want to see someone ask Watson about its quest.

Can it help Bunny Watson figure out who to marry?
In the classic movie The Desk Set, Katharine Hepburn, playing Bunny Watson, believes her job as a researcher for a TV network is about to be made redundant by a computer that seems to know everything. (Incidentally, her job actually will be made redundant by the Internet decades later.) The computer in the movie can instantly answer questions of facts, but it can't evaluate. Ultimately, Spencer Tracy asks Hepburn to cancel her engagement after asking the computer if she should marry. Watch that scene at the 3:40 mark of the video below, or enjoy the whole clip. (The entire movie is available for free here).

We now have half the technology this movie envisioned in 1957. If we get the other half, we'll have a pretty smart computer.

OK, so none of these are perfect tests. One reason for that is that we've never really been able to come up with an accurate test for intelligence in people. Presumably, if we can't reliably test intelligence in people, we can't test it in a computer.

What do we really mean when we talk about an artificial intelligence? Is it simply something that can do a single task well, or is it something that can reason through any major task? Do you agree with my notion that true intelligence is doing something you've never been programmed to do -- or is that unfair? After all, I have to program myself to learn to play the violin.

What is your test of intelligence? Tell me in the comments.

Apply now for the 2015 InformationWeek Elite 100, which recognizes the most innovative users of technology to advance a company's business goals. Winners will be recognized at the InformationWeek Conference, April 27-28, 2015, at the Mandalay Bay in Las Vegas. Application period ends Jan. 16, 2015.

About the Author(s)

David Wagner

Executive Editor, Community & IT Life

David has been writing on business and technology for over 10 years and was most recently Managing Editor at Enterpriseefficiency.com. Before that he was an Assistant Editor at MIT Sloan Management Review, where he covered a wide range of business topics including IT, leadership, and innovation. He has also been a freelance writer for many top consulting firms and academics in the business and technology sectors. Born in Silver Spring, Md., he grew up doodling on the back of used punch cards from the data center his father ran for over 25 years. In his spare time, he loses golf balls (and occasionally puts one in a hole), posts too often on Facebook, and teaches his two kids to take the zombie apocalypse just a little too seriously. 

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights