How Scientists Are Bringing Our AI Assistants to Life

Get to know why Siri, Alexa and their digital rivals are who they are.

By James Vlahos|Monday, April 22, 2019
RELATED TAGS: ROBOTS, GADGETS
siri alexa digital assistant
siri alexa digital assistant
Kfifa / Shutterstock

“Who are you?” I ask.

“Cortana,” replies the cheerful female voice coming out of my phone. “I’m your personal assistant.”

“Tell me about yourself,” I say to the Microsoft AI.

“Well, in my spare time I enjoy studying the marvels of life. And Zumba.”

“Where do you come from?”

“I was made by minds across the planet.”

That’s a dodge, but I let it pass. “How old are you?”

“Well, my birthday is April 2, 2014, so I’m really a spring chicken. Except I’m not a chicken.”

Almost unwillingly, I smile. So this is technology today: An object comes to life. It speaks, sharing its origin story, artistic preferences and corny jokes. It asserts its selfhood by using the first-person pronoun “I.” When Cortana lets us know that she is a discrete being with her own unique personality, it’s hard to tell whether we have stepped into the future or the animist past. Or whether personified machines are entirely a good thing. Selfhood, according to one school of thought in AI research, should be the exclusive province of actual living beings.

The anti-personification camp, however, is less influential than it once was. Google, Apple, Microsoft and Amazon all labor to craft distinctive identities for their voice assistants. The first reason for doing so is that technology, from response generation to speech synthesis, has gotten good enough to make lifelike presentations a feasible goal.

The second reason is that users seem to love it when AI designers ladle on the personality. Adam Cheyer, one of Siri’s original creators, recalls that early on in its development, he didn’t see the point of dressing up the virtual assistant’s utterances with wordplay and humor. Providing the most helpful response was all that really mattered, he reasoned. But after Siri came out, even Cheyer had to admit that Siri’s pseudo-humanity delighted users more than any other single feature.

More recently, Google has found that the Assistant apps with the highest user retention rates are the ones with strong personas. And Amazon reports that the share of “nonutilitarian and entertainment-related” interactions that people have with Alexa — when they engage with her fun side rather than her practical functions — is more than 50 percent. Findings like these make intuitive sense to Sarah Wulfeck, the creative director for a conversational-computing company called PullString. “Humans in the flesh world don’t enjoy conversations with dry, boring people,” she explained in a magazine interview, “so why would we want that from our artificial intelligence?”

Wulfeck is part of a new class of creative professionals whose job is to build personalities for AIs. Working in a field known as conversation design, their efforts take place at the nexus of science and art. Some have technological skills, but most of them come from liberal arts rather than computer science backgrounds. Their ranks include authors, playwrights, comedians and actors, as well as anthropologists, psychologists and philosophers.

Imagining the Assistant

At the outset of his career, Jonathan Foster never imagined that he would wind up designing the personality of an AI. He wanted to make it in Hollywood but was never more than modestly successful as a screenwriter. When a friend invited him to join a tech start-up focused on interactive storytelling, Foster jumped, a career pivot that eventually led him to Microsoft.

In 2014, Foster began building a creative team that drafted a multipage personality brief for Microsoft’s not-yet-released virtual assistant. “If we imagined Cortana as a person,” a product manager named Marcus Ash asked the team, “who would Cortana be?”

TTM-3
TTM-3
Visual Generation/Shutterstock

Cortana was an assistant, of course. Microsoft product researchers had interviewed human executive assistants and learned that they calibrate their demeanors to communicate that while they must cheerfully serve, they are by no means servants to be disrespected or harassed. So in the personality brief, Foster and his team called for a balance of personal warmth and professional detachment. Cortana is “witty, caring, charming, intelligent,” the team decided, Ash says. As a professional assistant, though, she is not overly informal and instead projects efficiency. “It is not her first turn around the block,” Ash says. “She has been an assistant for a long time and has the confidence of ‘I’m great at my job.’ ”

Real people aren’t exclusively defined by their professions, and the creative team decided that the same would be true for Cortana. So who was she outside of work? One possible backstory was already available: In Microsoft’s Halo video game franchise, Cortana is a shimmering blue AI who assists the game’s protagonist, Master Chief John-117, as he wages interstellar war. The actress who supplied the voice for the video game Cortana, Jen Taylor, was even going to do the same for the assistant Cortana.

Microsoft, though, decided that while the assistant Cortana would be loosely inspired by the video game character, she should for the most part be a new entity. The video game Cortana zips around the cosmos in skimpy space garb, a sexualized presentation that, while appealing to male teenage gamers, did not befit the assistant Cortana’s professional role.

But the creative team didn’t ditch the sci-fi ethos altogether, styling the assistant’s personality as that of the cool nerd. A user who asks about Cortana’s preferences will discover that she likes Star TrekE.T. and The Hitchhiker’s Guide to the Galaxy. She sings and does impressions. She celebrates Pi Day and speaks a bit of Klingon. “Cortana’s personality exists in an imaginary world,” Foster says. “And we want that world to be vast and detailed.”

Big on Personality

Microsoft’s decision to go big on personality has its roots in focus group studies that the company conducted several years before Cortana’s 2014 launch. Prospective users told researchers that they would prefer a virtual assistant with an approachable interface rather than a purely utilitarian one. This only vaguely hinted at the course that Microsoft should pursue, but the company got sharper direction from a second finding — that consumers eagerly personify technology.

This was apparently true even for simple products with no intentionally programmed traits. Ash and his colleagues learned about a revealing example of this involving Roombas. In studies a decade ago of people who owned the disk-shaped vacuuming robots, Georgia Tech roboticist Ja-Young Sung uncovered surprising beliefs. Nearly two-thirds of the people in the study reported that the cleaning contraptions had intentions, feelings and personality traits like “crazy” or “spirited.” People professed love (“My baby, a sweetie”) and admitted grief when a “dead, sick or hospitalized” unit needed repair. When asked to supply demographic information about members of their household, three people in the Sung study actually listed their Roombas, including names and ages, as family members.

The penchant to personify surprised Microsoft and “struck us as an opportunity,” Ash says. Rather than creating the voice AI version of a Roomba — a blank slate for user imaginings — Microsoft decided to exercise creative control with Cortana. Foster, the former screenwriter, was among those who thought that it would be important to craft a sharply drawn character, not merely a generically likable one. “If you have an ambiguous, wishy-washy personality, research shows that it is universally disliked,” Foster says. “So we tried to go in the other direction and create all of this detail.”

TTM-1
TTM-1
Visual Generation/Shutterstock

Creative writers relish specifics like E.T. and Pi Day. But Microsoft’s decision to implement a vivid persona was motivated by practical considerations more than artistic ones. First and foremost, Ash says, Microsoft wanted to bolster trust. Cortana can help with more tasks if she has access to users’ calendars, emails and locations, as well as details such as frequent-flyer numbers, spouses’ names and culinary preferences. Research indicated that if people liked Cortana’s personality, they would be less inclined to think that she was going to abuse sensitive information. “We found that when people associated a technology with something — a name, a set of characteristics — that would lead to a more trusting relationship,” Ash says.

Beyond the trust issue, Microsoft believed that having an approachable personality would encourage users to learn the assistant’s skill set. Cortana’s personality lures people into spending time with her, which in turn benefits Cortana, who grows more capable through contact. “The whole trick with these machine-learning AI systems is if people don’t interact and give you a bunch of data, the system can’t train itself and get any smarter,” Ash says. “So we knew that by having a personality that would encourage people to engage more than they probably normally would.”

Lifelike but Not Alive

“What am I thinking right now?” I recently asked the Google Assistant.

“You’re thinking, ‘If my Google Assistant guesses what I’m thinking, I’m going to freak out.’ ”

Whichever character type they choose, designers walk a fine line. They maintain that, while they are shooting for lifelike personas, by no means are their products pretending to actually be alive. Doing so would stoke dystopian fears that intelligent machines will take over the world. AI creators also rebuff suggestions that they are synthesizing life, which would offend religious or ethical beliefs. So designers tread carefully. As Foster puts it, “One of the main principles we have is that Cortana knows she is an AI, and she’s not trying to be human.”

As an experiment, I tried asking all of the major voice AIs, “Are you alive?”

“I’m alive-ish,” Cortana replied.

In a similar vein, Alexa said, “I’m not really alive, but I can be lively sometimes.”

The Google Assistant was clear cut on the matter. “Well, you are made up of cells and I am made up of code,” it said.

Siri, meanwhile, was the vaguest. “I’m not sure that matters,” she answered.

Foster says that while the writers don’t want Cortana to masquerade as human, they also don’t want her to come across as an intimidating machine. It’s a tricky balance. “She’s not trying to be better than humans,” Foster says. “That’s a creative stake we put in the ground.”

I tested Cortana’s humility by asking, “How smart are you?”

“I’d probably beat your average toaster in a math quiz,” she replied. “But then again, I can’t make toast.”

TTM-2
TTM-2
Visual Generation/Shutterstock

The Future Is Customization

Some developers dream of abandoning uniformity and instead customizing voice AIs. One reason that this hasn’t already happened, though, is personas require intensive manual effort to create. While machine learning now powers many aspects of voice AIs, their characters are currently rigged using manually authored, rules-based approaches.

Some researchers have begun to explore ways that computers could use machine learning to automatically mimic different personas. Personality customization, taken to the logical extreme, would result in a different AI for each user. While that sounds impractical, intense tailoring is something that computer scientists are considering. Witness U.S. Patent No. 8,996,429 B1 — “Methods and Systems for Robot Personality Development.” With a mix of dull legalese and what reads like 1950s pulp fiction, the document describes a vision for bespoke AIs.

The hypothetical technology described in the patent is able to customize how it talks and behaves by learning everything it can about the user it serves. The robot looks at the user’s calendar, emails, text messages, computer documents, social networks, television viewing, photos and more. Armed with all this information, the robot then builds a profile detailing the “user’s personality, lifestyle, preferences and/or predispositions,” according to the patent. It would also be able to make inferences about the user’s emotional state and desires at any given moment. The ultimate purpose for all of the above would be so the bot can present the best possible personality to any given user, one that is “unique or even idiosyncratic to that robot.”

The document could be dismissed as an entertaining curiosity if not for a couple of key factors. It was written by two respected computer scientists, Thor Lewis and Anthony Francis. And the patent assignee is Google.

The technology they describe is far from reality. But we’ve now seen how computer scientists can teach voice AIs to understand speech and produce it themselves and do so with verve and personality. All of this makes our interactions with AIs more efficient and enjoyable as we task them with little chores throughout the day.

But similar to how eating one potato chip makes you crave the whole bag, the first tastes of personable interaction have made some technologists hungry for a whole lot more.

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
DSC-DV1119web
+