Go ahead. Stare.
Inspect the weathered face of senior citizen Dr. Sid, with its backcountry topography of forked veins, crow's-feet, and liver spots. Lose yourself in the freckles and windblown hair of lovely Aki. Watch fear corrugate Neil's brow, and note that his sweat dribbles in complex courses, tracking every ridge of flesh.
The objects of such scrutiny don't mind, because they aren't real. These characters in the movie Final Fantasy: The Spirits Within look, move, and speak like human beings, but they are computer-generated animations, as artificial as Bart Simpson or Mickey Mouse.
As such, they are surprisingly unsettling. One last bastion of human discrimination— our ability to tell whether the person on the screen is real or simulated— is about to crumble. When the film, a production of Square Pictures, opens July 11, it will signal a change not only in cinema but also in television, computer monitors, handheld displays, and the other ubiquitous screens of modern life. Already thousands of people worldwide are swapping talking-head e-mails, getting news from synthetic broadcasters, even holding eye-to-eye conversations with databases. From now on, expect technology to literally have a human face.
Click on image to enlarge
Step 1: Skeletal FrameworkThe creation of a computer model for close-up shots of a virtual actor begins with what animators call a wire frame. The underlying structure of Dr. Sid's head is a malleable grid. "This defines every edge and every border," says artist Francisco Cortina. "To change the shape, you pick the points where the lines intersect and move them." Photo courtesy of Square Pictures
Click on image to enlarge
Step 2: Working CanvasOnce Dr. Sid's basic facial features are sculpted to the satisfaction of the film's character designers, animators cover the wire frame model with a layer of gray-toned flesh, using a special algorithm that fills in gaps in the grid and simulates light from various angles. "At this point, you essentially have a canvas on which you can paint," says Cortina.Photo courtesy of Square Pictures
Click on image to enlarge
Step 3: Photo-real characterThe final and by far the most difficult stage in creating a virtual actor is digitally painting the face. "Getting that last 20 percent of detail takes 80 percent of the creation time," says Cortina. Complex software tools add realistic imperfections to the digital skin. "The wrinkles are rendered in three dimensions, not two, so that they shade properly."Photo courtesy of Square Pictures
Through films like Final Fantasy, Hollywood is leading the charge. Computer-generated, photo-real human actors— "synthespians" in special-effects parlance— have been creeping toward us from movie animators' dim cloisters for more than a decade. From dinosaurs in Jurassic Park to apes in Mighty Joe Young to tumbling-overboard passengers on the sinking Titanic, computer-graphic beings have spent the last decade evolving from the weird to the intimate. "A close-up, photo-real human being is the Holy Grail," says Hoyt Yeatman, senior visual-effects supervisor for Disney's special-effects house— The Secret Lab— in Burbank, California. (Disney owns Discover magazine.) "It's the most difficult thing to do because it's the most familiar to us. The tiniest flaw, and the viewer will instantly realize it's not right."
"The human face is full of what I call micromovements," says animation director Rob Coleman of Industrial Light and Magic in San Rafael, California, the effects house behind the Star Wars films. "The little twitches of the bottom eyelid alone are incredibly complex. Even Industrial Light and Magic isn't in a place where we can capture that yet."
The Final Fantasy team isn't quite there yet, either. While Dr. Sid and his crew are startlingly realistic, they are a shade short of photo-real. A 17-minute preview, shown at Square Pictures' studio in Honolulu, reveals skin that's a tad too opaque and faces that are slightly too stiff. Still, they're so close to perfect, call it 95 percent, that they prove the next jump is achievable. "The acid test will be having a computer-animated person and a real person on-screen at the same time and have it be impossible to tell which is which," says Yeatman, who thinks that triumph "will happen within a year."
The implications will further complicate the once-noticeable gap between fantasy and reality. The dead, for example, will live again in movies and television, allowing studios and heirs to milk their star power endlessly: Bruce Lee could rumble with John Wayne, Marilyn Monroe could seduce Elvis, the Three Stooges could eye-poke Harpo Marx. Actors could remain 24 forever, or age in a film sans pounds of latex. More significantly, hordes of custom-made, photo-real synthespians will soon colonize the Internet. Coupled to text-to-speech and speech-recognition software, digital creatures may become our companions, taskmasters, corporate spokesmen, dream dates, slaves, emissaries, and resurrected loved ones.
Ananova, a green-haired Web newscaster owned by Britain's Orange mobile communications group (www.ananova.com), provides an early glimpse of what such creatures can do. Facemail, a software package from LifeFX (www.lifefx.com) of Newton, Massachusetts, allows users to send and receive talking-head e-mails. Today the service offers a limited stable of icons, but chief marketing officer Bill Clausen promises that "soon you'll be able to have a digital representation of your own face scanned from a photograph, and it will speak using a recording of your own voice." For a view of this future, check out inventor Ray Kurzweil's female alter ego, Ramona, who was created with LifeFX software and makes somewhat dim-witted conversation with visitors at www.kurzweilai.com.
Even if digital humans are multiplying, photo-real synthespians will not become ubiquitous until they are a lot easier to create. It took 200 animators, directors, producers, and software mavens from 22 countries four years to make the actors in Final Fantasy, at an astounding cost of $115 million, twice the average bill for a Hollywood studio production. Says Square Pictures' lighting department supervisor, David Seager: "There are 140,000 frames in this movie, and every one of them has to be assembled and checked. Calling it a 'vast amount of work' is an understatement."
"The problem is what might be called the fluid dynamics of the facial tissues," says Eric Haseltine, senior vice president of Disney research and development. "Some elements of the face are essentially water; others are much less mobile, because they are bone or muscle under tension. This kind of thing is incredibly difficult to computer-model." Movies have an advantage because animators can pre-render details on high-speed computers before printing the results on film. Web and video-game designers, on the other hand, must shoehorn data into the constraints of real-time computation. That's why Final Fantasy's Aki consists of several million polygons (collections of triangles in three-dimensional space), while characters in the cutting-edge PlayStation 2 video games average just 1,000 polygons. Photo-real humans won't inhabit computers, games, and the Internet until memories become bigger, data pipes fatter, and processors speedier.
Compounding the challenge, says Haseltine, is that evolution has "wired us to pick up tiny facial cues. Our survival depends upon our social skills. So even if an animator can control the kinematics of facial tissues, he has to do it in precisely the way a real person does it, or people will recognize it as false."
Click on image to enlarge
To make the hair of Aki appear realistic as she moves her head, animators adjust on-screen toggle switches to reflect calculations of gravity, friction, and wind speed.Photo courtesy of Square Pictures
For overall body motion, a technique called motion capture is the best method yet for imparting humanlike movement, and the technology employed in Final Fantasy was state of the art. In a warehouse near Honolulu's Diamond Head, staff members in black leotards studded with 35 light-reflective balls mimed scenes on a black floor. Sixteen cameras, each pulsing a red light 60 times per second, recorded the balls' locations as the stand-ins moved in sync with a voice track recorded by Hollywood stars, including Alec Baldwin and James Woods. This created a database of points that move on the x, y, and z axes. "It looks like a swarm of bees," says motion-capture line producer Remington Scott, indicating the dancing points on his video screen. "This gets the body movement, but not the hands and face." Final Fantasy designers subsequently hung faces, hands, and flesh on the motion-capture points, banking on the computing muscle of more than 200 Silicon Graphics Octane workstations and 1,110 custom-designed, Linux-based CPUs housed on the top 41/2 floors of the upscale Harbor Court office building in Honolulu. The scene in this studio is typical geek-chic "cube farm," festooned with Star Wars toys and pizza boxes, but virtually every workstation also features a mirror. "We use them to see our expressions," says lead character-designer Steve Giesler. "They say an artist always puts something of himself in his work."
Giesler, who describes his job as "making people on the computer," says he quickly discovered an irony long known to directors of traditional films— quotidian reality seldom looks real enough for the heightened reality movies demand. He "tried using digital photos of skin" to cover his three-dimensional humanoid figures, "but it never looked right, and it ended up being more trouble than it was worth. Everything here is hand painted."
From the internal framework to the smallest skin blemish, simply creating (as opposed to animating) a character can take two months. Giesler notes that "old people are easier to do." The reason: The wealth of detail in an elderly person's skin draws the eye, making any imperfections in head shape or motion less noticeable.
Once the character design got director Hironobu Sakaguchi's nod, animators took over. Lead animator Roy Sato was responsible for creating Aki. "She's essentially a big 3-D puppet," he says. Using software from Toronto-based Alias Wavefront souped up with Square Pictures' proprietary plug-ins, Sato manipulated several hundred on-screen toggles and sliders that direct virtually every movable part of her face, including nose flare, throat constriction, and pupil dilation. While her body followed the rough guide of the motion-capture points, Sato crafted her face and hands frame by frame, including more than 100 independently operable clumps of hair. Each clump has its own settings for gravity strength, wind speed and direction, even "collision objects." If Aki puts her head on a compatriot's shoulder, notes Sato, "the hair will lie on it instead of going through it."
All of these parameters are propelled by physics engines— sets of computer instructions that attempt to mimic the real-world forces that affect humans and objects in motion. But physics engines don't solve every motion problem. One of the great mysteries of computer animation is that even when the software says the physics are mathematically perfect, "it can still look weird," says Square's compositing supervisor, James Rogers. So ultimately, he says, animators must watch "each shot over and over and over again and fiddle until it looks believable. You have to cheat the physics all the time."
Click on image to enlarge
For body movement, stand-ins in black bodysuits mime action from the script. The suits' reflective balls, tracked by 16 cameras, make a database that animators use to guide their characters. "It's the end of typecasting," says line producer Remington Scott. "Looks don't matter for these actors."Photo courtesy of Square Pictures
To do this well, Rogers invokes master cheaters. "I asked all of the people on my team here to look at Rembrandt and Vermeer. The Dutch painters understood better than anyone how to render the three-dimensional world in two dimensions." As it turns out, many of Square's animators have traditional arts backgrounds in oils, sculpture, and even theater, and fall back on that training daily. "Our philosophy is, we can always teach people how to use computers. But we can't teach people how to be artists," says producer Jun Aida.
"The process is simultaneously technical and artistic, so it brings together people from two different worlds," says Disney's Yeatman. "You are literally working with rocket scientists who do graphic modeling for the Jet Propulsion Laboratory, and then with these crazy artists. Keeping the two groups from killing each other is the challenge, and the fun of it."
Movie animators must keep in mind that they are not emulating real humans— they are emulating filmed humans. So one of the last processes is to use software to impose the characteristic blur and grain of conventional film on otherwise ultrasharp digital images. "Film is swimming in grain. If you look at a bit of it under a microscope, you will see all the colors of the rainbow in each tiny section," says Yeatman. "You have to replicate that, or it just doesn't look right."
As long as photo-real digital people depend upon artistic prowess, trade-offs in realism will have to be made. Top-notch artists will most likely always be rare, time will always be short, budgets always tight. Some Internet entrepreneurs are simply bypassing drawing altogether. "We use some motion capture, some mapping of real-life images, and proprietary software," says Bill Clausen of the talking-head e-mails designed by LifeFX. "We are doing this with science, not art."
Of course that begs the question of whether a proliferation of digital talking heads and photo-real characters will make the average person's life any better. "It's difficult to bond just with text," contends Clausen. "We're offering a way to bring humanity to the Internet." Maybe. So far, the Web-based digital avatars are more creepy than comforting. With their robotic intonations, blurry mouths, and Max Headroom stutters, the best one can say is that they show promise.
Click on image to enlarge
Making Aki's hand posed a painterly challenge to animators. "Light doesn't just bounce off skin," says Cortina. "It also picks up color from the blood."Photo courtesy of Square Pictures
So movies, with their pre-rendering advantage, will be likely to lead the photo-real synthetic human charge for some time to come. The animators in Hawaii know that they have opened new territory with Final Fantasy, and as production winds down, they seem more than a little awed by their feat. "I thought it was a crazy thing to try at first. I still think it's crazy to some extent," says animation director Andy Jones. "But this works. It takes you to a different place."
Digital Cloning
Don't be surprised if someday soon you find yourself staring at your virtual twin on a movie or computer screen. Matthew Brand of the Mitsubishi Electric Research Labs in Cambridge, Massachusetts, has created Voice Puppet, a software system that watches and listens as you talk on videotape, then learns to match sounds with your unique facial mannerisms. "All these things, as far as I'm concerned, are just what are called mapping problems," explains Brand. "We want to map from one thing to another. We want to map from voice to facial motion." Once the Voice Puppet knows how you move your face as you make a particular sound, it can animate an image of you to any voice track. So far the Voice Puppet can't match the realism of the best animation artists, but that may change soon. "I'm in the business of machine learning," says Brand. "If you want total realism, learning will beat any artist in the long run."
Indeed, the more video the Voice Puppet analyzes, the better it gets. The problem is that it can only learn so much from normal video; some transient phenomena in speech, such as plosives (p and b) and tongue flaps (th), happen in the 33-millisecond gap between video frames. Working around the resulting motion blur can be difficult. "Real motion has a texture with spikes and roughness," says Brand. "Synthetic motion often lacks that texture. People can kind of sense that." But Brand believes it's only a matter of time before improvements in video technology and machine learning will result in realism.
Brand's latest mapping software creates a 3-D digital sculpture of a human head from video. Within a year, he hopes to use it to render the head of a dead movie star, like Charlie Chaplin, then use the Voice Puppet to make it talk. For Brand, building a believable computer-generated human isn't the real issue. "Maybe I could make you a virtual Richard Nixon that was really, really good," he says. "You'd look at it for five minutes and then you'd say, 'All right, that's nice, but what can you do with Richard Nixon that's really new?' " — David S. Cohen
Virtual Huckster
When Motorola introduced new call-screening and messagetaking software recently, the winsome spokesperson chosen for the national advertising campaign had an unusual pedigree: She was created by a computer. Fred Raimondi and his associates at Digital Domain, a Venice, California, special-effects company, set out to make a perfect digital woman— Mya. Animation artists patterned her movements after a model hired for the part. New software simulated the drape and flow of Mya's glittering gown. The special-effects team pushed the software envelope. "This type of project," says Raimondi, "comes around once in a lifetime, where somebody says okay, 'Here we go, we're going to do it, and we have the money and the time to do it right.'"
Ironically, the digital woman turned out to be too perfect. When Digital Domain unveiled the first iteration of Mya, she was so realistic that the reps from Motorola's ad agency, McCann-Erickson, didn't believe she was computer generated. They insisted Raimondi make Mya less realistic so viewers would know immediately that she is a cyber character. "If I had my druthers, I would have made her look as real as you possibly could have," laments Raimondi, "but we had a different task."— David S. Cohen
More examples of art from Final Fantasy: The Spirits Within can be found at the movie's official Web site: www.finalfantasy.com.
Digital news broadcaster Ananova resides at www.ananova.com.
Download facemail and other digital-human software created by LifeFX at www.lifefx.com.
Digital avatar Ramona converses intelligently about the technology of inventor Ray Kurzweil— and rather dim-wittedly about all other subjects— at www.kurzweilai.com.