It was 1970 when marine biologist Roger Payne first brought the mesmerizing sound of humpback whales to a wide audience via the celebrated Songs of the Humpback Whale album. Back then, the possibility of deciphering those eerie vocalizations seemed a far-out idea plucked straight from science fiction.
But breakthroughs in artificial intelligence, which have come at extraordinary pace in the last decade as computer processing power burgeoned and algorithms became more advanced, made this a realistic prospect. Last April, an interdisciplinary group of scientists and experts embarked on a five-year effort, dubbed Project CETI (Cetacean Translation Initiative), that aims to tap these technological advances and decode the language of one of the world’s largest predators: the sperm whale.
“Sperm whales are incredibly intelligent and highly socially aware creatures,” says David Gruber, a marine biologist at City University of New York and the leader of Project CETI. “We believe that by bringing humans closer to an animal species whose behavior is more similar to our culture and intellect than any other living being, we can help them care more for every form of life on earth.”
An Idea Worth Exploring
Project CETI emerged in 2017 when Gruber was a fellow at Harvard University’s Radcliffe Institute. One day, he was listening to recordings of sperm whale codas — as the patterned clicks that these marine mammals use to communicate to each other are called — when another Radcliffe fellow, cryptography expert Shafi Goldwasser, came by.
Intrigued, Goldwasser asked Gruber to share some recordings with her group and suggested that they could set up a project to understand what those haunting, otherworldly noises actually meant. Although she meant that more as a joke, Gruber thought it was an idea worth exploring. He approached machine learning expert Michael Bronstein, also a Radcliffe fellow, to learn whether artificial intelligence could be used for that purpose.
Bronstein, who specializes in natural language processing (NLP), a subfield of AI that uses algorithms to interpret written and spoken speech, was convinced that sperm whale vocalizations might be suitable for this sort of analysis due to their morselike structure that could be easily translated into ones and zeros.
Conveniently, Gruber knew a Canadian biologist named Shane Gero who had spent over a decade studying a large clan of some 40 families of sperm whales in the eastern Caribbean as part of his Dominica Sperm Whale Project, recording the sounds of hundreds of individuals along the way. He had also painstakingly annotated tracks with exhaustive notes describing which whales were talking, whom they were with, and how the animals were behaving at the time.
That was enough for a proof of concept. Using Gero’s dataset, Bronstein tasked an NLP algorithm with detecting whales by their clicks and running it to a sample of recordings. The algorithm correctly identified the specific whales more than 94 percent of the time, validating the researcher’s initial hunch. Thrilled, Gruber assembled a team to build off this encouraging result.
Collecting Clicks
With funding from the Audacious Project run by TED, the conference organization, and a number of other institutions around the world, CETI’s team is now at work to address the project’s first priority: growing the whale recording collection.
“The problems for which we can train machine learning models data efficiently are those where we understand a lot about the structure of the data,” says Jacob Andreas, an NLP expert at the Massachusetts Institute of Technology and a member of Project CETI. “Modern neural networks, by contrast, don't need all that structure, but then the information about how a given problem domain works has to come from somewhere else, and in practice this means we need much bigger datasets to learn from.”
Identifying patterns in whale talk, estimated Andreas, will take “roughly one billion clicks, or 100 to 200 million codas.” Gero’s Dominica Sperm Whale Project database currently contains around 100,000 entries.
To accomplish this whale-sized task, a pool of experts is developing a series of non-invasive video and audio recording devices. These include free-swimming robotic fish that can reach depths of thousands of feet and record visuals for hours; high-resolution hydrophones that can record all day long; and electronic tags that attach to deep-sea buoy arrays (and the whales themselves).
“Since we're studying sperm whales in their natural habitat, it's extremely important to make sure we design these experiments in a way that doesn't cause disruption or long-term confusion,” says Andreas.
Once a sufficient pool of data is gathered, CETI’s researchers will need to incorporate behavioral and social context to this basic architecture of sperm whale chatter by correlating each sound with a specific situation.
“There might be, for instance, a peculiar sequence of clicks that these animals make prior to go hunting, or a set that occurs when a whale is sick, pregnant or tries to attract a mate,” he says. “We hope that recording whales in different settings and doing different things will help build a comprehensive picture of their ‘vocabulary.’”
Talking Back?
If and when CETI cracks the code of whale language, researchers might even attempt to chat with them. This will involve developing a predictive conversation program that can generate vocalizations with estimated meanings, and then broadcast them to sperm whales to assess the response and whether it’s predictable.
Of course, no one knows if the sperm whales would ever accept humans as a conversational partner. But for Gruber that’s not an issue, nor is it CETI’s ultimate goal. “This whole thing is not about getting whales to better understand humans,” he says. “It’s rather about the idea that we are listening deeply to what these magical animals are saying — that we respect them.”