Google created a new AI model for talking to dolphins

Dolphins are generally regarded as some of the smartest creatures on the planet. Research has shown they can cooperate, teach each other new skills, and even recognize themselves in a mirror. For decades, scientists have attempted to make sense of the complex collection of whistles and clicks dolphins use to communicate. Researchers might make a little headway on that front soon with the help of Google’s open AI model and some Pixel phones.

Google has been finding ways to work generative AI into everything else it does, so why not its collaboration with the Wild Dolphin Project (WDP)? This group has been studying dolphins since 1985 using a non-invasive approach to track a specific community of Atlantic spotted dolphins. The WDP creates video and audio recordings of dolphins, along with correlating notes on their behaviors.

One of the WDP’s main goals is to analyze the way dolphins vocalize and how that can affect their social interactions. With decades of underwater recordings, researchers have managed to connect some basic activities to specific sounds. For example, Atlantic spotted dolphins have signature whistles that appear to be used like names, allowing two specific individuals to find each other. They also consistently produce “squawk” sound patterns during fights.

WDP researchers believe that understanding the structure and patterns of dolphin vocalizations is necessary to determine if their communication rises to the level of a language. “We do not know if animals have words,” says WDP’s Denise Herzing.

An overview of DolphinGemma

The ultimate goal is to speak dolphin, if indeed there is such a language. The pursuit of this goal has led WDP to create a massive, meticulously labeled data set, which Google says is perfect for analysis with generative AI.

Meet DolphinGemma

The large language models (LLMs) that have become unavoidable in consumer tech are essentially predicting patterns. You provide them with an input, and the models predict the next token over and over until they have an output. When a model has been trained effectively, that output can sound like it was created by a person. Google and WDP hope it’s possible to do something similar with DolphinGemma for marine mammals.

DolphinGemma is based on Google’s Gemma open AI models, which are themselves built on the same foundation as the company’s commercial Gemini models. The dolphin communication model uses a Google-developed audio technology called SoundStream to tokenize dolphin vocalizations, allowing the sounds to be fed into the model as they’re recorded.