Skip to content

Google’s DeepMind unveils new video-to-audio technology

Capable of creating soundscapes, effects, and dialogues for a variety of video footage.

Photo credit: Deep Mind – Website

DeepMind, the artificial intelligence division of Google, has taken a significant leap forward in video-to-audio technology. It has unveiled a novel feature, aptly named “synchronized audiovisual generation.”

This technology is capable of producing an array of soundscapes, effects, and dialogues for diverse video footage, a tool that is currently undergoing rigorous safety testing before being released to the public.

This technology from Google is adept at crafting audio that seamlessly corresponds with the mood and personality of any video content. The company recently disclosed this in a blog post, providing the public a glimpse into its latest innovation.

Remarkably, Google’s technology possesses the ability to produce limitless soundtracks for any video input. This represents a significant shift from existing video-to-audio solutions. The software can understand raw pixels, eliminating the requirement for text prompts, a feature that sets it apart from its competition.

The blog post underscored the swift progress of video generation models. Yet, it also pointed out a prevalent shortcoming in many current systems – the lack of capacity to generate anything other than silent output.

Watch the test clip shared by DeepMind below, and find additional information about the tool here.

SHARE THIS
Back To Top