Google’s DeepMind’s New AI Tool Can Generate Audio, Dialogues For Muted Videos

Published On: June 19, 2024

JOIN US

Google has embraced the AI era through its developments in Gemini and other tools. The company has already showcased VideoPoet and Veo, which can generate videos from text input. The company’s DeepMind AI unit has now unveiled a new video-to-audio (V2A) technology to create contextual audio files for silent videos. In simple terms, the technology can create dialogues and soundtrack for a video based on the scene. Here are the details.

Google’s V2A Technology Explained

Google DeepMind’s video-to-audio technology analyses the pixels in a video using natural text prompts, allowing the tool to understand the video content better. Using this data and Google’s in-house AI models, the V2A tool creates high-quality sound effects that match the video.

V2A also uses Veo, Google’s video generation tool, to create realistic sound effects. It also tries to match the tone of specific subjects in the video. The new tool can create audio and animated or stock footage content for human subjects. The company has shared several examples on its website showcasing the potential of this technology.

Also Read

Google’s VideoPoet Can Generate Videos From Text Using AI: Here’s How

OpenAI's Sora is Cool, But Mute; ElevenLabs' AI Sound Generator Wants to Fix That

Microsoft's VASA-1 Can Generate Realistic Human Videos From Images

V2A by Google uses a diffusion-based technique, similar to most multimedia-related AI tools. It creates the final audio file by pairing a series of encoders and recorders with a trained diffusion model. The tool’s effectiveness will improve when trained on additional data, similar to AI chatbots like Gemini and ChatGPT.

V2A is currently a concept project and is not publicly available for use. Google says that further research is still underway to improve the V2A technology and achieve realistic results. Currently, several text-to-video and text-to-audio generators are available on the Internet, but Google’s V2A is a unique product that can create audio for the video provided by the user.

Google has not shared any timelines for the public rollout of V2A. Veo appears to be the company’s priority project in rivalling Sora, an AI video generator by OpenAI.

Google’s V2A Technology Explained

Vivo V30e Review: Should You Buy in 10 Points

Realme P1 Review: Should You Buy in 10 Points

Realme GT 6 Review: Should You Buy in 10 Points

Xiaomi 14 Civi Review: Should You Buy in 10 Points

Itel S24 Review: Should You Buy in 10 Points

Honor X9b Review: Should You Buy in 10 Points