Beata Zawrzel/NurPhoto via Getty Images
- Google announced it developed an AI bot that generates music based on text descriptions.
- The tech will not be released to the public due to existing technical issues and risks.
- The tech giant is ramping up its AI efforts after issuing a “code red” following the explosion of ChatGPT.
In the artificial intelligence race, Google announced it developed a bot that creates music based on text prompts — but don’t expect to be able to use it any time soon.
In a research paper published Thursday, Google researchers described MusicLM as “a model generating high-fidelity music from text descriptions such as ‘a calming violin melody backed by a distorted guitar riff.'”
“We demonstrate that MusicLM can be conditioned on both text and a melody in that it can transform whistled and hummed melodies according to the style described in a text caption,” the paper reads.
Per the study, users can enter descriptions like “enchanting jazz song with a memorable saxophone solo and a solo singer” or “Berlin 90s techno with a low bass and strong kick,” and receive corresponding results. Similar examples, shared on Google’s Github page, show the corresponding audio to such prompts.
The debut of MusicLM comes during the meteoric rise of OpenAI’s buzzy chatbot ChatGPT, which prompted Google to issue a “code red” — what The New York Times described in December as “akin to pulling the fire alarm” for the tech giant.
In an attempt to compete, the company is revving up the release of 20 new products, as well as a version of Google Search with AI chatbot features, per the Times.
Still, Google said it does not intend to release MusicLM to the public, citing a variety of risks including programming biases that may lead to lack of representation and cultural appropriation, technological glitches, and namely “the potential misappropriation of creative content.”
Per the study, identifiable existing songs were found in approximately 1% of examples, pointing to potential copyright infringement.
“We strongly emphasize the need for more future work in tackling these risks associated to music generation — we have no plans to release models at this point,” the study states.
The study also notes the technology’s existing limitations, including the use of negations and temporal ordering used in text prompts, as well as vocal quality. Looking ahead, researchers said they intend to work toward “modeling of high-level song structure like introduction, verse, and chorus.”