DailyPost 2429

If this is not a revolution, then there has never been a revolution and there would never be any. This is the vortex of tectonic change all around us, being unleashed on us, as never before in human history. This is the Artificial Intelligence Revolution, AI Revolution we can call it. If anyone or any organization feels that they can remain unchanged, then they are making the biggest mistake of their lives. It would lead to decimation – sooner than later. You can’t wish it off, as has been the world’s experience, it would either suck you in, or leave you on the periphery, serving no purpose in life. ChatGPT is just the beginning.

ChatGPT is not just a product or a service, it is the beginning of a new life with Large Multi-Modal Language Models. They have come to complement and supplement the smart professionals / regular users as nothing even imagined. OpenAI with ChatGPT took the lead, but Google being the market leaders cannot afford to be left behind. Recently there have been product after product announcements, demos, work in progress developments with a promise to be the market leader in this area in the days to come. Gemini and Palm-2, you would have heard of and may be even the MedPlam-2 and its cyber security variant, SecPalm-2. And now on the cards in most innovative and insanely creative product in the making from the stables of the famed Google, MusicLM.

MusicLM is text to music generator AI tool which is bound to disrupt the music industry way beyond what happened with the onset of Internet. This literally has a text prompt and then it actually makes a realistic sounding music track. A recently published paper ”MusicLM: Generating Music From Text” describes how it works. It says, “we introduce MusicLM, a model generating high-fidelity music from text descriptions such as ‘calming violin melody backed by distorted guitar riff.’ The demo starts with auto generation from rich captions. It is very accurate in this totally new kind of area. The music remains consistent of several minutes. You don’t need a deep sense of understanding of music to generate succinct music. With screen writers already being up in arms against ChatGPT and how will the copyright issue be handled for music generated by MusicLM?

The tool is quite good on the electronic music, but on the vocal side it is taking a beating as of now. They also have a story mode. Story mode means multiple prompts but essentially having the construct of one music track. The transition is very smooth. There is serious work in progress for the entire project. Another interesting task is of handling Painting Caption, which translates into conditioning music from painting. Is it not surreal? What would be an image sound like? Added to that they have a Musician Experience Model which would be of immense use for musicians at differential stages of professional precision. When fully released and fully out in the public domain, it would be the ChatGPT of music creation.

Sanjay Sahay

Leave a Comment

Your email address will not be published. Required fields are marked *

The reCAPTCHA verification period has expired. Please reload the page.

Scroll to Top