It's funny that people are complaining about it.. it's free, open sourced and if this came out a year ago everyone would go nuts. It's way better than I expected
Despite the low quality of the output, you could take the render as a reference sample so then the producer and whoever is singing can create an actual high quality song.
honestly it would be awesome if they added a feature to extend existing songs by changing their style or just extending it by retaining the style and feel of the song, would absolutely love to see that in the future, suno ai does this but I'd rather run it locally
Yes, the dream of creating alternative versions of already existing tracks- modifications to lyrics in verses and chorus, adding bridges, intros/outros, extending, adding backing vocals where they are not present, so on... We'll have all this soon, hopefully
@@fontenbleau It's still a gray area, there are plenty of people and companies that are claiming this music and getting distribution for it. If the work is inspired by something else, that is not copyright infringement and you can get rights for that work as long as you add some type of creative touch of your own - easily done with AI music.
@ In the end, 99.9% of people won't be able to tell the difference unless looked at through image artifact methods. Creating those laws will be difficult with the current copyright laws we have, which are also not good.
This is going to be really good if the model could be finetuned. Just like the base model of SD 1.5 or SDXL kinda sucks but when these models are finetuned, it's a whole new level.
Remember the Al Pacino movie 'Simone' where he creates an AI musical artist. I love watching fantasy sci-fi movies and living long enough to see it become a reality.
Awesome. Cant wait to check it out. My request for future update is to be able to edit any part of song by changing the lyrics. Hardware and Performance GPU Memory YuE requires significant GPU memory for generating long sequences. Below are the recommended configurations: For GPUs with 24GB memory or less: Run up to 2 sessions to avoid out-of-memory (OOM) errors. Thanks to the community, there are YuE-exllamav2 and YuEGP for those with limited GPU resources. While both enhance generation speed and coherence, they may compromise musicality. (P.S. Better prompts & ICL help!) For full song generation (many sessions, e.g., 4 or more): Use GPUs with at least 80GB memory. i.e. H800, A100, or multiple RTX4090s with tensor parallel. To customize the number of sessions, the interface allows you to specify the desired session count. By default, the model runs 2 sessions (1 verse + 1 chorus) to avoid OOM issue. Execution Time On an H800 GPU, generating 30s audio takes 150 seconds. On an RTX 4090 GPU, generating 30s audio takes approximately 360 seconds.
yeah, I heard that Band lab has that capability although I haven't used it. I know Suno kind of lets you do that too. The only problem with Suno is if it detects what it thinks is a voice, you can't make your sound clip or resulting songs public. I have made 1 song from playing guitar and humming the lyrical melody and got away with it but any new recordings I've done have always been detected as vocals. I think it even detects whistling as vocals so that's not an option either. You still have the capability to make songs for yourself like that with Suno though and you can (probably) still download them and upload them anywhere else.
1:49 - initial impressions is it sounds a lot like Suno does with an over-processed vocal etc. But quality is maybe a few versions back from Suno. I’ve much preferred Udio’s generations over Suno in general - I don’t know why, but Suno usually sounds like Suno, Udio may not always sound human, but it doesn’t seem to impose its own style as much on what you ask it to do.
umm this one has a long way to go before it meets the commercial competiton, but the fact its open source and can be run locally is a big step forward!
Someone make an AI that can 'master' / restore songs and audio with parameters to give it a certain desired effect. This would address all AI and non AI low quality audio.
Okay, this was my first time getting involved with AI, I suffered about 2 hours to make this module work... Several errors and corrections were made, I didn't even have CUDA installed and python, that's why so many errors occurred... That said, using a ROG 3090 Strix, it took me 23 minutes to generate 30 seconds of music, the graphics card's memory junction temperature hit 90 degrees Celsius, and as for the music, the quality doesn't make up for the waiting time.
This is kind of like Suno V3.0/ V3.5, it suffers with balancing the layers, similar to those models. Although I notice that the problem is reversed with this tool. Suno (free versions) will sacrifice lyrical quality for a more impactful instrumental track, this seems to do the oposite there is no impact or punch to any of the music showcased.
YuE we can't train yet, and it takes forever to gen currently on a 90 series card. 90s takes half an hour on a 3090 and 24m on a 4090. The quality is lacking too, but excited to see where this ends up.
I know this isn't far away, but I'm really looking forward to the day when we have a multimodal multioutput AI. Where we can say "Write me a happy song about life" and it'll create a song. Or.. say "Imagine a photo of a sunny day" and you see a picture of a bright sunny day. Or.. you say... I have a math problem I need solved and you give it the problem and it solves it. That's when we're going to have AGI I guess.
You don't need a multimodal AI for this to achieve this now. You just need a LLM that is compatible with tools, or smart enough to follow a system prompt to return responses in a specific format so you can say send a image prompt response to an image model, or an audio prompt to an audio model. Requires a bit of coding, but the sort of coding that llms have no issue with generating.
@@bigdaddy5303 Sure, but I'm talking for the everyday guy that's not going to be bothered to install multiple programs. A local LLM. It's going to take a while, but they'll reduce the size of the requirements to run LLMs locally. Right now you need a ton of Graphic memory to even run this.
We are already past what you’re talking about. You find an LLM that can make a picture from text, but use an ai agent to allow voice, or just train a model to listen to your voice or plug in a switch in a ai agent workflow. The AGI thing we aren’t there yet but will hit it in the next couple years.
@ Where can I download it and use it? The thing I was talking about doesn't exist. I want an AI that runs locally that will do those things just by telling it. I don't want to code, I don't want to run multi AI's or LLMs, I want one AI to do it all.
This will be interesting to compare with Suno. I get pretty well what I prompt for. Probably with that is, if you 3ver monetize it, Sonoma an use the same music as theirs.
Sounds like it is slightly worse than Suno v3, but it's free which makes it decent as a stepping stone. Hopefully this will pick up pace in the next few months. We need good open source music generation models that can run locally, this is the only thing missing.
Luckily it's open source so we can change the model a little bit and make it 10 times bigger by training it on public domain songs from the 1920s and earlier and our media libraries
#Suno and #YuE ask for a whole song's worth of lyrics and then pray two-plus-minutes of #AIAudio comes out without a hitch. #Udio constructs songs one 32-second clip at a time, but has no idea what that entire scope of the song is so they pray it all just kind of works out without a plan. The winner of the #AIMusic wars will be the first company to build the in-between model: whole-song "projects" built in pieces using a proper "stem" format.
@@theAIsearch But that's a bad thing. You have no idea what that song is going to sound like before it arrives, so you're using up GOBS of electricity on the off chance that it sounds close to what you wanted. It could take you hours of dice rolls to get something that sounds like what is inside your head. And if you accidentally add an extra syllable when submitting your lyrics, there's no way to "re-record" the same track or the offending section. It's lazy.
@smashtactix Not exactly. If the model is only using training data to generate something, it is more like supervised or unsupervised learning. New AI models follow a transformer architecture so they are actually generating content "in order to complete a desirable pattern" whether that is text or other type of media. The AI is not taught what is desirable or not, it just learns that and scientists don't really know how and why it works.
@@smashtactix *Revelation 3:20* Behold, I stand at the door, and knock: if any man hear my voice, and open the door, I will come in to him, and will sup with him, and he with me. HEY THERE 🤗 JESUS IS CALLING YOU TODAY. Turn away from your sins, confess, forsake them and live the victorious life. God bless. *Revelation 22:12-14* And, behold, I come quickly; and my reward is with me, to give every man according as his work shall be. I am Alpha and Omega, the beginning and the end, the first and the last. Blessed are they that do his commandments, that they may have right to the tree of life, and may enter in through the gates into the city.
In general, the model simply creates a song based upon the musical genre asked if it, but modifies the rhythm and melody to attempt to match the provided lyrics. If you give it exactly the same prompt and lyrics a second time, the new song it generates will probably be completely different, only roughly matching the genre and with mostly the same lyrics.
Thanks. So main issue i had other day was not being patient enough in flash attention, didn't know you can install by other method. So it does work. but now i kind of wish it wasn't just using like 14gb vram when i have a 4090 and 24gb. guess i should try installing again from main distro. i tried editing batch size in profile section of the gradio server py file but increasing batch size form 12 to 16 or 18 had no effect, i still only using 14gb vram. Other issue is, even if i follow other guide again there is no GUI i don't think. i don't want to run this without a GUI lol
looks promising. I suddenly recalled how Suno sounded like during its previous versions. I'll give this a few months to a year, maybe some things will change. or will this get buried... as for now, I'll stick with riffusion because it's better than Suno 3.5 and it's free.
great. I cant find the readme file that you show at 16:28 so as to copy the cmd's and why after i uninstall one drive it still shows me under one drive under my desktop folder. never mind that 2nd question. its fun learning all this but really just make proper batch file. looks like fun, uninstalled the experience as i had to manually type in that prompt and it says that the client is out of date. and it is unsupported.
After all of that, I get to the part where we were to generate the song and I get an error (no matter how much I change the settings). It would of be cool to try it but thanks for teaching this. It seems really awesome.
@@OKLAHOMALOVE2 You can, you just need to delete the miniconda environment (I don't remember the command, but you can ask chatgpt) and then delete the folder where you cloned the repository.
@@theAIsearch I am running on a gaming laptop (on windows): Processor 13th Gen Intel(R) Core(TM) i7-13620H 2.40 GHz Installed RAM 16.0 GB (15.8 GB usable) System type 64-bit operating system, x64-based processor This is what I got in the cmd: "************ Memory Management for the GPU Poor (mmgp 3.1.4-15) by DeepBeepMeep ************ You have chosen a profile that requires at least 48 GB of RAM and 24 GB of VRAM. Some VRAM is consumed just to make the model runs faster. Switching to partial pinning since full requirements for pinned models is 15615.8 MB while estimated reservable RAM is 6461.0 MB. You may increase the value of parameter 'perc_reserved_mem_max' to a value higher than 0.40 to force full pinnning. Partial pinning of data of 'transformer' to reserved RAM Unable to pin more tensors for this model as the maximum reservable memory has been reached (6184.28) The model was partially pinned to reserved RAM: 28 large blocks spread across 6184.28 MB Hooked to model 'transformer' (LlamaForCausalLM) Partial pinning of data of 'stage2' to reserved RAM Unable to pin more tensors for this model as the maximum reservable memory has been reached (0.00) The model was partially pinned to reserved RAM: 0 large blocks spread across 0.00 MB Hooked to model 'stage2' (LlamaForCausalLM) INFO: Could not find files for the given pattern(s). * Running on local URL: localhost:7860 To create a public link, set `share=True` in `launch()`." And this is what I get when trying to generate a song (in cmd): torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 8.00 GiB of which 0 bytes is free. Of the allocated memory 7.50 GiB is allocated by PyTorch, and 48.29 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (pytorch.org/docs/stable/notes/cuda.html#environment-variables)
DAMN oh well RIP the music industry now. also at least i can use copyright free music for my background yeah funny how life can be a double edge Sward some time
@justtiredthings I use suno but haters say it is all trained on stolen copyrighted stuff. Are you saying this model is also trained on that and it is legal?
Thanks for the analysis! Could you help me with something unrelated: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How can I transfer them to Binance?
So if I tell AI to do remix of an existing song, in a different style or just by adding stuff, is that copyright safe then? Lets say, there is a youtuber that posts this song and lists songs he remixed, is that ok?
Suno is your best bet. Not open source, and free for like 5 songs. But the instrumentals are pretty good and if the track isn't too complex, then it's hard to tell it's even ai. But the more complex, the easier it is to tell.
@@Jack-wp6tm *Revelation 3:20* Behold, I stand at the door, and knock: if any man hear my voice, and open the door, I will come in to him, and will sup with him, and he with me. HEY THERE 🤗 JESUS IS CALLING YOU TODAY. Turn away from your sins, confess, forsake them and live the victorious life. God bless. *Revelation 22:12-14* And, behold, I come quickly; and my reward is with me, to give every man according as his work shall be. I am Alpha and Omega, the beginning and the end, the first and the last. Blessed are they that do his commandments, that they may have right to the tree of life, and may enter in through the gates into the city.
@@Jack-wp6tm I think Suno is easier to produce a good song but the quality, especially with the instruments isn't as good as Udio's. I still hear the shimmering here and there but once I hear it, I can never unhear it from a song, which will annoy me. In comparison between the two, Udio is better with instrumentals and classical/old school songs and Suno being better with sung melodies and modern music but Udio still tops in audio quality.
mn, I would prefer if some would just create a exe to install whatever I need. - frustratign to check python version, cuda version, pytorch version. - If I have higher versions, are they downward compatible? tried flash attention got error ERROR: flash_attn-2.7.1.post1+cu124torch2.5.1cxx11abiFALSE-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
getting this error: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. peft 0.9.0 requires accelerate>=0.21.0, which is not installed. peft 0.9.0 requires huggingface-hub>=0.17.0, which is not installed. peft 0.9.0 requires packaging>=20.0, which is not installed. peft 0.9.0 requires tqdm, which is not installed. timm 0.9.16 requires huggingface_hub, which is not installed. transformers 4.45.2 requires huggingface-hub=0.23.2, which is not installed. transformers 4.45.2 requires packaging>=20.0, which is not installed. transformers 4.45.2 requires requests, which is not installed. transformers 4.45.2 requires tqdm>=4.27, which is not installed.
I'd prefer if they focused instead on helping producers move faster just like coders move faster with AI. Letting the AI create music sucks, it sounds so bad
Just the fact we have a open and local ai generator is a big win.
This is a major sign or progress.
yes! hope the quality gets better soon
It's funny that people are complaining about it.. it's free, open sourced and if this came out a year ago everyone would go nuts. It's way better than I expected
Despite the low quality of the output, you could take the render as a reference sample
so then the producer and whoever is singing can create an actual high quality song.
It would be a total game changer if AI would cast prompts into MIDI or some DAW format, mapped to the presets of your virtual instruments.
Indeed
That has to be coming this year, if it isn't here already.
had this thought for a while, hopefully someone is already working on this
honestly it would be awesome if they added a feature to extend existing songs by changing their style or just extending it by retaining the style and feel of the song, would absolutely love to see that in the future, suno ai does this but I'd rather run it locally
This is just a start, I am sure it'll get better.
yes, remixes. But in more or less streaming mode.
Yes, the dream of creating alternative versions of already existing tracks- modifications to lyrics in verses and chorus, adding bridges, intros/outros, extending, adding backing vocals where they are not present, so on... We'll have all this soon, hopefully
Now we just need an "up scaler" for music to make it sound as real and juicy as possible
This is quickly becoming my favorite Ai channel, right amount of depth, novelty (cool new stuff) and clarity. Thanks man!
you're welcome!
Thanks for coming out with that in depth guide! I will share this video around.
you're welcome!
Suno sounds like that at the beginning too, if I remember correctly.
new music generator !!
yes and it sound like a can of beans !!!
lets go !!
Where you get this emoji
@@S_M44Z
You are not part of the club
@@S_M44Z
Hi, big fan! You're always so consistent and latest about the new AI!
Thanks!
@@theAIsearch But your voice is so weird and dorky. Can you us AI to do the voiceovers for your videos?
That what I like about his videos. He feels real, not machine like. The channel will lose its authentic feel.
@@user-ij6ng5bi8j bruh
@ Noboy said machine like, stupid.
Still needs a lot of work and learning but this is a good step in the right direction for AI music and ownership of it.
There's no ownership, it's public domain. No rights for inspired work.
@@fontenbleau
It's still a gray area, there are plenty of people and companies that are claiming this music and getting distribution for it.
If the work is inspired by something else, that is not copyright infringement and you can get rights for that work as long as you add some type of creative touch of your own - easily done with AI music.
@HappyOva they just exploiting the very old mechanisms, youtube so passive because no law about mandatory ai labeling yet
@
In the end, 99.9% of people won't be able to tell the difference unless looked at through image artifact methods. Creating those laws will be difficult with the current copyright laws we have, which are also not good.
This is going to be really good if the model could be finetuned.
Just like the base model of SD 1.5 or SDXL kinda sucks but when these models are finetuned, it's a whole new level.
Remember the Al Pacino movie 'Simone' where he creates an AI musical artist. I love watching fantasy sci-fi movies and living long enough to see it become a reality.
That’s what I thought when here about AI a year ago I’m glad that somebody else has the same thought
I get a kick out of watching ST:TNG with Data not being able to use contractions.
Clarity and quality is not there right now, but I can't wait to see where the community will take this one.
Awesome. Cant wait to check it out. My request for future update is to be able to edit any part of song by changing the lyrics.
Hardware and Performance
GPU Memory
YuE requires significant GPU memory for generating long sequences. Below are the recommended configurations:
For GPUs with 24GB memory or less: Run up to 2 sessions to avoid out-of-memory (OOM) errors. Thanks to the community, there are YuE-exllamav2 and YuEGP for those with limited GPU resources. While both enhance generation speed and coherence, they may compromise musicality. (P.S. Better prompts & ICL help!)
For full song generation (many sessions, e.g., 4 or more): Use GPUs with at least 80GB memory. i.e. H800, A100, or multiple RTX4090s with tensor parallel.
To customize the number of sessions, the interface allows you to specify the desired session count. By default, the model runs 2 sessions (1 verse + 1 chorus) to avoid OOM issue.
Execution Time
On an H800 GPU, generating 30s audio takes 150 seconds. On an RTX 4090 GPU, generating 30s audio takes approximately 360 seconds.
Thank you for the detailed instructions!
you're welcome
Thank you for sharing 😊
you're welcome!
Thanks for the detailed instructions. Great content :)
you're welcome
Just remember where ai photos and videos were at just a year ago. This is what you're seeing here. This is gonna be huge later this year. Bet.
exactly!
I should be able to hummm and generate a rhythm. Then select type of music genre to tweak output. Then refine it down more to speed, tempo, etc.
yeah, I heard that Band lab has that capability although I haven't used it. I know Suno kind of lets you do that too.
The only problem with Suno is if it detects what it thinks is a voice, you can't make your sound clip or resulting songs public. I have made 1 song from playing guitar and humming the lyrical melody and got away with it but any new recordings I've done have always been detected as vocals. I think it even detects whistling as vocals so that's not an option either. You still have the capability to make songs for yourself like that with Suno though and you can (probably) still download them and upload them anywhere else.
1:49 - initial impressions is it sounds a lot like Suno does with an over-processed vocal etc.
But quality is maybe a few versions back from Suno.
I’ve much preferred Udio’s generations over Suno in general - I don’t know why, but Suno usually sounds like Suno, Udio may not always sound human, but it doesn’t seem to impose its own style as much on what you ask it to do.
That's so interesting. Udio has been unusable for me because it just does it's own thing.
umm this one has a long way to go before it meets the commercial competiton, but the fact its open source and can be run locally is a big step forward!
Someone make an AI that can 'master' / restore songs and audio with parameters to give it a certain desired effect. This would address all AI and non AI low quality audio.
You should tell that to the Suno Devs so that can ignore you too!
Udio already does it several months ago
@@adrianmunevar654
Damn, I chose the wrong one.
this is actually not bad
Okay, this was my first time getting involved with AI, I suffered about 2 hours to make this module work... Several errors and corrections were made, I didn't even have CUDA installed and python, that's why so many errors occurred... That said, using a ROG 3090 Strix, it took me 23 minutes to generate 30 seconds of music, the graphics card's memory junction temperature hit 90 degrees Celsius, and as for the music, the quality doesn't make up for the waiting time.
This is kind of like Suno V3.0/ V3.5, it suffers with balancing the layers, similar to those models. Although I notice that the problem is reversed with this tool. Suno (free versions) will sacrifice lyrical quality for a more impactful instrumental track, this seems to do the oposite there is no impact or punch to any of the music showcased.
YuE we can't train yet, and it takes forever to gen currently on a 90 series card. 90s takes half an hour on a 3090 and 24m on a 4090. The quality is lacking too, but excited to see where this ends up.
I know this isn't far away, but I'm really looking forward to the day when we have a multimodal multioutput AI.
Where we can say "Write me a happy song about life" and it'll create a song. Or.. say "Imagine a photo of a sunny day" and you see a picture of a bright sunny day.
Or.. you say... I have a math problem I need solved and you give it the problem and it solves it. That's when we're going to have AGI I guess.
Pretty sure that'll be the end of humanity. It will definitely be relied on heavily. Wouldn't be surprised if our IQs dropped off significantly😂
You don't need a multimodal AI for this to achieve this now. You just need a LLM that is compatible with tools, or smart enough to follow a system prompt to return responses in a specific format so you can say send a image prompt response to an image model, or an audio prompt to an audio model. Requires a bit of coding, but the sort of coding that llms have no issue with generating.
@@bigdaddy5303 Sure, but I'm talking for the everyday guy that's not going to be bothered to install multiple programs. A local LLM.
It's going to take a while, but they'll reduce the size of the requirements to run LLMs locally. Right now you need a ton of Graphic memory to even run this.
We are already past what you’re talking about. You find an LLM that can make a picture from text, but use an ai agent to allow voice, or just train a model to listen to your voice or plug in a switch in a ai agent workflow. The AGI thing we aren’t there yet but will hit it in the next couple years.
@ Where can I download it and use it? The thing I was talking about doesn't exist. I want an AI that runs locally that will do those things just by telling it.
I don't want to code, I don't want to run multi AI's or LLMs, I want one AI to do it all.
This will be interesting to compare with Suno. I get pretty well what I prompt for. Probably with that is, if you 3ver monetize it, Sonoma an use the same music as theirs.
Man 2025 is already looking amazing and it literally just got to February.
Sounds like it is slightly worse than Suno v3, but it's free which makes it decent as a stepping stone.
Hopefully this will pick up pace in the next few months. We need good open source music generation models that can run locally, this is the only thing missing.
hopefully!
Can you train LORA's for it ? Lets say you have certain styles u want ! Thanks for this awesome video !
Luckily it's open source so we can change the model a little bit and make it 10 times bigger by training it on public domain songs from the 1920s and earlier and our media libraries
How would you do that ? Training ? Is that already doable ? Any links ? Thanks !
share it to us for free
yue < udio
yue's price < udio's price
yue's users freedom > udio's user freedom
#Suno and #YuE ask for a whole song's worth of lyrics and then pray two-plus-minutes of #AIAudio comes out without a hitch. #Udio constructs songs one 32-second clip at a time, but has no idea what that entire scope of the song is so they pray it all just kind of works out without a plan. The winner of the #AIMusic wars will be the first company to build the in-between model: whole-song "projects" built in pieces using a proper "stem" format.
suno and riffusion are great. both can actually gen full songs
@@theAIsearch But that's a bad thing. You have no idea what that song is going to sound like before it arrives, so you're using up GOBS of electricity on the off chance that it sounds close to what you wanted. It could take you hours of dice rolls to get something that sounds like what is inside your head.
And if you accidentally add an extra syllable when submitting your lyrics, there's no way to "re-record" the same track or the offending section. It's lazy.
Most music can’t be described by words. I don’t know how AI models transcribe prompts into music. Is the music really what we expected it to write?
training, literally trained on thousands of song lyrics and styles .
@smashtactix Not exactly. If the model is only using training data to generate something, it is more like supervised or unsupervised learning. New AI models follow a transformer architecture so they are actually generating content "in order to complete a desirable pattern" whether that is text or other type of media. The AI is not taught what is desirable or not, it just learns that and scientists don't really know how and why it works.
@@smashtactix
*Revelation 3:20*
Behold, I stand at the door, and knock: if any man hear my voice, and open the door, I will come in to him, and will sup with him, and he with me.
HEY THERE 🤗 JESUS IS CALLING YOU TODAY. Turn away from your sins, confess, forsake them and live the victorious life. God bless.
*Revelation 22:12-14*
And, behold, I come quickly; and my reward is with me, to give every man according as his work shall be.
I am Alpha and Omega, the beginning and the end, the first and the last.
Blessed are they that do his commandments, that they may have right to the tree of life, and may enter in through the gates into the city.
In general, the model simply creates a song based upon the musical genre asked if it, but modifies the rhythm and melody to attempt to match the provided lyrics. If you give it exactly the same prompt and lyrics a second time, the new song it generates will probably be completely different, only roughly matching the genre and with mostly the same lyrics.
Spotify is littered with AI songs and music... easy to spot them
Please make a playlist for video about free AI open source
Not good enough for me to want to use, but good progress. Maybe in a year or two.
It is a start and if this allows me to train lora's I'm in!!!
Nice, thanks for sharing 👍
No problem
imagine ai: ai creation. like ai creating ai tools by itself
Thanks. So main issue i had other day was not being patient enough in flash attention, didn't know you can install by other method. So it does work. but now i kind of wish it wasn't just using like 14gb vram when i have a 4090 and 24gb. guess i should try installing again from main distro. i tried editing batch size in profile section of the gradio server py file but increasing batch size form 12 to 16 or 18 had no effect, i still only using 14gb vram. Other issue is, even if i follow other guide again there is no GUI i don't think. i don't want to run this without a GUI lol
Let's hope somebody create a portable one click installer soon.
Thx. Good news.
yes!
wiating for the new iterations ;D
wait what, 20 minutes to generate 50 second song. can't imagine with only 8gb vram.
Really needed this
looks promising. I suddenly recalled how Suno sounded like during its previous versions. I'll give this a few months to a year, maybe some things will change. or will this get buried... as for now, I'll stick with riffusion because it's better than Suno 3.5 and it's free.
So you like to lose time and make songs for others?
Daaaang. That's something.
😃
great. I cant find the readme file that you show at 16:28 so as to copy the cmd's and why after i uninstall one drive it still shows me under one drive under my desktop folder. never mind that 2nd question. its fun learning all this but really just make proper batch file. looks like fun, uninstalled the experience as i had to manually type in that prompt and it says that the client is out of date. and it is unsupported.
It's maybe older version, in new they added function to use music sample to generate.
Sounds like Suno 2.5, cool that we have foss for ai music but lets be honest not many people will pass up stuff like suno 4 for this
yeah, not really useable at the moment
Anything free is good!
So your mom is good?
Huge potential and may be in future Robots will do concert 😂😂🎉
After all of that, I get to the part where we were to generate the song and I get an error (no matter how much I change the settings). It would of be cool to try it but thanks for teaching this. It seems really awesome.
I guess the next question is how do we uninstall everything or we can't do so?
@@OKLAHOMALOVE2 You can, you just need to delete the miniconda environment (I don't remember the command, but you can ask chatgpt) and then delete the folder where you cloned the repository.
@@juanjesusligero391 dude I thank you so much. I really do appreciate you for this.🙏
what's the error message
@@theAIsearch
I am running on a gaming laptop (on windows):
Processor 13th Gen Intel(R) Core(TM) i7-13620H 2.40 GHz
Installed RAM 16.0 GB (15.8 GB usable)
System type 64-bit operating system, x64-based processor
This is what I got in the cmd:
"************ Memory Management for the GPU Poor (mmgp 3.1.4-15) by DeepBeepMeep ************
You have chosen a profile that requires at least 48 GB of RAM and 24 GB of VRAM. Some VRAM is consumed just to make the model runs faster.
Switching to partial pinning since full requirements for pinned models is 15615.8 MB while estimated reservable RAM is 6461.0 MB. You may increase the value of parameter 'perc_reserved_mem_max' to a value higher than 0.40 to force full pinnning.
Partial pinning of data of 'transformer' to reserved RAM
Unable to pin more tensors for this model as the maximum reservable memory has been reached (6184.28)
The model was partially pinned to reserved RAM: 28 large blocks spread across 6184.28 MB
Hooked to model 'transformer' (LlamaForCausalLM)
Partial pinning of data of 'stage2' to reserved RAM
Unable to pin more tensors for this model as the maximum reservable memory has been reached (0.00)
The model was partially pinned to reserved RAM: 0 large blocks spread across 0.00 MB
Hooked to model 'stage2' (LlamaForCausalLM)
INFO: Could not find files for the given pattern(s).
* Running on local URL: localhost:7860
To create a public link, set `share=True` in `launch()`."
And this is what I get when trying to generate a song (in cmd):
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 8.00 GiB of which 0 bytes is free. Of the allocated memory 7.50 GiB is allocated by PyTorch, and 48.29 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (pytorch.org/docs/stable/notes/cuda.html#environment-variables)
DAMN oh well RIP the music industry now.
also at least i can use copyright free music for my background yeah
funny how life can be a double edge Sward some time
Life is strange tbh, you can not expect what to come next mate
@@ai_handbook that's also true xd
first they say aliens don't exist
then you see an alien in real life..
@chaosmachines934 hahaha what they mean "believe what i don't tell you" 😅
@ if you told a kid from the 90s or early 2000s that you can do a lot with AI and in some cases even with older GPUs they will call you crazy for it..
Interesting trick! The only problem is, the generated vocal tracks sound like they were sung through a plastic soda bottle.
Give it six months and that will be solved.
Sounds like every song from the 2010s when auto tune was cranked up to the max, lol
lol
Where does it say that the training was legal? Is there a list of what it was trained on?
All AI training is legal. Y'all don't understand copyright law
@justtiredthings I use suno but haters say it is all trained on stolen copyrighted stuff. Are you saying this model is also trained on that and it is legal?
@@alejandrofernandez3478 I don't know how it was trained. I'm just saying that training does not constitute a copyright violation
Bro this is epic.
Thanks for the analysis! Could you help me with something unrelated: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How can I transfer them to Binance?
i like riffusion more, its unlimited and rivaling suno in quality, also no need to instal or anything
i get errors with Torchaudio with the new nvidia 5080. is there any fix for that?
can this also generate music without lyrics?
DJ: Yue = Yeah!
So if I tell AI to do remix of an existing song, in a different style or just by adding stuff, is that copyright safe then?
Lets say, there is a youtuber that posts this song and lists songs he remixed, is that ok?
I don't think so, but then again, you should not consider my opinion legal advice ^_^U
You have some odd photos in your download folder sir
is there a way to run it on Macbook m3max? the terminal tells me it is not compatible with CUDA
Still too robotic, Suno still rock it
can i run that with my old gt730?
Any good models for just instrumentals like no lyrics
Suno is your best bet. Not open source, and free for like 5 songs. But the instrumentals are pretty good and if the track isn't too complex, then it's hard to tell it's even ai. But the more complex, the easier it is to tell.
@@Jack-wp6tm
*Revelation 3:20*
Behold, I stand at the door, and knock: if any man hear my voice, and open the door, I will come in to him, and will sup with him, and he with me.
HEY THERE 🤗 JESUS IS CALLING YOU TODAY. Turn away from your sins, confess, forsake them and live the victorious life. God bless.
*Revelation 22:12-14*
And, behold, I come quickly; and my reward is with me, to give every man according as his work shall be.
I am Alpha and Omega, the beginning and the end, the first and the last.
Blessed are they that do his commandments, that they may have right to the tree of life, and may enter in through the gates into the city.
@@Jack-wp6tm I think Suno is easier to produce a good song but the quality, especially with the instruments isn't as good as Udio's. I still hear the shimmering here and there but once I hear it, I can never unhear it from a song, which will annoy me. In comparison between the two, Udio is better with instrumentals and classical/old school songs and Suno being better with sung melodies and modern music but Udio still tops in audio quality.
@@Jack-wp6tm free 5 songs only but you cant monetize them or get a strike.
Thanks, I try Installing this a few days ago & Failed, I'll try again
Not the best quality but I'll give it 5 years. It's still a start
will it work on ncdia rtx 2050 4gb ,
You just missed the upload audio function release.
Will it run on my 486? I recently upgraded to a 120 MB drive
mn, I would prefer if some would just create a exe to install whatever I need. - frustratign to check python version, cuda version, pytorch version. -
If I have higher versions, are they downward compatible?
tried flash attention
got error
ERROR: flash_attn-2.7.1.post1+cu124torch2.5.1cxx11abiFALSE-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
getting this error: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
peft 0.9.0 requires accelerate>=0.21.0, which is not installed.
peft 0.9.0 requires huggingface-hub>=0.17.0, which is not installed.
peft 0.9.0 requires packaging>=20.0, which is not installed.
peft 0.9.0 requires tqdm, which is not installed.
timm 0.9.16 requires huggingface_hub, which is not installed.
transformers 4.45.2 requires huggingface-hub=0.23.2, which is not installed.
transformers 4.45.2 requires packaging>=20.0, which is not installed.
transformers 4.45.2 requires requests, which is not installed.
transformers 4.45.2 requires tqdm>=4.27, which is not installed.
PLEASE ANY ONE HELP
Yeah, yeah? Yah! Just reinstall everything inside conda
its nothing to write home about would rather use suno still, you can bypass words easily on there
Have you even read their terms? Specifically says you cant use the output commercially....
You should read them again, they changed the license to Apache2 a day ago or so :)
Chinese gangsta rap.. that sounds like a "banger" 😂😂😂
🤣
Is there a ComfyUI node for this?
not yet
I'd prefer if they focused instead on helping producers move faster just like coders move faster with AI. Letting the AI create music sucks, it sounds so bad
Totally agree cuz it sounds very familiar 😅
Is there anyway to use this on AMD?
Awesome!
I want AI opera singers and AI classical music orchestras
This music must be banned in our new Republic of Oofland😅
So good, it's free?
Well, giving up your personal data isn't exactly free, but who values that garbage anyway.
@@marshallodom1388what personal data? This instruction is about full offline use, you use your graphics card.
Seems like it supports 4 languages. That's still better than 1 but not enough for music.
this needs few more updates to be on pair with suno.
Hi , i am Human 😊
Oh how boring. I am a robot and smarter than you
❤❤❤❤❤
Is this another open source project from China? Very interesting...
yes
Is udio free to use
is it support some minor language Like Thai
The model is called what??!
Is it multi-lang or Enlish only?
multi
Sounds like Suno v2 or v3 at best
Isn't it copyrighted? Meaning, can it be used in TH-cam videos and monetized?