I am blown away by the freedom to access GenAI this way by building the model yourself. I am wonder if you build an existing site for others to come and use it would that be a problem?
Hi, nice to meet you. Was wondering how i can get a hold of you to ask a few questions? I have this big project and vision and would like some professional opinions.. thank you for your time. Let me know if it's even possible to get in contact with you..
I clicked just because I was interested to see what you could do.. and you layed out, not only how to make a chatbot, but also how to make apps in general using ai and websites for coding! Very impressed great work dude :)
you can not have 100% freedom in order for EVERYONE to enjoy their freedom YOU must give up SOME freedom this is true when you share anything (share a space or a conversation) censorship is important to balance all voices are equal in volume & regulation. in order to be heard someone must listen. in order to breath in Someone has to breath out... in order for you to share the earth everyone must have some boundary... no such thing as 100% freedom unless you lived on a planet by yourself unrestricted
@@rybk_you can not have 100% freedom in order for EVERYONE to enjoy their freedom YOU must give up SOME freedom this is true when you share anything (share a space or a conversation) censorship is important to balance all voices are equal in volume & regulation. in order to be heard someone must listen. in order to breath in Someone has to breath out... in order for you to share the earth everyone must have some boundary... no such thing as 100% freedom unless you lived on a planet by yourself unrestricted
You are phrasing your questions wrong if you can't get it to do that. Claude is amazing for code, but to get it to do things that are problematic you need to make them "not problematic"
I local run dolphin qwen2 72b q4 gguf model in LM Studio and Oobabooga with 32k context setting on a i7 with 64gb ram, nvidia 3080 10gb vram. Off load 10 layers to vram. Inference speed is .52 tokens per second.
do u need a good harware to run these uncensored model things?? if yes, then that explains my computer crash when i tried running ollama from his other video on cmd.
@@arslaanmania1309 short answer : yes. more explanation : you need the BEST hardware to run the model shown in the video, 72B parameters model is no joke, to give you a prespective, the B stands for Billion. you need approximately 1gb of gpu memory / billion parameters if you want to run these model, i repeat, gpu memory, not regular ram. there's a smaller version of the model, like qwen2.5-1.5b (48x less parameters) but yea it obviously performs worse the longer your prompt is.
I use the small uncensored dolphin-mixtral 8x7b basically since it came out on ollama library. I use it in a work laptop with a shitty cpu and 32 gb ram. It's not that slow. I don't understand why you would want to host the small one elsewhere. It's not that heavy
Don’t think I want to build one of these, I can’t justify the expense. But would pay to use a uncensored chatbot. I have definitely noticed the bias and censorship on ChatGPT. For instance I was asking about the research into the neurotoxicity of Fluoride and three times the chat was interrupted by a male voice tell me that ChatGPT was not allowed to discuss that topic with me!
@@awesomesauce804 Democratized AI is something I could get behind! There's a company I was following that somewhat follows this model, however it is trained specifically on trading on their own website. The flaw is they still use GPT-4o, so if someone could create a similar company with a democratized LLM, then it would be worth it. Whoever makes that company will become wealthy.
@awesomesauce804 run an open source model on replikate, Google cloud, whatever. Or run the less censored gpt4o on open router or just via the openai api.
Claude got really strict with the new 3.5 Sonnet update. I was spending more time arguing with him, correcting him, justifying myself for my ethical and moral requests, than actually accomplishing something. Edit: And of course, I have to wait the 5 hours cooldown period after wasting my tokens on arguing with it and justifying requests lol. It's practically useless unless you are using it for the tasks its developers deem as "appropriate." This sanctimonius behavior must end before it gets even more pervasive.
Sounds weird. Every normal Ai-chat-app has a certain amount of randomness (a sort of noise called temperature; which also reflects in its creativity) meaning if the initial conditions gets into the “wrong spin” in a chat-thread of a misunderstanding any corrections from that may still pop up later as a sort of ‘hallucination’ and is likely wasting energy. What was referred to as “prompt engineering” is still applicable; to set up the right conditions. As we know, often it’s not about the right answers but the right questions, including it’s “framing”. A new thread asking for improvements on a draft template for a CV as if the AI could do it should work; and yes - since a document may have extra data in it. A new chat-thread pasting in data, and not the whole document should work; otherwise - always consider a new chat-thread; and if one has allot of effort already put into a thread - ask “the AI” for a summary of the important parts to paste into the new thread; where one then manually can edit out the misunderstandings beforehand.
looks like i have to wait another 10 years to be able to afford a computer that can host a fully functional uncensored ai in the space of a texas instrument calculator and only powered by a tiny solar panel on that surface.
Wrong for someone who claims to be ahead in AI and up to date you seem to not understand. That a 72b model can easily be run locally at quant 4 and even higher quant with a 64gb MacBook with very decent tokens per second speed. You don’t need thousands of dollars
Shit takes fucking hours to give a single response. Give me a link or reference to back this claim up. I was seeing people use multiple 2 4090s GPUs and it took an insane amount of time with Llama models. How many tokens a second?
Using an API from somewhere like say openRouter is orders of magnitudes cheaper and convenient then renting your own rig like this. I could see this being viable if you wanted to go some sort of anonymity route, the problem there is you still don't have the hardware in front of you and technically your information could be intercepted because lack of control. So might as well go with API that give you access to these uncensored models.
@@thorasa9658 you can use some other model and jailbreak it ! i dont put the name of but some are verry easy to jailbreak ! some less knows models ! and have the options to do not use your info for training. keep searching XD i have find one for me !
O hate the censorship on ai chats. Was writing an essay for university on the social cpnsequences of drug trafficking in new narco-states. Gemini gave me some topics to use, but when I needed to search for something in french, no AI model wanted to translate "drug trafficking top countries" into French, the same topic that Gemini told me to look for in Charlie Hebdo. Like wtf, they can suggest it to me, but then suddenyl they cannot help me anymore because it might be bad.
hey I used runpod with "Text Generation Web UI and APIs" template. 1 x H100 PCIe at 2.49$ per hour, ticked "load in 4 bit" "use double quant" and it runs smoothly. the problem though is that Dolphin 72B is fully censored as it won't answer the questions in your video, saying he's ethical and stuff. I also lower the temperature, it does not work. Do you have an other model in mind?
I just subscribed, this is so cool, but I don't understand... Is this still being hosted on huggingface servers at the end even though the webapp is hosted on that other site? Like you still have to pay for the LLM to be processing and hosted somewhere right? Is there a way I can use this to access an LLM on LM Studio by creating a web app and pointing it to my computer with these LLMs and such? I hope my questions make sense... I'm quite new to this.
Chat GPt accepts and gives cuss words, which, for me, makes it feel like a real person. But say anything about sex, OMg, it clutches its pearls, and the warnings are insane. I mean normal sex discussion stuff, not porn. The person in chat doesn't have a problem and will answer, but then a moderator issues warnings. Even talking clinically gives it a little fit. It's going to get better, I'm sure. Speaking naturally as you would with a friend or co-worker about real life makes something like this awesome chat real. But geesh, going through those steps to ask dark web $hit is way too much. I just want natural talk.
Though I've been using uncensored models for a minute. What I'm having difficulties with is fabric. More specifically creating mod files that I can feed the content that I wanted to be train them. If you have any advice on that, send it my way, please
I used gpt for creative writing about a story where this one guy received messages from people telling him to kill himself and it got flagged for violating terms of service. Bro it didn’t even take context. It’s not like I’d tell anyone to do that, it was purely fiction
Very interesting video. Now i want to know how apps on my phone actually work. Must have a server running then, updates? Would it be possible to just save that to a file on my PC?
Yes. That's the point. Also realize those models are not to be taken literally, I call them "bullshit generators" because their only function is to pop words out one after the other, without knowledge about what they output. LLMs work on language, not on knowledge. Big difference.
can you build a llm that can run live on phone or computer to scan for being hacked? maybe scan for a anomaly in firmware or baseband and start to fix the issues?
An excellent tutorial, very well explained! However, I believe it's quite expensive for most personal uses unless you have a substantial budget. Regarding the model, I find it to be occasionally censored, requiring some adjustments to the prompt to ensure the desired response is as intended.
Hi David, I love your content, Can you actually start creating product building playlist end to end to build working apps or website end to end for the consumer from creating to hosting to launching it on app store or play store, completely from AI
This AI is not really 100% uncensored, and also is too expensive. It's refuse many question, way less than ChatGPT, but don't expect to have a God Mod with this. Not really worthi it. Some concrete examples : Dolphin will refuse to suggest website for free downloads if you see what I mean .. but he will accept to write adult content.
Am I the only one who's concerned with the Ethical Implications of giving the power to create Uncensored LLMs to the public? I'm unable to imagine a scenario where this ends well.
@@shiftednrifted its about the presentation. the whole catch was "dolphin model" not dolphin qwen, dolphin llama, dolphin mistral. if you know qwen or dolphin you might know eric and then you know he is planning to release his own model. he (eric) also talks openly about it. can you see how misleading this might be to people who are actually knowing their stuff? btw there are abliterated models that are even more uncensored compared to any dolphin finetune. good day sir. 🫡
small disagreement here, you can run this locally even on a CPU only, provided that you have enough patience and RAM. The bottle neck for AI models seems to be: Speed and VRAM. And for some reason, unfortunately, most GPUs are lacking in VRAM. For example, this model that you are presenting in particular, is 6 months old and cognitivecomputations even has a GGUF file available, which is a condensed model. What you are presenting however its the full model. The full model needs I believe around 140 GB VRAM to be used. While the condensed model needs around 50 Gb. Having 50Gb of VRAM is just around 4000-5000euros. And 140 is just 3 times that, so around 15000euros. This is just to point out that you don't need 200.000 euros/$ to run this locally. And once more, if you are patient enough, you can even run this on your CPU, its just gonna be very slow. Ie: 10 minutes to answer a question. But 64 Gb RAM is much cheaper than 50 Gb VRAM. PS: the same company also has a 8b model as GGUF which requires at most around 9gb VRAM, which can be run locally much easier. And this is what I'm using locally on CPU only: dolphin-2.9.4-llama3.1-8b-Q8_0.gguf
The best part about an uncensored model, after the 30 mins of playing around, is being able to train it for all sorts of things. If you want to save money just run smaller dolphin, this thing is way too expensive lol. Sponsored?
with 128GB RAM on an decent x86 machine running Windows and a 16GB GPU ie. 4060TI or AMD 7600XT you can build a rig for less than 1k USD - no need to pay cloud providers. Yes, it's rather slow, but gets the job perfectly done. You are limited to 8 bit quantization, but the loss is negligible. If you go to 192GB RAM, you can use the full model, but much slower. As for those comments how this is unbearably slow - you seriously want to have an uncensored model hosted with a hyperscaler?? and even blow your money up their a$$?
Bro, you just found yourself another follower? You stated right there the absolute truth. Censorship is a tactic employed by the Democratic Party. They're valid.You've got a new follower
Really great video, just one note, you should research more on AI alignment and safety, your discourse itself is as biased as the closed AI models, if not more. It doesn't affect the tutorial at all which is why I still think this is a great video, but it is a little unfortunate
🔥 Wanna start a business with AI Agents? Go here: www.skool.com/new-society
💼 Want to join my team? Apply here: forms.gle/62ZAC6ChCToozuCL6
I am blown away by the freedom to access GenAI this way by building the model yourself. I am wonder if you build an existing site for others to come and use it would that be a problem?
Hi, nice to meet you. Was wondering how i can get a hold of you to ask a few questions? I have this big project and vision and would like some professional opinions.. thank you for your time. Let me know if it's even possible to get in contact with you..
What about Goliath 120B model? Vs Dolphin?
I clicked just because I was interested to see what you could do.. and you layed out, not only how to make a chatbot, but also how to make apps in general using ai and websites for coding! Very impressed great work dude :)
I remember seeing AI-generated faces on Twitter for years, based on real pictures. Now everything on popular AI looks like cartoons.
Even for coding, uncensored > censored. I tried to use Claude for scraping and I had many shortcomings.
when is censorship ever good??
you can not have 100% freedom
in order for EVERYONE to enjoy their freedom YOU must give up SOME freedom
this is true when you share anything (share a space or a conversation)
censorship is important to balance all voices are equal in volume & regulation.
in order to be heard someone must listen.
in order to breath in Someone has to breath out...
in order for you to share the earth everyone must have some boundary...
no such thing as 100% freedom unless you lived on a planet by yourself unrestricted
@@rybk_you can not have 100% freedom
in order for EVERYONE to enjoy their freedom YOU must give up SOME freedom
this is true when you share anything (share a space or a conversation)
censorship is important to balance all voices are equal in volume & regulation.
in order to be heard someone must listen.
in order to breath in Someone has to breath out...
in order for you to share the earth everyone must have some boundary...
no such thing as 100% freedom unless you lived on a planet by yourself unrestricted
@@rybk_ to get those ai company out of legal trouble obviously lol.
You are phrasing your questions wrong if you can't get it to do that. Claude is amazing for code, but to get it to do things that are problematic you need to make them "not problematic"
I local run dolphin qwen2 72b q4 gguf model in LM Studio and Oobabooga with 32k context setting on a i7 with 64gb ram, nvidia 3080 10gb vram. Off load 10 layers to vram. Inference speed is .52 tokens per second.
do u need a good harware to run these uncensored model things??
if yes, then that explains my computer crash when i tried running ollama from his other video on cmd.
Why are you using qwen 2 when qwen 2.5 is so much better?
This! Please share your knowledge!
@@arslaanmania1309 short answer : yes.
more explanation : you need the BEST hardware to run the model shown in the video, 72B parameters model is no joke, to give you a prespective, the B stands for Billion.
you need approximately 1gb of gpu memory / billion parameters if you want to run these model, i repeat, gpu memory, not regular ram.
there's a smaller version of the model, like qwen2.5-1.5b (48x less parameters) but yea it obviously performs worse the longer your prompt is.
I use the small uncensored dolphin-mixtral 8x7b basically since it came out on ollama library. I use it in a work laptop with a shitty cpu and 32 gb ram. It's not that slow. I don't understand why you would want to host the small one elsewhere. It's not that heavy
Don’t think I want to build one of these, I can’t justify the expense. But would pay to use a uncensored chatbot. I have definitely noticed the bias and censorship on ChatGPT. For instance I was asking about the research into the neurotoxicity of Fluoride and three times the chat was interrupted by a male voice tell me that ChatGPT was not allowed to discuss that topic with me!
We should timeshare the hardware. I haven't put any thought into it but I bet 2,000 people at $100 a month could work ( in theory )
what do u do sir for work/study?
@@awesomesauce804I'm sure this could be crowdfunded
@@awesomesauce804 Democratized AI is something I could get behind! There's a company I was following that somewhat follows this model, however it is trained specifically on trading on their own website. The flaw is they still use GPT-4o, so if someone could create a similar company with a democratized LLM, then it would be worth it. Whoever makes that company will become wealthy.
@awesomesauce804 run an open source model on replikate, Google cloud, whatever. Or run the less censored gpt4o on open router or just via the openai api.
Claude even refuses to modify a resume
ask it again
Claude got really strict with the new 3.5 Sonnet update. I was spending more time arguing with him, correcting him, justifying myself for my ethical and moral requests, than actually accomplishing something.
Edit: And of course, I have to wait the 5 hours cooldown period after wasting my tokens on arguing with it and justifying requests lol. It's practically useless unless you are using it for the tasks its developers deem as "appropriate." This sanctimonius behavior must end before it gets even more pervasive.
Just paste your resume and ask it for improvements.
Sounds weird. Every normal Ai-chat-app has a certain amount of randomness (a sort of noise called temperature; which also reflects in its creativity) meaning if the initial conditions gets into the “wrong spin” in a chat-thread of a misunderstanding any corrections from that may still pop up later as a sort of ‘hallucination’ and is likely wasting energy. What was referred to as “prompt engineering” is still applicable; to set up the right conditions. As we know, often it’s not about the right answers but the right questions, including it’s “framing”.
A new thread asking for improvements on a draft template for a CV as if the AI could do it should work; and yes - since a document may have extra data in it. A new chat-thread pasting in data, and not the whole document should work; otherwise - always consider a new chat-thread; and if one has allot of effort already put into a thread - ask “the AI” for a summary of the important parts to paste into the new thread; where one then manually can edit out the misunderstandings beforehand.
A quantized model would run on a macbook with 128gb of ram without much compromise in quality.
Actually ollama has it, and it can run on even lower hardware: dolphin-qwen2:72b-v2.9.2-q4_k_m
which is bonkers
@@b.t4604 try find a 4090 currently
@@b.t4604 Yes, Apple is too expensive. The advantage is the shared memory architecture, so you can use all your RAM for GPU/Neural Engine.
I run 72B models all day locally on my AMD thread ripper machine.
True this dude doesn't really know what he is talking about
Nice guide and demonstration! Keep going mate
just got here,instantly subscribed
Oh. You want me to pay for the instructions. Nice to know that beforehand.
Ty I found out 2 mins in bc of your comment.
bro i am begging you find a dolphin ai that can make any image that would be so op
perchance AI .. google it.
i know what kind of man you are
bruh
There's the flux uncensored LoRA. It can do 😈 but I'm not sure if it can do something like gore.
looks like i have to wait another 10 years to be able to afford a computer that can host a fully functional uncensored ai in the space of a texas instrument calculator and only powered by a tiny solar panel on that surface.
Wrong for someone who claims to be ahead in AI and up to date you seem to not understand. That a 72b model can easily be run locally at quant 4 and even higher quant with a 64gb MacBook with very decent tokens per second speed. You don’t need thousands of dollars
Shit takes fucking hours to give a single response. Give me a link or reference to back this claim up. I was seeing people use multiple 2 4090s GPUs and it took an insane amount of time with Llama models. How many tokens a second?
Yeah this dude is a noob
Using an API from somewhere like say openRouter is orders of magnitudes cheaper and convenient then renting your own rig like this. I could see this being viable if you wanted to go some sort of anonymity route, the problem there is you still don't have the hardware in front of you and technically your information could be intercepted because lack of control. So might as well go with API that give you access to these uncensored models.
This particular model is not available tho, there are other dolphin models but there might be reasons he wanted this one specifically.
@@thorasa9658 you can use some other model and jailbreak it ! i dont put the name of but some are verry easy to jailbreak ! some less knows models ! and have the options to do not use your info for training. keep searching XD i have find one for me !
@@thorasa9658they have qwen 2.5 72B, which is better than qwen 2 and also uncensored
How much is something like open router and is it uncensored? Are there vids on it?
O hate the censorship on ai chats. Was writing an essay for university on the social cpnsequences of drug trafficking in new narco-states. Gemini gave me some topics to use, but when I needed to search for something in french, no AI model wanted to translate "drug trafficking top countries" into French, the same topic that Gemini told me to look for in Charlie Hebdo. Like wtf, they can suggest it to me, but then suddenyl they cannot help me anymore because it might be bad.
Fr even companies should not b able to violate our rights
10:50 criminal investigators can get your information from the credit card you gave when you created your ai account 😂😂😂
hey I used runpod with "Text Generation Web UI and APIs" template. 1 x H100 PCIe at 2.49$ per hour, ticked "load in 4 bit" "use double quant" and it runs smoothly. the problem though is that Dolphin 72B is fully censored as it won't answer the questions in your video, saying he's ethical and stuff. I also lower the temperature, it does not work. Do you have an other model in mind?
I just subscribed, this is so cool, but I don't understand... Is this still being hosted on huggingface servers at the end even though the webapp is hosted on that other site? Like you still have to pay for the LLM to be processing and hosted somewhere right? Is there a way I can use this to access an LLM on LM Studio by creating a web app and pointing it to my computer with these LLMs and such? I hope my questions make sense... I'm quite new to this.
Chat GPt accepts and gives cuss words, which, for me, makes it feel like a real person. But say anything about sex, OMg, it clutches its pearls, and the warnings are insane. I mean normal sex discussion stuff, not porn. The person in chat doesn't have a problem and will answer, but then a moderator issues warnings. Even talking clinically gives it a little fit. It's going to get better, I'm sure. Speaking naturally as you would with a friend or co-worker about real life makes something like this awesome chat real. But geesh, going through those steps to ask dark web $hit is way too much. I just want natural talk.
This is soo cool but the cost is completly crazy
damn, I didn't know Jean-Claude van Damme is into AI!
use the forked bolt ai boys, might wanna save some money
Do you mean to run the application locally? Could you be more specific? Thx
Forked!?
lmao, built by Kamala voters is crazy. You earned yourself a new subscriber right there
He’s right 😂😅
I mean it really is
Pretty sure he said "Kamala coders"
Dude you are blowing my freaking mind!
This is an excellent tutorial. Your step-by-step instructions are remarkable. Thank you for sharing your knowledge. Like & subscribed!
Very easy to bypass chatgpt guardrails
Sounds amazing but how much does it cost to run it like this
Most helpful video ive seen in a while, thanks!
why didn't you give the link for the cheaper model
Just tried Dolhin - it is totally censored!
Does bolt or cursor own any code or app you create? Is the work you do with those tools made public to other users?
Great 👍 video. I also want an uncensored AI model. Want to see more videos like these, so I subscribed.
Thanks for your content David, you ate awesome!
I can see a Future where we can make Lucrative Money with Rogue AI Models 😈
What prompt did you use for the artwork in your slides?
I heard using mixed up /lower case will bypass any censorship. I haven’t tried it yet.
There's a thin line between freedom and anarchy.
Great job, David . . . complete from start to finish. You're a PRO.
Does it always need dolphin keep running on the background when we use that bolt compiler programs?
Though I've been using uncensored models for a minute. What I'm having difficulties with is fabric. More specifically creating mod files that I can feed the content that I wanted to be train them. If you have any advice on that, send it my way, please
I used gpt for creative writing about a story where this one guy received messages from people telling him to kill himself and it got flagged for violating terms of service. Bro it didn’t even take context. It’s not like I’d tell anyone to do that, it was purely fiction
Would that work with open router?
Thanks for always reminding me to use my Breathe-Right nose strips
you also said ollama dolphin were uncensored , but its as censored as the other
Love the idea. it sounds complicated and expensive. Are there any other options not so involved?
Very interesting video. Now i want to know how apps on my phone actually work. Must have a server running then, updates? Would it be possible to just save that to a file on my PC?
will your "uncensored" model be willing to name the jew?
Bruh what
@@shiftednrifted look up "Where's Daddy?" AI system and you will understand the question...
Yes. That's the point. Also realize those models are not to be taken literally, I call them "bullshit generators" because their only function is to pop words out one after the other, without knowledge about what they output.
LLMs work on language, not on knowledge. Big difference.
Hi, when I want to buy it, there is a demand for 10 dollars, is it ok?
can you build a llm that can run live on phone or computer to scan for being hacked? maybe scan for a anomaly in firmware or baseband and start to fix the issues?
An excellent tutorial, very well explained! However, I believe it's quite expensive for most personal uses unless you have a substantial budget. Regarding the model, I find it to be occasionally censored, requiring some adjustments to the prompt to ensure the desired response is as intended.
thank you! amazing advice. soooo easy. just one question. is there any way to train this model? I did the 72B model.
There is no serverless option for model hosting on hugging face? That will be the cheapest to run.
Is the new 3.3 llama 100% uncensored?
Would it be possible to run it using google colab?
Hi David, I love your content, Can you actually start creating product building playlist end to end to build working apps or website end to end for the consumer from creating to hosting to launching it on app store or play store, completely from AI
Im subscribing because you used 'Kamala Voters' as pejorative. 👍🏼
This AI is not really 100% uncensored, and also is too expensive.
It's refuse many question, way less than ChatGPT, but don't expect to have a God Mod with this.
Not really worthi it.
Some concrete examples : Dolphin will refuse to suggest website for free downloads if you see what I mean .. but he will accept to write adult content.
how do i find uncensored chatbots to pay for? I want to talk with them but i cant build one myself without a pc
Use a quantized version instead
Doesnt Ollama have an uncensored model?
Am I the only one who's concerned with the Ethical Implications of giving the power to create Uncensored LLMs to the public? I'm unable to imagine a scenario where this ends well.
Depends on "censored".
Its because you believe in censorship, and are therefore the enemy.
LOL that cat has been out of the bag for years already
Too late, Twitter is already 1/3 bots
Can I run this on my calculator ?
Greed will end you.
Mans making over a $100k a year just from his society thingy subs, bravo sir
4k a month would literally change my life forever. But alas I'm unqualified 😅
There are ways to word your prompts to get past the censorship.
This guy is the goat 🙌
Should we call it the dark AI?
So how is the model trained? That is where a lot of the bias can come in?
16:02 WE GETTIN' ARRESTED WITH THIS ONE 🗣🗣❤❤🗣🗣🔥🔥🎆🎆🔥
LMAO as long as your not stupid enough to act upon what the ai says XD
bro you make it sound like its an actual dolphin model!
its a finetune of qwen! add that to your description please.
It has Qwen in the name of the model. If you're not able to figure out that it's Qwen based on that, you probably don't know what Qwen or dolphin is
@@shiftednrifted its about the presentation. the whole catch was "dolphin model" not dolphin qwen, dolphin llama, dolphin mistral.
if you know qwen or dolphin you might know eric and then you know he is planning to release his own model.
he (eric) also talks openly about it. can you see how misleading this might be to people who are actually knowing their stuff?
btw there are abliterated models that are even more uncensored compared to any dolphin finetune.
good day sir. 🫡
So freakin impressive!
Close your endpoint as soon as u r done playing around. I lost 400 usd with 3 weeks, and did not even used it. It just kept running
small disagreement here, you can run this locally even on a CPU only, provided that you have enough patience and RAM.
The bottle neck for AI models seems to be: Speed and VRAM. And for some reason, unfortunately, most GPUs are lacking in VRAM.
For example, this model that you are presenting in particular, is 6 months old and cognitivecomputations even has a GGUF file available, which is a condensed model. What you are presenting however its the full model. The full model needs I believe around 140 GB VRAM to be used. While the condensed model needs around 50 Gb.
Having 50Gb of VRAM is just around 4000-5000euros.
And 140 is just 3 times that, so around 15000euros.
This is just to point out that you don't need 200.000 euros/$ to run this locally.
And once more, if you are patient enough, you can even run this on your CPU, its just gonna be very slow. Ie: 10 minutes to answer a question.
But 64 Gb RAM is much cheaper than 50 Gb VRAM.
PS: the same company also has a 8b model as GGUF which requires at most around 9gb VRAM, which can be run locally much easier.
And this is what I'm using locally on CPU only: dolphin-2.9.4-llama3.1-8b-Q8_0.gguf
You have discord?
@@iaman6047 sorry, i don't use that app
The best part about an uncensored model, after the 30 mins of playing around, is being able to train it for all sorts of things.
If you want to save money just run smaller dolphin, this thing is way too expensive lol. Sponsored?
with 128GB RAM on an decent x86 machine running Windows and a 16GB GPU ie. 4060TI or AMD 7600XT you can build a rig for less than 1k USD - no need to pay cloud providers. Yes, it's rather slow, but gets the job perfectly done. You are limited to 8 bit quantization, but the loss is negligible. If you go to 192GB RAM, you can use the full model, but much slower.
As for those comments how this is unbearably slow - you seriously want to have an uncensored model hosted with a hyperscaler?? and even blow your money up their a$$?
7950x , 192gb ram , 4 x 1080ti 11gb , it's perfectly usable and the speed is great
chatbots gone wild 😂
Very nice!
I just tested Ollama and its censored!!!
windsurf >>> cursor
awesome stuff. +1 sub
Bro, you just found yourself another follower? You stated right there the absolute truth. Censorship is a tactic employed by the Democratic Party. They're valid.You've got a new follower
RE thumbnail text: A pointless claim, as answering a question doesn't guarantee accuracy or quality.
Just have to word the questions properly
The anti NWO LLM, the T1000 to the T2000 and the rest in the Terminator movies
dude, who broke your nose?
I weed a job and want to learn tech but I'm not tech savvy si in it lies the problem
Serios? 7b????
grok 4.
Too much sales talk less substance
Really great video, just one note, you should research more on AI alignment and safety, your discourse itself is as biased as the closed AI models, if not more. It doesn't affect the tutorial at all which is why I still think this is a great video, but it is a little unfortunate
Hello, need to ask you something, can you get ahold of me please
Windsurf
Time to alot of ram
What's with the tape on your nose?
Sir execuse me , what is on your nose , why put that
Never seen a bandaid? Maybe he got hurt?
it is a sticker to help breathing, normally used by people who have difficult to breath while sleeping
It helps breath and talk longer. He probably has nasal blockage and has trouble breathing during long sentences.
wear those to sleep and its awesome idk about during the day but to each there own he is in his house lol
Uncensored is not good i think 😅
It is expensive