Poorman's ChatGPT-4o Works!! 🤣
ฝัง
- เผยแพร่เมื่อ 14 พ.ค. 2024
- This video demonstrates a working prototype of CHATGPT-type UI powered by GPT-4o like model except that it's all completely powered by Open source models!
🔗 Links 🔗
Hugging Face Spaces - huggingface.co/spaces/KingNis...
Introducing OpenGPT-4o
KingNish/GPT-4o
Features:
1️⃣ Inputs possible are Text ✏️, Text + Image 📝🖼️, Audio 🎧
and outputs possible are Image 🖼️, Image + Text 🖼️📝, Text 📝, Audio 🎧
2️⃣ Flat 100% FREE 💸 and Super-fast ⚡.
3️⃣ Publicly Available before GPT 4o.
Future Features:
1️⃣ Chat with PDF (Both voice and text)
2️⃣ Video generation.
3️⃣ Sequential Image Generation.
4️⃣ Better UI and customization.
Announcement post - huggingface.co/posts/KingNish...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs - วิทยาศาสตร์และเทคโนโลยี
Poorman's GPT is still better than ClosedAI Skynet. Altman got lucky with first mover advantage and funding, now he thinks he's gatekeeper of humanity, he wants to lobby against open source and make hardware level identification for SOCs and GPUs.
"gatekeeper of humanity" 😒😒😒
He is trying to save us all … 😂😂
his twitter name says it all - "sama" (japanese for someone of royalty or godhood)
Ilya and others leaving is sign that Altman has too much power
Amen bro 💪
It's nice to see community around AI making their own things and open source helps a lot to bring AI to everyone.
Ah. I was worried before publishing the video. Glad to receive a positive comment ❤️
@@1littlecoder nah, don't worry, open source alternatives deserves some attention, even if they aren't as good.
There are use cases - companies might want to save some money on subscription prices, some people might want to make something without restictions chatGPT \ Gemini has, some people might care about their privacy and prefer something local...
So why not.
Open source does not., this nonsense has to stop the vast majority of people who use AI do not know what a Gradio app is.
The compute needed to train these models means that only organisations with huge budgets can afford to do so.
Big Tech is also giving access to their AI in a way the vast majority of people can easily use.
I did the exact same thing in my platform some months ago - far more efficient than having one mega model that needs trillions of parameters to answer “Hello” all the time
Great. What was your stack
That's a real problem, we should create an new kind of LLM that select only the required layers and parameters depending of the input.
@@1littlecoder Llava1.6 , Open source LLM (mixtral , llama3 etc.) , whisper for transcription and bark for voice synthesis
Thanks for the video. I really like your viewpoints and insights!
As far as pronunciation, regarding "A" and "I" , the A rhymes with "way", "they", "say" and the I rhymes with "eye", "sigh", and "sky".
Thanks for the info! Will try to improve
Love the project
I've tested it, and the voice chat has improved; it now even supports live chat and video generation. It's exciting to see the direction the open-source community is heading.
Hey 1littlecoder, just curious to know more about the free plan of GPT4o, what do you think is scenario? Is the free usage based on the limited input of tokens or prompts? I hit the paywall cap really quickly, it was not even 10 prompts for me.
It's cool to look at the source like you did and see that it's actually not that much code.
great video
Nice.
Hello my friend, thank you for the video! Is it possible to run this locally with a RTX 4090 and if so how do we install it? Thank you again!
Thank you for the response 🙏🙌
I guess people and you do not understand the gist of GPT4 Omni. Opensource technologies like visipn chat, text to speech or speech to text where already out there for a long time. Btw, why poorman’s? Omni can also be used by free users
we understand it's multimodal, the gemini was there before as well
@@1littlecoder I think the point is more that it uses speech to speech instead of speech to text to text to speech isn’t it? Still enjoyed the video though. Can’t wait until til we get open source speech to speech models.
Maybe you are not aware but there was already pretty good open source vision models before that 😅
Considering this is multi-modal; If one wanted to run this entire stack locally, what would be a estimate/guesstimate of the minimum GPU resources required? Are we talking 20 H100's, or 1 RTX6000? Remember ...guestimate.
How much electricity do you want to invest?
64MB RAM for the Yi1.5-33B + llava with Ollama and OpenWebUI, whisper and xttsV2...
@@AltMarc Sounds reasonable. So One NVIDIA Ampere A16 64GB GDDR6 250W GPU (D1P1T-OSTK) at $4100 USD. Probably another $2K for a 24C workstation-class PC with 256GB RAM. Total $6K investment. Now the question is what is the ROI at the personal use level.
LLaVA-next-video and xtts to rvc.