LIQUID AI 40B (MIT): REAL Performance on Reasoning (My 5 Tests)

Discover AI

มุมมอง 4 522

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 3 ต.ค. 2024

ความคิดเห็น • 39

@GrahamAnderson-z7x วันที่ผ่านมา ⁺⁴
Someone needs to build a model that is only trained on logic tests and boilerplate snake game apps. And, to creatively apologize when you scold it for giving wrong/lazy answers. That would be entertaining, at least.
@fontenbleau วันที่ผ่านมา ⁺⁴
i don't see it on huggingface and it's the first fail of creators avoiding main competitive platform.
@ps3301 2 วันที่ผ่านมา ⁺²
Openai should start to build their models based on liquid neural network and see if they can improve with this architecture
@SasskiaLudin 2 วันที่ผ่านมา ⁺²
Those are abysmal results indeed. Thank you so much for debunking seemingly outrageously false claims. I believe you, so I'm finding this unexplainable if not disgusting from such a "reputable" MIT team 🤔
@NLPprompter 2 วันที่ผ่านมา ⁺³
Logo
Yes, that's correct. I was likely trained using a transformer-based model, which is a type of neural network architecture that's particularly effective for natural language processing tasks. This architecture allows me to understand the context and meaning of words and sentences, and to generate responses that are relevant and coherent.
AU
do you know i have met so many transformers models, they are mostly can't count how many "r" is in strawberry word, can you?
Logo
I'm sorry to hear that you've had negative experiences with some AI models. However, I'm designed to provide accurate and helpful responses. I can certainly count the number of "r"s in the word "strawberry" for you. There are two "r"s in the word "strawberry".
@NLPprompter 2 วันที่ผ่านมา
Hmmmmmm.....
@psi4j 2 วันที่ผ่านมา ⁺³
They have Joscha Bach on their team 🎉
@biroshi2969 2 วันที่ผ่านมา
Please do a video testing the limits of OpenAI’s advanced voice mode.
All your videos are informative and useful.
@themax2go 2 วันที่ผ่านมา ⁺¹³
rip my eyes... dark mode pls 🙏🙃
@dinoscheidt 2 วันที่ผ่านมา ⁺⁴
Yeah, weird that research papers are in light mode. Light mode like the outside world some rumor about where supposedly gras is. Really weird
@dinoscheidt 11 ชั่วโมงที่ผ่านมา
@@themax2go Than set up your display device according. Even smartphones have accessibility features to adjust for a whole array of visual assemblies. You can not walk into a cinema, don‘t wear sun glasses, and complain that the screen is to bright.
@themax2go 7 ชั่วโมงที่ผ่านมา
@@dinoscheidt *then *too
@themax2go 7 ชั่วโมงที่ผ่านมา
@@dinoscheidt i understand that you can't use dark mode. btw accessibility features don't work (well) for that particular usage scenario (YT vids) and even PDFs, and are AFAIK not intended for that purpose. ty for your reply and suggestion though
@WearyTimeTraveler 2 วันที่ผ่านมา
Interesting, I asked it a couple of logical questions with spatial reasoning which it obviously failed zero shot but with a slight 3 word prompt it recovered, and at 3B. With such small models and massive context it seems like you could run chains-of-thought through this all day on an edge device
@NLPprompter 2 วันที่ผ่านมา
another proprietary... this made me thinking, if we all got these propriety models so different from transformer, I think we all don't know anything what happened in other countries...
@zyzhang1130 2 วันที่ผ่านมา ⁺⁴
I mean I don’t think any of their previous remarkable achievements has any transferable value to what they are doing now, I.e., jumping into the LLMs bandwagon, so this is expected.
@Lemure_Noah 19 ชั่วโมงที่ผ่านมา ⁺¹
I guess most of the logic test applied are very complex for MoE of 13 x 3.1B.
IMHO, a MoE of 3B models is like a bunch of dwarves trying to pick apples in a tall tree. Not impossible but quite hard.
@antonystringfellow5152 9 ชั่วโมงที่ผ่านมา
This is a 40B model
@scottmiller2591 2 วันที่ผ่านมา
Well, that was brutal.
@Fredi-yu3rc 2 วันที่ผ่านมา ⁺⁴
The whole LFM thing sounds too good to be true, and their blog posts about the architecture are also weird
@WearyTimeTraveler 2 วันที่ผ่านมา ⁺¹
Weird compared to transformer blocks but then again transformers are weird. At least this is a model with polymorphism and an extremely long context window for edge devices
@Fredi-yu3rc วันที่ผ่านมา ⁺¹
@@WearyTimeTraveler I meant the blog posts are weirdly written, saying they invented Liquid Neural Networks is also a lie.
@thatonebanan4 2 วันที่ผ่านมา ⁺¹
these LFMs are recurrent like other liquid neural nets right? maybe it has trouble remembering details because of its compressed hidden state?
@SirajFlorida 2 วันที่ผ่านมา
Right, when I tried it out the other day I thought I would be very pleased by the performance. However, after getting questionable results I took a closer look and also noticed the caveat that their graph is only on the MMLU. So my concern with their claim performance at specific parameter count made me wonder how it would perform if they also trained their architecture in other categories besides the MMLU as well... which made me question the initial claims. The question is that if they apply say 3B parameters to the MMLU, that effectively expert model at coding. Had they trained the model in other areas as well, would that impact the ability of the architecture to perform on the MMLU? Please forgive my ignorance.
As for the tests, I attempted to make it generate python, rust, java, javascript, and in my tests the three billion parameter liquid model was outperformed by Llama 32_3bi. I have to admit I haven't had the time to read Dr Hasani's paper. Thank you so much for illustrating and highlighting the team. 😀
@NLPprompter 2 วันที่ผ่านมา ⁺²
how do we know what they saying is true? these aren't transformer based model? as far i know only transformer based model training method able to achieve Generalization, even with samba architecture it never achieved Generalization this good...
a paper would be interesting.
@Fredi-yu3rc วันที่ผ่านมา ⁺²
we don't, sadly. I think it's not really worth looking into it. Some people say Liquid AI just did this to get funding. I think everything with Liquid AI is weird, kinda also reminds me of the Reflection drama
@strangereyes9594 วันที่ผ่านมา ⁺²
MIT made an LLM using DEI.
@themax2go 2 วันที่ผ่านมา ⁺²
tldw: FAIL (of current version)
@seanscon 2 วันที่ผ่านมา ⁺¹
failed the strawberry test
@cbgaming08 วันที่ผ่านมา ⁺¹
that's a dmb test bro😅
@cbgaming08 วันที่ผ่านมา ⁺¹
*TOKENIZATION*
@robertotomas 2 วันที่ผ่านมา
What is Labonne’s series you recommend? (On beautiful code)
@code4AI 2 วันที่ผ่านมา
@@robertotomas github.com/mlabonne/llm-course
@criticalnodecapital 2 วันที่ผ่านมา
yes i saw this! im visiting them.. but the model was meh
@smicha15 2 วันที่ผ่านมา
ouch...
@NextGenart99 2 วันที่ผ่านมา
The model was terrible I found
@robertotomas 2 วันที่ผ่านมา
Mmlu-pro in the mid 50s seems about par for sota
@rrr00999 2 วันที่ผ่านมา
terrible!

ต่อไป

เล่นอัตโนมัติ

Anthropic's new improved RAG: Explained (for all LLM)