Someone needs to build a model that is only trained on logic tests and boilerplate snake game apps. And, to creatively apologize when you scold it for giving wrong/lazy answers. That would be entertaining, at least.
Those are abysmal results indeed. Thank you so much for debunking seemingly outrageously false claims. I believe you, so I'm finding this unexplainable if not disgusting from such a "reputable" MIT team 🤔
Logo Yes, that's correct. I was likely trained using a transformer-based model, which is a type of neural network architecture that's particularly effective for natural language processing tasks. This architecture allows me to understand the context and meaning of words and sentences, and to generate responses that are relevant and coherent. AU do you know i have met so many transformers models, they are mostly can't count how many "r" is in strawberry word, can you? Logo I'm sorry to hear that you've had negative experiences with some AI models. However, I'm designed to provide accurate and helpful responses. I can certainly count the number of "r"s in the word "strawberry" for you. There are two "r"s in the word "strawberry".
@@themax2go Than set up your display device according. Even smartphones have accessibility features to adjust for a whole array of visual assemblies. You can not walk into a cinema, don‘t wear sun glasses, and complain that the screen is to bright.
@@dinoscheidt i understand that you can't use dark mode. btw accessibility features don't work (well) for that particular usage scenario (YT vids) and even PDFs, and are AFAIK not intended for that purpose. ty for your reply and suggestion though
Interesting, I asked it a couple of logical questions with spatial reasoning which it obviously failed zero shot but with a slight 3 word prompt it recovered, and at 3B. With such small models and massive context it seems like you could run chains-of-thought through this all day on an edge device
another proprietary... this made me thinking, if we all got these propriety models so different from transformer, I think we all don't know anything what happened in other countries...
I mean I don’t think any of their previous remarkable achievements has any transferable value to what they are doing now, I.e., jumping into the LLMs bandwagon, so this is expected.
I guess most of the logic test applied are very complex for MoE of 13 x 3.1B. IMHO, a MoE of 3B models is like a bunch of dwarves trying to pick apples in a tall tree. Not impossible but quite hard.
Weird compared to transformer blocks but then again transformers are weird. At least this is a model with polymorphism and an extremely long context window for edge devices
Right, when I tried it out the other day I thought I would be very pleased by the performance. However, after getting questionable results I took a closer look and also noticed the caveat that their graph is only on the MMLU. So my concern with their claim performance at specific parameter count made me wonder how it would perform if they also trained their architecture in other categories besides the MMLU as well... which made me question the initial claims. The question is that if they apply say 3B parameters to the MMLU, that effectively expert model at coding. Had they trained the model in other areas as well, would that impact the ability of the architecture to perform on the MMLU? Please forgive my ignorance. As for the tests, I attempted to make it generate python, rust, java, javascript, and in my tests the three billion parameter liquid model was outperformed by Llama 32_3bi. I have to admit I haven't had the time to read Dr Hasani's paper. Thank you so much for illustrating and highlighting the team. 😀
how do we know what they saying is true? these aren't transformer based model? as far i know only transformer based model training method able to achieve Generalization, even with samba architecture it never achieved Generalization this good... a paper would be interesting.
we don't, sadly. I think it's not really worth looking into it. Some people say Liquid AI just did this to get funding. I think everything with Liquid AI is weird, kinda also reminds me of the Reflection drama
Someone needs to build a model that is only trained on logic tests and boilerplate snake game apps. And, to creatively apologize when you scold it for giving wrong/lazy answers. That would be entertaining, at least.
i don't see it on huggingface and it's the first fail of creators avoiding main competitive platform.
Openai should start to build their models based on liquid neural network and see if they can improve with this architecture
Those are abysmal results indeed. Thank you so much for debunking seemingly outrageously false claims. I believe you, so I'm finding this unexplainable if not disgusting from such a "reputable" MIT team 🤔
Logo
Yes, that's correct. I was likely trained using a transformer-based model, which is a type of neural network architecture that's particularly effective for natural language processing tasks. This architecture allows me to understand the context and meaning of words and sentences, and to generate responses that are relevant and coherent.
AU
do you know i have met so many transformers models, they are mostly can't count how many "r" is in strawberry word, can you?
Logo
I'm sorry to hear that you've had negative experiences with some AI models. However, I'm designed to provide accurate and helpful responses. I can certainly count the number of "r"s in the word "strawberry" for you. There are two "r"s in the word "strawberry".
Hmmmmmm.....
They have Joscha Bach on their team 🎉
Please do a video testing the limits of OpenAI’s advanced voice mode.
All your videos are informative and useful.
rip my eyes... dark mode pls 🙏🙃
Yeah, weird that research papers are in light mode. Light mode like the outside world some rumor about where supposedly gras is. Really weird
@@themax2go Than set up your display device according. Even smartphones have accessibility features to adjust for a whole array of visual assemblies. You can not walk into a cinema, don‘t wear sun glasses, and complain that the screen is to bright.
@@dinoscheidt *then *too
@@dinoscheidt i understand that you can't use dark mode. btw accessibility features don't work (well) for that particular usage scenario (YT vids) and even PDFs, and are AFAIK not intended for that purpose. ty for your reply and suggestion though
Interesting, I asked it a couple of logical questions with spatial reasoning which it obviously failed zero shot but with a slight 3 word prompt it recovered, and at 3B. With such small models and massive context it seems like you could run chains-of-thought through this all day on an edge device
another proprietary... this made me thinking, if we all got these propriety models so different from transformer, I think we all don't know anything what happened in other countries...
I mean I don’t think any of their previous remarkable achievements has any transferable value to what they are doing now, I.e., jumping into the LLMs bandwagon, so this is expected.
I guess most of the logic test applied are very complex for MoE of 13 x 3.1B.
IMHO, a MoE of 3B models is like a bunch of dwarves trying to pick apples in a tall tree. Not impossible but quite hard.
This is a 40B model
Well, that was brutal.
The whole LFM thing sounds too good to be true, and their blog posts about the architecture are also weird
Weird compared to transformer blocks but then again transformers are weird. At least this is a model with polymorphism and an extremely long context window for edge devices
@@WearyTimeTraveler I meant the blog posts are weirdly written, saying they invented Liquid Neural Networks is also a lie.
these LFMs are recurrent like other liquid neural nets right? maybe it has trouble remembering details because of its compressed hidden state?
Right, when I tried it out the other day I thought I would be very pleased by the performance. However, after getting questionable results I took a closer look and also noticed the caveat that their graph is only on the MMLU. So my concern with their claim performance at specific parameter count made me wonder how it would perform if they also trained their architecture in other categories besides the MMLU as well... which made me question the initial claims. The question is that if they apply say 3B parameters to the MMLU, that effectively expert model at coding. Had they trained the model in other areas as well, would that impact the ability of the architecture to perform on the MMLU? Please forgive my ignorance.
As for the tests, I attempted to make it generate python, rust, java, javascript, and in my tests the three billion parameter liquid model was outperformed by Llama 32_3bi. I have to admit I haven't had the time to read Dr Hasani's paper. Thank you so much for illustrating and highlighting the team. 😀
how do we know what they saying is true? these aren't transformer based model? as far i know only transformer based model training method able to achieve Generalization, even with samba architecture it never achieved Generalization this good...
a paper would be interesting.
we don't, sadly. I think it's not really worth looking into it. Some people say Liquid AI just did this to get funding. I think everything with Liquid AI is weird, kinda also reminds me of the Reflection drama
MIT made an LLM using DEI.
tldw: FAIL (of current version)
failed the strawberry test
that's a dmb test bro😅
*TOKENIZATION*
What is Labonne’s series you recommend? (On beautiful code)
@@robertotomas github.com/mlabonne/llm-course
yes i saw this! im visiting them.. but the model was meh
ouch...
The model was terrible I found
Mmlu-pro in the mid 50s seems about par for sota
terrible!