Forget Deepseek, Here's another MAX Release from China!

1littlecoder

มุมมอง 26 824

818

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ก.พ. 2025
Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model
Qwen2.5-Max, a large-scale MoE model that has been pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies
Qwen Chat here - chat.qwenlm.ai/
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1lit...
🧭 Follow me on 🧭
Twitter - / 1littlecoder

ความคิดเห็น • 99

@1littlecoder 7 วันที่ผ่านมา ⁺⁹
The model has been open sourced - huggingface.co/collections/Qwen/qwen25-1m-679325716327ec07860530ba
@tuna1867 7 วันที่ผ่านมา ⁺²
❤
@zyxwvutsrqponmlkh 8 วันที่ผ่านมา ⁺²¹
I been telling folks for years that Chinese ai was no joke. People were mostly ignoring all the advances for the longest time. I been running mostly Chinese local models sense forever.
@syedibrahimkhalil786 8 วันที่ผ่านมา ⁺⁶⁴
Why forget bro? Deepseek is still good :D 😂
@1littlecoder 8 วันที่ผ่านมา ⁺¹⁵
yeah maybe don't forget :D
@VaneNickOke 8 วันที่ผ่านมา
@@1littlecoder😂
@DailySpark_365 8 วันที่ผ่านมา ⁺⁷
It’s a figure of speech. He doesn’t mean it literally. 🤫
@abulka 8 วันที่ผ่านมา
Not the DeepSeek 7B model, I happily deleted that useless model.
@syedibrahimkhalil786 8 วันที่ผ่านมา
@@DailySpark_365 I understand bruh xD
@MattRodriguez-h7j 8 วันที่ผ่านมา ⁺³³
Dude...I follow your channel...your testing of deepseek was very interesting. It really shows that you are passionate about these things. Love and Wishes from Norway.
@1littlecoder 8 วันที่ผ่านมา ⁺⁵
Not sure if you follow chess, but always curious about Norway because of Magnus Carlsen. Great to hear from you (from Norway)!
@VaneNickOke 8 วันที่ผ่านมา ⁺¹
I fully agree about his excellence and passion.
@meisterblack9806 8 วันที่ผ่านมา ⁺⁸
Qwen2.5-Max is the most powerful language model in the Qwen series. It achieves excellent performance in complex reasoning, instruction following, mathematics, coding, role-playing, creative writing, etc.
Maximum context length: 32,768 tokens
Maximum generation length: 8,192 tokens
Modality: text
@Alden1320 8 วันที่ผ่านมา ⁺¹⁸
Today: forget deepseek
Tomorrow: Qween is dead
Past tomorrow: GPT R05 killer
Past past tomorrow: stop using llama15
We are tired of the hype BS clickbait
@ugwuanyicollins6136 6 วันที่ผ่านมา ⁺²
GPT Ro5, i see what you did their 😊
@Aiworld2025 8 วันที่ผ่านมา ⁺³
I’m here before 100k subscribers, as usual top content! I really like how you you reference “like do you remember when..it helps connect everything and keeps our brains in focus in connecting the progress of these models”
The one thing I noticed is these benchmarks “including deepseek 17b model say they are better than Claude but imo it doesn’t really communicate the same way especially when coding.”
Data storage 😂 at this point the game is over the models are all trained, there isn’t some transformation of models because of our inputs, we are just “uneducated in some areas…” this is just my opinion “the models are smarter than us, we just feel like the cook when the tool is helping us cook”
@danial_amini 7 วันที่ผ่านมา ⁺¹
i watched a lot of channels for this new wave of AI bots but you are the most reasonable, not too hyped or too negative but very very practical. thank you sir 🙏 for me & a lot of people the ability of AI bots to output long codes and error free in first try is really important to avoid the debugging headache.
@1littlecoder 7 วันที่ผ่านมา ⁺¹
I appreciate that kind feedback! Hope I can do more of these!
@superfliping 8 วันที่ผ่านมา ⁺¹
Let the inference battles begin, thank you for your video more insight is always helpful is it open source and open weights?
@meth8848 8 วันที่ผ่านมา ⁺⁵
Who cares where the data is being stored as long as it works
@AyushYadav-yj1sy 8 วันที่ผ่านมา ⁺²
Following you since your matplotlib video, atleast since then I started following your in my feed. Glad as the AI revolution is coming, you are finally going to have the moment to shine. Keep up the good work.
@mash-room 8 วันที่ผ่านมา
How about kimi k1.5? Multi- model of r1/o1, Can you also test it?
@18hourgaming66 8 วันที่ผ่านมา ⁺¹
I don't know if stack overflow is okay for "copying UI stuff" but when it comes to systems software - OS, k8s, packet forwarding, claude gives around 10x productivity benefit over trying to find a proper solution in stack overflow.
@1littlecoder 8 วันที่ผ่านมา
And claude 3.5 sonnet is mostly correct than GPT-4o except their infra is not solid
@altdoom5205 8 วันที่ผ่านมา
Tested it and it got all the tough questions right, including the 10 sentences that end in "apple".
@leeme179 8 วันที่ผ่านมา ⁺¹
as far as i Know Qwen had Max and Pro variant for a while but both are API only not open weights
@sfl1986 8 วันที่ผ่านมา
Its not available on openrouter yet, what ddo we know about function calling for agents on this one?
@amirrahman853 7 วันที่ผ่านมา
I am really confused, I have tried deepseek r1 7b also the 1.3b locally and gave it a simple deep merge task in JavaScript
Once it's starting thinking even after 10 mins it's just thinking and thinking but it codeqwen 7b it's solved instantly
I really don't understand why it's so hyped or maybe my one is broken really confused
@alx8439 8 วันที่ผ่านมา ⁺⁴
Waiting for "pro" version of model. Then "pro max". Then "ultra". Oh, sht, no, sorry, wrong room
@mrinalraj4801 8 วันที่ผ่านมา ⁺²
Please switch to dark mode in vs code
@BracerJack 8 วันที่ผ่านมา ⁺³
Your IDE is in bright mode, are you trying to hurt eyes? 😅
@1littlecoder 8 วันที่ผ่านมา ⁺²
@@BracerJack sorry I recently made a presentation with a projector and with the dark mode no one could see anything, will change again!
@BracerJack 8 วันที่ผ่านมา ⁺²
@1littlecoder 👍
@1littlecoder 8 วันที่ผ่านมา
Just saw your channel, super cool 😎
@BracerJack 8 วันที่ผ่านมา ⁺¹
@1littlecoder awe, big hugz to you 🤗
@Sammyli99 8 วันที่ผ่านมา ⁺¹
Put your blue screen on bud...I hate this woke...please change everything cos I'm still breastfeeding from mama at 30 bs. Hate it. No real men left, just trans-firmers.😂😂😂
@ledererova 7 วันที่ผ่านมา ⁺¹
Guess what the largest position in Michael Burry's (featured in The Big Short movie) portfolio is. Yes, it's Alibaba.
@1littlecoder 7 วันที่ผ่านมา
He's shorting or shorted Alibaba?
@huhsaywhat 8 วันที่ผ่านมา ⁺³
Doesn't Deepseek R1 perform better than Qwen?
@1littlecoder 8 วันที่ผ่านมา ⁺⁴
R1 is a reasoning model ,here the comparison is with V3 - non-reasoning model!
@CHADAFGHAN 8 วันที่ผ่านมา
TRULY IMPRESSIVE
@sohamnimbalkar66 8 วันที่ผ่านมา ⁺¹²
China is not stopping now
The way stocks of NVIDIA has dropped(deepseek) my god too much drama is coming in the universe of AI 😅
I have read about space race between US and Russia and now it's going to be an AI Race between US and China
@MarcosCapixaba 8 วันที่ผ่านมา ⁺¹
You mean a race between China Chinese Engineers and Chinese Engineers hired by American companies ?
@Sammyli99 8 วันที่ผ่านมา ⁺¹
Up 8% on the 17% drop, so the market kinda thinks its not a serious long term threat. My opinion is, the market has no idea...as usual 😮
@chrismanchin 7 วันที่ผ่านมา
Is qwen an open source?
@Aapig 6 วันที่ผ่านมา
yes
@ugwuanyicollins6136 6 วันที่ผ่านมา
Qwen 2.5 plus & Max ain't the rest aren't.
@1msirius 8 วันที่ผ่านมา
hey, man I really love your work
@sherman-j5j 8 วันที่ผ่านมา
Wait til you see the one coming in about 6 or 7 weeks. Start shorting the for profit Ai stocks in the USA.
@flashflexpro 7 วันที่ผ่านมา
32,768 tokens size limit, can't do much really.
@Dp-dx3zu 8 วันที่ผ่านมา ⁺⁴
Swear, deep seek v3 was pretty fire
@testales 8 วันที่ผ่านมา
As much as I like China pumping out new models and putting pressure on companies like OpenAI, this model is not impressive at all judging from the quick test I did online. It's at best at the level of an average 70b at q4. It not only failed to answer my questions it even stuck to it's incorrect answer despite me telling it that it is incorrect and to try again while giving it huge hints. So it actually failed even harder. If I ask Claude something it gets wrong and tell it so and give it hints, it usually picks that up and tries to improve instead of ignoring me and running in circles. So just failing is one thing but not being able to collaborate in a productive way is another issue.
@srikantdhondi 8 วันที่ผ่านมา
Forget everything, tomorrow another God made AI will come 😊😊😊
@amortalbeing 8 วันที่ผ่านมา
thanks a lot
@iwanghirawan 8 วันที่ผ่านมา
Is it for free?
@ugwuanyicollins6136 6 วันที่ผ่านมา
Yes
@uday4717 8 วันที่ผ่านมา
China is every day new model
@RohithS-ig4hl 8 วันที่ผ่านมา ⁺¹
Its joever for openai now
@1littlecoder 8 วันที่ผ่านมา ⁺¹
This could have been the title
@Sammyli99 8 วันที่ผ่านมา ⁺²
Dont understand why India isnt no.1 in AI.😮 what went wrong?
@vivekkarumudi 8 วันที่ผ่านมา ⁺³
Bad place to start an enterprise duhhh
@Justanaccountforthings 8 วันที่ผ่านมา ⁺⁴
Too busy in shady call centers looking for gift cards
@shafiahamed3536 8 วันที่ผ่านมา ⁺³
We are busy digging places of worship
@Sammyli99 8 วันที่ผ่านมา ⁺²
@@vivekkarumudi maybe but always go offshore to Seychelles (corporately), so the "place" is irrelevant; I kinda of meant for the skill set of this gen of IT people. And why GOV (if a bad place) didn't/doesn't make it a good place...instead of talent going elsewhere.
@jamesjonnes 8 วันที่ผ่านมา
No money for GPUs.
@paulmuriithi9195 8 วันที่ผ่านมา ⁺³
you have invested heavily in this channel. Using nonsense pr clickbait like "forget this forget that": just ruins the trust gained by your followers. stop it.
@1littlecoder 8 วันที่ผ่านมา
@@paulmuriithi9195 genuinely asking what's wrong in forget x as a prefix, it's a general way to headline things isn't it ?
@toxy805 8 วันที่ผ่านมา
@@1littlecoder yeah, but for pr?
@paulmuriithi9195 7 วันที่ผ่านมา
@1littlecoder stop it. Be factual like Mathew Berman. There's only so many ppl interested in Ai Yt channels. Click bait kills trust. By June 2025, we will use MOE and agents plus avatars to get us all AI NEWS we need.
@ramkumarpandey4865 7 วันที่ผ่านมา
@@1littlecoder it gives the impression that Qwen is way better than deepseek. In this era of all clickbait AI channels, you stand out, please do not lose that
@1littlecoder 7 วันที่ผ่านมา
@ so forget deepseek is not factual but someone talking about deepseek consipracy is factual, woah 👏🏽👏🏽👏🏽 you have a great scale of whats' factuality!
@geno5183 7 วันที่ผ่านมา
Chinese models are powerful. Also dangerous, IMHO. Ask Qwen or Deepseek about Tianamenn Square, for example, and it breaks them or wont answer. With that kind of jaded filtering do you trust these models with your sensitive data or their ineherent biases? I def do not regardless of how fantastic the reasoning is on DeepSeek. Further, there is not a chance DeepSeek was NOT trained using more advanced chips then the H800's based on a number of known factors. My 2 cents. That said - your videos are fantastic as always.
@1littlecoder 7 วันที่ผ่านมา
My latest video was about it th-cam.com/video/WVo01K6hVVs/w-d-xo.html
@TheRealHassan789 8 วันที่ผ่านมา
Qwen didn’t compare against Gemini 1206
@germainyao6748 8 วันที่ผ่านมา ⁺³
Deepseek is the best. All closed model are liars
@Enedee007 8 วันที่ผ่านมา
China need to chill a little 😂
@blender_wiki 8 วันที่ผ่านมา ⁺⁶
i am very worry when the data of my clients are stored in USA not when they are stored in China.
@mytechtool 8 วันที่ผ่านมา
informative
@surflaweb 8 วันที่ผ่านมา
Interesting another chinese model, headache for EE.UU.
@angloland4539 8 วันที่ผ่านมา
@suryakant_brewr 6 วันที่ผ่านมา
If you care much about data 😂😂 stop using internet 🤪🤪🤪
@TheSMomin 8 วันที่ผ่านมา
great but what about CodeFORCE?
@germainyao6748 8 วันที่ผ่านมา ⁺⁶
Deepseek is the best. All closed model are liars

ต่อไป

เล่นอัตโนมัติ

I Investigated the UK (The Reality Is Worse Than You Think)