o3-mini is the FIRST DANGEROUS Autonomy Model | INSANE Coding and ML Abilities

DeepSeek Drops Janus Pro - Vision AND Image Gen In ONE Model

🚀 "After DeepSeek, China Launches Next-Gen Technology: The Bionic Hand Revolution!"

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

The White Lotus Season 3 | Official Teaser | Max

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

The new Qwen Model blows my mind! Qwen 72B 2.5-VL tested

GosuCoder

มุมมอง 2 505

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 1 ก.พ. 2025

ความคิดเห็น • 18

@jeffwads วันที่ผ่านมา ⁺⁵
Due to the Deepseek model, I had to pay Tekboost a visit to grab one of their monster 1.5TB Z8 G4 dual Xeon 18core rigs for around 4K. I need to have that full 128K context and don't care about the slower inference, as long as I get great results with the output.
@GosuCoder วันที่ผ่านมา
That is an incredible setup you have there! I'm jealous lol
@hydrohasspoken6227 14 ชั่วโมงที่ผ่านมา
what if the model quickly gets obsolete and you need much horsepower for the next breakthrough model?
@NetmainRiadh 2 ชั่วโมงที่ผ่านมา
I used it to copy handwritten text, and it was really amazing, mind-blowing. All the other models either refused to work or just failed to correctly transcript the text and gave me unreadable text.
@DaveEtchells วันที่ผ่านมา ⁺²
Mind-blowing that a 72B model can code this well - *and* has vision too!
I haven’t looked, but how big is the model file if you were to download it (if you even can)?
72B seems like it could run on something like a 128GB M4 Pro machine.
The model companies are all moving slow on the agent front, but this and the R1 revelation is telling me that quite small local “thinking” models will be plenty capable of taking on the personal assistant role, checking email, managing appointments, taking meeting notes, etc, etc. The hangup is the potential screw-ups if they give the assistant full access to your computer and accounts, but pure capability-wise, we’re getting really really close to small, efficient and practical assistant agents running locally.
🤯
@GosuCoder 23 ชั่วโมงที่ผ่านมา
When I tried to set it up in Sagemaker I was gonna test it in 128GB. Digits can’t come fast enough for me! Have you tried Operator by chance? I’ve heard mixed things on that but haven’t had a chance to actually try it.
@UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา
QWEN2.5 72B Instruct has been my daily driver ever since it launched. Bang for the buck in code quality is unmatched. My favorite part is when you get errors in ChatGPT like «Too many concurrent connections» and I'm fully functional offline at home!
@JaysThoughts-q5e วันที่ผ่านมา
12:55 Yeah, I gotta have API access. I'm not a great programmer but I know how to loop things via the API.
@UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา
Also I do not use the newer 2.5 VL because I believe they had to prune the 2.5 Instruct model to make room for the vison stuff which I do not care for.
@GosuCoder วันที่ผ่านมา
I’m curious where you saw that? It makes me want to test the VL model side by side with original 2.5 now.
@UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา
@@GosuCoder I read a lot about it all and the vision part is literally another smaller model merged with the main model. I don't fully understand it all but I believe that layers 1 to x are text generation and layers x to y are vision stuff. So they obviously had to prune stuff from the text model to fit the vision part. Also the logic power of the model has to be lower since 72B in the VL is the combined total. I'd bet 100% that the older 72B instruct has more logic power.
@vncstudio 16 ชั่วโมงที่ผ่านมา
The VL is excellent for handwritten transcription and much better than Llama 3.2 Vision 90B. Just a tad below Google Pro Vision and Claude Sonnet. The challenge of handwritten transcription used to be very difficult before 2024.
@GosuCoder 13 ชั่วโมงที่ผ่านมา
That’s awesome! I need to test that more.
@maxxflyer 12 ชั่วโมงที่ผ่านมา
closed source and no api?
@GosuCoder 12 ชั่วโมงที่ผ่านมา ⁺¹
Open weights but having trouble deploying it myself.
@xryanlee 23 ชั่วโมงที่ผ่านมา ⁺¹
So you can actually pay to load Qwen's API as well.
@GosuCoder 23 ชั่วโมงที่ผ่านมา ⁺¹
Really? I tried finding that.
@KrahsThe 10 ชั่วโมงที่ผ่านมา
i've been trying to connect to it from cline but was unable to.

ต่อไป

เล่นอัตโนมัติ

o3-mini is the FIRST DANGEROUS Autonomy Model | INSANE Coding and ML Abilities

o3-mini is the FIRST DANGEROUS Autonomy Model | INSANE Coding and ML Abilities

DeepSeek Drops Janus Pro - Vision AND Image Gen In ONE Model

DeepSeek Drops Janus Pro - Vision AND Image Gen In ONE Model

🚀 "After DeepSeek, China Launches Next-Gen Technology: The Bionic Hand Revolution!"

🚀 "After DeepSeek, China Launches Next-Gen Technology: The Bionic Hand Revolution!"

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

The White Lotus Season 3 | Official Teaser | Max

The White Lotus Season 3 | Official Teaser | Max

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

NVIDIA CEO Jensen Huang's Vision for the Future

NVIDIA CEO Jensen Huang's Vision for the Future

Deepseek R1 671b Running and Testing on a $2000 Local AI Server

Deepseek R1 671b Running and Testing on a $2000 Local AI Server

Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History | Lex Fridman Podcast #449

Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History | Lex Fridman Podcast #449

Building with AI is INSANELY hard!

Building with AI is INSANELY hard!

o3-mini is really good (but does it beat deepseek?)

o3-mini is really good (but does it beat deepseek?)

o3-Mini Fully Tested - Coding, Math, and Logic GENIUS

o3-Mini Fully Tested - Coding, Math, and Logic GENIUS

Nvidia CEO Huang New Chips, AI, Musk, Meeting Trump

Nvidia CEO Huang New Chips, AI, Musk, Meeting Trump

The Best FSD System In China! 1 Hour Drive Using Huawei Qiankun ADS 3.2 Installed In Avatr 11

The Best FSD System In China! 1 Hour Drive Using Huawei Qiankun ADS 3.2 Installed In Avatr 11

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 16 : แมนเชสเตอร์ ซิตี้ พบ แมนเชสเตอร์ ยูไนเต็ด

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 16 : แมนเชสเตอร์ ซิตี้ พบ แมนเชสเตอร์ ยูไนเต็ด

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

Cat mode activated 🤣

Cat mode activated 🤣

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

How to treat Acne💉

How to treat Acne💉