Yup, QwQ is CRACKED: Prompt Chaining with Qwen and QwQ reasoning model (Ollama + LLM)

IndyDevDan

มุมมอง 19 912

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 ม.ค. 2025

ความคิดเห็น • 68

@vedantaggarwal1092 หลายเดือนก่อน ⁺¹²
what are your device's specs?
@patrickbrady535 หลายเดือนก่อน ⁺⁹
2023 M2 Macbook Pro with 64 GB RAM and 12 cores. Shows up at 3:39 in the video.
@blackwhite-3607 28 วันที่ผ่านมา
@@patrickbrady535 wow macbook pro can run ollama qwq 32b?
@Scarsuna 23 วันที่ผ่านมา
@@blackwhite-3607 Thanks to chip design, the ML Compute framework is able to utilize system ram for vram.
Imagine if your video card had 64GB RAM! Now you might understand how a 20GB model can run easily on a Macbook Pro.
@stonedizzleful หลายเดือนก่อน ⁺⁸
Bro you are down the rabbit hole on this stuff it's so impressive. Some of the best quality AI content on youtube!
@indydevdan หลายเดือนก่อน ⁺²
Thanks for the kind words. The rabbit hole is so deep. Meta prompting + o1 + OpenAI 12 day launch content in the works. The things we can do with this tech is mind boggling.
@Scarsuna 23 วันที่ผ่านมา
@@indydevdan I am not a coder at all and running Ollama/OpenWebUI via WSL2. However, I understood quickly I needed to come up with a better system for prompting and just found your channel. Got a lot to learn!
Thanks to your suggestions in this video, I asked Grok to generate an XML format for specific instructions to edit a particular piece of text and it actually worked on the first try!
Normally, I have to ask the AI 2-3 times before it understands the directions.
@abdellahcodes หลายเดือนก่อน ⁺¹⁹
Let's just call it Qwik, ironic but easy to say
@notnotandrew หลายเดือนก่อน ⁺¹
One might say that it's quick to say.
@stanislavtrifan96 หลายเดือนก่อน ⁺¹
John Qwik
@ashgtd หลายเดือนก่อน ⁺²
Amazing video. I love how you are pushing the tech to do as much as it can. i def try this out myself
@sebastianmalcolm3597 หลายเดือนก่อน ⁺²
Super insightful prompting chaining examples! Thankyou IndyDevDan. 5:42
@wSevenDays 6 วันที่ผ่านมา
thank you very much for this prompt guide!
@Maisonier 29 วันที่ผ่านมา
Wow dude, this is amazing! liked and subscribed.
@RuneX_ai 27 วันที่ผ่านมา
Thank you for reviewing!
@plasmax หลายเดือนก่อน
Love the reasoner - extractor pattern. Prompt chaining seems very useful, especially for agents with tool use - you could have the reasoner decide what step to take next, then have an extractor, then verifier with inspection tools that goes back to the reasoner with new information in case something went wrong with the reasoning…
@jtjames79 หลายเดือนก่อน ⁺⁴
I would really like to preserve all that thinking process, merge them all into one giant file, and turn it into a knowledge graph.
I like to talk about philosophy and futurism, so the details are often very important.
I don't care how long it takes. If the AI needs to get back to me tomorrow, that's what I'm used to anyway with people.
@johang1293 หลายเดือนก่อน ⁺²
Sounds like you should look at graph rag.
@jtjames79 หลายเดือนก่อน
@johang1293 The database part is still Greek to me. Most of the tutorials are like "draw a circle, now draw the rest of the owl". 🤷‍♂️
@MarxOrx 14 วันที่ผ่านมา
This is cool but you can also just use the structured output in ollama and force the output to put the chain of thought in one key and the final result in another one. Then you don’t need the second LLM pass at all.
@alexjensen990 หลายเดือนก่อน ⁺⁸
Dude! I don’t know how I haven't watched one of your videos yet. Assuming that all your videos are like this... I have found my new favorite channel... I sometimes feel like I'm the only one I know who is really nerding out on prompt engineering in more complicated (and better performing) manner... To be honest, I feel like I am the only one I know who is really into using generative AI. In any event you’ve got a new regular sub here on YT.
@alexjensen990 หลายเดือนก่อน
PS- Ironically, I already follow you on GitHub somehow. I don’t recall checking out your repos, but I look forward to following your work man. Cheers.
@samson3523 หลายเดือนก่อน
yep hes fucking spot on always, I thought I was the only one in my IT team on the cutting edge (I am) , but at least we've got indydevdan
@ABUNDANCEandBEYONDATHLETE หลายเดือนก่อน
There are millions dude.... Most people don't talk about it. It almost feels like a intern as an assistant, can't wait until PhD level. Never making a template from scratch again lol
@sd5853 หลายเดือนก่อน
I don’t always get everything from your vids as I’m a fkin noob with limited follow through but fk me I love your vids and I’m getting so much value out of it. Thank you for putting it out there
@perelmanych 3 วันที่ผ่านมา
You can achieve almost the same output quality just by using the next prompt like: "Use previous answer to formulate the final answer in the json format"))
@supermold 29 วันที่ผ่านมา
I mean it is slow, but in the time it took me to watch this video it wrote snake game on my rtx 3060. Technically I only asked it for an outline, but it decided phase three of outlining how to code snake game was showing me the code that outlines how to instruct your computer to run snake game on it so idk, I guess it failed successfully. Considering I just gave it some generic and poorly worded instructions in a state of sleep deprivation, I'm pretty impressed and excited to see what else it's capable of.
@JimMendenhall หลายเดือนก่อน ⁺¹
This is a very nice approach. Thank you for sharing. I'm wondering if QwQ will output the final answer in a specified tag like similar to how some other models will. That could help with the extraction for sure.
@MaelSimonApprenTyr หลายเดือนก่อน
Yeah, then after you can extract it with a simple regex 😀
@Gunnerpigs หลายเดือนก่อน
After I watched your previous video about prompt levels I wondered how I should implement dynamic variables, is the way you do it in your 5:47 example the best way? Or are there other better ways of utilizing the dynamic variable?
@canelj หลายเดือนก่อน
Thanks for sharing this! 🤩
@TravisChristopher หลายเดือนก่อน ⁺²
Is that an EGPU for Mac on your desktop? Would love to hear a bit about your setup…
@ThoughtFission 22 วันที่ผ่านมา
And here I had hoped we mere mortals could get something useful out of this. Not for those who don't program I guess ;o)
@jacquesdupontd หลายเดือนก่อน
Very interesting. Is this script for prompt chaining possible on an IDE like Windsurf for each of the prompt give to an agent like Sonnet 3.5 ? Thanks for your work
@paulyflynn หลายเดือนก่อน ⁺²
Crushing it ! Can I call you Mr. Hands
@Truth_Unleashed หลายเดือนก่อน
There can be only one!
@NLPprompter หลายเดือนก่อน
i disagree, he's bigger than that mate his hands wizard look at how his hands handle the keyboard invoking strokes, sounds like keyboard wizard vim shortcuts on OS!
@iainmckenzie1 หลายเดือนก่อน
Curious to know if you've done any work with DSPy Dan? We've just started piloting it in my organisation so will be generating preliminary results soon but it's an interesting concept. Would be cool to hear your thoughts.
@KennylexLuckless 24 วันที่ผ่านมา
hat it speaks Chinese would indicate that it is a well-trained model. Mandarin Chinese is the most spoken language in the world, while English only becomes the largest if you count those who speak it as a second language. As a Swede, I often encounter U.S. bias in AI responses, where it uses feet and inches even though the metric system is the most widely used. I have to use a system prompt to make it use metric, but it often leaves the conversion in the answer, which I have to remove later.
@Scarsuna 23 วันที่ผ่านมา
Europeans have a hard time creating innovative projects of their own due to draconian censorship laws in the EU. That's the problem!
@aaagaming2023 หลายเดือนก่อน
Would you say prompt chaining like this is as efficient as a framework like LangGraph in a production context?
@magnusquest หลายเดือนก่อน ⁺²
Using this to run a local aider --architect Qwen + QwQ stack :D
@MaelSimonApprenTyr หลายเดือนก่อน
Are you using this stack? Or an idea? If you are using it, I'm really curious about the perfs ! I want to switch from sonnet 3.5 to something local to reduce my climate impact
@magnusquest หลายเดือนก่อน ⁺²
@@MaelSimonApprenTyr Currently running QwQ > output to file > input to aider. As Dan mentions the pitfall of this reasoning model is that it outputs its whole thought process so it would take super long for architect to run efficiently. Extracting the specific steps and details with prompt chaining is best here, but still takes quite a bit longer than using something like GPT o1
@MaelSimonApprenTyr หลายเดือนก่อน
@@magnusquest I'll have to check this in details, thanks man ! 😊
@bukitsorrento หลายเดือนก่อน ⁺¹
MCP x Ada/Agent OS, haven't see any channel covering memory server, also check awesome-MCP server repo. Can't wait.
@indydevdan หลายเดือนก่อน
100% feels big - I'm still digesting MCP. Massive OAI releases incoming.
@alexjensen990 หลายเดือนก่อน
TL; DR - I am not convinced that Qwen is all that great. Admittedly, I haven’t put the newest one through its paces yet due to time constraints, but I intend to do my due diligence when time permits. I will elaborate below on why Qwen, and for that matter a good number of recent models and papers, all seem to be at best not really moving the industry forward. At worst, well, they seem like they are an intentional grift or at a minimum plagiarism. Due to there inability to correctly answer characters lacking certain identifying aspects. I’m sorry if that sounds too general, but in a effort to keep this short I will simply say that if you use o-1 & o-1 mini; Claude, Haiku, and Opus; Llama 70B & 405Band many of its variants; Gemini Pro; and to some degree Mistral though I find that Mistral has fewer, more focused strengths, what you find is that they, at least in English, are able to generalize and abstract in mind blowing ways already. Particularly OpenAI and Anthropic’s models. Despite their flaws, Gemini, Llama, Mistral, and Grok all have extremely amazing ability to infer from a query what the next words should factually be given the effectiveness of the prompt and, often, even with ineffective prompts. However, Qwen out of the box has not, as of yet, shown me anything, but party tricks. Previous versions of Qwen have performed absolutely horribly when I have tested their ability to make certain connections between ideas that every other model, even the 6B-8B models, tend to do without missing a beat. I have several theories as to why and as a scientist, enthusiast, and member of the human race it bums me out. I will leave those theories to your imagination. With all that said, I look forward to putting this new Qwen through its paces and see if it finds a place in my stack. Believe it or not, this was by far the TL; DR because I could go on for some time about this topic. Anyway… Great video. I look forward to following your content moving forward. Cheers.
@daryladhityahenry หลายเดือนก่อน ⁺¹
Wait....... How do you fit the context length for extraction? I mean, it's sooooo long.... And your ram only 64GB and you use 32B model. hmm. I really wonder why it can fit and works really well. >,
@box4soumendu4ever หลายเดือนก่อน
Great 👍👍🥰🥰 Thanks.
@Sam-kj8ew หลายเดือนก่อน ⁺¹
nice. What IDE is that?
@TheAIBlueprint หลายเดือนก่อน
The model is fully open? Or is through an API? Does any info, data, meta-data go back to anyone’s servers or is it 100% tested local?
@GjentiG4 หลายเดือนก่อน
Quick question. Are you putting the subtitles automatically? If so what tool are you using?
@marcshawn หลายเดือนก่อน
What is your Macbook Specs?
@JohnLewis-old หลายเดือนก่อน
Can you do a multi-run test to see how many of those chained outputs fail? Are we talking 70% or 99%?
@goldenglowitsolutions หลายเดือนก่อน
This looks amazing!
But I dont think by Intel Core I7, 16GB RAM DDR4, RTX 3050 4GB (Acer Nitro 5) Laptop will be able to handle it.
@ozuromo หลายเดือนก่อน
Nice video.
@pensiveintrovert4318 หลายเดือนก่อน ⁺¹
It is also NOT as smart as o1-preview. o1-preview was the first model ever that was able to solve my puzzle, and QwQ made stupid logic errors.
@indydevdan หลายเดือนก่อน
Totally. It's no where near o1-preview or o1 (just released). For local reasoning though it's a massive step forward.
@daburritoda2255 หลายเดือนก่อน
what are the specs of your MacBook pro?
I'm impressed how quickly you can generate tokens on QwQ
@daburritoda2255 หลายเดือนก่อน ⁺¹
I have an M4 Pro, MBP, but with only 24gb ram so I can't run the model locally
@haroldpierre1726 หลายเดือนก่อน
@@daburritoda2255 At 3:36 he has a description on the screen about the specs of his Macbook. He has an M2 Max with 64 GB RAM.
@balqaasem หลายเดือนก่อน
Call it Quick
@shuntera หลายเดือนก่อน
Hard to use??? I’m thinking nahhhhhh, immediately disregard it as there are other models that are easier to use
@sephirothcloud3953 หลายเดือนก่อน
deepseek r1 is better than qwq, sadly they didnt released the model and API yet
@thenextension9160 หลายเดือนก่อน
No Patreon? Come on man. Your content is way to important for you not to be getting memberships.

ต่อไป

เล่นอัตโนมัติ

Meta Prompting with o1, o1 Pro Mode, and ChatGPT Pro (Compute on Compute)