"We were right" - How to use o1-preview and o1-mini REASONING models

IndyDevDan

มุมมอง 17 468

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 18 ก.ย. 2024
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 65

@vincentjean6756 2 วันที่ผ่านมา ⁺¹⁸
Can't wait for competition and the price drops coming in the upcoming weeks. what a great time to be alive, I LOVE it.
@nsdalton 2 วันที่ผ่านมา ⁺³
I like the AI Coding Meta part. Recently I tried to build out an app with quite a lot of files in the frontend and I ran out of tokens. But then I made an instance of Claude, that had extensive knowledge about my app and it's functionality, create a series of prompts that would focus on different areas of the app. It made sure that the app context and architecture was kept intact across the app. Came out to about 60 prompts, but it saved me soo much time and it was surprisingly accurate.
@drlordbasil 2 วันที่ผ่านมา ⁺⁴
mimic'd thinking depth and time with llama 3.1 using groq, hella fast, hella smart! Love that you put "WE" were right, we all work as a hive mind finding what works and finding what doesn't by eachother, even following leads from the closed companies.
Love this space, love this time we are in.
Thanks for another great video to watch while working.
@deltagamma1442 วันที่ผ่านมา ⁺¹
Have you used claude 3.5 sonnet? Do you find llama 3.1 better? Is your use case coding?
@drlordbasil วันที่ผ่านมา
@@deltagamma1442 I've used llama 3.1 mainly as its free for my research, preference def claude 3.5 sonnet.
Use-cases vary as I have ADHD and love coding new projects.
I have done most automation online possible with llm agents or NN/RL/Meta agents.
@---Oracle--- วันที่ผ่านมา
Hello Dan. I want to say that the coherence, elegance and clarity with which you present, articulate and code is profound and unique. We all want to see you succeed beyond your wildest dreams. Amazing content, pioneer🎉
@riley539 2 วันที่ผ่านมา ⁺⁵
Not going to lie, as a sophomore Computer Science student, this video kind of opened my eyes on the possibilities of LLMs
@davidjohnson4063 วันที่ผ่านมา
Job = gone give it 2 years
@riley539 22 ชั่วโมงที่ผ่านมา
@@davidjohnson4063 I think that the "internet of things" will evolve into the "AI of things" until AGI appears. In the meantime, most computer science jobs are not replaceable (except management). Regardless, Chain prompting is revolutionizing LLM use - although I still believe there is a ceiling for LLM applications
@akhilsharma2712 13 ชั่วโมงที่ผ่านมา
@@riley539 lol but junior jobs are replaceable aka yours (in the future)
@ben2660 4 ชั่วโมงที่ผ่านมา
yeah ur cooked switch to data science and build the AI's
@Truth_Unleashed 2 วันที่ผ่านมา ⁺³
Great video another example of why you are my new fav ai dev channel! Thanks!
@mikew2883 2 วันที่ผ่านมา ⁺⁴
Hey Dan. I did not see the XML formatted prompt examples in the libraries you listed. Can you possibly guide us to where to find them? Thanks!
@JimMendenhall 2 วันที่ผ่านมา ⁺⁷
Are you a tier-5 OpenAI user? How are you getting API access to these models?
@KS-tj6fc 2 วันที่ผ่านมา ⁺¹
Assuming this is the case. What are the 1/M token API costs for o1-preview and preview mini?
@JoshDingus 2 วันที่ผ่านมา ⁺³
Open router provides access and o1 is very expensive
@mikew2883 2 วันที่ผ่านมา ⁺²
The new models are actually available through OpenRouter API.
@andydataguy 2 วันที่ผ่านมา ⁺³
Openrouter offers the models at a 5% upcharge
$3 / $12 for mini
$15 / $60 for preview
Guessing o1 will be $75 / $300 (allegedly will be released EoM)
@KS-tj6fc 2 วันที่ผ่านมา
@@andydataguy crazy prices! I thought SOTA LLM were suppose to move towards instant inference, unlimited context windows and ever decreasing costs per a top level guy at Anthropic during the Engineering Worlds Fair just a month ago: th-cam.com/video/EuC1GWhQdKE/w-d-xo.html
@KS-tj6fc 2 วันที่ผ่านมา ⁺¹
5:45 Suggestion - Have o1-preview create ##Chapters
### Section 1 (00:00-08:44)
#### 00:01
#### 01:35
#### 03:45
#### 05:18
### Section 2 (08:45-12:59)
#### 08:45
Then list the keywords for the sections, allow you you to select which key words to keep/prioritize (GUI with +/-) # of times keyword is listed in section and TOTAL number of ####. So if there are 5 #### you suggest 3-4 ####, or 3 #### headings and have it reconfigure just Section 2, perhaps not have AIDER on all 4 of the ####, maybe 3 times maximum.
My thought process here was your small 6 words into an expanded prompt into an image. This is tweaking the output via basic and efficient HITL review to then nudge/guide an iteration by o1-preview to take its better than Sonnet output and perfect it. Ok - back to the video!
@rousabout7578 วันที่ผ่านมา
No BS! Raw quality, content dense material. 👍
@WenRolland 9 ชั่วโมงที่ผ่านมา
Just for kicks, here is a test chapter I created with a custom GPT I'm working on.
00:00 Introduction: Why Prompt Chaining is Key
01:05 Understanding the 01 Series Model Update
01:57 TH-cam Chapter Generation: 01 vs. Claude 3.5
03:06 Using Simon W's CLI LLM Tool for Chapter Generation
04:29 Comparing Results: 01 Preview vs. Claude 3.5
05:58 The Advantage of 01's Instruction Following
07:55 AI Coding Review: 01's Superior Performance
10:24 Simon W's File-to-Prompt Library for Code Review
12:01 Running 01 Preview for AI Coding Solutions
14:54 Key Learnings: Instruction Following in the 01 Models
16:38 Sentiment Analysis: Testing on Hacker News
19:16 Iterating with Large Token Prompts
21:37 Final Results: Detailed Sentiment Analysis with 01
27:52 What's Next: The Future with Reasoning Models
@fups8222 2 วันที่ผ่านมา ⁺²
another amazing video Dan! keep up the great work👍
@techfren 2 วันที่ผ่านมา ⁺¹
Amazing video. Lots of great nuggets of info
@rluijk 2 วันที่ผ่านมา
Great video! Thanks for all the value given!
@DemetriusZhomir วันที่ผ่านมา
You build your prompts quite wisely - that's what most people don't do, especially while benchmarking. Those miss the whole potential of LLMs, yet making their conclusions 🤦‍♂️
@tomaszzielinski4521 2 วันที่ผ่านมา ⁺¹
The ability to clean up jsons still remains valuable, as the tokens wasted on useless data here must have costed a lot :P
@CostaReall 9 ชั่วโมงที่ผ่านมา
That's a beautiful thumbnail! How did you prompt that?
@Deadshotas9845 วันที่ผ่านมา
Please test o1-mini as well for content generation as well as coding
@youriwatson วันที่ผ่านมา ⁺²
14:02 hahaha i liked and subbed
@i2Sekc4U 2 วันที่ผ่านมา
Can you put the resources you refer to in all your videos somewhere? Or just in the description of the video?
@aresaurelian วันที่ผ่านมา
The base model should know when it needs to infer or not, and thus tell us if it must infer to reach a better result, and ask us if we are willing to use the extra token cost for it. We want convenience, agency, and the system must be capable and able to do actual work. Verb. Action, doing, producing. The less we must tinker with prompts and models ourselves, the better for the general end user.
User must be synonymous with agent, and thus, users can be ai agents, doing real work, and vice versa.
@internetperson2 วันที่ผ่านมา
You are describing precognition
@aresaurelian วันที่ผ่านมา
@@internetperson2 A mini model could recognize if the prompt seem complex enough for using inference models to handle it better. A larger search model should realize there was no obvious result matching the specific problem too, and recommend an inference model.
@internetperson2 18 ชั่วโมงที่ผ่านมา
@@aresaurelian This is wishful thinking imo, you cant trust a mini model's gut about assessing the level of required compute to arrive to a satisfactory result for a given problem. I'm not saying such a tool is infeasible, but I am of the mind it would suck.
@aresaurelian 16 ชั่วโมงที่ผ่านมา
@@internetperson2 I could be optional. When the customer/user/agent is displeased, the model would learn to behave in a manner suiting them.
@pawsjaws 2 วันที่ผ่านมา
Its not so much prompt chaining but the Qstar type RL type stuff is key. Tuning the model with the right optimized reasoning routes. Prompting is legit and chaining it certainly works. But in no way is this only prompt chaining. They're even claiming one single model (which shocked me too).
@sambarjunk 2 วันที่ผ่านมา
Great video, can you share the xml prompts you used in this video?
@MichaelLikvidator วันที่ผ่านมา
Which plugin calculates token amount on the bottom right?
@MariuszWoloszyn วันที่ผ่านมา
Can you share the prompt files used in the video?
@IdPreferNot1 2 วันที่ผ่านมา
Are you tier 5 for API access or is there a workaround?
@silvansoeters 2 วันที่ผ่านมา
great stuff! learned a lot from this video.
@user-ti9mv9hb3g วันที่ผ่านมา
I don’t think code review is possible for larger code base, where you need to add 20 files and diff 2k to analyze, that’s requires some vector db and ran ChatGPT against it somehow
@toddschavey6736 2 วันที่ผ่านมา
So we are finally going have software engineers --write-- down their requirements and use cases.... cause you can feed them to AI agents to implement, test, and review
Finally.
@andydataguy 2 วันที่ผ่านมา
Thanks for sharing!
@amandamate9117 2 วันที่ผ่านมา ⁺¹
thats a nice demo, but whos gonna wait minutes to get sentiment analysis for couple comments? way too slow .
@MustardGamings 2 วันที่ผ่านมา ⁺¹
What do you do when you think do you isntatly figure things out or do you ponder and think??
@user-eg2oe7pv2i 2 วันที่ผ่านมา
Best way ? Always pre test run .a dummy run ..like gamer in wow hitting dummy for dps eval. And tell it when the pre test is over and the test start
@techfren 2 วันที่ผ่านมา ⁺²
You are my favourite 🔥🔥
@JoshDingus 2 วันที่ผ่านมา ⁺¹
Same here, let's get a community going indydevdan!
@techfren 2 วันที่ผ่านมา
@@JoshDingus for sure! We Stan Indy dev dan in my discord community too
@carkawalakhatulistiwa 2 วันที่ผ่านมา ⁺¹
Is better is they just call gpt 4.5 o1
@faisalhijazi9782 2 วันที่ผ่านมา
Great content as usual 👏
@Stevenpwalsh 2 วันที่ผ่านมา
Technically Tree-of-thought not Chain-of-thought
@KyleFES 2 วันที่ผ่านมา
LFG 🔥!!
@BrandonMeeks-i2q วันที่ผ่านมา
Lee Scott Thomas Kimberly Rodriguez Sharon
@fieldcommandermarshall 2 วันที่ผ่านมา
👑👑👑
@Trendilien69 2 วันที่ผ่านมา ⁺¹
this constant noise of you typing in the keyboard is distracting and annoying.
@retratosariel 2 วันที่ผ่านมา
Deal with it
@lexscarlet 2 วันที่ผ่านมา
"if you're subscribed to the channel you know what we're about." yeah but I'm not, so I don't, so like, maybe make an introduction about what you're about? You have 19k subs (rounding up) over 2 years, clearly the content isn't selling itself.
@internetperson2 วันที่ผ่านมา ⁺¹
It's a pretty good bleeding edge meta AI channel focused on extracting the most value out of the best tools depending on your use case

ต่อไป

เล่นอัตโนมัติ

SECRET SAUCE of AI Coding? AI Devlog with Aider, Cursor, Bun and Notion