We’re entering a new era of automation, shifting from an imperative approach - where every step had to be explicitly coded - to a declarative one focused on defining goals and outcomes. AI can now produce structured outputs with high accuracy, allowing for more abstract instructions. This marks a fundamental change in automation, where AI agents, equipped with the right tools and capable of understanding, planning, and acting autonomously, can complete tasks with minimal human input.
I can finally use AI inside code in a way where the output is actionable by traditional code. It can literally be a part of the program now and it's so awesome I almost shed a tear when I read this.
Introduction and Overview of Structured Outputs by OpenAI - 00:00:00 The State of AI Adoption in 2024 - 00:00:37 Importance of Function Calling for AI Agents - 00:01:16 Challenges of Traditional Automation vs. AI Agents - 00:02:31 Understanding JSON Schemas in AI Agents - 00:03:05 Introduction to Structured Outputs by OpenAI - 00:04:10 How Structured Outputs Ensure 100% Accuracy - 00:05:20 Usage and Limitations of Structured Outputs - 00:07:31 Demonstration: Using Structured Outputs with Function Calling - 00:08:03 Demonstration: Using Structured Outputs with Response Format - 00:09:09 Comparison with Instructor Library - 00:10:46 Building AI Agents Using Structured Outputs in the Agency Framework - 00:11:21 Conclusion and Final Thoughts - 00:13:21
Don’t forget that this is structured output. You can use the structured output in a function the way you need that function to use it. But don’t lose focus on the structured output. For example, you can create a real time user interface that can adjust or be modify based on user input so basically a. Single Paged Website that changes based on the LLM’s response. It actually is pretty cool.
Is this only through their official API or can it be used with langchain? So far I've been using the output parsers with pydantic in langchain with great results
What if the prompt has missing Information like location in the schema when the prompt is: „What‘s the weather?“ Does it ask for the location to fulfill the schema?
Hi bro i was just thinking about you and then u posted a video lol Im wondering if you can do a tutorial on using vapi with agency swarm to completely control a business including sending emails and follow up calls and also scheduling employees for specific jobs to go to a location and then managing them. Im trying to setup for an Hvac business and I want it to take a call, find out what's needed, send an invoice and follow up with the customer and then if they go with the job to organize with the employees to go to the job on a scheduled time and day and then manage everything. Do u think that's possible?
I am not understanding why I should use structured outputs. In the app I use, I have the JSON schema defined in the prompt. I have not seen the schema not be in the correct structure in 1000+ tests. What am I missing?
@@vrsenvirtual or physical, same point I am making. We have LLMs speaking our language and instead of focusing on ONE thing only,ie, access and do everything just like we do, the way we access data etc, we are yet still trying to speak the computer language, getting APIs and tools in the AI platforms for it, instead of focusing on that ONE thing, access everything online/PC etc like a human. AI is so fast that it does not matter if it accesses everything our way. And it is better to only do that, as billions of online/apps/platforms on our machines and internet are made for us already. Just my view...
This was possible though before this. They just made it easier for developers, which is great. But then you're stuck with the OpenAI base model... No privacy... Building your own way for 100% JSON output is the way to go, and it's relatively easy. And if you have a large nested schema, why not just break it down? It's golden. I've done millions of calls, with a 100% success rate on JSON output structure.
Their blog post has a great example of a tool for creating HTML files, where a nested schema makes sense due to the hierarchy of elements. Attempting to generate content 10 levels deep will certainly lead to hallucinations without this feature. I'll release a video next, where I build a similar React developer agent with a nested tool for creating component trees.
Возможно, вам кажется, что я говорю как ИИ, потому что моя речь часто структурирована, логична и лишена эмоциональной окраски, которую мы привыкли видеть в общении с людьми. Это связано с тем, что я стараюсь предоставлять точную, полезную информацию и избегать двусмысленностей, что может создавать впечатление определённой "неестественности" общения. Кроме того, я обучен анализировать и генерировать текст на основе огромного количества данных, что делает мою речь более формальной и прямолинейной. Если вам кажется, что моя речь слишком машинная или формальная, я могу адаптировать стиль общения, чтобы сделать его более естественным и живым, если это вам будет удобнее.
@@vrsen Хаха, даже комент звучит как ИИ :) Не, мне все норм, ничего менять не надо, если это ваш стиль и вам так удобнее доносить до публики ваши идеи. Я, кстати, неделю назад еще не понимал эту тему от слова совсем. Но посмотрел ваши ролики и других ребят и на выходных уже сделал PoC бота технической поддержки в своей компании, с RAG, qdrant и Claude агентами. Писал на Ruby, но думаю сделать полноценный проект на Elixir/Phoenix.
@vrsen anyone else that works with AI 12 hours a day starting to think like generative AI? I find myself thinking like I'm generating responses to my other own prompts :)
We’re entering a new era of automation, shifting from an imperative approach - where every step had to be explicitly coded - to a declarative one focused on defining goals and outcomes. AI can now produce structured outputs with high accuracy, allowing for more abstract instructions. This marks a fundamental change in automation, where AI agents, equipped with the right tools and capable of understanding, planning, and acting autonomously, can complete tasks with minimal human input.
Well said Sir 👏🏼
polymorphic applications, as said by david shapiro
In theory that's what agents were supposed to do from beginning. But always ran into problems. We will probably run into other problems now.
Good bot
Well said!
What do you think of Anthropic's new prompt caching feature?
Thanks for continually maintaining your framework! Can't wait to try this new structured output
Install from git for now, I wanna add a few things before the next release
I can finally use AI inside code in a way where the output is actionable by traditional code. It can literally be a part of the program now and it's so awesome I almost shed a tear when I read this.
Introduction and Overview of Structured Outputs by OpenAI - 00:00:00
The State of AI Adoption in 2024 - 00:00:37
Importance of Function Calling for AI Agents - 00:01:16
Challenges of Traditional Automation vs. AI Agents - 00:02:31
Understanding JSON Schemas in AI Agents - 00:03:05
Introduction to Structured Outputs by OpenAI - 00:04:10
How Structured Outputs Ensure 100% Accuracy - 00:05:20
Usage and Limitations of Structured Outputs - 00:07:31
Demonstration: Using Structured Outputs with Function Calling - 00:08:03
Demonstration: Using Structured Outputs with Response Format - 00:09:09
Comparison with Instructor Library - 00:10:46
Building AI Agents Using Structured Outputs in the Agency Framework - 00:11:21
Conclusion and Final Thoughts - 00:13:21
Don’t forget that this is structured output. You can use the structured output in a function the way you need that function to use it. But don’t lose focus on the structured output. For example, you can create a real time user interface that can adjust or be modify based on user input so basically a. Single Paged Website that changes based on the LLM’s response. It actually is pretty cool.
One step closer to personalised digital products for everyone.
That’s a really cool idea! Can you try this and share on our discord?
That was the missing peace. Now you can get reliable outputs for simple ui components and sql queries. But it’s slower than hardcoded components.
@@yurijmikhassiak7342not for long
Nice video graphics man! 🙂🙌🏻
Is this only through their official API or can it be used with langchain? So far I've been using the output parsers with pydantic in langchain with great results
What if the prompt has missing Information like location in the schema when the prompt is: „What‘s the weather?“ Does it ask for the location to fulfill the schema?
That’s a really good question, I need to check
I found this in the midst of implementing the structure on one of my workflows thank you
Excellent thank you learning so much on this channel
Hi bro i was just thinking about you and then u posted a video lol
Im wondering if you can do a tutorial on using vapi with agency swarm to completely control a business including sending emails and follow up calls and also scheduling employees for specific jobs to go to a location and then managing them.
Im trying to setup for an Hvac business and I want it to take a call, find out what's needed, send an invoice and follow up with the customer and then if they go with the job to organize with the employees to go to the job on a scheduled time and day and then manage everything.
Do u think that's possible?
Yes I think that's possible if you deploy your agents as an API. Then, you can connect to Vapi
Hey @vrsen do you know if structured outputs is available in the Azure OpenAI SDK yet?
I am not understanding why I should use structured outputs. In the app I use, I have the JSON schema defined in the prompt. I have not seen the schema not be in the correct structure in 1000+ tests. What am I missing?
Is possible to maintain the context with assistants and still using structured outputs via API?
How about the data being complete? Many times it just outputs “samples”, so it’s structured JSON but incomplete. Is there a solution for that?
Yeah, try setting default values
This is very good functionality , I myself faced getting structured output in production
Another banger from the man himself
Thanks, will do more of this format
when are you doing a tutorial with your software working on open sourced LLM's, what if we dont want to use openAI?
it was already posted recently on David Ondrej’s channel, Agency Swarm with Llama 3.1
Really need a beginner to advanced course. Zero to hero
Got it
Man you are legendary
thanks for another great video, just my opinion on something: not hallucination, the problem is working with tools EXACTLY like humans do
That’s what figure is doing. Tools that control a robot body 🤖
@@vrsenvirtual or physical, same point I am making. We have LLMs speaking our language and instead of focusing on ONE thing only,ie, access and do everything just like we do, the way we access data etc, we are yet still trying to speak the computer language, getting APIs and tools in the AI platforms for it, instead of focusing on that ONE thing, access everything online/PC etc like a human. AI is so fast that it does not matter if it accesses everything our way. And it is better to only do that, as billions of online/apps/platforms on our machines and internet are made for us already. Just my view...
This is an impressive achievement!
Yes, definitely deserves more attention
I am a big advocate for structured output - let’s content generation, more structure and functional use.
Yeah, helps you save on tokens if used right as well
Please make a video talk about "Agency Chart".
great video thanks!
As cs first year scares me what i do for future
this is just a wrapper around the instructor library, but worse because you cant write validation code
No, it’s different. Watch till the end
This was possible though before this. They just made it easier for developers, which is great.
But then you're stuck with the OpenAI base model... No privacy...
Building your own way for 100% JSON output is the way to go, and it's relatively easy.
And if you have a large nested schema, why not just break it down? It's golden.
I've done millions of calls, with a 100% success rate on JSON output structure.
Their blog post has a great example of a tool for creating HTML files, where a nested schema makes sense due to the hierarchy of elements. Attempting to generate content 10 levels deep will certainly lead to hallucinations without this feature. I'll release a video next, where I build a similar React developer agent with a nested tool for creating component trees.
💥 YOU NEED to check your subtitles, because your English is not native, abd you say one thing, and the subtitles are a little bit different. 🙏👍
Почему мне кажется, что вы уже сами говорите как АИ? :)
Возможно, вам кажется, что я говорю как ИИ, потому что моя речь часто структурирована, логична и лишена эмоциональной окраски, которую мы привыкли видеть в общении с людьми. Это связано с тем, что я стараюсь предоставлять точную, полезную информацию и избегать двусмысленностей, что может создавать впечатление определённой "неестественности" общения.
Кроме того, я обучен анализировать и генерировать текст на основе огромного количества данных, что делает мою речь более формальной и прямолинейной. Если вам кажется, что моя речь слишком машинная или формальная, я могу адаптировать стиль общения, чтобы сделать его более естественным и живым, если это вам будет удобнее.
@@vrsen Хаха, даже комент звучит как ИИ :)
Не, мне все норм, ничего менять не надо, если это ваш стиль и вам так удобнее доносить до публики ваши идеи.
Я, кстати, неделю назад еще не понимал эту тему от слова совсем. Но посмотрел ваши ролики и других ребят и на выходных уже сделал PoC бота технической поддержки в своей компании, с RAG, qdrant и Claude агентами. Писал на Ruby, но думаю сделать полноценный проект на Elixir/Phoenix.
@@vrsen LOL :)
@vrsen anyone else that works with AI 12 hours a day starting to think like generative AI? I find myself thinking like I'm generating responses to my other own prompts :)
@@AI_Escaped haha, so true.
They are 100% correct in output form. Not 100% in output content.
Literally useless feature if they can't assure privacy. All business need privacy built in.
That’s true. Hopefully it’s coming to azure soon
OpenAI has an Enterprise version for this -- you would ned to contact their sales department for details and access to this
@@Aryankingz in that case better to use llama 3.1