GPT-4o mini - Can it be used for Agentic RAG?

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 ก.ย. 2024

ความคิดเห็น • 24

  • @unclecode
    @unclecode หลายเดือนก่อน +3

    This is really fascinating topic, I always feel alive when I do such thing with model.Fascinating topic! I have a few questions: 1) How do we define "good enough"? What if the output from Gpt-4o-mini meets user needs while Claude's model exceeds them? This is rarely discussed. 2) We've seen how prompt tuning can significantly affect outcomes. We should explore the Lamaindex prompt template; perhaps some examples can help a model generate just what we need.
    To me, when evaluating a model for task A, the priority is how effectively I can get it to produce the desired responses. This approach allows me to assess multiple models and maintain flexibility in decision-making. Sometimes, I start with a smaller model and gradually increase its complexity until I achieve the desired results. Other times, I adopt a reductive method, beginning with a larger model, refining my understanding of task specifics, and then switching to a smaller one until further reduction harms quality.
    This is truly a captivating topic; I always feel energized when working with models in this way.

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      I agree, in most of the cases the output is based on the user "vibe". I think we don't have a way to quantify whether the user is going to like the response or not, except in certain domains e.g. coding or math etc.
      My own experience is similar to yours. The smaller models seems to need more hand holding by providing few shot examples compared to the bigger models. Really great insights from your learnings.

  • @DavidePasca
    @DavidePasca หลายเดือนก่อน +5

    One trick with less powerful models is to spend more time writing the instructions. If the model's output is limited, one can explicitly ask for something more elaborate. I noticed this in the past when I had GPT-4o as front end and an agent using 3.5 doing background research. It took some work to get 3.5 to a decent level, but it was worth it in terms of running costs.

    • @KS-tj6fc
      @KS-tj6fc หลายเดือนก่อน +2

      I also find this, where breaking tasks as small as possible, in fact limiting tokens per task works well:
      From this idea {topic}.create a title (50 tokens)
      From {title} (or output -1) write a book (200 tokens)
      From {title} and {hook} write an introduction (500 tokens)
      And so on to build a blog post.
      This will create a good post.
      If you do multiple - pass all inputs and have the larger model review, selecting and improving on the best one.
      Worked well with 3.5, I’m sure 4o-mini is better. With the cost reduction you could even iterate many times and still come out cheaper.

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน +1

      Yes. I agree, this has been my experience as well. For smaller models, you need to break down the task and provide more detailed instructions.

  • @ahmedgharbi1927
    @ahmedgharbi1927 หลายเดือนก่อน +1

    In my experience, gpt-4 (or gpt-4o) function calling is not great for agentic RAG. However, i’ve seen a huge improvement when I used them with a ReAct agent. I found ReAct helps them in reasoning, elaborating the thinking steps and generating much more accurate and detailed queries. I’m curious if anyone had a similar experience. Would love to see a comparison video with ReAct, thank you for the video !

  • @j4cks0n94
    @j4cks0n94 หลายเดือนก่อน +2

    Nice vid as always. I tested gpt-4o-mini for my own use-case (which is using an agentic workflow), and I agree with you on the notion that it's not really good for agents. In my tests, it sometimes performs even worse than gpt-3.5-turbo-0125. I can't replace gpt-3.5-turbo with this one, let alone gpt-4o or other superior models. Underwhelming is all I can say at this point.

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      Interesting, yes, I think its better to test these models out for your application rather than relying on benchmarks.

  • @KS-tj6fc
    @KS-tj6fc หลายเดือนก่อน +1

    As an agent, is GPT-4o mini fine if it is orchestrated by a superior model? I was wondering about executing an orchestrated prompt given to GPT-4o mini, Deepseek-V2 API and Gemini 1.5 Flash - each running through the steps and then allowing the superior model to review and decide which answer is best and/or pooling the best answers/points together and simply rephrasing that response. Start large/expensive - do many smaller tasks for each - recombine for large/expensive final output. Thoughts??

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน +1

      Yes, I think that can work. I have created a video on a similar concept where Opus was used for Orchestration and for the smaller tasks it uses Haiku. So similar approach can be taken here. Here is the video: th-cam.com/video/a5OW5UAyC3E/w-d-xo.html

  • @shafai100
    @shafai100 หลายเดือนก่อน +1

    Can u covert your video in hindi/ urdu for easier understanding. Also take live stream session. And it would be helpful to tie this up with remote job offerings or something like that.

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      Not sure if Google can directly do that. Will look into some automated tools.

  • @KumR
    @KumR หลายเดือนก่อน +1

    Can u extend this to having streamlit ui too please

  • @TheReferrer72
    @TheReferrer72 หลายเดือนก่อน +1

    If AI's got good at agentic workflows would it not be in the interest of the big tech companies to keep it in house?

    • @johnclay7422
      @johnclay7422 หลายเดือนก่อน

      Good question!!!

    • @barackobama4552
      @barackobama4552 หลายเดือนก่อน

      can be they doing already? literally thats what they are "selling"

  • @mohsenghafari7652
    @mohsenghafari7652 หลายเดือนก่อน

    Thanks dear

  • @masterapofis4997
    @masterapofis4997 หลายเดือนก่อน

    They should have warned us that GPT-4o mini only has 50 questions every 4 hours at this rate.
    Does it force you to have multiple accounts or have to share it, ceasing to be free, and when does that happen?
    We will go to another AI. We want GPT-3.5 back!

  • @aaagaming2023
    @aaagaming2023 หลายเดือนก่อน

    What about Groq Llama 3.1 405b?

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน +1

      That will be a great option too!

  • @stratos7755
    @stratos7755 หลายเดือนก่อน

    Can you run it locally? No? Not cost effective enough...

  • @bastabey2652
    @bastabey2652 18 วันที่ผ่านมา

    gpt4o-mini is much better than Gemini flash