The Future of Knowledge Assistants: Jerry Liu

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ส.ค. 2024
  • In this talk, LlamaIndex founder & CEO Jerry Liu covers how we go beyond single-LLM prompt calls. He discusses advanced single-agent flows, Agentic RAG, multi-agent task-solvers & service architectures, and more. Jerry also announces Llama Agents: Agents as microservices that are easily deployed and communicate via a single API (and much more).
    Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at www.ai.enginee... & join us at the AI Engineer World's Fair in 2025! Get your tickets today at ai.engineer/2025
    About Jerry
    Jerry Liu is the CEO & Co-Founder of LlamaIndex

ความคิดเห็น • 27

  • @semrana1986
    @semrana1986 หลายเดือนก่อน +14

    Nice to see AI reinventing itself, we used to call these approaches as IR, Multi-Agent Systems.

    • @washedtoohot
      @washedtoohot 3 วันที่ผ่านมา

      I don’t blame him. AI has come to a point where every one and their mother wants to use it. My point being that it is much removed from academia.

  • @kaihuchen5468
    @kaihuchen5468 17 วันที่ผ่านมา +2

    > 9:44 Why multi-agents
    In addition to the mentioned benefits of using multi-agents (specialization, parallelization, and reduced cost/latency), there are several other important advantages:
    - Enhanced Reliability: By having multiple diverse agents attempt the same task or decision, we have a better chance of avoiding disastrous/erroneous outcome.
    - Improved Quality: Constructive competition among agents (if set up to do so), where each agent critiques the work of others, can lead to higher quality results.

  • @SebKrogh
    @SebKrogh หลายเดือนก่อน +13

    We went from Gen AI will make things easier and replace developers, to having to hire more developers and the equivalent to rocket scientists 😅

    • @zacboyles1396
      @zacboyles1396 19 วันที่ผ่านมา +1

      That’s the thing about automation and why it’s taken so long for companies to properly invest. It takes quite a bit of time to do it right however, once you do… unless you have another automation problem for those new developers, you might be back to the “replacement” conversation.

  • @brunomattesco
    @brunomattesco 19 วันที่ผ่านมา +4

    this micro agents structure was exactly what i was thinking yesterday and want to sell a saas about it

  • @oddfeeling7956
    @oddfeeling7956 11 วันที่ผ่านมา

    3 hours later of playing around with it - This is awesome!!! can I make the agents into route endpoints with something like reverse proxy and query them directly as I would API endpoints?

  • @Drone256
    @Drone256 หลายเดือนก่อน +8

    So this needs a sample application to demonstrate its value. Show me something I can’t currently do with API calls to my favorite LLM and good ole fashioned code.

    • @majesticmewtwo7386
      @majesticmewtwo7386 29 วันที่ผ่านมา

      there are already many tools you can use to do this. Here is an example, AutoGen is a framework that enables next-gen LLM applications via multi-agent conversation. Look it up!

  • @jianghong6444
    @jianghong6444 หลายเดือนก่อน +4

    I would assume that a lot of RAG tech ultimately would be using existing technologys e.g. search/IR etc etc,

  • @Bakmandour
    @Bakmandour 21 วันที่ผ่านมา +3

    If we see Agents as Microservices, why not reusing existing Microservices infrastructures proved reliable from years now? Truly curious about the reasons.

    • @Bakmandour
      @Bakmandour 21 วันที่ผ่านมา

      @Jerry Liu

    • @zacboyles1396
      @zacboyles1396 19 วันที่ผ่านมา

      You absolutely should be, I’m of the opinion that’s where the biggest gains are being made. Micro agents can enhance old exception handling processes with specialized agents redirecting requests while factoring in live system information or contextual data. In general it allows your old micro services to handle more complex tasks or accept a wider variety of inputs. Think about all the processes with some type of minimum criteria requirement which failed requests get passed to more expensive, often manual, or human involved workflows. A cheap micro agents can fill in missing details or approve alternative workflows. To say it’s a polishing for micro services is an understatement, it’s more like a powered exoskeleton with Jarvis to keep them company. 😂

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 28 วันที่ผ่านมา +4

    Outside of Python AI bubble this is so old and natural that you would never call it an Invention 😂 Well that happens when some data scientists try to host their Jupyter Notebook 😂

  • @1armadyl
    @1armadyl หลายเดือนก่อน +4

    look into semantic kernel and kernel memory

    • @majesticmewtwo7386
      @majesticmewtwo7386 29 วันที่ผ่านมา

      Damnn, that was soo insightful! Thanks man.

  • @oddfeeling7956
    @oddfeeling7956 11 วันที่ผ่านมา

    Went through the repo and checked the branch list to peep possible feature branches. Who tf is Logan!!?

  • @vikk2524
    @vikk2524 6 วันที่ผ่านมา

    popular frameworks usually come from extracting resuable bits from a proven working production system. I don't think it's productive to try to come up with some all-encompassing framework out of nothing. I recommend AI engineers to just use your existing microservice solution, figure out what's lacking for serving LLM agents, and then derive a solution from there if actually necessary. It's quite unclear what problems Llama Agents solve that's worth the migration efforts from this presentation.

  • @cagdasucar3932
    @cagdasucar3932 หลายเดือนก่อน +28

    I really think llama agents is utterly useless. There's no point in making agents into micro services. Just make an async call instead. Much lower overhead in terms of development and performance.

    • @yvestschischka9584
      @yvestschischka9584 หลายเดือนก่อน

      Well sounds strong but is actually not really useful as youd need async services like bpm. So...I can see the worth in those agents. And its not a coincidence Goolge is going in the sqme direction.

    • @xiomoen3943
      @xiomoen3943 หลายเดือนก่อน

      ​@@yvestschischka9584 Both views are good.
      For simple applications, directly using asynchronous calls can indeed reduce development and operational costs, avoiding the complexity of message queues and proxies.
      However, for more complex applications that need to handle intricate tasks, using message queues and proxies can offer greater flexibility and scalability.

    • @fallinginthed33p
      @fallinginthed33p 28 วันที่ผ่านมา +5

      The problem is that LLM API calls can't be async and parallelized if subsequent calls depend on results from previous calls. The more agent calls you have, the longer it takes to get a completion reply to the user.
      There's so much needless abstraction when these are just API calls to an LLM service.

    • @shootdaj
      @shootdaj 21 วันที่ผ่านมา

      That's not scalable. That's the whole point of microservices

    • @zacboyles1396
      @zacboyles1396 19 วันที่ผ่านมา

      Take a basic internet dependent search everyone always uses as an agent use case but to refute your comment let’s not use the typical lazy examples.
      You need to take the user’s query and properly qualify it. This could be many micro agent calls if you’re doing it right, dozens really. First, you had better be using multiple search providers per culture/language region. Focusing on the U.S. you’d have the standard Google/Bing and Brave and one of the intelligent ones like Tavily, that’s 4 services, each with a large number of arguments to help tailor the results to better address the query. How are you determining the “freshness” of the results? What about if a date range is required? You can get away with one agent determining a start/end date range but you’ll need another to determine past week/month/year. What about the general search category like web/news/images/etc? You should send the query off to a micro agent to determine what results set(s) should be targeted. You can also pass the query off to another micro agent to be rewritten to enhance results if possible. What about if the question is more technical, what about if the query would benefit from Reddit or social media profiles? You would need to send it to a Reddit specialist agent who could determine if it should be included and if so, what those parameters might be and similar for different social media. Stack overflow, Wikipedia, etc, each would benefit from separate agents targeted at each site’s content, helping to map out the search plan. Once each of these micro agents have completed their query evaluation tasks, all run in parallel of course, you then fire off the searches, again, in parallel. What do you do with 5 or 10 sets of results? You need to go through them and begin to collect the useful information from the results, firing of scrapers if/when the user’s query requires further investigation. That’s a ton of micro agents and all we might have done is accept a query and hopefully communicated some details of each of these micro background processes taking place. Llama agents seem to be a step in the right direction for deployment, organizing and sharing/reusing micro agents.
      A side note, what’s utterly useless is the anti-pydantic LangChain Expression Language LCEL for Python. I think their detour set back the entire AI development industry 6 months, quite possibly 12 considering how it broke everything and made samples and demos worse than useless for about a year.
      Cheers

  • @techwiththomas5690
    @techwiththomas5690 16 วันที่ผ่านมา

    I want to use Llama 3.1 8b and use a Qwiki (Quality management wiki) for RAG. If possible I would like to use a Llamafile. This whole thing should run only locally with no connection to the internet. Is there anyway I could get a tutorial on this? Possibly with the advanced RAG featured you showed in the presentation because I really do not just want a "glorified search".