Are LLMs Just Databases? The Real Story + Apple AI Predictions

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 ต.ค. 2024
  • Predicting Apple’s AI Play:
    / predicting-apples-a-i-...
    LLMs are not Databases:
    / crisippolite_ai-activi...
    Links:
    Claris Community Live: Learn about AI prompting basics
    Thursday, May 30, 2024 8:00 a.m. - 9:00 a.m. PT
    content.claris...
    VOTE FOR CRIS: Choose “A.I Deployed: Real World Examples” in this form:
    forms.office.c...
    Join the conversation on the Claris AI Learning Initiative on FRIDAY MAY 31st @ 8am PDT:
    us06web.zoom.u...
    Apple WWDC 2024: June 10th @ 10am PT: / @appledeveloper
    :book: Brave New Worlds by Salman Kahn: www.amazon.com...
    Follow Cris:
    / @isolutionsai
    / crisippolite
    / isolutionsai
    www.isolutions...
    Follow Matt:
    www.navarre.tr...
    / navarre
    ----
    Previous Episodes:
    Dr. Michael Schuckers: / wgxkzfer7w0
    Gabriel Woodger: • Bonus Episode: Gabriel...
    Ernest Koe & Joris Aarts: • Special Guests Ernest ...
    Vector Databases: • Vector Databases, Sema...
    Context: It’s not what you think: • Context: It's not what...
    Prompt Templates: • ClarisTalk Episode 8.....

ความคิดเห็น • 12

  • @ObservingBeauty
    @ObservingBeauty 4 หลายเดือนก่อน +4

    Great flow. Clear and informative. Subscribed

  • @TheNewton
    @TheNewton 4 หลายเดือนก่อน +1

    47:33 For some wild stuff don't stop at asking an LLM if it's a "database" , ask it to BE a relational database
    > Can you act like a relational database?
    then asking it
    > What commands do you have?
    vs
    > What database commands do you have?
    etc
    There's yt video 'mHgsnMlafwU' by @embracethered about this as well.

    • @navarre.training
      @navarre.training  4 หลายเดือนก่อน

      Can you post a link to the video? This sounds fascinating - And I tried the database commands, and it says it can 'simulate a relational database' and provides the usual SQL commands.

  • @NuncNuncNuncNunc
    @NuncNuncNuncNunc 4 หลายเดือนก่อน +3

    A good example of training being frozen in time - ask a model lile Gemini when the next total solar eclipse visible in the contiguous US will be. It's training stopped last year so the most probable completions point to April 8, 2024. If you let it know the response is incorrect, it will not infer that it needs to produce a result in the future but will answer with the eclipse prior to April's because that is the next most probable completion. Only if literally told the date of the next eclipse will it respond accurately...for that user/session.

    • @mofo78536
      @mofo78536 4 หลายเดือนก่อน +2

      So kinda like a ROM personality construct

    • @navarre.training
      @navarre.training  4 หลายเดือนก่อน

      I just tried this query with 4omni and get: The next total solar eclipse visible in the contiguous United States will occur on August 23, 2044. This eclipse will be visible primarily from the northwestern part of the United States, particularly around Montana  .
      Following that, another significant total solar eclipse will occur on August 12, 2045, which will cover a broad swath of the United States from Northern California to central Florida .

  • @TheNewton
    @TheNewton 4 หลายเดือนก่อน +1

    For a strained analogy if it's a "database" it's a randomly encrypted stochastic-storage database with no one having the decryption keys to the source fields.
    "Database" by itself implies structured deterministic retrieval of an original source.
    The entire LLM industry's core legal foundation of course is that original sources are used for training and are not stored, NOT retrievable. So any talk of a 'database' analogy for LLMS becoming a technical truth, even if proved as true, will always get pushback.
    So there's gonna be some illegitimacy|legitimacy to such an analogy|industry until some math geniuses prove retrievability is ALWAYS impossible AND that any thing seeming like an original source is an illusion of pure randomness; or deterministic retrieval is possible without having to store any random variables. Of coures they still have to communicate either outcome in terms that get talked about in conceptually simple ways. Either that or the law kicks in to say the randomness is/isn't good enough a defense.
    A fun thing for this is simple prompts like the following which seems likely to always provide an expected response seeming to show "original" information , especially when you reset the session per ask, but in a single session is completely random as you repeat the prompt multiple times without resetting (at least for chatgpt3.5):
    > finish the sentence: the quick brown fox

    • @navarre.training
      @navarre.training  4 หลายเดือนก่อน

      You are a nerd, and I love you.
      Just now when I tried 'finish the sentence: the quick brown fox' with ChatGPT 4omni, I can only get it to say: The quick brown fox jumps over the lazy dog

  • @riffsoffov9291
    @riffsoffov9291 4 หลายเดือนก่อน

    When you use RAG, are you giving up the ability of the AI to generalize? About 15 years ago, I read that AI should generalise, and if you use too many parameters and do too much training, you get overfitting, which kills the ability to generalise.
    For anyone not familiar with overfitting, it's easier to explain in terms of statistics, and I've made up an example. Suppose you have data for the age and height of every child in a school, you could fit a straight line to the data, and use the coefficients to look up the height you'd expect for a child at any age in the range. If the rate of getting taller varies with age, you could get a more accurate estimate by fitting a suitable curve that takes more than two parameters. If you fit too many parameters, the result is less useful. If you use more parameters than there are kids in the school, you're likely to get a wiggly curve that passes through every data point, corresponding to perfect recall of the data but with no ability to generalise. It would get like finding the child with the closest age, and that child might be unusually tall or short.

  • @cosmocalisse
    @cosmocalisse 4 หลายเดือนก่อน +3

    I feel like a more accurate description is that it's a frozen database summarizer. It is truly incapable of putting two and two together on it's own. It cannot predict undocumented outcomes.
    Despite knowing infinitely more than any single human, it's unable to provide us with answers that encompass all that knowledge into better solutions. It can only feed us the very same conclusions that the more reliable humans have already come to.
    It feels like the real revolution will be when compute power is great enough for the training itself to be occuring in real time during every single conversation. Seems very far away!

    • @navarre.training
      @navarre.training  4 หลายเดือนก่อน +3

      Today, it may be a frozen summarizer (though of much more than just databases) but as we talk about in the video, soon there will be a version with constant training. As far as putting two and two together in ways that are 'undocumented' - In my many hours with it, I think I have felt very strongly that I have seen novel solutions, and approaches to FileMaker issues that are novel, insofar as they use features and calculations that as an expert developer I had written off long ago as not being useful. How many FileMaker calculations have you never used? How deeply do you understand all of the features and ramifications of those?
      AI is being used today in fields such as protein folding, medical analysis, language comparison, and many other fields that far surpass what a human can do presently - in other words, creating better solutions.
      All that said, I do feel that much of what you say is true in some circumstances, but I'm writing this in May 2024. As you learn in this episode, the pace of change is many orders of magnitude faster than anything I've experienced in my long tech career.
      I'm curious - How many hours have you invested learning and using ChatGPT 4o?

  • @Utoko
    @Utoko 3 หลายเดือนก่อน

    No