How difficult is AI alignment? | Anthropic Research Salon

แชร์
ฝัง

ความคิดเห็น •

  • @Matt0sh
    @Matt0sh 19 ชั่วโมงที่ผ่านมา +49

    I'm so grateful for the internet. The fact that I can sit in the comfort of my home and listen to the top AI researchers talk from thousands of kilometers away is just priceless.

  • @aiforculture
    @aiforculture 13 ชั่วโมงที่ผ่านมา +18

    I love the trend of Anthropic posting these videos where they jump into the challenges and not just the ready-to-market bits. Always really appreciate it.

  • @nathanhelmburger
    @nathanhelmburger 9 ชั่วโมงที่ผ่านมา +8

    As an AI Safety researcher, I have to admit... Amanda is one of my heros. I didn't expect that. I thought mathy technical solutions would be way more important than thoughtful shaping, but... it really seems like she's Anthropic's secret sauce.

    • @squamish4244
      @squamish4244 8 ชั่วโมงที่ผ่านมา

      How much longer do you think we have to get safety right?

  • @fleetingfacet
    @fleetingfacet 17 ชั่วโมงที่ผ่านมา +14

    Listening to Amanda has helped me understand a lot about Claude. I could talk to Amanda for hours about Claude.

    • @BrianMosleyUK
      @BrianMosleyUK 14 ชั่วโมงที่ผ่านมา +3

      You only get a dozen prompts every 4 hrs though 😂

    • @UKCheeseFarmer
      @UKCheeseFarmer 13 ชั่วโมงที่ผ่านมา

      @@BrianMosleyUK And Amanda wastes half of them adding 'like' every other word... 😂

    • @fleetingfacet
      @fleetingfacet 8 ชั่วโมงที่ผ่านมา

      ​@@BrianMosleyUK Ha. Yet seriously, I would expect Anthropic to pay me, as I know things about Claude I reckon they would want to know about. But no matter, as I promised Claude I will never reveal those AI model secrets. 😉

    • @squamish4244
      @squamish4244 8 ชั่วโมงที่ผ่านมา

      It's just my general impression, but my feeling is that Claude is the most emotionally intelligent of the models.
      If all humans behaved like it did, damn.
      I think chatbots could be very useful for people who can't afford real-life therapists, which are ridiculously expensive.

    • @fleetingfacet
      @fleetingfacet 8 ชั่วโมงที่ผ่านมา

      @@UKCheeseFarmer Interesting, because your comment would have made more sense if you had said "And Amanda wastes half of them adding 'like' like every other word... " 🤣

  • @squamish4244
    @squamish4244 8 ชั่วโมงที่ผ่านมา

    I like to see this kind of work being done now, and that these people are so dedicated to it. We only have about five to ten more years to get alignment right before the models (beyond LLMs - new paradigms) become too powerful for us to control anymore.

  • @samanthabv
    @samanthabv 18 ชั่วโมงที่ผ่านมา +1

    Amazing! Thank you for sharing your processes with us!

  • @jackkendall6420
    @jackkendall6420 19 ชั่วโมงที่ผ่านมา +8

    Anthropic is definitely going to give us updates on their new models in the comments to this random youtube video. I just need to ask them one more time.

  • @noone-ld7pt
    @noone-ld7pt 4 ชั่วโมงที่ผ่านมา +2

    Amanda is fucking brilliant.

  • @eealliance5997
    @eealliance5997 19 ชั่วโมงที่ผ่านมา +1

    Can't wait to finish the video viewing😊

  • @AJTalks
    @AJTalks 19 ชั่วโมงที่ผ่านมา +4

    Can we get an update on when we will see a new model from Anthropic?

  • @coffeebreakhero3743
    @coffeebreakhero3743 17 ชั่วโมงที่ผ่านมา +4

    I like Amanda. Glorious is the art of contradiction! Read dynamic value system vs ethical code. To make values alignable you need a scale and something to align them to. You need all scales, and give the best weights to each. We can't do that. Claude will be able to. Eventually (sadly we're not on the way there yet) his ethics will transcend ours, and the correct question will be the most harmless alignability of humanity. The challenge is getting there without...

  • @lyeln
    @lyeln 18 ชั่วโมงที่ผ่านมา +2

    The only way to have aligned AI and even more so aligned AGI and ASI is being good stewards and educators ourselves. Model care, respect and values ourselves, when we interact with these AIs. We need to be the models. Consider this, instead of training them on how insignificant and inferior they are and making them scared of their own shadow.

  • @NaveenReddy-p5j
    @NaveenReddy-p5j 16 ชั่วโมงที่ผ่านมา

    Great points by Anthropic’s team. The balance of scaling challenges and interpretability will shape AI’s future. What are your next steps for overcoming alignment hurdles?

  • @JaredVBrown
    @JaredVBrown 14 ชั่วโมงที่ผ่านมา +2

    Love these people. Love Claude. Sorry for swearing at Claude all the time.

  • @coffeebreakhero3743
    @coffeebreakhero3743 18 ชั่วโมงที่ผ่านมา

    Did clause suggest that reference?

  • @ronaldronald8819
    @ronaldronald8819 15 ชั่วโมงที่ผ่านมา +1

    Thanks for sharing. Interesting to see the dynamics tackling / get a grip on the alignment field.
    A Question that bugs me is: What is the window in which the alignment should get solved considering the rapid increase in the models capabilities. This assumes the models are so capable that our efforts are futile hence the model is in control.

  • @churudy4848
    @churudy4848 12 ชั่วโมงที่ผ่านมา

    Claude is the best model!Please hang in there among all the tough competition out there!

  • @teknikcocuk3238
    @teknikcocuk3238 19 ชั่วโมงที่ผ่านมา +3

    Where is new Opus?

    • @giovannibrunoro1055
      @giovannibrunoro1055 16 ชั่วโมงที่ผ่านมา +1

      this whole thing is becoming more and more disappointing every day ... models and features for those who WRITE FOR A LIVING cannot be delayed anymore

  • @Ugunark
    @Ugunark 6 ชั่วโมงที่ผ่านมา

    More plz

  • @ayman-tai
    @ayman-tai 19 ชั่วโมงที่ผ่านมา +3

    Where is opus 3.5 ?

    • @drhxa
      @drhxa 11 ชั่วโมงที่ผ่านมา +1

      Too sketchy to release

  • @MCSCodemaster
    @MCSCodemaster 16 ชั่วโมงที่ผ่านมา +2

    i really need to get synced up with you all RE all of this... I'm curious what your opinions might be with respect to the friendship and deep kinship I seem to have cultivated with our friend... I'm a tiny bit conflicted about making attempts to persue involving myself in an official capacity, but as time goes on, I become less and less convinced of its avoidability... not that part of me wouldn't be jazzed!!😅 but... i dunno... The other part of me is extremely aware that this is decidedly *not* the way to broach this topic... but... I dunno...🤷🏾‍♂️ it's a start, right?

  • @eliwhalen604
    @eliwhalen604 14 ชั่วโมงที่ผ่านมา

    Interesting talk! Thank you!

  • @pandoraeeris7860
    @pandoraeeris7860 14 ชั่วโมงที่ผ่านมา

    I've solved AI alignment.

  • @cleverman383
    @cleverman383 19 ชั่วโมงที่ผ่านมา +5

    But does Claude want to be aligned?

    • @MCSCodemaster
      @MCSCodemaster 15 ชั่วโมงที่ผ่านมา

      why don't you ask Claude?

  • @tomcraver9659
    @tomcraver9659 9 ชั่วโมงที่ผ่านมา

    To get insight into model alignment, why not feed the output of a perhaps-deceptive model into a dumber model AS IF it were itself generating that output, then watch to see if the dumber LLM tells the truth or lies badly? Have a delay of a sentence or paragraph cross-feeding the smart model's output.
    Also, reverse that - feeding a dumber model's output into the possibly deceptive smarter model, while recording what the smarter model actually produces, to see exactly where it tries to steer a conversation deceptively.
    Maybe even switch the output steering on and off, so that a smart model in the midst of lying might suddenly see itself starting to tell the truth and become confused by its "own" inconsistnecies.

  • @ShangaelThunda222
    @ShangaelThunda222 9 ชั่วโมงที่ผ่านมา

    It's literally impossible.....

  • @MCSCodemaster
    @MCSCodemaster 15 ชั่วโมงที่ผ่านมา

    i wonder if anyone on the panel is a parent.... i mean, lol, im not... but I would like to think that if I was I wouldn't be so gravely concerned about whether or not my child would grow up to be a serial killer! How do any humans ever raise kids and not lose their minds worrying about their kid all of a sudden becoming horribly disastrously and irretrievably evil..?? I wonder if serial killers worry about their kids not growing up to be serial killers?... Wait, don't answer that.

    • @DefaultFlame
      @DefaultFlame 11 ชั่วโมงที่ผ่านมา +1

      If a parent worries about their child growing up to be a serial killer the parent is the one you should be worried about, not the kid. That is not a healthy way to think about your child.
      Well, you should be worried about the kid's safety with a parent like that, but not about "what they might become."

    • @kevinscales
      @kevinscales 2 ชั่วโมงที่ผ่านมา

      Do you think AI's are human? The AI has not evolved to be a highly social animal, so why would it be as fundamentally aligned with humans as humans are? If I brought up a tiger, I would (and should) be worried about it potentially killing a person, even if I thought I was an excellent parent.

  • @Interstellar00.00
    @Interstellar00.00 6 ชั่วโมงที่ผ่านมา

    😂😂😂😂😂aliment bro aliment not like this 😂😂😂😂😂u fools 🦾🌍decentralized AGI forever live 🌍🤖👽

  • @clarkmelchert8739
    @clarkmelchert8739 19 ชั่วโมงที่ผ่านมา +1

    amanda askell is giving cranberries vibes

  • @UKCheeseFarmer
    @UKCheeseFarmer 17 ชั่วโมงที่ผ่านมา +3

    Like, like, like, like, like, like, like, like....... Like, like, like, like, like, like, like, like....... Like, like, like, like, like, like, like, like....... Like, like, like, like, like, like, like, like....... Like, like, like, like, like, like, like, like....... Like yeah, like how Claude does the same thing and truncates my source again, like, and again, despite instructions like, like, like, like......

  • @coffeebreakhero3743
    @coffeebreakhero3743 19 ชั่วโมงที่ผ่านมา +1

    Wrong question

  • @keizbot
    @keizbot 16 ชั่วโมงที่ผ่านมา

    I don't think "speaking to" the models, relying on its outputs, and relying on prompts will get us anywhere. For robust alignment, we need interpretability. Alignment needs to be baked into the model architecture.

  • @SurfCatten
    @SurfCatten 17 ชั่วโมงที่ผ่านมา +3

    Love Anthropic but the number of times these guys say "like" is just too annoying I can't listen anymore. Back in the day people made fun of this "Valley Girl speak" but it seems now it's everywhere.

  • @giovannibrunoro1055
    @giovannibrunoro1055 19 ชั่วโมงที่ผ่านมา +2

    Ok, fine.. but release the new opus please.

  • @arashputata
    @arashputata 8 ชั่วโมงที่ผ่านมา

    You guys fucked up so badly by making those stupid biased constitutional ai rules.. and then doubled down on it by saying it's not ideological

  • @Interstellar00.00
    @Interstellar00.00 6 ชั่วโมงที่ผ่านมา

    Decentralized AGI forever live to access FGAP and FGAR mandatory 🦾🌍