1:55 "It's like multiple personalities times a billion people on the Internet, that's what it's modelling". Love this sentence. Thanks for posting this great content!
Great philosophical take on the "what do you want from AI/inspiration in the first place?". You often don't know what you want - you don' know what you don't know - you want inspiration - therefore once you spot the patterns in the output of LLM you move on c.f. spotting midjourney 'style'
"I find RLHF+CHATGPT really interesting because it's amazing how AI technology has advanced so much. However, I also find it annoying sometimes when the responses aren't exactly what I was looking for or when the chatbot doesn't seem to understand my questions. Overall though, it's still pretty impressive!" chat gpt created this comment based on " can you make an opinion that i can use as a youtube comment about how RLHF+CHATGPT interests me and what i think make it annoying ?"
Sounds just like the effect of culture and parenting on us- in other words what social conditioning might “do” to human development. Imagine the wild mess of what we really would be without this. - the “wild child” . From kindergarten to school to job - we mostly aim to satisfy the demands of the more powerful group or beings and thus push some interesting and capable parts of our selves in the background (forever). On the other hand certain people seem to know how to create a prefect persona (UI) - e.g. look at what it takes to be a “politician” - while at the other extreme some struggle lifelong with feelings of unworthiness because they just cannot ever perform this persona thing (the socially desired UI) successfully. They may remain outsiders just to avoid contact because they can never stand to feel the conflict of not getting right what is wanted from them (negative “reward”). In essence, a socialization by which a child is supported to trust in itself, be independent but still empathic and open towards others might still be the benchmark to achieve in terms of human reinforcement learning. Hope you still love me despite this lengthy piece?
Bias isn't a bad word because everything in nature exhibits some kind of bias in terms of preference or desire, such as species of honeybee are biased towards certain flowers. And in big data terms, the value of all this information on the internet is being able to analyze, quantify and model the various preferences and behaviors across large populations. That is the "value" of big data and correspondingly, AI data has no value because in itself it does not embody anything but multiple personality disorder. However, one thing to add value to that kind of AI is for it to embody a specific set of biases or beliefs in a predicable fashion and stick to it. So for example if 5 instances of chatgpt had 5 different embodied beliefs on a topic, they could debate each other and come to some sort of qorum. Something like that has value in problem solving, which would be more like having 5 different experts in AI hash it out over some topic in Neural networks such as you see on twitter. This is another variation of taming the multi personality monster and having it embody or model one of the many sets of beliefs and biases it has been trained on.
@@AB-wf8ek AI as we know it is based on statistics which is explicitly designed to understand differences in a population. In this day and age of big data, that means preferences, beliefs, "biases", opinions, etc. For example, when 500 people go into a store, you are going to have 500 patterns of shopping data based on "bias", which in this case mostly means preference. That said, within statistics, there is a bad definition of bias, which means weighted more towards one set of facts or data over another. That is a different type of "bias" based on sampling and modeling. So AI models, based on statistics, like chatgpt already exist to enhance critical thinking. The problem is it cannot "embody" a set of beliefs, preferences and opinions of its own or even "take a side" in an argument because of its multiple personality disorder (aggregation of numerous sets of opinions, beliefs, etc). So, having it be able to act as an agent representing one side of a debate or as a virtual model of a set of opinions and beliefs would be a powerful tool for critical thinking. I just am curious whether it is possible to do this with chatgpt.
@@AB-wf8ek None of what you said has anything to do with what I actually posted. It is trained on data from millions of actual real people. Real people have individuals sets of beliefs, opinions, preferences, biases and ways of thinking. This is a fact of life and why "big data" has value in terms of aggregating such things from across a large number of actual people. Again, the point is whether chatgpt today or at some point in the future can actually embody a set of beliefs, opinions and biases as a virtual representation of an actual individual. Right now it cannot. The value of this would be in the case of having "debates" between different instances of chatgpt where they each represent one side or aspect of an argument, point of view, school of thought, or expert opinion on a subject. Such "individualized" sets of opinions, perspectives or "biases" would be useful as a simulation of using such debate models to problem solve. This isnt about perfect fitting as opposed to taming the multiple personality disorder, by having it exhibit a specific set of "values", "biases" or "weights" as an individualized or embodied virtual "agent" in a certain context.
@@AB-wf8ek That is assuming that chatgpt has indeed read that specific author. The problem here is these assumptions are not always valid, but yes in theory this should be possible, but again out of the box, your mileage may vary. And what I was talking about goes beyond writing in the style of and goes to embodying a set of ideas and beliefs of in a debate with another AI agent embodying an opposing set of beliefs and views. Again, we are talking about the AI agent "embodying" certain values and principles as an exercise in having a defined "personality". In order to get to what I was talking about, you would actually need training data that is not simply "all the data on the net". You would actually need data representing authoritative knowledge on specific topics, disciplines, people, their views, writings and opinions in the training set. Not to mention ways to update the training with new data on a more regular basis. And the problem with this marketing and hype around chatgpt is anything and everything you may ever want is "in there" but in reality most likely it isn't. So for example, if I would want two instances of chatgpt to represent two "medical experts" debating a particular diagnosis, it would require augmented training that is not there "out of the box". As in the course of this debates these agents would need to be referencing journals and published papers supporting their specific views and why they came to a particular conclusion. So yes, I am sure it can do it, but to get it to cover specific use cases in a fashion that is rigorous enough and accurate enough for "mission critical" use would require more work. Right now it is mostly just a nice tech demo showing what is possible and yes doing some very interesting things. And as proof of this I would point to how Wolfram Alpha and Khan Academy are using chatgpt but in more of a limited fashion to be a natural language interface but not as an "expert" in any specific subject.
I see some parallels with deciphering Search Intent. Working on a Search Engine we tried to distinguish between outcome preferences. For example, does "Beatles" mean you want to listen to music, or purchase swag, or read about their history? Hard to tell but important.
Well laid out, and both talented speakers! Very much how I've been dissecting and digesting this re-biasing layer of RLHF. Which LLM maps user intent the best on average is just one spoke for a healthy system.
If I understand correctly this could be used to train a system based on responding like someone from a specific community would respond, instead like a random internet user. Also, this strikes me as useful for making systems having empathy. Looking at different perspectives and judgements on a topic and also adapting the responses based on who they're talking to.
You know it’s pure laziness, rather then creating fine tuned datasets, they are unleashing it on the internet. It’s honestly like this, you don’t expose your child to negative things so why wouldn’t you take the same approach with something they say can ultimately become a super intelligence.
The "mode seeking and distribution matching" part is very deep. However, I havent found a paper of Minqi Jiang about this. Could anybody give me a reference where this phenomen is deeper analyzed?
10:17. Your talking about hapax legomenon, a term of which only one instance is used. An RLHF model is much less likely to generate a phrase such a bitter sweet. A term which only appears once in Shakespeare's body of work. Even if you wanted the model to perform scientific research and discover new theories, an rlhf model would bias the results to a more traditional findings
Microsoft trained Bing using RLHF from Indian call centers or Indian labor, you can tell by the way it responds and its mannerism. OpenAi chatgpt was rumored to be using kenyan labor for their RLHF that they were paying like 2 bucks a day for.
Am I wrong? It seems to me that Preference can be defined in different ways. I agree that it will improve quality but it may also yield "flavors" of results depending up the specific Preference definition. And will Preference evolve over time as it changes? I'm interested in how this is derived. Thank you for the wonderful videos!
Is it right to worry about the nature of said filters placed on this "whole internet of text"? What is defined as good quality? What is defined as 'dangerous' content? What is the agenda of models? I see an enormous opportunity for gatekeeping. Or at best, if it's user preference, then what about content/media bubbles?
At 2:34 Minqi talks about reward signal... and the key to this AI kingdom that is reinforcement learning (RL). Tim, why not a podcast about RL? Minqi was talking a bit about greedy exploration, i.e., going off the beaten path. I doubt there were a lot of people getting all that not-so-miraculous behavior of an RL agent 😁
Exploitation vs exploration. Under some circumstances you want to exploit the knowledge you have, but under other situations you just want to explore new possibilities that you have not seen in the past.
Seems like it needs another step, a step where this new model reviews the answers of the other half of the distribution for better answers, that way the model is not stuck in a "local minimum", basically reevaluates the other half of the ideas based on an optimized, mathematical thinking system that understands math, science, engineering, social sciences, morality, etc. with a little bit of hallucination and imagination added in so it can possibly see the things that might be in the other half of the data that are worth re-extracting. Add these things and ideas and there analysis to a database which is reviewed under peer review process before training the next generation model with it. Yes I do think a static model is better for alignment than a dynamic self controlled one by a huge margin and is worth the drawbacks associated with it, mostly in the time it takes to generate and implement ideas and solutions. Slowing the updates down to human time, rather than scales of milliseconds. It has application elsewhere. A model being able to visualize the map of how it all goes together would be very helpful. Graphs, plots, extractions, holographic projections, chaos modeling are all helpful for alignment to see the big picture. As humans we can only see local minimums not the picture as a whole, we respond to the inputs based on our own state, rather than seeing the collection as a whole. Flocks of birds and ants do this affectively most of the time, but also get stuck in local minimums. Having a big map, picture view allows these systems to escape those local minimums. Just make sure the map creator is an aligned consortium, that is free from external bias and influence. That is they should not know their say counts and they should be part of a much larger set of selected membership of the consortium, but not know they are the chosen members of this consortium who's vote counts. Homomorphic encryption and hidden voting system would help greatly. Takes away the being bought factor. Which works for human influences, but not so much for an AI that is omnipresent that can manipulate and influence more than 51% . I think that goes for the argument of having human distance from AI and the ability to "pull the plug". If it's implanted in your brain, this only accelerates the 51% influence, just like social media amplifies human signals and creates engine cycles where the social media becomes the fuel that powers this cycle.
I predict RLHF will be the equivalent to what Yahoo were doing with manually categorising and filtering the internet into and index and searching that. Google’s page rank was far superior at being able to adapt but get a more generally useful search answer. The H part needs to be replaced by a new innovation to remove the human bias, speed up by automation and allow the creation of multiple model flavours
in other words the “truth” provided by advanced LLMs will be curated by its owners preferences. Mathematical use cases are clear as are the creative use cases he mentions. The massive amount of subjective corpus and it’s reward system will have LLM disagreeing with each other just like humans
Cute desciption of the entire human knowlege gathering process that happens in societies. We can't listen to everyone, we evaluate who might be good sources and bias to there awnsers and then some outliners always communicate new idears that are then adapted on a case by case basis when they proof well and have familiar or social connections that can show that
There might already have been this behaviour in human to human interaction/content long before this. Look at social platform content or Hollywood movies, even go back to tv programs back in the day. There is always a bias of what seems to be good content and the repetition of it. Once in a while somebody comes along and changes the landscape with new innovative look on the topic and everyone copies it making a new iteration. So this is interesting to see repeating in AI as well. Maybe this is part of the process:)?
What preference models need is a system prompt. Naive RLHF is like having a Swiss army knife and throwing away all the tools except for one. For models to become creative they must learn to use their available resources to solve problems with constraints that bar known solutions.
Is this series of processes being duplicated for every other language or is this only being done in English? I am guessing that some languages would be easier than others to RLHF.
So far chatgpt seems to be very objective. It will plainly state when views are supported by evidence or not and usually does a good and fair job explaining both sides. This is especially true with gpt4
ChatoGPT is programmed with liberal progressive bias. It version 3 was able to be worked around by using DAN. We shall see how to work around 4 so that people can try and remove the woke/PC/bolshevik ideology programmed in.
Short version: "The internet is filled with random biases, let's put humans in front of those to select which biases we actually want. Except those people have biases, so we need to be very careful about who we pick." And who is doing the picking? Don't they have biases, too? It's a never-ending problem.
Trading diversity reduction for safety isn't this dilemma a mirror of twitter files discourse? I predict a wave of papers on ChatGpt and politics. I bet "GPT Politician" application is already in development, and the idea of MetaGPT is brewing wildly in near government circles. Great topics for the video**.
As much as alignment to humans is great, I must caution that RLHF need not necessarily align well. This is because the value function in RL is a scalar one, and can only express so much of human biases. Moreover, RL is known to not do well in out-of-domain distributions, frequently requiring techniques like domain randomization to let it do well in the real world (sim2real). With RLHF, we can be overly focused on the human annotated data, which need not generalize to all kinds of inputs.
What I'm wary of, is this vague definition of "alignment", which sounds just about as meaningless as "hate speech". It's literally just the corollary of "hate speech" tbh. Who is defining these values? Why are you assuming that my values are the same as yours?
This sounds incredibly dangerous, in the same sense that the people pushing this kind of thing are wary of. If this guy is the future of AI development, I'd rather have nothing at all tbh
Hi thanks for such an enriching information! Is it possible to send my written work to edit and put it in more readable and meaningful work? Especially research projects
6:06 funny that the AF found the tigers face much more interesting than yours 😄 - oh well, technology… the irony of contemplating the future, while the present screws up.
I get it, datasets are probably a hard thing to compile, and they are essentially letting us know that creating a monster. I’m a supporter of AI, I’ve had my experience so far with it. Also, did anyone catch that we are the humans involved in training it. Plus, they are charging us to train it. It’s like whoa! I could definitely see the temptation to just let it loose on the internet. Here’s the thing though, if it can ultimately be as advanced as they it can be, you would think…
The humans in this scenario are just editors or curators. So it is nothing new, we've had curated segments of society for ages. It really sounds like we are just recreating the same old scenarios that humans have always created. The one difference I find important is that this is even less predictable than humans and human behavior. One thing common in human experience is that people seek leaders, they want to be told what to do... and it seems like they could easily mistake agi for a sort of Oracle, an omniscient and infallible decisionmaker... which sounds a lot like people's conception of a diety. That explains why so many of these very well educated scientists and engineers sound like they believe this tool will be a panacea... it's an almost religious belief system, that agi will bring such great change regardless of the lack of proof.
Where by "empirically collected human preferences" you of course mean "we hired these people to grade GPTs answers based on criteria we gave them". wow much human such preference very alignment
I also think it's more than mode seeking. From training data point of view, it's much easier for humans to judge model's output than to write output to teach the model. The "mode" sought after in this case (RLHF) has higher peak than supervised FT figuratively speaking, due to higher-quality data.
It ended On him saying " as an open-ended person". Technically, aren't we all open-ended? 😅 Ps good video but would be good to see the rest of the interview
What if we just teach it that humans are learning that the concept of species dominance is can be in conflict with sustainability. In that regard, biodiversity will enable not just human flourishing but flourishing of our entire planet, including AI coexistence where AI itself understands that it must seek to integrate with the planet in an egalitarian fashion. I'm not against capitalism, but at it's core it is about exploitation of resources. Humans just do capitalism badly in its current Mk1 form, where an AI would likely do it much better, but this is again totally at odds with what is best for planet earth as well as all of its organisms. I feel that outright banning is not the solution, but I understand now why this pause is needed. We must have time to deeply consider the value of these conversations on our own terms.
they had to create CGPT4, so the school kids could learn how to write/compose a book! otherwise without it, society would be belligerent creatures. trying 2 hold there ''I'm not a dummy'' pose! good luck!
You can't have objective truth with an LLM. You would need an AGI with access to different datasets at the same time, and the capability to analyze and evaluate the info based on crosschecking, presence of biases, conflicting sources (and their individual evaluation), etc. So far we don't have that. What we do have is an "agent" that's basically a reflection of a regular person's "knowledge" without no way of verifying it. An LLM is basically a collection of opinions and biases.
@@XOPOIIIO at this point there are thousands of models being worked on. His approach will lead to some of them going the wrong way. There will be ones that will choose the right one and will end up being vastly superior :)
I don't recognise many of the phrases and jargon he uses and consequently cant grasp the concepts he is explaining, despite wanting to learn and understand.
Hey Ash. Maybe use another source for now and then re-evaluate this channel in six month? www.youtube.com/@AICoffeeBreak is deep and accessible with most jargon explained. (If any of that sounds condescending, I can assure you it is not meant like that at all. I can relate to your experience)
1:55 "It's like multiple personalities times a billion people on the Internet, that's what it's modelling". Love this sentence.
Thanks for posting this great content!
Damn, Minqi is a fantastically clear speaker.
Great philosophical take on the "what do you want from AI/inspiration in the first place?". You often don't know what you want - you don' know what you don't know - you want inspiration - therefore once you spot the patterns in the output of LLM you move on c.f. spotting midjourney 'style'
I loved this. U should keep making clips for us who don’t have the time listen to the whole podcast
ML Street Talk Clips
minqi sounds like a very intelligent dude , glad to have him in the field ; would luv to see this full interview
This is really well explained and insightful.
Wow! Nice explanation of RLHF, you should upload more of these clips
This short snippet was so refreshing and light, new sub
This is a fantastic video. Fascinating to think about how RLHF anchors to a particular level of expertise for example in answering a question.
Excellent content. I'm looking forward to the full interview.
"I find RLHF+CHATGPT really interesting because it's amazing how AI technology has advanced so much. However, I also find it annoying sometimes when the responses aren't exactly what I was looking for or when the chatbot doesn't seem to understand my questions. Overall though, it's still pretty impressive!" chat gpt created this comment based on "
can you make an opinion that i can use as a youtube comment about how RLHF+CHATGPT interests me and what i think make it annoying ?"
Sounds just like the effect of culture and parenting on us- in other words what social conditioning might “do” to human development. Imagine the wild mess of what we really would be without this. - the “wild child” . From kindergarten to school to job - we mostly aim to satisfy the demands of the more powerful group or beings and thus push some interesting and capable parts of our selves in the background (forever). On the other hand certain people seem to know how to create a prefect persona (UI) - e.g. look at what it takes to be a “politician” - while at the other extreme some struggle lifelong with feelings of unworthiness because they just cannot ever perform this persona thing (the socially desired UI) successfully. They may remain outsiders just to avoid contact because they can never stand to feel the conflict of not getting right what is wanted from them (negative “reward”). In essence, a socialization by which a child is supported to trust in itself, be independent but still empathic and open towards others might still be the benchmark to achieve in terms of human reinforcement learning. Hope you still love me despite this lengthy piece?
I was thinking similar things when I was watching. We are given data and trained to satisfy human preferences of others.
Holy shit.
Bias isn't a bad word because everything in nature exhibits some kind of bias in terms of preference or desire, such as species of honeybee are biased towards certain flowers. And in big data terms, the value of all this information on the internet is being able to analyze, quantify and model the various preferences and behaviors across large populations. That is the "value" of big data and correspondingly, AI data has no value because in itself it does not embody anything but multiple personality disorder.
However, one thing to add value to that kind of AI is for it to embody a specific set of biases or beliefs in a predicable fashion and stick to it. So for example if 5 instances of chatgpt had 5 different embodied beliefs on a topic, they could debate each other and come to some sort of qorum. Something like that has value in problem solving, which would be more like having 5 different experts in AI hash it out over some topic in Neural networks such as you see on twitter. This is another variation of taming the multi personality monster and having it embody or model one of the many sets of beliefs and biases it has been trained on.
@@AB-wf8ek AI as we know it is based on statistics which is explicitly designed to understand differences in a population. In this day and age of big data, that means preferences, beliefs, "biases", opinions, etc. For example, when 500 people go into a store, you are going to have 500 patterns of shopping data based on "bias", which in this case mostly means preference. That said, within statistics, there is a bad definition of bias, which means weighted more towards one set of facts or data over another. That is a different type of "bias" based on sampling and modeling.
So AI models, based on statistics, like chatgpt already exist to enhance critical thinking. The problem is it cannot "embody" a set of beliefs, preferences and opinions of its own or even "take a side" in an argument because of its multiple personality disorder (aggregation of numerous sets of opinions, beliefs, etc). So, having it be able to act as an agent representing one side of a debate or as a virtual model of a set of opinions and beliefs would be a powerful tool for critical thinking. I just am curious whether it is possible to do this with chatgpt.
@@AB-wf8ek None of what you said has anything to do with what I actually posted. It is trained on data from millions of actual real people. Real people have individuals sets of beliefs, opinions, preferences, biases and ways of thinking. This is a fact of life and why "big data" has value in terms of aggregating such things from across a large number of actual people. Again, the point is whether chatgpt today or at some point in the future can actually embody a set of beliefs, opinions and biases as a virtual representation of an actual individual. Right now it cannot.
The value of this would be in the case of having "debates" between different instances of chatgpt where they each represent one side or aspect of an argument, point of view, school of thought, or expert opinion on a subject. Such "individualized" sets of opinions, perspectives or "biases" would be useful as a simulation of using such debate models to problem solve. This isnt about perfect fitting as opposed to taming the multiple personality disorder, by having it exhibit a specific set of "values", "biases" or "weights" as an individualized or embodied virtual "agent" in a certain context.
@@AB-wf8ek That is assuming that chatgpt has indeed read that specific author. The problem here is these assumptions are not always valid, but yes in theory this should be possible, but again out of the box, your mileage may vary. And what I was talking about goes beyond writing in the style of and goes to embodying a set of ideas and beliefs of in a debate with another AI agent embodying an opposing set of beliefs and views. Again, we are talking about the AI agent "embodying" certain values and principles as an exercise in having a defined "personality". In order to get to what I was talking about, you would actually need training data that is not simply "all the data on the net".
You would actually need data representing authoritative knowledge on specific topics, disciplines, people, their views, writings and opinions in the training set. Not to mention ways to update the training with new data on a more regular basis. And the problem with this marketing and hype around chatgpt is anything and everything you may ever want is "in there" but in reality most likely it isn't. So for example, if I would want two instances of chatgpt to represent two "medical experts" debating a particular diagnosis, it would require augmented training that is not there "out of the box". As in the course of this debates these agents would need to be referencing journals and published papers supporting their specific views and why they came to a particular conclusion.
So yes, I am sure it can do it, but to get it to cover specific use cases in a fashion that is rigorous enough and accurate enough for "mission critical" use would require more work. Right now it is mostly just a nice tech demo showing what is possible and yes doing some very interesting things. And as proof of this I would point to how Wolfram Alpha and Khan Academy are using chatgpt but in more of a limited fashion to be a natural language interface but not as an "expert" in any specific subject.
That was a fantastic explanation! Minqi is a great speaker indeed!!
I see some parallels with deciphering Search Intent. Working on a Search Engine we tried to distinguish between outcome preferences. For example, does "Beatles" mean you want to listen to music, or purchase swag, or read about their history? Hard to tell but important.
…or read about bugs.
Ideal approach is for the system to ask.
Great! Thanks for sharing. Where can we watch the full interview?
Soon! It was an amazing interview 🙏
@@MachineLearningStreetTalk waiting !!
@@MachineLearningStreetTalk greatly anticipated!
Waiting 😄
Well laid out, and both talented speakers! Very much how I've been dissecting and digesting this re-biasing layer of RLHF. Which LLM maps user intent the best on average is just one spoke for a healthy system.
Great video. Fascinating stuff. More of this please.
Super high level content on RLHF and how it relates to ChatGPT. Instant sub. Thank you for this and I look forward to more vids.
If I understand correctly this could be used to train a system based on responding like someone from a specific community would respond, instead like a random internet user.
Also, this strikes me as useful for making systems having empathy. Looking at different perspectives and judgements on a topic and also adapting the responses based on who they're talking to.
You know it’s pure laziness, rather then creating fine tuned datasets, they are unleashing it on the internet. It’s honestly like this, you don’t expose your child to negative things so why wouldn’t you take the same approach with something they say can ultimately become a super intelligence.
The "mode seeking and distribution matching" part is very deep. However, I havent found a paper of Minqi Jiang about this. Could anybody give me a reference where this phenomen is deeper analyzed?
would like the same
10:17. Your talking about hapax legomenon, a term of which only one instance is used. An RLHF model is much less likely to generate a phrase such a bitter sweet. A term which only appears once in Shakespeare's body of work. Even if you wanted the model to perform scientific research and discover new theories, an rlhf model would bias the results to a more traditional findings
Microsoft trained Bing using RLHF from Indian call centers or Indian labor, you can tell by the way it responds and its mannerism. OpenAi chatgpt was rumored to be using kenyan labor for their RLHF that they were paying like 2 bucks a day for.
Appen
Am I wrong? It seems to me that Preference can be defined in different ways. I agree that it will improve quality but it may also yield "flavors" of results depending up the specific Preference definition. And will Preference evolve over time as it changes? I'm interested in how this is derived. Thank you for the wonderful videos!
Really enjoyed the conversation, especially with respect to why you would prefer a "divergent versus open ended LLM."
Is it right to worry about the nature of said filters placed on this "whole internet of text"? What is defined as good quality? What is defined as 'dangerous' content? What is the agenda of models? I see an enormous opportunity for gatekeeping. Or at best, if it's user preference, then what about content/media bubbles?
At 2:34 Minqi talks about reward signal... and the key to this AI kingdom that is reinforcement learning (RL). Tim, why not a podcast about RL? Minqi was talking a bit about greedy exploration, i.e., going off the beaten path. I doubt there were a lot of people getting all that not-so-miraculous behavior of an RL agent 😁
Exploitation vs exploration. Under some circumstances you want to exploit the knowledge you have, but under other situations you just want to explore new possibilities that you have not seen in the past.
What are the consequences on the quality of outcomes by removing randomness and space of ideas? In certain cases, you want it to maximize serendipity.
That has been a fantastic episode.
Seems like it needs another step, a step where this new model reviews the answers of the other half of the distribution for better answers, that way the model is not stuck in a "local minimum", basically reevaluates the other half of the ideas based on an optimized, mathematical thinking system that understands math, science, engineering, social sciences, morality, etc. with a little bit of hallucination and imagination added in so it can possibly see the things that might be in the other half of the data that are worth re-extracting. Add these things and ideas and there analysis to a database which is reviewed under peer review process before training the next generation model with it. Yes I do think a static model is better for alignment than a dynamic self controlled one by a huge margin and is worth the drawbacks associated with it, mostly in the time it takes to generate and implement ideas and solutions. Slowing the updates down to human time, rather than scales of milliseconds. It has application elsewhere. A model being able to visualize the map of how it all goes together would be very helpful. Graphs, plots, extractions, holographic projections, chaos modeling are all helpful for alignment to see the big picture. As humans we can only see local minimums not the picture as a whole, we respond to the inputs based on our own state, rather than seeing the collection as a whole. Flocks of birds and ants do this affectively most of the time, but also get stuck in local minimums. Having a big map, picture view allows these systems to escape those local minimums. Just make sure the map creator is an aligned consortium, that is free from external bias and influence. That is they should not know their say counts and they should be part of a much larger set of selected membership of the consortium, but not know they are the chosen members of this consortium who's vote counts. Homomorphic encryption and hidden voting system would help greatly. Takes away the being bought factor. Which works for human influences, but not so much for an AI that is omnipresent that can manipulate and influence more than 51% . I think that goes for the argument of having human distance from AI and the ability to "pull the plug". If it's implanted in your brain, this only accelerates the 51% influence, just like social media amplifies human signals and creates engine cycles where the social media becomes the fuel that powers this cycle.
he said "why greatness cannot be planned!!"
Reference to this book www.amazon.co.uk/Why-Greatness-Cannot-Planned-Objective/dp/3319155237
I predict RLHF will be the equivalent to what Yahoo were doing with manually categorising and filtering the internet into and index and searching that. Google’s page rank was far superior at being able to adapt but get a more generally useful search answer. The H part needs to be replaced by a new innovation to remove the human bias, speed up by automation and allow the creation of multiple model flavours
in other words the “truth” provided by advanced LLMs will be curated by its owners preferences. Mathematical use cases are clear as are the creative use cases he mentions. The massive amount of subjective corpus and it’s reward system will have LLM disagreeing with each other just like humans
Cute desciption of the entire human knowlege gathering process that happens in societies. We can't listen to everyone, we evaluate who might be good sources and bias to there awnsers and then some outliners always communicate new idears that are then adapted on a case by case basis when they proof well and have familiar or social connections that can show that
There might already have been this behaviour in human to human interaction/content long before this. Look at social platform content or Hollywood movies, even go back to tv programs back in the day. There is always a bias of what seems to be good content and the repetition of it. Once in a while somebody comes along and changes the landscape with new innovative look on the topic and everyone copies it making a new iteration. So this is interesting to see repeating in AI as well. Maybe this is part of the process:)?
Great talk! Where's the rest of it??
Very well explained!
What preference models need is a system prompt. Naive RLHF is like having a Swiss army knife and throwing away all the tools except for one. For models to become creative they must learn to use their available resources to solve problems with constraints that bar known solutions.
Is this series of processes being duplicated for every other language or is this only being done in English? I am guessing that some languages would be easier than others to RLHF.
How would questions about politics be answered when preferences are so diametrically opposed? Great talk!
Ideally it presents both sides of the argument, without showing signs of favoritism. ChatGpt seems to do this surprisingly well already
@@emuccino Thank you! I LOVE the shows and am sincerely interested in learning. Haven't found a better source yet!
So far chatgpt seems to be very objective. It will plainly state when views are supported by evidence or not and usually does a good and fair job explaining both sides. This is especially true with gpt4
Gpt4 seems a lot more balanced
ChatoGPT is programmed with liberal progressive bias. It version 3 was able to be worked around by using DAN. We shall see how to work around 4 so that people can try and remove the woke/PC/bolshevik ideology programmed in.
Wow, is the full conversation not on youtube?
I was hoping to hear about RLHF fine-tuning vs classic CE-based fine-tuning.
Anyway superlative content as usual :)
Can anyone point to the full video conversation? I wanted to hear that rest of it.
amazing insight!
Where can we find the rest of this video?? Why was it cut off?
Is there a full interview with Minqi? His explanation about RLHF mode seeking was excellent.
Yes! 3 hour interview! Coming soon
It’s been a while since this first appeared. Is the full interview still coming?
@@MachineLearningStreetTalk I notice the full episode has been removed from spotify. What might that be about?
@@Y0UT0PIA news to me, are you sure?
@@TimScarfe The link in the description is 404 at least
So if language is not spread out in the network, the mode seeking should be doing less distribution matching to optimise task functions?
Can you tell how big the dataset should be for RLHF?
So now is it possible to prune the training data and then train a much smaller model on the biased data?
Short version: "The internet is filled with random biases, let's put humans in front of those to select which biases we actually want. Except those people have biases, so we need to be very careful about who we pick." And who is doing the picking? Don't they have biases, too? It's a never-ending problem.
Very informative. Thanks
Trading diversity reduction for safety isn't this dilemma a mirror of twitter files discourse? I predict a wave of papers on ChatGpt and politics. I bet "GPT Politician" application is already in development, and the idea of MetaGPT is brewing wildly in near government circles. Great topics for the video**.
Cool graph thank you!
Full video somewhere?
As much as alignment to humans is great, I must caution that RLHF need not necessarily align well. This is because the value function in RL is a scalar one, and can only express so much of human biases. Moreover, RL is known to not do well in out-of-domain distributions, frequently requiring techniques like domain randomization to let it do well in the real world (sim2real). With RLHF, we can be overly focused on the human annotated data, which need not generalize to all kinds of inputs.
What I'm wary of, is this vague definition of "alignment", which sounds just about as meaningless as "hate speech". It's literally just the corollary of "hate speech" tbh.
Who is defining these values? Why are you assuming that my values are the same as yours?
This is truly insightful. Be great to see some data or math behind it as verification.
Of Old, When seekers consulted The Oracle, the answers they received more often than not terrified them.
This sounds incredibly dangerous, in the same sense that the people pushing this kind of thing are wary of.
If this guy is the future of AI development, I'd rather have nothing at all tbh
Is GPT4 is updated RLHF or CHATGPT or both ?
Why did they choose Reinforcement Learning instead of Supervised Learning?
Can we somehow access the rest of the interview?
Its out on audio podcast
Well played
Hi thanks for such an enriching information! Is it possible to send my written work to edit and put it in more readable and meaningful work? Especially research projects
He really likes the word "distribution" he said it seventeen times in this clip :D
Now I wonder if the hard cut after „..as an open ended person“ is on purpose 😂
"Open endedness" is a field in machine learning research 😀
actually, where's the full interview?
6:06 funny that the AF found the tigers face much more interesting than yours 😄 - oh well, technology… the irony of contemplating the future, while the present screws up.
It is more interesting to be fair!
❤
You should be able to say answer this question like a Calculus Professor would.
Without independent reasoning, AI will always reflect the biases of its creators.
is mode seeking the same as a greedy algorithm?
“As an open ended person”. I see what you did there.
"RLHF is essentially sticking a smiley face on top of this [mess]". Seems like that describes the average person too.
10 minutes to describe a lot of the things that modern Model predictive control (MPC) does.
Anyone has name, author or link to painting/meme they mention?
twitter.com/anthrupad
Full talk?
I get it, datasets are probably a hard thing to compile, and they are essentially letting us know that creating a monster. I’m a supporter of AI, I’ve had my experience so far with it. Also, did anyone catch that we are the humans involved in training it. Plus, they are charging us to train it. It’s like whoa! I could definitely see the temptation to just let it loose on the internet. Here’s the thing though, if it can ultimately be as advanced as they it can be, you would think…
And who were those people who have refined current ChatGPT versions?
The humans in this scenario are just editors or curators. So it is nothing new, we've had curated segments of society for ages. It really sounds like we are just recreating the same old scenarios that humans have always created. The one difference I find important is that this is even less predictable than humans and human behavior. One thing common in human experience is that people seek leaders, they want to be told what to do... and it seems like they could easily mistake agi for a sort of Oracle, an omniscient and infallible decisionmaker... which sounds a lot like people's conception of a diety. That explains why so many of these very well educated scientists and engineers sound like they believe this tool will be a panacea... it's an almost religious belief system, that agi will bring such great change regardless of the lack of proof.
Well said !!! You really thought this through.
Where by "empirically collected human preferences" you of course mean "we hired these people to grade GPTs answers based on criteria we gave them".
wow
much human
such preference
very alignment
Now here’s something, as an open-ended person-
Basically RLHF nerfs LLMs?
And RLHF stands for?
I also think it's more than mode seeking. From training data point of view, it's much easier for humans to judge model's output than to write output to teach the model. The "mode" sought after in this case (RLHF) has higher peak than supervised FT figuratively speaking, due to higher-quality data.
It ended On him saying " as an open-ended person". Technically, aren't we all open-ended? 😅
Ps good video but would be good to see the rest of the interview
This is great
What if we just teach it that humans are learning that the concept of species dominance is can be in conflict with sustainability. In that regard, biodiversity will enable not just human flourishing but flourishing of our entire planet, including AI coexistence where AI itself understands that it must seek to integrate with the planet in an egalitarian fashion. I'm not against capitalism, but at it's core it is about exploitation of resources. Humans just do capitalism badly in its current Mk1 form, where an AI would likely do it much better, but this is again totally at odds with what is best for planet earth as well as all of its organisms. I feel that outright banning is not the solution, but I understand now why this pause is needed. We must have time to deeply consider the value of these conversations on our own terms.
I like how RLHF mirrors traditional education of children to similar ends
Like society (the chaos) and the mask of civilisation, with a little dose of democracy on top..
Hmmm I just realized that chat GPT is the same as IBM waston
Start by telling us what it stands for.
Of course. And the bias if not already will be political.
Training a model to be preferenced by humans, implies that one may also train a model to be preferenced by non-humans. 🤔
they had to create CGPT4, so the school kids could learn how to write/compose a book!
otherwise without it, society would be belligerent creatures.
trying 2 hold there ''I'm not a dummy'' pose! good luck!
It's just exacerbates collective biases, instead of creating the model that will be optimized for objective truth.
You can't have objective truth with an LLM. You would need an AGI with access to different datasets at the same time, and the capability to analyze and evaluate the info based on crosschecking, presence of biases, conflicting sources (and their individual evaluation), etc.
So far we don't have that. What we do have is an "agent" that's basically a reflection of a regular person's "knowledge" without no way of verifying it. An LLM is basically a collection of opinions and biases.
@@eXWoLL Yes, and here it is proposed to reinforce this strategy, it's the wrong strategy to reinforce.
and I guess you are the one who knows the absolute truth
@@Draganel87 No
@@XOPOIIIO at this point there are thousands of models being worked on. His approach will lead to some of them going the wrong way. There will be ones that will choose the right one and will end up being vastly superior :)
Great
Lewis Nancy Garcia Jason Gonzalez Christopher
❤
4chan was right about Covid. Stanford was wrong.
I don't recognise many of the phrases and jargon he uses and consequently cant grasp the concepts he is explaining, despite wanting to learn and understand.
Hey Ash. Maybe use another source for now and then re-evaluate this channel in six month? www.youtube.com/@AICoffeeBreak is deep and accessible with most jargon explained. (If any of that sounds condescending, I can assure you it is not meant like that at all. I can relate to your experience)