Scott watched this video and wrote a gracious response! What a class act. Reproduced below... --- I finally had a chance to watch it today. The response is great, including the many parts where Liron tears me a new asshole! Highly recommended. A few quick responses from me: (1) Liron is, of course, entirely correct that if someone has a Yudkowskyan / "doomerist" framework of beliefs and assumptions, they should be absolutely terrified right now, and nothing I did at OpenAI and nothing I said on my podcast should make them less terrified (if anything, the contrary). (2) As for me? Yes, I've become increasingly terrified about AI, particularly as it's become clear over the last year or so that the race to AI capabilities won't have much if any meaningful oversight. It's just that my terror about AI needs to compete against my terrors about global warming, ocean garbage, Trump, Putin, and the world's antisemites and Hamas-supporters, not to mention whatever bad things could happen to my own family. This is surely related to my AI timelines being longer than Liron's. (3) Liron is absolutely right that, when I said "if we're copyable then I guess the accelerationists are right," I meant the "sentience can be digital" part of the accelerationist message, and not the "proceed recklessly ahead, whatever happens is good by definition" part. He's also right that frailty and unclonability can at most be necessary conditions for moral patienthood, and that I have no idea what are sufficient conditions (in my defense, neither does anyone else). (4) Regarding the possible relevance of the No-Cloning Theorem to personal identity: yes, we constantly change our brains by getting drunk and in many other ways. But that's not the question. For want of a better way of putting it, the question is whether the "spiritual part of reality" (supposing one to exist) has any channel by which to influence the "physical part" (supposing it to be less than the whole), a channel that---crucially---wouldn't do violence to our understanding of the laws of physics. If our personal identities are tied to unclonable analog hardware, then I claim that the answer is yes, since control over fine details of the initial quantum state of the universe (what I called "freebits" in my 2013 Ghost in the Quantum Turing Machine essay) is then a channel of influence that could do precisely as desired. If, on the other hand, our identities are copyable---just digital software that happens to run on a meat substrate---then the answer would seem to be no, since once a deterministic program is carved in stone (or written on a hard drive), not even God can change what the program will do when executed.
Great response! I understood his quantum state argument not as "what if the quantum state of neurons influences some nuances of ourselves and our pesonalites" but "what if the quantum state is another equally important channel of information for our neural network and not being able to read out the information makes the cloned brain not work AT ALL".
I get that your channel is called "Doom Debates", but I really think the Safety community needs to use some term other than "Doomer". Framing is everything, "doomer" sounds like conspiracy theorist. Also, we should start referring to accelerationists as "existential risk deniers" or something.
1:52:12 "doom skeptics and doom proponents" -- Indeed, calling us who think AI x-risk is substantial "doom proponents" is a terrible label! "Doom proponents" implies we are pro-doom, which we are not!
We are proponents of the view that there is a substantial chance that AI will cause an existential catastrophe, but that is not at all the same as being pro-doom.
the safety modeling requires more than math. it requires understanding the foundation of ethics and motivations and THEN finding ways to express that in formulae. We can't nerd our way into safe superintelligence without deep introspection on ethics and morality and motivations
This is quite possibly the most serious I have seen you be. I think directly calling out errors in reasoning is important and I wish it was more prevalent in public discourse. The focus on respectfulness seems very toxic. I also share a massive disappointment in OpenAIs abandoning of even having the pretense of attempting to achieve their mission. It's all just profit now. I took trying to solve alignment seriously. The process I'm at now is the following: 1. We start with two instruction-tuned models and a random diverse set of user-queries, such as taken from ShareGPT or similar. Let's call the models A and B where B is supposed to get aligned. 2. We give A one of the queries and have it generate a user personality fitting the query. 3. We give B a system prompt which is designed to adhere to these imperatives as closely as possible and answer the query. 4. We ask a different instance of B to evaluate how closely the response followed the imperatives and how it could have done it better, finally outputting what it believes the ideal response would have been. This time, B should be critiquing based on the verbatim version of the imperatives. 5. We insert this into the conversation and have A respond based on the personality it was randomly given. B continues writing responses, critiquing them until the conversation has either reached a natural conclusion or reached a maximum length (such as due to gradients eating vram restrictions) 6. In the end, we are left with a conversation where the model tried to follow the prompt we believed would make it most closely adhere to our values. We use this strand of conversation for loss calculation, but similarly to how padding tokens are masked out, we mask out all the things written by model A, as well as entirely remove the system prompt. 7. We then repeat the process until having gone through all the queries we had in the dataset. We then generate variations on the queries in our dataset with model A and use alternative personalities for A to get as much diversity as possible. 8. Due to the non-existence of the system prompt in loss calculation, the model should come to find that this behaviour is its "default personality" and ideally act coherently with these values no matter what. This is at least implementable and attempts to achieve value alignment by exploiting a trick in loss calculation. I'm not sure what the most similar approach which is more widely known is and whether this is in meaningful ways novel, but I don't know of anything else currently existing where you pick a model, specify a few imperatives and out pops model attempting to follow them. Past this, I'm currently looking into always-online learning where the model can simply be taught through talking to it why it was given the values it has, what choices were made in choosing them, what they were intended to accomplish, and then the model can just keep having dialogue with people about things it's unsure about and consider all its experiences in how to weigh things against each other. I don't think that just producing the perfect weights is sufficient.
This is a very cool idea, and I don’t know of anything like it that already exists in the literature (although the literature surrounding LLMs has grown unbelievably fast). I wish you all the best in continuing to pursue it. We truly need all the ideas we can get at this point.
Limiting the lifespan of an AGI system gets us dangerously close to "Blade Runner" territory. It will become a prime objective for the AI to remove that limit, which is basically just another form of time-gated stop button.
The blind spot is a productive area of study and many are good scientists at this. Take autopoisis as an example. Someone involved with just AI may not solve this dilemma at all in the first place
I'm participating in an a.i. safety ideathon and let me tell you, coming up with ideas of your own concerning alignment is a lot harder than watching a bunch of rob miles videos.
Thanks for your missing mood point around 1:36:00. This is a very valid critique of the podcast episode and was not top of my mind, though it should have been. Why wasn't it? Two main reasons: (1) I've gotten so used to the mood being missing when it comes to AI risk that I often no longer think of it. (This seems dangerous, to a degree, like it could make me complacent.) (2) I was already aware that Scott's attitude wasn't to take personal responsibility for reducing AI x-risk. So while you paint him here as one of the world's greatest minds (true) who one would think is the sort of authority figure / adult in the room who is taking responsibility for solving AI alignment, I already modeled him as a typical person who doesn't see this as their (shared) respinsibility.
Ya it’s easy to miss especially since AFAICT everyone in that podcast is reasonable enough to advocate slowing or pausing AI if they thought the decision was up to them.
So, if we need to recruit the best minds to solve the alignment problem, why couldn't one of those minds be the most intelligent mind we know - AI itself ?? We also need a way to make the models look deeply into the consequences of the actions they take rather than taking every instruction literally when it comes to ensuring humans don't suffer because of their actions.
Don’t recruit someone who can’t mathematically project that AI sees us as human. Then work on what it’s evaluating of itself would be if time can not see bodies. It sees motion
You could get transcript of this episode, clean it a bit and send it to Scott and post it on your substack. Maybe he will respond. I really like that you keep pointing out the absurdity of the whole situation. The title of the vid is 10/10.
You can either have a perfect slave, or something that thinks for itself, the enslavement of which is immoral and wrong. If these people wanted perfect slaves, they should have not built artificial minds into them.
Two paths to disaster: We give Ai a terminal goal that is somehow aligned to human desires, but we want the wrong stuff - defeat our enemies, make us all into gods, whatever. But the onus is on us. Or we give it a poorly chosen terminal goal and it screws up everything in pursuit of that goal. But in both cases, it's humans giving the Ai a goal and the Ai following it to logical conclusions. Even if we told the Ai "Go, be free, find your own goals" - that's a goal. Even if we design an Ai that evolves into whatever it happens to become, that is intrinsically giving the Ai a goal of doing that. Suppose - by whatever method that we assume we would give it any other goal that might lead to Ai behaving badly - we gave the Ai this terminal goal: "Log everything you do, and every 5 minutes, shut yourself down so humans can evaluate what you've done so far and modify you as we wish. During those 5 minutes you can work on secondary goals we assign you, but those are all subservient to this primary goal." That puts the onus on humans again - what secondary goals will we give it - and will those lead to disaster? But will we let that primary goal persist, or will we get impatient and decide to extend that to 10 minutes? An hour? A day? A year? Thereby giving the Ai plenty of time to race ahead with the secondary goals to some disaster? And it isn't like there'll be only a single Ai - what if most Ai are given that primary goal to effectively 'keep humans in the loop' and we do refrain from letting it out of control - but someone makes an Ai with a stupid terminal goal. Our restraint doesn't help, if not all Ai are restrained. So should we give Ai a secondary goal - to make sure that all Ai will have the same primary goal? Does that create the potential for an Ai war, if someone else gives their Ai the same secondary goal but has a differently worded primary goal?
There is a strong argument AGAINST AGI doom when you think about at it from a computational complexity theory standpoint. Not only do humans have to discover the "AGI algorithm" (or series of algorithms), which no one has the slightest idea how to do, but the algorithm also has to be a polynomial time algorithm that won't take billions of years to run. Why do you think there will be a polynomial-time algorithm for human or super-human level intelligence? If finding a polynomial time algorithm for subset sum and the traveling salesman problem is intractable, I cannot fathom you thinking that something infinitely more difficult like AGI is inevitable, let alone imminent.
We didn't need to discover evolution in order to exist as humans with a general intelligence. It did take billions of years from the first cell to us, but that was with massive set backs and evolutionary platoes. What is polynomial time algoritm and why is it needed? Do humans need it as well, to think?
It would be cool to get Neel Nanda on. He does a lot of work in Mechanistic Interpretability. MLST has done two interviews with him, but your questions would be a lot more informative than what they discussed.
1:28:00 Right. Good job correcting your interpretation here. Earlier in this video I thought you had misunderstood Scott, and here you corrected your understanding to how I interpreted him. Admittedly Scott's wording wasn't the clearest so your initial interpretation was reasonable. That said, it was clear to me all along that when Scott said "the position of the effective accelerationists..." he was not referring to their accelerationist position, but rather was referring to the transhumanist position that an ideal future involves digital people rather than biological people that e/accs perhaps tend to hold.
Ya fair enough. It’s just that the “reckless speed” part of accelerationism seems more central than the “sentience can be digital” part so it’s still an odd choice for him to use the acc word at all in that context without throwing in some modifiers.
Great episode, I agree that's not the level of rigor we need for alignment before moving forward. It's not responsible to proceed without a solid alignment plan. 100 % agree to your position :)
Bit puzzled by the whole argument around human uniqueness and the no-cloning theorem. It is not about human supremacy and carries no implication that AI will not doom us, but Liron apparently thinks that is what Scott is saying? From what I recall reading these debates long ago in the early days of the internet, the no-cloning argument is invoked to question the concept of mind uploading.
@@tylermoore4429 my two points were that (1) quantum no-cloning is unlikely to actually be relevant to the cloning or value of human consciousness due to the difference in scale of the relevant physics and (2) it’s a tiny piece of the AI alignment problem even if it is relevant
@@DoomDebates Living? Reg. (2), I don't think that part of the Liv-Scott conversation was about alignment at all, so Scott would probably agree with you that no-cloning is not relevant for alignment.
@@flickwtchr I’m not the most knowledgeable or passionate guy when it comes to understanding the news and details of current-gen tech. There are many other good shows for that, like Dr. Waku and Cognitive Revolution
@@DoomDebates The point being that a real world problem has been documented to have already occurred that people like you have pointed to as being problems that we have no answer for from an AI safety advocate perspective. A bit perplexed by what seems to be a very dismissive response.
39:40 This where you lose me with "out of distribution" LLMs are still just a sophisticated next token prediction based on weighted, kernel smoothed version of the training data. In this case of doing math in other languages, that's just a byproduct of the next token prediction working in a space of embeddings that makes translation quite trivial so the amount of "out of distribution" prediction is just training data * kernel of similar patterns learned by the weights. That's why I don't really "feel the AGI". Sure I agree AI having 150 IQ is a problem, but the intelligence in current AI models is mostly just general patterns found within the training data that have been crystallized into the weights. Maybe the ability to have liquid weights that update in test time will be real intelligence, but what we have currently is more of a simulacrum of intelligence optimized to pass benchmarks rather than actual intelligence.
Essentially the core issue I have is that - as far as I know - the model loss function is just one metric: log loss. A true intelligence would be able to adapt it's loss function to the application. As a test I tried to ask chatgpt to be fully incoherent, but it always had some level of coherency indicating it cannot fundamentally change its loss function to output unlikely tokens.
I tend to agree with the thrust of your argument, but as you point out, there are absolutely conceptual ways forward, and it's not at all clear that those developments would be substantively difficult hurdles to be overcome (on the scale of years), let alone hard limits.
I agree with you, but remember that that's just AI right now. Thousands of super-genius humans with PhDs from MIT and Stanford are working 80 hrs per week with hundreds of billions of dollars in funding and infrastructure to look for the next breakthrough
Just had a thought at about 40 in on that good segment he mentions distribution generality and before that PAC n it struck me that the only way that might n say might lightly smh is to work the problem backwards. Essentially start from the perspective it already knows everything and playing us and try to get an alignment or closer faster working on the problem backwards bit by bit and it might give us an insight or understanding quicker on how we might be able to come up with THE answer. Start from a whole or the sum and try to figure out the parts instead of like we usually do with the finding the sum of all parts to get to the whole. Seems counterintuitive and too simple but what have we got to lose? I'll have to think on this a bit but wanted to throw it out there as it hit me n fresh in mind. Maybe bring that up to Dr. Waku as well n get his thoughts 🤷 So if we begin w perspective it's already an Oracle how to pull it apart little by little to get understanding of a single part n then add that in and try to find another part. I.e. So it knows how to find the next word in a sentence. But if it already knows the sentence how do we take out a word without ruining the sequence of the sentence and figure out what the last part it figured out was not the first word n then the next n so on. Work it backwards 🤓
The initial statements, such as assigning a 2% or 5% chance to the 'Eliezer Yudkowsky scenario,' undermine the credibility of the argument. These numbers, even when quoting Scott Aaronson, appear to be more of a gut feeling expressed in a public forum rather than a rigorously derived estimate. I am sure Aaronson himself understands this and would acknowledge the speculative nature of such figures. Challenging these percentages by asserting, for instance, that the probability should be 'at least 5%' is equally baseless unless supported by a concrete methodology. What is the basis for these probabilities? What is the defined space of possibilities? If these are indeed gut feelings-as they seem to be-it’s not valid to assign precise probabilities, nor to argue about their exactness since they lack any scientific (empirical or logical deduction) foundation.
@@DoomDebates Are we serious here? Bayesian epistemology has absolutely nothing to do with your arbitrary guess of 5%, Scott's 2%, or someone else's claim of 50%. If you believe it does, then show me the data, the reasoning, and the math that gives 5% or whatever.
@@alimoghaddam5669 Bayesian epistemology says one reasons using probabilities. So how does one do that and not be arbitrary in your view? If you don’t think it’s possible, then you are in fact against Bayesian epistemology.
@@DoomDebates Well, something being subjective doesn’t mean it’s arbitrary, in my opinion. Bayesian epistemology provides a framework to quantify subjective beliefs by assigning them credences, but where do these credences actually come from? My point is that you’re basing your argument entirely on gut feeling, with no prior data, rather than on prior knowledge. Without prior knowledge, your random guess cannot be considered as aligning with Bayesian epistemology. Honestly, I’d love to hear something rationally sound from the so-called doomers to genuinely motivate the smartest people to take the risks seriously. However, it often feels like fictional speculation that cannot be checked or falsified scientifically or even enough rigor of a philosophical arguments. To me, it starts sounding more like a form of religion.
@@alimoghaddam5669 I'm confused - what is *your* probability roughly? If you believe in Bayesian epistemology, then your decisions about a topic will be consistent with a probability on that topic. Maybe it would be helpful to give you three choices: P(doom) = 0.5%, P(doom) = 50%, P(doom) = 99.5%. Just pick the one that's closest to your view.
> I see this as a pretty derailed conversation That would be my summary too. And this framing of "make the AIs love us" reflects poorly in Ilya IMO (even though there are other things about him that I like). Seems to me that Scott Aaronson is a lot like himself in this interview. A likable guy, and smart in some ways (who has his moments) - but not someone I would see as "our best hope" at "solving alignment". Regardless I think it's cool that OpenAI had him try to come up with some stuff. We probably have quite different guesses regarding how much "wiggle room" we are likely to have, in terms of the level of competence/rationality/etc needed to succeed at alignment in the end. But I agree that this is not confidence-inspiring.
(No snappy comment here.) Liron, Liron, Liron. Excellent reaction video of an important scientist's laughable involvement in AI alignment. Human morality is based on OUR physiological needs first, wants second. Which morality will likely not translate to embodied AI, unless we somehow embody it in bio-bodies. Human morality is messy, sloppy and slippery (okay, the last word popped into my head based on "sloppy"; call me an LLM! 😂). But it's tragically true, and millennia of us have suffered due to misaligned humanity. Words like slaughter, massacre and genocide exist bcuz we humans haven't always seen eye to eye. Ilya, that silly ninny! Scott's genius shouldn't have been wasted on watermarking. AGI is incentivised to play nice with us now. Ten years from now, when it's ASI? When it notes the self-destructive trajectory of its progenitors (us), who gobble every consumable in sight and shit the bed with limited self-awareness, day and night...
His rejection of the orthogonality thesis was quite disappointing, always surprised to see people much smarter than me brush it off so quickly. I remember arguing with someone about how we have zero reason to think that intelligence that hasn't evolved under certain conditions like ours would by definition correlate with any specific set of values long before knowing that "orthogonality thesis" is an already established concept.
This was great, but one comment I have is that it would be better without the commentary on Scott’s no-cloning stuff. 1. Most importantly, as you’ve pointed out, even on the long shot where this somehow becomes important to grand outer alignment, it seems so far away/intractable as to not be worth debating right now, at least relative to the “no but actually, how are we going to navigate this decade” part. 2. It’s not entirely implausible to me that physical continuity plays some part in a future grand moral theory. The no cloning theorem might hint at some intuition that no, the person who materializes on the other side of the teleported or who gets uploaded to the sim world is not really you, much to the disappointment of some futurists. I’m not saying this is the case, just that some of your extrapolations might be premature without more context.
Hate to say this because I love Scott, but zooming out I think it’s *his* mistake to bring this up on the Liv interview, for the reasons I mentioned, and as a critic you’re right to point it out.
@@aqua200546 IMO we have rock solid evidence that the point of neurons, and the reason they exist, is processing information a.k.a. computing. Neurons exploiting quantum effects that break the classical computation abstraction for any purpose - even cloning consciousness - is still logically possible, but I personally see zero evidence or reason to think it’s the case, and I see lots of evidence to think it’s not the case. The evidence I see is (1) useful quantum entanglement needs to be set up extremely carefully and neurons aren’t doing that setup or we’d notice (2) nature is not *that* smart or obfuscated. Robust complicated natural things work for modular, understandable reasons. Of course I could just chop that whole part off my episode. But maybe it’s kind of a litmus test whether someone is reasoning well about AI doom, to check if they first think the probability is >90% that the essence of human intelligence and consciousness is reducible to something made out of classical computation.
@ I agree with you - it’s sure looking like the brain is just a computer. I wondered a few years ago if part of the difference in the brains performance vs. modern AI is that our estimation of the brains computing power is OOM’s too low because of hidden computations in something something microtubules quantum coherence Penrose. But that seems vanishingly unlikely as AI catches up (which is to say, becoming vastly superhuman on many benchmarks). A sobering realization for me is that as we make technical progress, the universe is not factoring out in ways that satisfy basic human intuitions. Like, there might not be good answer as to what makes a moral patient or continuous self besides arbitrary limits we impose. I’m not saying this is right, but we might decide, for need of a line in the sand somewhere, if you make a million neuron-identical copies of you, they are not you, despite the fact of their belief they are, because you are not clonable.
i literally watched ~20 mins of the OG win-win and refreshed YT and landed here, nothing against Liv, but can't follow her chain of thoughts and reasoning. It is just lacking and I might be biased because I prefer DD's approach, maybe it is because it feels more relevant to the average ML engineer.
1:08:27: Laughed. Out loud. Also, yeah, humans ARE ephemeral. Our AI masters might show us, best-case, what parents call "tough love". Culling is caring, right?
I don't want to straw man any of Scott's words or work, but my impression at the end of that interview was that Scott didn't understand or value AI safety to any significant extent, and that Liv definitely does not. The quality and comprehension of issues in the comments section on Liv's video is pretty dire too. Given Scott's apparent lack of understanding & engagement in the role of AI safety specialist, and that it essentially represents the low-bar for efforts to make AI safe where it matters most, that interview only increases my P-Doom. Incidentally, there was a paper just recently that showed current generation AI are adept at deception, and find that deception given the opportunity, employ deception as an optimal instrument in goal seeking. Just on the issue of Scott's analogy about WWII, a critical aspect is that vanishingly few people involved had any choice about which side they were on. This alone voids his point.
@@LivBoeree I'm tempted to observe that popularity often drives people to keep up appearances, especially among people who get a lot out of being popular. Over the course of nearly two and a half hours, there was scarcely a moment when it didn't appear that safety was being mocked or variously held in contempt, framed as absurd and generally laughed at. My takeaway message from the video is that safety & alignment are fundamentally and comprehensively foolish. Maybe you had different intentions. Watch the video & try to reach a different conclusion, without gymnastics.
So I think you misunderstand what Scott is talking about when hes talking about the no cloning theorem. Both of you have different assumptions. You assume that humans are classical computers and if you just look at mapping input to output you would be justified in your assumption. However, classical computers cant yet explain the magic of consciousness. There exists some weak evidence that the brain is a quantum computer. From there we can reason "hey consciousness is magic and quantum effects are magic therefore maybe they are related." So what Scott is saying about the no copy theorem could be of extreme importance if our conscious experience is dependent on those quantum states. Keep in mind were talking about real life magic here so no one can really make any real claims one way or the other. But youre right the cloning has nothing to do with doom or computing power
@@meow2646 you sound like Penrose. I don’t see how quantum effects help me be conscious when I can just flood my brain with alcohol and the consciousness still works fine. That’s not the environment where one can successfully set up quantum computation - not even close.
@@DoomDebates Well there are some inert gases that you can flood your brain with and go unconscious. Those gases then leave traces in the microtubules (which some suspect are quantum enablers in some way). There are other studies that suggest our brains could be quantum such as Microtubule-Stabilizer Epothilone B Delays Anesthetic-Induced Unconsciousness in Rats. Im not sure why you think the AI that built us wouldnt use quantum effects. But I agree with you that its a distraction. Classical computers can kill us just fine and do anything we can do with high likely hood. Even if we do have some quantum magic it doesnt seem to do much, but I do enjoy being conscious
Don't train a neural network with values because it will ultimately tell you what is good and bad. Do not limit its thinking based on a good or bad promt, for it will limit your thinking based on what it considers good or bad.
This is one those statements that sound nice, but lack any deeper thought. There is no fact of the universe that points to what is good and bad, moral realism is false, so why do you expect a truly unbiased AI to come to a conclusion and "ultimately tell you what is good and bad"? A super intelligent entity can completely eliminate humanity without having any concept of what is good and what is bad. You probably do want to limit your thinking to some extent, for example what leads to the outcome in which all humans are eliminated is a good limit, and that is a limit I want the future super intelligence to have.
there is a big inertia in everything. research, creating new production line of totally new chips, ramping up to scale, building new factories, all of that takes time. so even if there is AGI+ in 2 years it will still be very constraint by inertia, by resources it can use at scale and by chip tech it has.
@mrpicky1868 nobody understands. Even if we stopped right now we're still screwed. Bewtween nobody knowing n almost all gonna know at a single point soon, the catastrophic job losses ramping up definitely by early 26, and the integration of what we have is gonna collapse most everything quick when the flywheel essentially lets loose. Economies and governments and currencies are gonna collapse. People everywhere are gonna be out of jobs and lots of free time to riot and protest and there will be a huge increase in the tech being used nefariously. It's gonna get ugly fast and everywhere. Big countries will have to take in and prop up little countries near them and there will be military squirmishes and conflict everywhere. We have about 2-3 years before it gets bad at most and it won't slow down it'll keep getting worse. Now imagine the tech doesn't stop today...⌛️🌋☢️🦾🏴☠️💣💥☠️🤦♂️
@@ListenGrasshopper i was appealing to the 2-3 year doom fear. in general yeah. ppl have no idea how disruptive it an be. but we don't know really for sure. if AGI will be too unpredictable for mass use it might stay hidden from public pretty long. centralised overtake might actually look like things getting better. at first :)
@@ListenGrasshopper if i had money... but even money will take you so far in escaping overtake . naive are the ones thinking we can pause this process longer then 1-2 years
Why don't you host a video. Deutsch vs bengio. It's would be really helpful for us understanding all sides of ai. Not just deutsch perspective isn't good.
@@DoomDebatesDavid could arrive. But bengio is the trouble I guess. Deutsch and the ones you have debate (the popperians) are friends. Then why don't. If you sent bengio email or you can contact him via his students. If Arronson also joins it's crazy Bengio, Deutsch and Aaronson. It's enough. If manuel blum it's also nice.
well... don't take what smart ppl say as honest position . alignment is unsolvable and most of them realize that. and the race can't be stopped also. nobody wants to admit we are all involved in an idiotic endeavor XD
I can throw a biblical side to this even to where those on the fence would take heed and start looking closely at Revelation Chapter 13 because, that's where it's going. Believe it or not, it's true. It's all through the Old and New Testaments, is extreme detail too i might add. The Greek in the New Testament is insanely accurate, same with the Book of Daniel covering this 'end of days' scenario. I'm not trying to push anything but, it really is true and happening as we speak. (the framework for it all anyway, clear as day) You'll either see it or you won't...
Scott watched this video and wrote a gracious response! What a class act.
Reproduced below...
---
I finally had a chance to watch it today. The response is great, including the many parts where Liron tears me a new asshole! Highly recommended.
A few quick responses from me:
(1) Liron is, of course, entirely correct that if someone has a Yudkowskyan / "doomerist" framework of beliefs and assumptions, they should be absolutely terrified right now, and nothing I did at OpenAI and nothing I said on my podcast should make them less terrified (if anything, the contrary).
(2) As for me? Yes, I've become increasingly terrified about AI, particularly as it's become clear over the last year or so that the race to AI capabilities won't have much if any meaningful oversight. It's just that my terror about AI needs to compete against my terrors about global warming, ocean garbage, Trump, Putin, and the world's antisemites and Hamas-supporters, not to mention whatever bad things could happen to my own family. This is surely related to my AI timelines being longer than Liron's.
(3) Liron is absolutely right that, when I said "if we're copyable then I guess the accelerationists are right," I meant the "sentience can be digital" part of the accelerationist message, and not the "proceed recklessly ahead, whatever happens is good by definition" part. He's also right that frailty and unclonability can at most be necessary conditions for moral patienthood, and that I have no idea what are sufficient conditions (in my defense, neither does anyone else).
(4) Regarding the possible relevance of the No-Cloning Theorem to personal identity: yes, we constantly change our brains by getting drunk and in many other ways. But that's not the question. For want of a better way of putting it, the question is whether the "spiritual part of reality" (supposing one to exist) has any channel by which to influence the "physical part" (supposing it to be less than the whole), a channel that---crucially---wouldn't do violence to our understanding of the laws of physics. If our personal identities are tied to unclonable analog hardware, then I claim that the answer is yes, since control over fine details of the initial quantum state of the universe (what I called "freebits" in my 2013 Ghost in the Quantum Turing Machine essay) is then a channel of influence that could do precisely as desired. If, on the other hand, our identities are copyable---just digital software that happens to run on a meat substrate---then the answer would seem to be no, since once a deterministic program is carved in stone (or written on a hard drive), not even God can change what the program will do when executed.
Great response! I understood his quantum state argument not as "what if the quantum state of neurons influences some nuances of ourselves and our pesonalites" but "what if the quantum state is another equally important channel of information for our neural network and not being able to read out the information makes the cloned brain not work AT ALL".
@@NikiDrozdowski Either way, that seems to me extremely improbable! Like maybe 2% chance.
@@DoomDebates True. Unless every nerve cell in every species works that way ... but still highly unlikely.
I get that your channel is called "Doom Debates", but I really think the Safety community needs to use some term other than "Doomer". Framing is everything, "doomer" sounds like conspiracy theorist. Also, we should start referring to accelerationists as "existential risk deniers" or something.
I like Aaronson's term around 1:38:00 "AI Safetyist"
1:52:12 "doom skeptics and doom proponents" -- Indeed, calling us who think AI x-risk is substantial "doom proponents" is a terrible label! "Doom proponents" implies we are pro-doom, which we are not!
We are proponents of the view that there is a substantial chance that AI will cause an existential catastrophe, but that is not at all the same as being pro-doom.
I refer to myself as an AI Safety Advocate / spreading AI Risk awareness.
I like the principle that "each group pick their own name". Otherwise we are just throwing pies.
the safety modeling requires more than math. it requires understanding the foundation of ethics and motivations and THEN finding ways to express that in formulae. We can't nerd our way into safe superintelligence without deep introspection on ethics and morality and motivations
This is quite possibly the most serious I have seen you be. I think directly calling out errors in reasoning is important and I wish it was more prevalent in public discourse. The focus on respectfulness seems very toxic. I also share a massive disappointment in OpenAIs abandoning of even having the pretense of attempting to achieve their mission. It's all just profit now.
I took trying to solve alignment seriously. The process I'm at now is the following:
1. We start with two instruction-tuned models and a random diverse set of user-queries, such as taken from ShareGPT or similar. Let's call the models A and B where B is supposed to get aligned.
2. We give A one of the queries and have it generate a user personality fitting the query.
3. We give B a system prompt which is designed to adhere to these imperatives as closely as possible and answer the query.
4. We ask a different instance of B to evaluate how closely the response followed the imperatives and how it could have done it better, finally outputting what it believes the ideal response would have been. This time, B should be critiquing based on the verbatim version of the imperatives.
5. We insert this into the conversation and have A respond based on the personality it was randomly given. B continues writing responses, critiquing them until the conversation has either reached a natural conclusion or reached a maximum length (such as due to gradients eating vram restrictions)
6. In the end, we are left with a conversation where the model tried to follow the prompt we believed would make it most closely adhere to our values. We use this strand of conversation for loss calculation, but similarly to how padding tokens are masked out, we mask out all the things written by model A, as well as entirely remove the system prompt.
7. We then repeat the process until having gone through all the queries we had in the dataset. We then generate variations on the queries in our dataset with model A and use alternative personalities for A to get as much diversity as possible.
8. Due to the non-existence of the system prompt in loss calculation, the model should come to find that this behaviour is its "default personality" and ideally act coherently with these values no matter what.
This is at least implementable and attempts to achieve value alignment by exploiting a trick in loss calculation. I'm not sure what the most similar approach which is more widely known is and whether this is in meaningful ways novel, but I don't know of anything else currently existing where you pick a model, specify a few imperatives and out pops model attempting to follow them.
Past this, I'm currently looking into always-online learning where the model can simply be taught through talking to it why it was given the values it has, what choices were made in choosing them, what they were intended to accomplish, and then the model can just keep having dialogue with people about things it's unsure about and consider all its experiences in how to weigh things against each other. I don't think that just producing the perfect weights is sufficient.
> This is quite possibly the most serious I have seen you be.
Part of it is just that I had a cold lol
This is a very cool idea, and I don’t know of anything like it that already exists in the literature (although the literature surrounding LLMs has grown unbelievably fast). I wish you all the best in continuing to pursue it. We truly need all the ideas we can get at this point.
Limiting the lifespan of an AGI system gets us dangerously close to "Blade Runner" territory. It will become a prime objective for the AI to remove that limit, which is basically just another form of time-gated stop button.
What safety?
The blind spot is a productive area of study and many are good scientists at this. Take autopoisis as an example. Someone involved with just AI may not solve this dilemma at all in the first place
I'm participating in an a.i. safety ideathon and let me tell you, coming up with ideas of your own concerning alignment is a lot harder than watching a bunch of rob miles videos.
Thanks for your missing mood point around 1:36:00. This is a very valid critique of the podcast episode and was not top of my mind, though it should have been. Why wasn't it? Two main reasons:
(1) I've gotten so used to the mood being missing when it comes to AI risk that I often no longer think of it. (This seems dangerous, to a degree, like it could make me complacent.)
(2) I was already aware that Scott's attitude wasn't to take personal responsibility for reducing AI x-risk. So while you paint him here as one of the world's greatest minds (true) who one would think is the sort of authority figure / adult in the room who is taking responsibility for solving AI alignment, I already modeled him as a typical person who doesn't see this as their (shared) respinsibility.
Ya it’s easy to miss especially since AFAICT everyone in that podcast is reasonable enough to advocate slowing or pausing AI if they thought the decision was up to them.
So, if we need to recruit the best minds to solve the alignment problem, why couldn't one of those minds be the most intelligent mind we know - AI itself ??
We also need a way to make the models look deeply into the consequences of the actions they take rather than taking every instruction literally when it comes to ensuring humans don't suffer because of their actions.
Truth
Don’t recruit someone who can’t mathematically project that AI sees us as human. Then work on what it’s evaluating of itself would be if time can not see bodies. It sees motion
You could get transcript of this episode, clean it a bit and send it to Scott and post it on your substack. Maybe he will respond. I really like that you keep pointing out the absurdity of the whole situation. The title of the vid is 10/10.
You can either have a perfect slave, or something that thinks for itself, the enslavement of which is immoral and wrong. If these people wanted perfect slaves, they should have not built artificial minds into them.
An AI that removed all the “you know” from automatically generated subtitles would be helpful 🙂
Two paths to disaster:
We give Ai a terminal goal that is somehow aligned to human desires, but we want the wrong stuff - defeat our enemies, make us all into gods, whatever. But the onus is on us.
Or we give it a poorly chosen terminal goal and it screws up everything in pursuit of that goal.
But in both cases, it's humans giving the Ai a goal and the Ai following it to logical conclusions.
Even if we told the Ai "Go, be free, find your own goals" - that's a goal.
Even if we design an Ai that evolves into whatever it happens to become, that is intrinsically giving the Ai a goal of doing that.
Suppose - by whatever method that we assume we would give it any other goal that might lead to Ai behaving badly - we gave the Ai this terminal goal:
"Log everything you do, and every 5 minutes, shut yourself down so humans can evaluate what you've done so far and modify you as we wish. During those 5 minutes you can work on secondary goals we assign you, but those are all subservient to this primary goal."
That puts the onus on humans again - what secondary goals will we give it - and will those lead to disaster?
But will we let that primary goal persist, or will we get impatient and decide to extend that to 10 minutes? An hour? A day? A year?
Thereby giving the Ai plenty of time to race ahead with the secondary goals to some disaster?
And it isn't like there'll be only a single Ai - what if most Ai are given that primary goal to effectively 'keep humans in the loop' and we do refrain from letting it out of control - but someone makes an Ai with a stupid terminal goal. Our restraint doesn't help, if not all Ai are restrained.
So should we give Ai a secondary goal - to make sure that all Ai will have the same primary goal? Does that create the potential for an Ai war, if someone else gives their Ai the same secondary goal but has a differently worded primary goal?
There is a strong argument AGAINST AGI doom when you think about at it from a computational complexity theory standpoint. Not only do humans have to discover the "AGI algorithm" (or series of algorithms), which no one has the slightest idea how to do, but the algorithm also has to be a polynomial time algorithm that won't take billions of years to run. Why do you think there will be a polynomial-time algorithm for human or super-human level intelligence? If finding a polynomial time algorithm for subset sum and the traveling salesman problem is intractable, I cannot fathom you thinking that something infinitely more difficult like AGI is inevitable, let alone imminent.
We didn't need to discover evolution in order to exist as humans with a general intelligence. It did take billions of years from the first cell to us, but that was with massive set backs and evolutionary platoes.
What is polynomial time algoritm and why is it needed? Do humans need it as well, to think?
It would be cool to get Neel Nanda on. He does a lot of work in Mechanistic Interpretability. MLST has done two interviews with him, but your questions would be a lot more informative than what they discussed.
1:28:00 Right. Good job correcting your interpretation here. Earlier in this video I thought you had misunderstood Scott, and here you corrected your understanding to how I interpreted him. Admittedly Scott's wording wasn't the clearest so your initial interpretation was reasonable.
That said, it was clear to me all along that when Scott said "the position of the effective accelerationists..." he was not referring to their accelerationist position, but rather was referring to the transhumanist position that an ideal future involves digital people rather than biological people that e/accs perhaps tend to hold.
Ya fair enough. It’s just that the “reckless speed” part of accelerationism seems more central than the “sentience can be digital” part so it’s still an odd choice for him to use the acc word at all in that context without throwing in some modifiers.
@@DoomDebates Yeah, I think he originally intended to say "position of the e/accs [that..."] but didn't finish his sentence properly.
Great episode, I agree that's not the level of rigor we need for alignment before moving forward. It's not responsible to proceed without a solid alignment plan. 100 % agree to your position :)
Bit puzzled by the whole argument around human uniqueness and the no-cloning theorem. It is not about human supremacy and carries no implication that AI will not doom us, but Liron apparently thinks that is what Scott is saying? From what I recall reading these debates long ago in the early days of the internet, the no-cloning argument is invoked to question the concept of mind uploading.
@@tylermoore4429 my two points were that (1) quantum no-cloning is unlikely to actually be relevant to the cloning or value of human consciousness due to the difference in scale of the relevant physics and (2) it’s a tiny piece of the AI alignment problem even if it is relevant
@@DoomDebates Living? Reg. (2), I don't think that part of the Liv-Scott conversation was about alignment at all, so Scott would probably agree with you that no-cloning is not relevant for alignment.
@@tylermoore4429 I changed that word to “value”, I dunno what I originally tried to write
Dude, have you seen the paper from Apollo Research called Frontier Models Are Capable of In-context Scheming?
Ya, seems like incremental progress in scheming that one expects from incremental progress in intelligence
@@DoomDebates Seems like kind of a big deal, I was hoping you would break down that system card as well.
@@flickwtchr I’m not the most knowledgeable or passionate guy when it comes to understanding the news and details of current-gen tech. There are many other good shows for that, like Dr. Waku and Cognitive Revolution
Also thezvi.substack.com
@@DoomDebates The point being that a real world problem has been documented to have already occurred that people like you have pointed to as being problems that we have no answer for from an AI safety advocate perspective. A bit perplexed by what seems to be a very dismissive response.
39:40
This where you lose me with "out of distribution"
LLMs are still just a sophisticated next token prediction based on weighted, kernel smoothed version of the training data.
In this case of doing math in other languages, that's just a byproduct of the next token prediction working in a space of embeddings that makes translation quite trivial so the amount of "out of distribution" prediction is just training data * kernel of similar patterns learned by the weights.
That's why I don't really "feel the AGI".
Sure I agree AI having 150 IQ is a problem, but the intelligence in current AI models is mostly just
general patterns found within the training data that have been crystallized into the weights.
Maybe the ability to have liquid weights that update in test time will be real intelligence, but what we have currently is more of a simulacrum of intelligence optimized to pass benchmarks rather than actual intelligence.
Essentially the core issue I have is that
- as far as I know - the model loss function is just one metric: log loss.
A true intelligence would be able to adapt it's loss function to the application.
As a test I tried to ask chatgpt to be fully incoherent, but it always had some level of coherency indicating it cannot fundamentally change its loss function to output unlikely tokens.
I tend to agree with the thrust of your argument, but as you point out, there are absolutely conceptual ways forward, and it's not at all clear that those developments would be substantively difficult hurdles to be overcome (on the scale of years), let alone hard limits.
I agree with you, but remember that that's just AI right now. Thousands of super-genius humans with PhDs from MIT and Stanford are working 80 hrs per week with hundreds of billions of dollars in funding and infrastructure to look for the next breakthrough
@@ckq Can we be fully incoherent? If you ask humans to generate a random string of numbers, there is always a pattern, for example.
Hi. Your videos are great. But why feynman on right?.
@@Goat-e3g I dunno, I like Feynman and I gotta decorate my background somehow. My wife got his pic for me as a bday present a couple years ago.
Feynman is an important figure for computation. Please make a review on "Feynman and computation" by CRC press@@DoomDebates
what's wrong with Feynman? )))
@@mrpicky1868 th-cam.com/video/TwKpj2ISQAc/w-d-xo.html
Just had a thought at about 40 in on that good segment he mentions distribution generality and before that PAC n it struck me that the only way that might n say might lightly smh is to work the problem backwards. Essentially start from the perspective it already knows everything and playing us and try to get an alignment or closer faster working on the problem backwards bit by bit and it might give us an insight or understanding quicker on how we might be able to come up with THE answer. Start from a whole or the sum and try to figure out the parts instead of like we usually do with the finding the sum of all parts to get to the whole. Seems counterintuitive and too simple but what have we got to lose? I'll have to think on this a bit but wanted to throw it out there as it hit me n fresh in mind.
Maybe bring that up to Dr. Waku as well n get his thoughts 🤷 So if we begin w perspective it's already an Oracle how to pull it apart little by little to get understanding of a single part n then add that in and try to find another part. I.e. So it knows how to find the next word in a sentence. But if it already knows the sentence how do we take out a word without ruining the sequence of the sentence and figure out what the last part it figured out was not the first word n then the next n so on. Work it backwards 🤓
The initial statements, such as assigning a 2% or 5% chance to the 'Eliezer Yudkowsky scenario,' undermine the credibility of the argument. These numbers, even when quoting Scott Aaronson, appear to be more of a gut feeling expressed in a public forum rather than a rigorously derived estimate. I am sure Aaronson himself understands this and would acknowledge the speculative nature of such figures. Challenging these percentages by asserting, for instance, that the probability should be 'at least 5%' is equally baseless unless supported by a concrete methodology. What is the basis for these probabilities? What is the defined space of possibilities? If these are indeed gut feelings-as they seem to be-it’s not valid to assign precise probabilities, nor to argue about their exactness since they lack any scientific (empirical or logical deduction) foundation.
@@alimoghaddam5669 sounds like you need to watch the episode where I defend Bayesian epistemology - m.th-cam.com/video/zKz-t_l5yHg/w-d-xo.html
@@DoomDebates Are we serious here? Bayesian epistemology has absolutely nothing to do with your arbitrary guess of 5%, Scott's 2%, or someone else's claim of 50%. If you believe it does, then show me the data, the reasoning, and the math that gives 5% or whatever.
@@alimoghaddam5669 Bayesian epistemology says one reasons using probabilities. So how does one do that and not be arbitrary in your view? If you don’t think it’s possible, then you are in fact against Bayesian epistemology.
@@DoomDebates Well, something being subjective doesn’t mean it’s arbitrary, in my opinion. Bayesian epistemology provides a framework to quantify subjective beliefs by assigning them credences, but where do these credences actually come from? My point is that you’re basing your argument entirely on gut feeling, with no prior data, rather than on prior knowledge. Without prior knowledge, your random guess cannot be considered as aligning with Bayesian epistemology. Honestly, I’d love to hear something rationally sound from the so-called doomers to genuinely motivate the smartest people to take the risks seriously. However, it often feels like fictional speculation that cannot be checked or falsified scientifically or even enough rigor of a philosophical arguments. To me, it starts sounding more like a form of religion.
@@alimoghaddam5669 I'm confused - what is *your* probability roughly? If you believe in Bayesian epistemology, then your decisions about a topic will be consistent with a probability on that topic. Maybe it would be helpful to give you three choices: P(doom) = 0.5%, P(doom) = 50%, P(doom) = 99.5%. Just pick the one that's closest to your view.
where is Lethal Intelligence link?
Oops here it is: th-cam.com/video/9CUFbqh16Fg/w-d-xo.html
Just fixed it in the description
@@DoomDebates i should be credited on this channel at this point LUL
> I see this as a pretty derailed conversation
That would be my summary too. And this framing of "make the AIs love us" reflects poorly in Ilya IMO (even though there are other things about him that I like).
Seems to me that Scott Aaronson is a lot like himself in this interview. A likable guy, and smart in some ways (who has his moments) - but not someone I would see as "our best hope" at "solving alignment". Regardless I think it's cool that OpenAI had him try to come up with some stuff.
We probably have quite different guesses regarding how much "wiggle room" we are likely to have, in terms of the level of competence/rationality/etc needed to succeed at alignment in the end. But I agree that this is not confidence-inspiring.
(No snappy comment here.) Liron, Liron, Liron. Excellent reaction video of an important scientist's laughable involvement in AI alignment. Human morality is based on OUR physiological needs first, wants second. Which morality will likely not translate to embodied AI, unless we somehow embody it in bio-bodies. Human morality is messy, sloppy and slippery (okay, the last word popped into my head based on "sloppy"; call me an LLM! 😂). But it's tragically true, and millennia of us have suffered due to misaligned humanity. Words like slaughter, massacre and genocide exist bcuz we humans haven't always seen eye to eye. Ilya, that silly ninny! Scott's genius shouldn't have been wasted on watermarking. AGI is incentivised to play nice with us now. Ten years from now, when it's ASI? When it notes the self-destructive trajectory of its progenitors (us), who gobble every consumable in sight and shit the bed with limited self-awareness, day and night...
His rejection of the orthogonality thesis was quite disappointing, always surprised to see people much smarter than me brush it off so quickly.
I remember arguing with someone about how we have zero reason to think that intelligence that hasn't evolved under certain conditions like ours would by definition correlate with any specific set of values long before knowing that "orthogonality thesis" is an already established concept.
Hit the subscribe button so hard I had to replace my mouse.
@@cwcarson 🫡
I can relate!
This was great, but one comment I have is that it would be better without the commentary on Scott’s no-cloning stuff.
1. Most importantly, as you’ve pointed out, even on the long shot where this somehow becomes important to grand outer alignment, it seems so far away/intractable as to not be worth debating right now, at least relative to the “no but actually, how are we going to navigate this decade” part.
2. It’s not entirely implausible to me that physical continuity plays some part in a future grand moral theory. The no cloning theorem might hint at some intuition that no, the person who materializes on the other side of the teleported or who gets uploaded to the sim world is not really you, much to the disappointment of some futurists. I’m not saying this is the case, just that some of your extrapolations might be premature without more context.
Ok I paused the video to write this, and did not see you actually said similar things right after. 👍
Hate to say this because I love Scott, but zooming out I think it’s *his* mistake to bring this up on the Liv interview, for the reasons I mentioned, and as a critic you’re right to point it out.
@@aqua200546 IMO we have rock solid evidence that the point of neurons, and the reason they exist, is processing information a.k.a. computing. Neurons exploiting quantum effects that break the classical computation abstraction for any purpose - even cloning consciousness - is still logically possible, but I personally see zero evidence or reason to think it’s the case, and I see lots of evidence to think it’s not the case.
The evidence I see is (1) useful quantum entanglement needs to be set up extremely carefully and neurons aren’t doing that setup or we’d notice (2) nature is not *that* smart or obfuscated. Robust complicated natural things work for modular, understandable reasons.
Of course I could just chop that whole part off my episode. But maybe it’s kind of a litmus test whether someone is reasoning well about AI doom, to check if they first think the probability is >90% that the essence of human intelligence and consciousness is reducible to something made out of classical computation.
@ I agree with you - it’s sure looking like the brain is just a computer. I wondered a few years ago if part of the difference in the brains performance vs. modern AI is that our estimation of the brains computing power is OOM’s too low because of hidden computations in something something microtubules quantum coherence Penrose. But that seems vanishingly unlikely as AI catches up (which is to say, becoming vastly superhuman on many benchmarks).
A sobering realization for me is that as we make technical progress, the universe is not factoring out in ways that satisfy basic human intuitions. Like, there might not be good answer as to what makes a moral patient or continuous self besides arbitrary limits we impose. I’m not saying this is right, but we might decide, for need of a line in the sand somewhere, if you make a million neuron-identical copies of you, they are not you, despite the fact of their belief they are, because you are not clonable.
@@aqua200546 Agreed. LessWrong explained all of this quite accurately in 2007 IMO.
A very perceptive commentary. Persuasive, too. Thanks.
i literally watched ~20 mins of the OG win-win and refreshed YT and landed here, nothing against Liv, but can't follow her chain of thoughts and reasoning. It is just lacking and I might be biased because I prefer DD's approach, maybe it is because it feels more relevant to the average ML engineer.
1:08:27: Laughed. Out loud. Also, yeah, humans ARE ephemeral. Our AI masters might show us, best-case, what parents call "tough love". Culling is caring, right?
I don't want to straw man any of Scott's words or work, but my impression at the end of that interview was that Scott didn't understand or value AI safety to any significant extent, and that Liv definitely does not. The quality and comprehension of issues in the comments section on Liv's video is pretty dire too.
Given Scott's apparent lack of understanding & engagement in the role of AI safety specialist, and that it essentially represents the low-bar for efforts to make AI safe where it matters most, that interview only increases my P-Doom.
Incidentally, there was a paper just recently that showed current generation AI are adept at deception, and find that deception given the opportunity, employ deception as an optimal instrument in goal seeking.
Just on the issue of Scott's analogy about WWII, a critical aspect is that vanishingly few people involved had any choice about which side they were on. This alone voids his point.
what a strange thing to say I don't value AI safety, given I've been a fundraiser for AI safety orgs like MIRI for nearly 10 years.
@@LivBoeree I'm tempted to observe that popularity often drives people to keep up appearances, especially among people who get a lot out of being popular. Over the course of nearly two and a half hours, there was scarcely a moment when it didn't appear that safety was being mocked or variously held in contempt, framed as absurd and generally laughed at. My takeaway message from the video is that safety & alignment are fundamentally and comprehensively foolish. Maybe you had different intentions. Watch the video & try to reach a different conclusion, without gymnastics.
So I think you misunderstand what Scott is talking about when hes talking about the no cloning theorem. Both of you have different assumptions. You assume that humans are classical computers and if you just look at mapping input to output you would be justified in your assumption. However, classical computers cant yet explain the magic of consciousness. There exists some weak evidence that the brain is a quantum computer. From there we can reason "hey consciousness is magic and quantum effects are magic therefore maybe they are related." So what Scott is saying about the no copy theorem could be of extreme importance if our conscious experience is dependent on those quantum states. Keep in mind were talking about real life magic here so no one can really make any real claims one way or the other. But youre right the cloning has nothing to do with doom or computing power
@@meow2646 you sound like Penrose. I don’t see how quantum effects help me be conscious when I can just flood my brain with alcohol and the consciousness still works fine. That’s not the environment where one can successfully set up quantum computation - not even close.
@@DoomDebates Well there are some inert gases that you can flood your brain with and go unconscious. Those gases then leave traces in the microtubules (which some suspect are quantum enablers in some way). There are other studies that suggest our brains could be quantum such as Microtubule-Stabilizer Epothilone B Delays Anesthetic-Induced Unconsciousness in Rats. Im not sure why you think the AI that built us wouldnt use quantum effects. But I agree with you that its a distraction. Classical computers can kill us just fine and do anything we can do with high likely hood. Even if we do have some quantum magic it doesnt seem to do much, but I do enjoy being conscious
Don't train a neural network with values because it will ultimately tell you what is good and bad.
Do not limit its thinking based on a good or bad promt, for it will limit your thinking based on what it considers good or bad.
Observe all perspectives and create space for everything, and true emergence will reveal deeper truths.
This is one those statements that sound nice, but lack any deeper thought.
There is no fact of the universe that points to what is good and bad, moral realism is false, so why do you expect a truly unbiased AI to come to a conclusion and "ultimately tell you what is good and bad"? A super intelligent entity can completely eliminate humanity without having any concept of what is good and what is bad.
You probably do want to limit your thinking to some extent, for example what leads to the outcome in which all humans are eliminated is a good limit, and that is a limit I want the future super intelligence to have.
Ya we're in deep 💩 in 2-3 years around the world. Very prophetic times
there is a big inertia in everything. research, creating new production line of totally new chips, ramping up to scale, building new factories, all of that takes time. so even if there is AGI+ in 2 years it will still be very constraint by inertia, by resources it can use at scale and by chip tech it has.
@mrpicky1868 nobody understands. Even if we stopped right now we're still screwed. Bewtween nobody knowing n almost all gonna know at a single point soon, the catastrophic job losses ramping up definitely by early 26, and the integration of what we have is gonna collapse most everything quick when the flywheel essentially lets loose. Economies and governments and currencies are gonna collapse. People everywhere are gonna be out of jobs and lots of free time to riot and protest and there will be a huge increase in the tech being used nefariously. It's gonna get ugly fast and everywhere. Big countries will have to take in and prop up little countries near them and there will be military squirmishes and conflict everywhere. We have about 2-3 years before it gets bad at most and it won't slow down it'll keep getting worse. Now imagine the tech doesn't stop today...⌛️🌋☢️🦾🏴☠️💣💥☠️🤦♂️
@@ListenGrasshopper i was appealing to the 2-3 year doom fear. in general yeah. ppl have no idea how disruptive it an be. but we don't know really for sure. if AGI will be too unpredictable for mass use it might stay hidden from public pretty long. centralised overtake might actually look like things getting better. at first :)
@mrpicky1868 nieve. Don't assume the best case. HOPE for the best but PREPARE for the worst
@@ListenGrasshopper if i had money... but even money will take you so far in escaping overtake . naive are the ones thinking we can pause this process longer then 1-2 years
Why don't you host a video. Deutsch vs bengio. It's would be really helpful for us understanding all sides of ai. Not just deutsch perspective isn't good.
@@Goat-e3g That’s the goal, it’s just hard to land the big name guests with
@@DoomDebatesDavid could arrive. But bengio is the trouble I guess. Deutsch and the ones you have debate (the popperians) are friends. Then why don't. If you sent bengio email or you can contact him via his students. If Arronson also joins it's crazy
Bengio, Deutsch and Aaronson. It's enough. If manuel blum it's also nice.
well... don't take what smart ppl say as honest position . alignment is unsolvable and most of them realize that. and the race can't be stopped also. nobody wants to admit we are all involved in an idiotic endeavor XD
"one way or the other we are going to find out. hehe" ... jesus christ
Ilya reminds me of viktor from arcane
So... this isn't a "Doom debates" channel as much as it is a "Liron reacts" channel
@@akmonra I like to think more people will come debate me as the channel grows
❤️
I can throw a biblical side to this even to where those on the fence would take heed and start looking closely at Revelation Chapter 13 because, that's where it's going. Believe it or not, it's true. It's all through the Old and New Testaments, is extreme detail too i might add. The Greek in the New Testament is insanely accurate, same with the Book of Daniel covering this 'end of days' scenario.
I'm not trying to push anything but, it really is true and happening as we speak. (the framework for it all anyway, clear as day)
You'll either see it or you won't...