Sometimes as an AI researcher I worry about the way the world is going with AI. It makes me happy to see how small minded it is to assume the digital world is more important than it is. It's good to see that there are people like you who are great at maths and much prefer to be out in nature.
Note that it only applies if all the regions are like simple connected blobs. If you want to colour a country's exclaves with the same colour, then you can make maps that require arbitrarily many colours. Water is also a region that isn't simple if you want all lakes to be the same colour.
In the episode I clarified “neighboring regions” so exclaves would be counted as separate things, and yeah lakes would also have to be counted as the same type of region (like if you are coloring “cities”, each disconnected exclave or lake would be considered its own city) which is why I think of it as “neighboring regions” and not “regions with the same name”
I don't know if it's the first paper published with use of a computer but this is one I like: COUNTEREXAMPLE TO EULER'S CONJECTURE ON SUMS OF LIKE POWERS BY L. J. LANDER AND T. R. PARKIN Communicated by J. D. Swift, June 27, 1966 A direct search on the CDC 6600 yielded 27^5 + 84^5 + 110^5 + 133^5 = 144^5 as the smallest instance in which four fifth powers sum to a fifth power. This is a counterexample to a conjecture by Euler [1] that at least n nth powers are required to sum to an nth power, n>2. (Yes, this is the whole paper.)
Another excellent video Domotro. I really enjoyed the throwing of objects at you.. not just because it's fun to see random everyday objects as projectiles and your physical reaction to them being thrown at you, but because it feels like another layer of symbolic meaning to add to the pile, both in terms of the subject of the video as well as the whole channel, and just existence really. Life throws stuff at us every day. Sometimes it feels incessant, other times there are gaps and so you don't see it coming. Sometimes it's only one thing.. easy-peasy. Other times there are simultaneous instances.. harder to identify and track two or more things at once. Sometimes it's new things and we don't know how to react. The explosion of computer learning models and eventually full AI onto the world scene is bound to create a situation where the rate of novel things being thrown at us simultaneously increases vastly. We will look back and miss the days when it was just bagels. ❤
8:31 here is the misconception (i think AI companies use misleading terms deliberately). Prompts are neither questions nor instructions for the LLM. Thats how we might formulate them, but the prompt is simply the starting point for token guess game. The AI extrapolates language patterns (which often results in question/instruction followed by answer style text) but has no actual concept of questions or instructions and is therefore incabable of "following" instructions.
Very important topic - fundamental to education. Thanks. Been in machine learning and control systems, since the 80's ... and view these machine learning bits and digital computing - as tools - to augment what goes on in the mind - to help solve problems and further ideas from the mind. All the "old school" machine learning types I know ... electrical engineers (like myself) or mathematicians - all have strong mathematics skills and a true love of mathematics (and simulation). So all that "AI" being created is based on the work of mathematicians that enjoyed using their mind. In any case, great channel with lots of great topics. All the best to you and Cheers ...
The reason GPT is bad at math is that the integer tokenization you get when using BPE is incredibly bad. The model is basically doing the entire computation and then reversing the result to show you, while also seeing the numbers like |546|78|+|234|5=| where each token boundary is the pipe symbol in that expression. This is making it virtually impossible to learn math. If you take a model like llama with much better tokenization of numbers you can get a massively improved result on math benchmarks. There has been research in this space such as WizardMath
I don't think the ChatGPT responses are a result of minimizing computation time. ChatGPT is built to provide correct language, and doesn't build much in the way of a mathematical model for itself. For what ChatGPT is designed to do, everything it produced was (grammatically) correct. Any mathematical processing that resulted from this was just ChatGPT parroting its training data, and perhaps recognizing some "grammatical" relations between numbers.
While it's decent at grammar, it's very bad at logic. There were times (I don't think I showed these in the episode but they are saved on livestreams on my bonus Domotro channel) when it would say stuff along the lines of "...therefore, [so-and-so] is true" and then later in the same response say "...therefore, [that exact same so-and-so] is not true".
@@ComboClassthe underlying mathematics of generative transformers is largely probabilistic, so it is simply computing what is the most likely character combination to follow after the previous one + some additional contexts + part-of-speech tagging, lemmatizatio , word stemming, removing stop words, etc.
The trouble with chat gpt is that it's not even good at grammar. It's literally just spouting nonsense that would probabilistically follow from the question asked, using its training data, which is basically the entire Internet, used without permission. So, it's not using logic at all. It doesn't have grammar programmed in either. It didn't actually even have real words. It has tokens, which are word parts; a few letters at a time usually, sometimes parts of two words.
@@ashheilborn Not sure what ChatGPT you use, but it is at least very decent at grammar. Concerning it being a statistical model that spouts tokens: if I posit that that is also what your brain does, how would you argue against that?
I prefer the newer thumbnail pic for the video, though it's not the reason I clicked (I saw it earlier and waiting until later when I wasn't busy). Great video as always!
Thanks! And yeah I made the original thumbnail picture quickly so that I could release the episode since it had been a while since the last one, but then today I edited it closer to how I wanted it (although the thumbnails are still not my expertise haha)
7:48 I think for the statement about 1 it's possible it's mixed up because people need to keep reiterating that 1 isn't prime, while there's a lot less effort put into explaining that 4,6,8, and 9 aren't prime. It's just taken the logic the wrong way around, 1 not being prime doesn't mean 1 is the only non-prime. That's a pretty major logical error though.
1 is a unit. That is different from 4, 6,.. that are composite (or from 2, 3,.. being prime). The fact that 1 doesn't belong to the primes doesn't imply it belongs with the composites. 1 is an entire different little animal. So, no surprise it is treated differently.
@landsgevaer I get that, but the AI doesn't, it's trained on large volumes of text(most of which is written by teenagers given how most internet demographics look) most of which is aimed at non-knowledgeable people. So it doesn't see the reasoning very often so it plugs in some reasonable sounding nonsense based on what it does hear very often(which is in most places explaining primes the very explicit assertion that 1 isn't prime listed alongside the definition).
So what I take away is that just because some computer, whether meat-based or silicon-based, can be right about some things, one shouldn't assume it's always right, no matter how confident it sounds.
i have a math video idea. how can complex base systems be used, if it’s possible to represent all numbers in the real and complex number fields without multiplying by a system of another base
The answer to the question for me, about computational proofs such as the map/node ones, is that it is no longer a mathematical proof as it is falsifiable.. but scietific evidence as it must be falsifiable. But we are so so sure of certain scientific evidence that it doesn't faulter in our daily lives, so we take it as absolute proof.
Once the process for generating the graphs and evaluating their colorability are both freely available people can write their own versions and see whether they can reproduce the result. That's how most physical science works anyway, but it was a bit of a departure for mathematical proofs.
Bing’s AI copilot seems to be actually really good at math I haven’t tested it extensively but from what I did it was able to do basic arithmetic, algebra and at least up to what I know in pre-calculus 😂
Don’t worry, there will be plenty of non-math episodes too. Although I love numbers, I also love other things (music, nature, games, philosophy, comedy, etc.) so there will be a variety of topics over time :)
@@ComboClass I've noticed! The random squirrel feeding segments are adorable, love seeing them little guys and the relationship you have with them! And as much as maths makes my brain hurt, it is fascinating to hear you talk about. I especially loved the connect 4 video and the one you made about how our clocks don't make any sense 😂 You've got a unique perspective on the world and it's great ❤️
Dear ChatGPT, I am a freakin Phd Mathematician, so you better give me some serious time on my questions or else I will badmouth you to all my colleages.
@@ComboClass I agree with you for now, but just like in Wiki's early days fighting the misinformation people would post, I have a feeling the AI will grow into more of a sensible version of itself, but still (always) needing the direct oversight by people to maintain accuracy Actually, this is already a thing, if in its infancy. I have a gig where I'm paid to check prompts/responses to help improve accuracy, safety, and other things. Also any specific info requires citing
GPT3 was pretty bad at maths, GPT4o can solve all your problems, it can solve the questions from my university textbook straight from picture, it also solved the maths needed for my masters thesis.
It will probably grow over time (the algorithm doesn’t promote my videos instantly yet and they take time to spread). If anyone wants to help, remember that extra comments and watch time will help the algorithm like this channel more. In any case, I’m happy that all of you are watching/appreciating :)
gpt3.5 isn't known for math. The highest order things consumers can get is some of those programs that rebound GPT-4 to force it to double and triple check itself, which you can get some pretty good results with. Apparently gemini can beat like 95% competitive programmers on novel problems when given a massive amount of compute power, which easily requires very very strong mathematical abilities. Personally I'm freaked out, it was like 5 years ago that the highest end AIs had the IQ of a literal rodent, now the stuff that exists (albeit lots behind closed doors) that's smarter than people at most things. Really makes me think the 2nd Coming is close
The brief rejoinder to is that although GPT-4 was able to pass programming contest questions with something like 95% accuracy, that turned out to only be true for programming contest questions that came out before the date of the block of training data that GPT-4 was tested with. When people tested it on rewordings, or brand new contest questions that haven't had a chance to be integrated into training data yet, GPT-4's accuracy and ability to solve them dropped to . . . ~0.05%. Similar to previous versions of GPT; this is a recurring test people run and you can find different results, methodologies, and discussions about them if you want to delve into specifics. A lot of the "good" results we're seeing in many ways from chatGPT right now are laser-focused data-poisoning, where complex answers have been fed directly into the training data for specific association with manually chosen questions from known question-sets. This can often be examined somewhat directly by examining the "anti" space of some prompts, (looking for the least-matching result for a prompt, rather than the most-matching), where a more "organic" result will have an anti space that is fairly "continuous", and a more poisoned result will have an incredibly sharp change after you wander "far" enough away. There's a lot of air-quotes there, because none of those terms are rigorously defined except "anti", but if you try it out for yourself, you'll realize all those concepts for yourself soon enough. You can also often google GPT's code samples and find them verbatim on stackexchange, or as submissions to code contests. While I find the level of grammar that enables answer-recognition to work as well as it does to be a little impressive*, it doesn't seem as though GPT is capable of patching together any ideas of its own unless someone already has patched those specific ideas together in its training data somewhere. To the point where rewording a prompt to which it gave a perfectly good code sample for, in such a way that the underlying problem hasn't been changed, can often result in incorrect answers, or no longer returning a code-sample at all, where even a child could identify that it's the same problem. Try using names for variables like "delta" or "epsilon", for example, the mere use of which usually indicate a specific context in mathematics, (cauchy sequences). . . . and remember, GPT doesn't do any thinking. If anyone's ever said a wrong answer to something anywhere within it's training data, GPT will think that's as correct as any right answer in its training data, up to number of times each answer is stated in its training set. For common questions or basic questions, answer popularity is often a good indicator of correctness. But for anything that's rarely enough asked that it's difficult to google, that's a lot more of a problem. GPT is strictly Garbage In, Garbage Out, and Unknown Prompt, Random Answer. I encourage you to find weak points in whatever implementation of GPT or other LLM's you have access to, and play around with them and understand the weaknesses of the model better. You'll notice the same types of weaknesses peeking through even in spaces where GPT is "stronger" afterwards! *and testing GPT as a language-learning machine, I'm not terribly impressed with it over previous versions of itself. While it seems to get natural language right a lot of the time *in it's output*, it struggles a lot with interpreting problematic sentences in the input, which indicates that the output results only look as good as they do because of the scale of the training set and the internal sentence-size, outputting entire sentences from its training set rather than having learned how to correctly construct sentences. Some well-known examples of problematic sentences give good results, but their anti spaces are incredibly sharp . . .
@@MurderWho I'm going off claims about gemini not GPT-4, GPT-4 isn't the best at programming but apparently google's new ai is. there's more credit due as well, it's not like GPT is the best it can be "raw", there exists tools that basically allow it to talk to itself and plan out higher order thinking so it gets better results. (this is also on top of the fact GPT-4 is kinda stupid compared to the behind-closed-doors version of it that isn't bogged down with useless "safety" features which makes it generally less capable of intelligence) Some of Yannic's videos go into detail abt these things. [also you mention Cauchy sequences but I swear I've given some 0 context real analysis questions to GPT and it knew what I meant lol]
This video makes me suspect you've been talking to the free version of ChatGPT, which is extremely restricted/stupid compared to GPT-4. You didn't seem to at least address the fact that there are multiple models. It's more or less like discussing Wolfram Alpha but the spending the entire time just getting results from phone keyboard autocomplete.
I did say in the video that I wasn’t a paying consumer. And the free version GPT-3.5 (and even older worse versions) were what the majority of people were using and what the media was mostly discussing, so I don’t see what’s wrong with me analyzing it
@@ComboClass You're gonna be lulled to a massive false sense of security if you think the free version derping around is even close to the actual tool people pay to use. I tried the free one too, deemed it glorified text autocomplete, and had a sigh of relief, humans still weren't obsolete, but just in case I decided to try the paid version. Surely it wouldn't be that much better, I thought. It is vastly better. GPT 3.5 is dumb enough that you can't compare it to a person really. I'm concerned GPT-4 has a genuine case that could be made its smarter than your average human. Not smarter than a super smart human, but, smarter than average. And, with GPT models, there's the issue that their AI assistant mode is taking a significant toll on their function, essentially they're a more general intelligence which is kinda just operating based on written instructions about how AI assistant should behave when receiving your math problems. Using the APIs and a bit of tinkering, even GPT 3.5 turns into a really scary thing if it drops some of its operational handicap, so that would be the actual publicly available state of the art, these models but with less rules, or more math-focused rules for answers.
@@ComboClass But, for what it's worth, 3.5 imo is pointless. All free tier models atm are GPT 3.5 level so they are basically terrible and useless, unless they've been fine-tuned for some particular use case.
At least it uses complete sentences 😂 although speaking live, it's easy to lose track of what you've already said and create run-on sentences or fragments.
Is scary how confident is chatgpt at lying and saying general nonsense. And even scarier that some coworkers use it daily and take it's output as gospel.
If there were only 'thousands' of graphs to color, why couldn't you assign each graph to a person (as there are billions of people, but only thousands of graphs)? If everyone could come up with a valid solution for their graph, then the problem is solved. If not, it may be possible to prove that a fifth color was required for some of the graphs (notably, the ones people got stuck on), in which case, the problem would also be solved. This would alleviate any controversy in the proof
Some graphs are big/complicated but that is theoretically possible. The difficulty would be getting that many humans to collaborate on the same mathematical project but that type of thing is happening more over time
GPT is a language processing model. We humans hold language so dear we equate it with general intelligence. But this isn't true for artificial intelligence. A Casio calculator is 100% accurate on arithmetic yet downright incapable of processing language. GPT is able, for example, of giving accurate answers when it comes to history, literature, or biology as it can understand the sources (which are written in natural language). In the future, there may be a methematical proof AI that thinks if proofs by itself and expresses them in logic notation.
ChatGPT is already able to produce mathematical proofs. This video for some bonkers reason used the crippled free version which is significantly less intelligent than the GPT-4, so you got significantly crippled resultse But yeah the tech you speak of, it's been widely available for a long time already.
Anyone else unreasonably amused by his desperate attempts to increase the chaotic insanity? Like, "Shit! Shit! I set this on fire and it didn't even fall over!". The madman groans as he gingerly grasps the unburnt side of the conflagration. Flinging it into the air, he refuses to look at it. As if this would somehow betray his meddling to the camera.
you can make a free account and feed it the keys to your database and give it the source code to your website and get sued for using their intillectual property you made for them. classic.
Sometimes as an AI researcher I worry about the way the world is going with AI. It makes me happy to see how small minded it is to assume the digital world is more important than it is. It's good to see that there are people like you who are great at maths and much prefer to be out in nature.
7:56 yes, numbers that are both prime and perfect squares *are* pretty hard to find. i wonder why that might be
This channel is totally underrated.
Sharing this with my dad, who is a teacher's aide and had a student who was interested in the 4-color map theorem!
Note that it only applies if all the regions are like simple connected blobs. If you want to colour a country's exclaves with the same colour, then you can make maps that require arbitrarily many colours.
Water is also a region that isn't simple if you want all lakes to be the same colour.
In the episode I clarified “neighboring regions” so exclaves would be counted as separate things, and yeah lakes would also have to be counted as the same type of region (like if you are coloring “cities”, each disconnected exclave or lake would be considered its own city) which is why I think of it as “neighboring regions” and not “regions with the same name”
I don't know if it's the first paper published with use of a computer but this is one I like:
COUNTEREXAMPLE TO EULER'S CONJECTURE
ON SUMS OF LIKE POWERS
BY L. J. LANDER AND T. R. PARKIN
Communicated by J. D. Swift, June 27, 1966
A direct search on the CDC 6600 yielded
27^5 + 84^5 + 110^5 + 133^5 = 144^5
as the smallest instance in which four fifth powers sum to a fifth
power. This is a counterexample to a conjecture by Euler [1] that at
least n nth powers are required to sum to an nth power, n>2.
(Yes, this is the whole paper.)
Yeah I mentioned that paper in a previous episode, I love it haha.
The ChatGPT snippets remind me of the fact core from Portal 2. They become significantly more amusing if you read them it its voice.
Oh my god, I've never noticed before but you're totally right
Another excellent video Domotro. I really enjoyed the throwing of objects at you.. not just because it's fun to see random everyday objects as projectiles and your physical reaction to them being thrown at you, but because it feels like another layer of symbolic meaning to add to the pile, both in terms of the subject of the video as well as the whole channel, and just existence really.
Life throws stuff at us every day. Sometimes it feels incessant, other times there are gaps and so you don't see it coming. Sometimes it's only one thing.. easy-peasy. Other times there are simultaneous instances.. harder to identify and track two or more things at once.
Sometimes it's new things and we don't know how to react.
The explosion of computer learning models and eventually full AI onto the world scene is bound to create a situation where the rate of novel things being thrown at us simultaneously increases vastly.
We will look back and miss the days when it was just bagels. ❤
8:31 here is the misconception (i think AI companies use misleading terms deliberately). Prompts are neither questions nor instructions for the LLM. Thats how we might formulate them, but the prompt is simply the starting point for token guess game. The AI extrapolates language patterns (which often results in question/instruction followed by answer style text) but has no actual concept of questions or instructions and is therefore incabable of "following" instructions.
Very important topic - fundamental to education. Thanks. Been in machine learning and control systems, since the 80's ... and view these machine learning bits and digital computing - as tools - to augment what goes on in the mind - to help solve problems and further ideas from the mind. All the "old school" machine learning types I know ... electrical engineers (like myself) or mathematicians - all have strong mathematics skills and a true love of mathematics (and simulation). So all that "AI" being created is based on the work of mathematicians that enjoyed using their mind. In any case, great channel with lots of great topics. All the best to you and Cheers ...
Wow man, the fact that you compose your own music for the videos is super impressive. It’s really good and fits these videos perfectlv.
6:10 Terrance Howard 🤣😂
yo! you have 38k subs now?! congrats! your videos are good as always!
7:14 made me audibly laugh, well done
The reason GPT is bad at math is that the integer tokenization you get when using BPE is incredibly bad. The model is basically doing the entire computation and then reversing the result to show you, while also seeing the numbers like
|546|78|+|234|5=|
where each token boundary is the pipe symbol in that expression. This is making it virtually impossible to learn math. If you take a model like llama with much better tokenization of numbers you can get a massively improved result on math benchmarks. There has been research in this space such as WizardMath
Most underrated TH-cam channel!!
I don't think the ChatGPT responses are a result of minimizing computation time. ChatGPT is built to provide correct language, and doesn't build much in the way of a mathematical model for itself. For what ChatGPT is designed to do, everything it produced was (grammatically) correct. Any mathematical processing that resulted from this was just ChatGPT parroting its training data, and perhaps recognizing some "grammatical" relations between numbers.
Just like humans tend to do, you mean...? 🤔
While it's decent at grammar, it's very bad at logic. There were times (I don't think I showed these in the episode but they are saved on livestreams on my bonus Domotro channel) when it would say stuff along the lines of "...therefore, [so-and-so] is true" and then later in the same response say "...therefore, [that exact same so-and-so] is not true".
@@ComboClassthe underlying mathematics of generative transformers is largely probabilistic, so it is simply computing what is the most likely character combination to follow after the previous one + some additional contexts + part-of-speech tagging, lemmatizatio , word stemming, removing stop words, etc.
The trouble with chat gpt is that it's not even good at grammar. It's literally just spouting nonsense that would probabilistically follow from the question asked, using its training data, which is basically the entire Internet, used without permission.
So, it's not using logic at all. It doesn't have grammar programmed in either. It didn't actually even have real words. It has tokens, which are word parts; a few letters at a time usually, sometimes parts of two words.
@@ashheilborn Not sure what ChatGPT you use, but it is at least very decent at grammar.
Concerning it being a statistical model that spouts tokens: if I posit that that is also what your brain does, how would you argue against that?
I prefer the newer thumbnail pic for the video, though it's not the reason I clicked (I saw it earlier and waiting until later when I wasn't busy). Great video as always!
Thanks! And yeah I made the original thumbnail picture quickly so that I could release the episode since it had been a while since the last one, but then today I edited it closer to how I wanted it (although the thumbnails are still not my expertise haha)
notice how the squirrel did not attempt to take another half of a walnut
Glad you're feeling better!
You asked it for interesting facts about the number one, and it didn't even mention that it's the loneliest number you could ever do?
Two can be as bad as one; it's the loneliest number since the number one.
23:10 Wait, did you eat that pastry off the floor?
7:48
I think for the statement about 1 it's possible it's mixed up because people need to keep reiterating that 1 isn't prime, while there's a lot less effort put into explaining that 4,6,8, and 9 aren't prime.
It's just taken the logic the wrong way around, 1 not being prime doesn't mean 1 is the only non-prime. That's a pretty major logical error though.
1 is a unit. That is different from 4, 6,.. that are composite (or from 2, 3,.. being prime).
The fact that 1 doesn't belong to the primes doesn't imply it belongs with the composites.
1 is an entire different little animal. So, no surprise it is treated differently.
@landsgevaer
I get that, but the AI doesn't, it's trained on large volumes of text(most of which is written by teenagers given how most internet demographics look) most of which is aimed at non-knowledgeable people.
So it doesn't see the reasoning very often so it plugs in some reasonable sounding nonsense based on what it does hear very often(which is in most places explaining primes the very explicit assertion that 1 isn't prime listed alongside the definition).
7:58 I'm sorry, both prime and composite?
i love you man
So what I take away is that just because some computer, whether meat-based or silicon-based, can be right about some things, one shouldn't assume it's always right, no matter how confident it sounds.
Great topic! I love these lectures
ChatGBT getting info from Terrence Howard 😂
very good video!
i have a math video idea. how can complex base systems be used, if it’s possible to represent all numbers in the real and complex number fields without multiplying by a system of another base
The answer to the question for me, about computational proofs such as the map/node ones, is that it is no longer a mathematical proof as it is falsifiable.. but scietific evidence as it must be falsifiable. But we are so so sure of certain scientific evidence that it doesn't faulter in our daily lives, so we take it as absolute proof.
Is it falsifiable?
As in: you might find an error/bug in it if you went through it?
How is that different from a proof by humans...?
Once the process for generating the graphs and evaluating their colorability are both freely available people can write their own versions and see whether they can reproduce the result. That's how most physical science works anyway, but it was a bit of a departure for mathematical proofs.
Is this V sauce on a budget? JK this is an excellent analysis of language AIs.
Bing’s AI copilot seems to be actually really good at math I haven’t tested it extensively but from what I did it was able to do basic arithmetic, algebra and at least up to what I know in pre-calculus 😂
please do not set your house on fire
i have no interest in mathematics but i can't stop watching you, such unique energy
Don’t worry, there will be plenty of non-math episodes too. Although I love numbers, I also love other things (music, nature, games, philosophy, comedy, etc.) so there will be a variety of topics over time :)
@@ComboClass I've noticed! The random squirrel feeding segments are adorable, love seeing them little guys and the relationship you have with them! And as much as maths makes my brain hurt, it is fascinating to hear you talk about. I especially loved the connect 4 video and the one you made about how our clocks don't make any sense 😂 You've got a unique perspective on the world and it's great ❤️
Thanks! Glad you appreciate :)
The Ace Ventura of backyard mathematics 😀.
Dunning-Kruger effect in chatGPT
Dumbing-'Puter effect? 😛
It couldn't tell me a correct character count in a paragraph.
Dear ChatGPT, I am a freakin Phd Mathematician, so you better give me some serious time on my questions or else I will badmouth you to all my colleages.
I'm not sure why, but 7:54 is very funny to me.
So chatgpt is the new wikipedia?
Nah Wikipedia is way more accurate, it actually cites its sources (plus just has less nonsense gibberish mixed in)
@@ComboClass I agree with you for now, but just like in Wiki's early days fighting the misinformation people would post, I have a feeling the AI will grow into more of a sensible version of itself, but still (always) needing the direct oversight by people to maintain accuracy
Actually, this is already a thing, if in its infancy. I have a gig where I'm paid to check prompts/responses to help improve accuracy, safety, and other things. Also any specific info requires citing
Wouldn't be a combo class video if something didn't catch on fire😂
Wow that AI must've talked to Terrence Howard.
GPT3 was pretty bad at maths, GPT4o can solve all your problems, it can solve the questions from my university textbook straight from picture, it also solved the maths needed for my masters thesis.
Only 40 views? What is the algorithm doing?
Maybe TH-cam's A.I. wasn't happy with the video.
155 now!
@@Somebodyherefornow159
It will probably grow over time (the algorithm doesn’t promote my videos instantly yet and they take time to spread). If anyone wants to help, remember that extra comments and watch time will help the algorithm like this channel more. In any case, I’m happy that all of you are watching/appreciating :)
@@ComboClassCommenting for support! I’d love an episode on fractional calculus!
gpt3.5 isn't known for math. The highest order things consumers can get is some of those programs that rebound GPT-4 to force it to double and triple check itself, which you can get some pretty good results with. Apparently gemini can beat like 95% competitive programmers on novel problems when given a massive amount of compute power, which easily requires very very strong mathematical abilities. Personally I'm freaked out, it was like 5 years ago that the highest end AIs had the IQ of a literal rodent, now the stuff that exists (albeit lots behind closed doors) that's smarter than people at most things. Really makes me think the 2nd Coming is close
The second coming?
Bro...
The brief rejoinder to is that although GPT-4 was able to pass programming contest questions with something like 95% accuracy, that turned out to only be true for programming contest questions that came out before the date of the block of training data that GPT-4 was tested with.
When people tested it on rewordings, or brand new contest questions that haven't had a chance to be integrated into training data yet, GPT-4's accuracy and ability to solve them dropped to . . . ~0.05%. Similar to previous versions of GPT; this is a recurring test people run and you can find different results, methodologies, and discussions about them if you want to delve into specifics.
A lot of the "good" results we're seeing in many ways from chatGPT right now are laser-focused data-poisoning, where complex answers have been fed directly into the training data for specific association with manually chosen questions from known question-sets. This can often be examined somewhat directly by examining the "anti" space of some prompts, (looking for the least-matching result for a prompt, rather than the most-matching), where a more "organic" result will have an anti space that is fairly "continuous", and a more poisoned result will have an incredibly sharp change after you wander "far" enough away. There's a lot of air-quotes there, because none of those terms are rigorously defined except "anti", but if you try it out for yourself, you'll realize all those concepts for yourself soon enough.
You can also often google GPT's code samples and find them verbatim on stackexchange, or as submissions to code contests.
While I find the level of grammar that enables answer-recognition to work as well as it does to be a little impressive*, it doesn't seem as though GPT is capable of patching together any ideas of its own unless someone already has patched those specific ideas together in its training data somewhere. To the point where rewording a prompt to which it gave a perfectly good code sample for, in such a way that the underlying problem hasn't been changed, can often result in incorrect answers, or no longer returning a code-sample at all, where even a child could identify that it's the same problem. Try using names for variables like "delta" or "epsilon", for example, the mere use of which usually indicate a specific context in mathematics, (cauchy sequences).
. . . and remember, GPT doesn't do any thinking. If anyone's ever said a wrong answer to something anywhere within it's training data, GPT will think that's as correct as any right answer in its training data, up to number of times each answer is stated in its training set. For common questions or basic questions, answer popularity is often a good indicator of correctness. But for anything that's rarely enough asked that it's difficult to google, that's a lot more of a problem.
GPT is strictly Garbage In, Garbage Out, and Unknown Prompt, Random Answer.
I encourage you to find weak points in whatever implementation of GPT or other LLM's you have access to, and play around with them and understand the weaknesses of the model better. You'll notice the same types of weaknesses peeking through even in spaces where GPT is "stronger" afterwards!
*and testing GPT as a language-learning machine, I'm not terribly impressed with it over previous versions of itself. While it seems to get natural language right a lot of the time *in it's output*, it struggles a lot with interpreting problematic sentences in the input, which indicates that the output results only look as good as they do because of the scale of the training set and the internal sentence-size, outputting entire sentences from its training set rather than having learned how to correctly construct sentences. Some well-known examples of problematic sentences give good results, but their anti spaces are incredibly sharp . . .
@@MurderWho you nailed it. GPT is mostly useful as a glorified search engine for previously solved problems
@@MurderWho I'm going off claims about gemini not GPT-4, GPT-4 isn't the best at programming but apparently google's new ai is. there's more credit due as well, it's not like GPT is the best it can be "raw", there exists tools that basically allow it to talk to itself and plan out higher order thinking so it gets better results. (this is also on top of the fact GPT-4 is kinda stupid compared to the behind-closed-doors version of it that isn't bogged down with useless "safety" features which makes it generally less capable of intelligence)
Some of Yannic's videos go into detail abt these things.
[also you mention Cauchy sequences but I swear I've given some 0 context real analysis questions to GPT and it knew what I meant lol]
This video makes me suspect you've been talking to the free version of ChatGPT, which is extremely restricted/stupid compared to GPT-4. You didn't seem to at least address the fact that there are multiple models. It's more or less like discussing Wolfram Alpha but the spending the entire time just getting results from phone keyboard autocomplete.
I did say in the video that I wasn’t a paying consumer. And the free version GPT-3.5 (and even older worse versions) were what the majority of people were using and what the media was mostly discussing, so I don’t see what’s wrong with me analyzing it
@@ComboClass You're gonna be lulled to a massive false sense of security if you think the free version derping around is even close to the actual tool people pay to use. I tried the free one too, deemed it glorified text autocomplete, and had a sigh of relief, humans still weren't obsolete, but just in case I decided to try the paid version. Surely it wouldn't be that much better, I thought.
It is vastly better. GPT 3.5 is dumb enough that you can't compare it to a person really. I'm concerned GPT-4 has a genuine case that could be made its smarter than your average human. Not smarter than a super smart human, but, smarter than average. And, with GPT models, there's the issue that their AI assistant mode is taking a significant toll on their function, essentially they're a more general intelligence which is kinda just operating based on written instructions about how AI assistant should behave when receiving your math problems. Using the APIs and a bit of tinkering, even GPT 3.5 turns into a really scary thing if it drops some of its operational handicap, so that would be the actual publicly available state of the art, these models but with less rules, or more math-focused rules for answers.
@@ComboClass But, for what it's worth, 3.5 imo is pointless. All free tier models atm are GPT 3.5 level so they are basically terrible and useless, unless they've been fine-tuned for some particular use case.
Yes i too got basic math answers wrong and questioned it and it admitted its mistake
I see a cat. :D
Omg fam😂 I had the exact same problem!!!
But low key, one thing that chat got was right about (on accident) and for completely the wrong reasons but 1×1=2 is actually a true statement...🤯
-1/2 is negative half
Seems AI gives a word salad response like politicians.
At least it uses complete sentences 😂 although speaking live, it's easy to lose track of what you've already said and create run-on sentences or fragments.
I have a certain level of prejudice against using my brain myself (is hard) so I guess I'll have to side with the AI on this one.
A Meal + half a meal = a Meal 😊
Me when I'm Domotro from Combo Class
Is scary how confident is chatgpt at lying and saying general nonsense. And even scarier that some coworkers use it daily and take it's output as gospel.
If there were only 'thousands' of graphs to color, why couldn't you assign each graph to a person (as there are billions of people, but only thousands of graphs)? If everyone could come up with a valid solution for their graph, then the problem is solved. If not, it may be possible to prove that a fifth color was required for some of the graphs (notably, the ones people got stuck on), in which case, the problem would also be solved. This would alleviate any controversy in the proof
Some graphs are big/complicated but that is theoretically possible. The difficulty would be getting that many humans to collaborate on the same mathematical project but that type of thing is happening more over time
GPT is a language processing model. We humans hold language so dear we equate it with general intelligence. But this isn't true for artificial intelligence.
A Casio calculator is 100% accurate on arithmetic yet downright incapable of processing language.
GPT is able, for example, of giving accurate answers when it comes to history, literature, or biology as it can understand the sources (which are written in natural language).
In the future, there may be a methematical proof AI that thinks if proofs by itself and expresses them in logic notation.
ChatGPT is already able to produce mathematical proofs. This video for some bonkers reason used the crippled free version which is significantly less intelligent than the GPT-4, so you got significantly crippled resultse
But yeah the tech you speak of, it's been widely available for a long time already.
@@gJonii thanks for the info
Anyone else unreasonably amused by his desperate attempts to increase the chaotic insanity? Like, "Shit! Shit! I set this on fire and it didn't even fall over!". The madman groans as he gingerly grasps the unburnt side of the conflagration. Flinging it into the air, he refuses to look at it. As if this would somehow betray his meddling to the camera.
you can make a free account and feed it the keys to your database and give it the source code to your website and get sued for using their intillectual property you made for them. classic.
AI are not Robots!
AI = artificial intelligence = mind
Robot = machine servant = mind + body
Aged like milk
first!
Congrats 🎉
your life is meaningless in the grand scheme of things.