Fun fact about the imprecision of floating points: if you’re in a infinitely generating video game, going out further and further will eventually lead to your player “teleporting” around instead of walking, since the subtlety of your individual steps cannot be represented when your player coords are so big, which is why some infinitely generated games move the world around the player instead of the player around the world, that way everything it processed close to the coordinates 0,0,0 for the greatest precision
The Outer Wilds isn't infinitely generated but it uses the latter system. If you move away from the solar system for long enough, and bring up the map, the map will get flickery and glitchy
@@samagraarohan2513 Obligatory that's mostly on Bedrock. Java has distance effects too, but Bedrock's are far more apparent (see: falling through the world lol).
assuming it's using floats yeah. there are other ways to represent numbers, but they have other problems - usually being slow to work with and taking a lot of ram as they get longer (on either side of the radix point)
Hmm, I'm not sure I ever heard anyone say it aloud so I usually read it to myself as /rædɪks/. Both Merriam-Webster and Cambridge only offer /reɪdɪks/, which is a bit distressing.
@@ferociousfeind8538 I don't think I've ever seen someone ask about pronunciation in the comments section on TH-cam and then someone else give an answer where both commenters both know and use IPA. It stands to reason that it would be on this channel though.
ive often found the "pretend that youre inventing it for the first time" method of teaching to be really effective, and i feel like this video is just such an excellent case study. math is not my strong suit but i still found it easy to follow because of that framing and the wonderful visualizations. thank you!
I know right. I find this works really well for me for math and science concepts like quantum physics, relativity, and number theory that I've been looking at recently. It's like the whole style of math explained like numberphile, 3Blue1Brown and Matt Parker!
Also "Physics Explained", foing him recently which is what put me on this theoretical physics streak, PHENOMENAL channel. Criminally not known enough. Though only praise and praise in his comment sections
4:43 The moment I realized this incredible fluid and clean visualization was actually RECORDING EXCEL when he typed in the function blew my mind. I can’t imagine how these videos are made.
The reason 0 represents positive and 1 represents negative has to do with the fact that signed/unsigned standards with overlapping range will be the same if you do this for integers and the standard was carried over to floating point. So for 8 bit integers the range from 0->127 can be represented in both signed and unsigned standards. Because it would be convenient to represent them the same way, someone made the decision to do that. The result is that positive numbers in signed integers have a leading 0 and negatives have a leading 1.
@@Henrix1998 true, but with how two's complement works the first bit is zero for positive integers and one for negative. It still leads to "eh just copy how it works with ints for ease of use"
I'm assuming it was done that way in integer (and then floating point) because programmers often make the default state of a Boolean negative, and the modified state of a Boolean positive. For instance, if something went wrong with the assignment of the Boolean, it would have the default and/or more common value and more likely to be correct and fail silently.
@@corylong5808 No, it's entirely unrelated to booleans. Booleans don't have anything to do with positive or negative numbers, they can only store true and false. Usually that's represented canonically as 0=false and 1=true (so only the least significant bit changes, not the sign bit which is the most significant bit). And generally any non-zero value is interpreted as true.
@@Henrix1998 They weren't saying that negative and positive numbers look the same. What the original comment was saying is that a number like 2 would be written the same way in a signed and unsigned byte, both being 00000010 Edit: whereas if you made 1 represent a positive number the signed byte 2 would look like 10000010
I've always thought I'd love to have jan Misali as a teacher but after watching them explain this competely foreign and complex topic to me so well, they'd be so over qualified for any teacher's pay or compensation. Like, the w series was in depth, but this just has so many moving parts and you communicated them so well!
Let me tell you why you are objectively wrong. This is in no disrespect to jan Misali, firstly. But anyway, these videos contain a pretty large gap between each in-depth one. Going as far as a few months. While teachers have to prepare a lesson within different days in a week, at least once every two weeks but it depends what age you are. I expect you are at least thirteen which probably means at least five lessons every two weeks. While jan Misali has months for one lesson. And within preparing these lessons, the teacher has a short gap to produce a one hour lesson about a topic. And this is also to different people, around 30 people usually that they have to teach it to. This means there will be many hold ups probably. While with jan Misali, he has months to produce a lesson. This means making and perfecting scripts, perfecting slides and graphics. And it is to more than 30 people but not at one time. He is able to keep going without interruptions and put all attention on you (technically). This is why there might seem to be a big difference. And you thinking this does not show how great jan Misali is, it shows how underappreciated teachers are.
@@alternateaccount9510 So, the TL;DR of this is "Misali has more time for each lesson, and there's a slim-to-none chance that he could cover a unique subject for 100+ days, multiple times per day." Am I correct?
That is correct. In fact, if he were a teacher he would be as good as any other teacher simply because the challenges of arranging a lesson for 30 people daily such that they all equally understand and do not resort to rote memorization is so slim that he would fall to the teaching quality of any other teacher at best.
He does at least know what he is talking about and his interests are clear, but to teach something to random people directly in a nonstop basis to the level he does is impossible.
Floating-point is a very carefully-thought-out format. Here's my favorite float tricks: * You can generate a uniform random float between 1.0 and 2.0 by putting random bits into the mantissa, and using a certain constant bit pattern for the exponent/sign bits. Then you can subtract 1 from it to get a uniform random value between 0 and 1. This is extremely fast *and* has a much more uniform distribution compared to the naive ways of generating random floats, like "rand(100000) / 100000.0f" * GPU's can store texture RGBA pixels in a variety of formats, and some of them are floating-point. But engineers try as hard as possible to cram lots of data into a very small space, which leads to some esoteric formats. For example, the "R10G11B11" format stores each of the 3 components as an *unsigned* float, with 10 bits for the Red float, and 11 for each of the Green and Blue floats, to fit into a 32-bit pixel. Even weirder is "RGB9_E5". In this format, each of the three color channels uses a 14-bit unsigned float, 5 bits of exponent and 9 of mantissa. How does this fit into 32 bits? Because they share the same exponent! The pixel has 5 bits for the exponent, and then 9 bits for each mantissa.
So if the colors in RGB9_5 share the same exponent, then wouldn't each color have roughly the same brightness, each less than twice as much as each other? So the colors would always look pretty unsaturated.
@@pekkanen_sr no, because you would scale the other components' mantissa. e.g. if you want very bright red with very small green value, you set high exponent, high red mantissa and low green mantissa ofc this way you lose precision with very bright colors, but that's the point
@@kkard2 Because if it's not like that then my point still stands, the highest the mantissa can be is 1.11111111 in binary or about 1.998 in decimal, and the lowest it can be is 1, which is exactly what i was saying
Fun fact: The Chrome javascript engine uses the NaN-space to hold other datatypes such as integers, booleans and pointers. That way everything is a double.
Another reason for 0 being positive and 1 being negative: (-1)^0 is positive, and (-1)^1 is negative. More technically, it's because the multiplication of signs acts as addition mod 2, with positive or 0 as the identity, respectively. (So when multiplying floating-point numbers, the sign bit is just the XOR of the sign bits of the multiplicands.)
@@LARAUJO_0 If you want to get really particular about it, they're not synonyms. A factor is something that divides something else, usually in some 'even' way, but a multiplicand is just an operand to a multiplication operator. You wouldn't say that 1.5 is a factor of 6, but 1.5 × 4 = 6. But the word "operand" or "argument" is usually clear enough in context that "multiplicand" is uncommonly used. Even "term" often does the trick just fine.
IEEE 754 is amazing in how often it just works. "MATLAB's creator Dr. Cleve Moler used to advise foreign visitors not to miss the country's two most awesome spectacles: the Grand Canyon, and meetings of IEEE p754" -William Kahan
Wow the information density here is high! This 18 minute video is essentially the first lecture of the semester in the Numerical Analysis course I took in my senior year as a math major, except that lecture was an hour long! The professor used floating point as a motivation to talk about different kind of errors in lecture 2 (i.e., round-off vs. truncation) which honestly was a pretty effective framing.
actually i think it makes sense for 0 to represent a positive number in the system. that way, a floating point number that is initialized to all zeroes would represent positive zero instead of negative zero.
@@chromosozeit also makes sense with how we represent negative numbers usually. If I right 5, you assume I mean +5, I have to indicate -5 which is done by adding that 1 in front of it.
Shout out to my favorite underused format the fixed-point. An old game creation library I used had them. Great for 2D games when you wanted subpixel movement but not a lot of range ("I want it to move 16 pixels over 30 frames"). I found the lack of precision perfect so that you don't get funny rounding errors and have things not line up after moving them small spaces.
They're also good for avoiding bugs caused by the changing precision in floating-point formats. Sadly they have worse optimization potential in modern CPUs.
I also feel like minecraft should be using a fixed-point position format. Half of all the number space is completely wasted at the world's origin, and then the precision is stretched to its limit at quintillions of blocks out. Does minecraft need more than 4096 possible positions within a block (in X, Y, and Z directions each of course)? I don't really think so. Which leaves 19 bits for macro-block positions (and 1 for the sign of course) which is 524,288 blocks with 32 bits. Or, with 64 bits as I'm pretty sure it already does use, you can go out to 2,251,799,813,685,248 (or 2.251×10^15, or 2.2 quadrillion) blocks in any direction, before the game either craps out or loops your world. Which I think is a fine amount of space, and even 500,000 blocks was fine- you'd run out of computer memory doing ordinary minecraft things or run out of interest in the world before you actually naturally explored that far. But with 2 quadrillion blocks in any direction, there is no way you'll get out there ever, without teleworking there to see what it's like
Yeah, games with limited scale don't need the sliding scale of the floating point number system. Minecraft does not need to specify 1/65536th of a block, it probably doesn't functionally need anything more than, like, 1/4096th of a block (or 1/256th of a single pixel), which would nominally squash the range the floating point coordinate system afforded you, but expand the usable space significantly and give you a hard cutoff where the fixed point numbers top out (or bottom out negatively) In fact, using 64 bits for positions (floats are SO last year) and a fixed-point implementation down to 1/4096th of a block, you get a range up to 2.2517 * 10^15 blocks in any direction from spawn, all which behave as well as current vanilla minecraft between 2048 and 4096 blocks away from spawn (where the minimum change is already 1/4096) And, of course, I couldn't just NOT mention the other half of the floating-point numbers that Minecraft straight up cannot use- half of the available numbers are positions between 0 and 1! Unparalleled precision used for nothing!
Microprocessor/IC enthusiast here. 5:33 The reason that a 1 in the sign bit of a signed integer is negative and a 0 is positive, is that it saves a step when doing 2's Compliment, which is how computers do subtraction (basically, you can turn any addition problem into a subtraction problem by flipping the bits of one of the numbers and adding 1, since computers can't natively subtract).
11:23 "The caveat only being that they get increasingly less precise the closer you get to zero." That isn't the only caveat. Subnormals often have a separate trap handler in the processor which can slow down processing quite a bit if a lot of subnormals appear.
Thank you, you just answered my question of why my monte carlo integrator has wildly different run times for different values of the parameters even though it should perform the exact same operations.
Note that this is in many cases an implementation artifact. There _are_ implementation techniques for performant subnormal handling, but since they are supposed to happen only sparingly actual implementations have less incentive to optimize them. That's why subnormal numbers hit much harder in newer machines than in older machines. (IEEE 754 does have an additional FP exception type for subnormals, but traps are frequently disabled for the same reason that signaling NaN is unpopular.)
If I had to guess, 0 being positive and 1 being negative is a holdover from 2's complement binary representation. For those uninitiated, 2's complement binary representation (2C) is a way to represent positive and negative whole numbers in binary that also uses the leading bit as the signed bit. To showcase why this format exists here's an example of writing a -3 in 2C using 8 bits. Step 1: Write |-3| in binary |-3| = 3 = (0000 0011) Step 2: Invert all of the bits Inv(0000 0011) = 1111 1100 Step 3: Add 1 1111 1100 + 0000 0001 1111 1101 -3 = (1111 1101)2C Converting it back is the reverse process Step 1: Subtract 1 1111 1111 - 0000 0001 1111 1110 Step 2: Invert all of the bits Inv(1111 1110) = 0000 0001 Step 3: Convert to base of choice, 10 in this example, and multiply by -1 0000 0001 = 1 1 * -1 = -1 (1111 1111)2C = -1 The advantage of this form is that addition works the same regardless of sign of both numbers or the order. It does this by using the overflow to discard the negative sign if the result would be positive. Example: -3 + 5 Example: -3 + 2 1111 1101 1111 1101 + 0000 0101 + 0000 0010 0000 0010 1111 1111 2 = (0000 0010)2C -1 = (1111 1111)2C It's ingenious how effortlessly this integrates with existing computer operations and doesn't have glaring issues, such as One's Complement having a duplicate zero or requiring operational baggage like more naïve negative systems. To go back to the original statement, this system only works if the leading digit is a one because if it were inverted 0 would be (1000 0000)2C. This is not only unsatisfying to look at, but dangerous when you consider most bits in a computer are initialized as zero, which would be read in this hypothetical system as -255.
Using a single bit to represent the sign is convenient for thinking about and implementing the numbers, but ultimately binary integers are _modular_ integers, which don't actually have any concept of sign. Ascribing a sign based on the first bit gives odd behavior specifically for 0 = -0 being "positive" and 128 = -128 being "negative" when really both are neither.
this is a good video!! I like the way you explain how this came to exist. it is a human thing made by humans, and as such it is messy and flawed but it _works_ . and I love that. it was created for a purpose, and it serves that purpose well. you didn't do this, but I've seen people say that "computers can't store arbitrary-precision numbers", which frustrates me, because computers _can_ do that, they just need a different format. these tools are freedoms, not restrictions. if you want to perform a different task, then find different tools. and yeah you probably won't ever need this much precision, but like things like finance exists, where making a $0.01 error in a billion-dollar field is cause for concern, at the least. arbitrary-precision numbers do require arbitrary memory, but they're definitely possible. I think 0 is positive and 1 is negative because 0 is "default" and one is "special", like how main() returns 0 on success in C. also it could be because of how signed integers are stored, where 11111101 is -3 b/c of two's compliment (I think?), which is mathematically justified by 2-adic numbers. anyway good video, as always.
In finance, you can instead use a (int64, currency) pair with the integer representing micros. (1000, USD) is interpreted as one tenth of a cent (one thousand micro dollars).
You would definitely not use a floating point number for finance, the inaccuracies will cause trouble. You'd use a couple integers (like 200 dollars and 53 cents).
Using int64 for finance is questionable; there have been instances of hyperinflation where the 64-bit limit (~18 quintillion) could have been exceeded in some cases. In particular, the first Zimbabwe dollar (ZWD) had a total of 25 zeros knocked off during successive redenominations (i.e. 10^25 ZWD = 1 ZWL), but 18 quintillion is less than 10^20, so you would be unable to represent 1 ZWL as an int64 of ZWD. Making matters worse, 1 ZWL was still way too small to be useful. If your financial database was never converted from ZWD to ZWL, you would be overflowing all the time. The safest way to represent money is as an arbitrary-precision integer, not as an int64.
"I’ve been jan Misali, and much like infinity minus infinity, I too am not a number." guess we need to reconsider our definition of a "misalian"... and write an apology to our math teacher.
This is without question the best explanation of floating point numbers I've ever seen. I wish this video was around when I was taking my freshman CS classes and we had to memorize the structure of floats, since actually walking through the whole process of making compromises and design decisions behind the format really gives you a deep understanding of the reasoning behind the format.
I've had a vague sense of what floats are from running into problems with them in computational design software, and the feeling of getting a clear overview on something you've only been vaguely familiar with is so good. great video.
what a cool video!!! i always love that i can understand your content even though i have no background in it. thinking of numbers as approximations like this really is so fascinating and unique
I went from knowing nothing about floating point numbers to literally all of it (with the exception of how it's applied) in 17 minutes. I am very impressed! Great video
Here i am, once again When i got out of all my maths courses i swore to never come back, but am i gonna sit through however much time jan Misali needs to talk about some cool random math thing? Yes, yes i am
This is the best explanation of IEEE 754 I've ever seen. Much easier to follow than the textbook I originally learned this stuff from 20 years ago. The only things I can remember that you didn't mention are (a) that the FPU has a certain number of extra hidden bits (3?) to minimize the rounding errors applicable to the results of intermediate steps of sequences of calculations, (b) we had a whole 6 week lecture course that I didn't manage to understand (I was a CompSci student but the module was run alongside math majors) about what happens when the system breaks down (perhaps when subnormals aren't enough?) and how to perform operations in the correct order to minimize errors because formulae that should be mathematically equivalent aren't always (c) I can't remember the difference between a 'Signalling NaN' and a non-signalling one, (d) the whole issue with rounding errors, equality checking and 'epsilon', and (e) there was some hype at the time around a then new-ish IEEE *decimal* standard that was supposed to replace the standard double precision binary format and fix all the problems but I don't know if it ever gained much traction at all (obviously the binary formats are still ridiculously popular).
Great video as always! And I'm always happy to see Tom7 getting a plug as well :) You both exist in the same realm of "obscure but incredibly entertaining content about niche subjects"
File this under: things that made me cry in university course yet this TH-cam channel has me understand it for once. This has happened like, three seperate times!!! Thank you!
Daaamn Jan! You really blew up since last I saw you. Nice job! This video was a lot of fun. I like to think you also did the entire thing in one take. I know you said at the end you're not a number but as far as I'm concerned, you're number one!
Thanks, this is a very straightforward explanation and I'm probably going to link it to any neophyte programmer asking me a question about floating points :) Or anyone who shouts that JavaScript is a bad language when the flaws they're complaining about are really just flaws with the IEEE floating point standard that JS doesn't encapsulate.
I mean, "all numbers are double precision floats, deal with it" is a pretty awkward design decision. On the other hand the thing was thrown together in like a week and a half originally and when you take that into consideration it's pretty good.
@@notnullnotvoid Plenty of languages with a high level of abstraction have only one number type. 64bit floating point numbers are pretty good at representing 32 bit integers accurately, so when dealing with human-scale numbers it's rarely a problem. But JS could have done a better job handling NaN, Infinity and -0 for sure.
"it's kind of weird to think of a number system as having a philosophy," says the person who has spent years trying to sell me on the philosophy of seximal
As someone also weirdly into floating point numbers, I appreciate this video. 0 for the sign bit being positive is carried over from previous sign/value representations of numbers. It has the benefit that "all zero bits" is the same as positive zero, the expected "default" float value. Also, you can think of it as multiplying the rest of the number by (-1)**(sign bit). And I tend to think of floats themselves as precise, but operators need to round to the closest precise value. This makes rounding modes make sense. But your interpretation is great for stuff like 1/0==Infty. But I hate hearing the phrase "Infinity is a concept": You say "Infinity is not a real number", but that sounds like the non-maths meaning of "real" (i.e., it exists). Of course, it can be a "real" number in certain domains. In IEEE-754, it isn't a concept, but an entity that "exists". Like "engineering notation" 1.72e-3 == 1.72 * 10**-3 == 0.00172, there is a "precision notation" 1.001p4 == 1.001_2 * 2**4 = 10010_2 And about NaNs: All of those values are not used in real life. You might rarely see the difference between signalling and quite NaNs, but the entire payload isn't used. In JavaScript, there is only 1 NaN value for example. Some modified floating point formats (Like the ARM 16 bit float) reuse the NaN encodings as just larger numbers.
I think the reason that the first bit being a 0 represents positive numbers and 1 for negative is so that it's consistent with signed integer formats. When you add two signed ints (or byte, long, etc) you just go from right to left, adding bitwise and carrying the 1 to the next digit if you need to. The leading bit of a signed int is 0 if it's positive so that, to add two ints, you can just treat the leading bit like another digit in the number. For example, in signed bytes, 1+2=3 is done as 00000001+00000010=00000011, but if a leading 1 represented a positive number then if you tried to apply the same bitwise addition step to each digit, you would get 10000001+10000010=00000011. Since you're adding two leading 1's together, they add to 0, which means the number becomes negative. When using a 0 for the leading digit of a positive number, you don't have this problem when adding together negative numbers, since you'll have a 1 carried to the leading digit (so for example, (-1)+(-2)=(-3) is done as 11111111+11111110=11111101) unless the numbers are so negative that you get an integer underflow, which is an unavoidable problem. Because this convention makes logical sense for signed ints, it makes sense that it would be used for floats for consistency.
One useful result of the sign bit being 0 meaning positive is that this way the "ordinary zero", i.e. "positive zero" value consists entirely of zero-bits. So if some piece of memory is zero-initialized, and then interpreted as floating point numbers, those become zero-initialized, too.
In regards to signed zero, I recall an explanation in a paper a long, *long* time ago that showed a graph of a 2D function that was correct with signed zero, and incorrect without it. It kills me that I can't find the damn thing, but the point is that signed zero is *required* for some functions to get sensible results with floating point. I think it comes down to correctly representing sign at inflection points. Oh, also: another small, practical advantage of the way the sign bit is: it means "all zero bits" means 0.0. This is handy for contexts where data is initialised to zero: you end up with your floats all being 0.0 by default, rather than -0.0.
One interesting way that having negative zero can come in handy is with functions that have "branch cuts", which is something that's normally associated with complex numbers, but there's an analogous thing with the inverse tangent function that doesn't require complex numbers. In C (and most other programming languages), taking the inverse tangent of +infinity gives +π/2, and similarly the inverse tangent of -infinity is -π/2. So if you have a function that computes arctan(1/x) and x ends up being a negative number that's too small to be represented, the fact that it underflows to -0 instead of +0 can save you from being off by π.
I always see the sign as `isNegative`. Also, I recently did a thing in Desmos where I needed to know the quadrant a point was in, so I had Q(p) = N(p.x) + 2N(p.y); N(x) = {x < 0: 1, 0} (which generates a non-standard quadrant index, but it's maybe better.), where N(x) is the same mapping. Also, with this sign encoding, we have that the actual sign is (-1)^s ((-1)^0 = 1, (-1)^1 = -1)
thank you for making this video! my code teacher told me basically everything i needed to know but also move to a more important subject, while your video on youtube is a more cozy place where i could learn more
Oh hell yeah I’ve heard floating points explained a few times and still don’t really get it so I’m really happy you did a video on it. Your brain seems to work similar to mine, or at least you’re very good at explaining, so your videos work very well with my brain. The current second place video for explaining this topic to me is the video about quake’s fast square root hack
i love all your videos because i get so into the topic and your style of delivering comedy that totally dont realize that I dont know what your talking about until like 5 minutes after you lost me lmao
Once again, Jan Misali has taken something I dont give two hoots in Hell about, and convinced me to sit through a 20 (almost) minute long video, and enjoy every second of it (and learn some stuff, that even though I dont care about, I will happily carry with me forever) Impressive
I always found it silly that IEEE-754 gives us 1/0=INF, but not INF*0=0, this is the first time I've had a plausible explanation for why it might have been designed that way. Thanks!
I hope you do a video about the balanced ternary number system sometime, it's very cool and has lots of unique properties! Having the digits 1, 0, and -1, it can naturally represent negative numbers. Truncation is equivalent to rounding, so repeated rounding will not result in loss of precision. Converting a number to negative simply involves swapping the 1 digit with -1 and vice versa. Similar to binary, having only digits with a magnitude of 1 simplifies multiplication, allowing you to use a modified version of the shift-and-add method (flip to negative if -1, shift, and add). Some early computers used balanced ternary, such as the Setun computer at Moscow State University, and a calculating machine built by Thomas Fowler. Fowler said in a letter to another mathematician that "I often reflect that had the Ternary instead of the denary Notation been adopted in the Infancy of Society, machines something like the present would long ere this have been common, as the transition from mental to mechanical calculation would have been so very obvious and simple."
jan Misali: These five bits for where the point should go allow us to do something very clever. Me, listening to this in the background for the third time, processing everything kinda on autopilot, and also having seen the Lidepla video: Uh oh that can't be good
For those who need an explanation: In binary, the fraction one-tenth is the recurring "decimal" 0.00011001100110011001100..., so has to be rounded off in order to be stored in floating point. To be ultra-precise, it's stored as 0.100000001490116119384765625 (13421773/2^27) in 32-bit "single precision", or 0.1000000000000000055511151231257827021181583404541015625 (3602879701896397/2^55) in 64-bit "double-precision". Base-ten arithmetic has a similar issue where (1/3)*3 = 0.333333333333333 * 3 = 0.999999999999999 instead of an exact 1.
This is better than the explanation given in my first year of my computer science degree, with one important omission: If two approximations are very good, subtracting one from the other might STILL yield a massively disproportionate difference. For example, a quadrillion minus a quadrillion-and-one is one. But floating point numerals would return 0. So that's wrong by a factor of NaN. If you don't want a factor of NaN in your multiplications/divisions, this is a (real world!) problem.
Hey jan Misali, take a look at "Posit numbers" if you haven't already. I feel like they'll be aligned with your interests. They feel like floating point but without all the ad-hoc design-by-committee kludges. No negative zero, only one exception value (NotAReal) instead of trillions, more precision near zero but larger dynamic range by trading bits between the fraction part and the exponent part using the superexponential technique you actually joked about in this video. They're super cool.
I just sat up last night thinking about and looking at the wiki page for floating point numbers in the middle of the night and so now it's blowing my mind that you released this video at basically the same time that was happening. Crazy coincidence and great video as always.
doubly recommend the tom7 video. It's interesting how the different NaN values tell you what sort of NaN it is, but NaN never equals NaN if you compare them. Even if they're the same sort of NaN they don't equal each other. Even if they're literally the same bits at the same location in memory it doesn't equal itself.
you are my favorite youtuber because your next video is pretty much always about a niche phenomenon in a random very specific field and somehow it's always something that i've also noticed and that i want to learn more about. or sometimes wario ware which is also fantastic
Just the other day I found myself thinking "I wonder how floating point works. I should like, do some research or something." So this was a really nice video to have pop up in my feed. Thank you!
If the sign bit worked the other way around then that would set the value to -0.0. In most cases that difference would probably be harmless though sometimes it could cause problems.
That was a very good explanation. Personally, I always considered finding the best compromise to be one of the cornerstoness of engineering, and this system is a really good example of it. Also, I find the whole "every number is a range" thing much easier to digest by thinking about it as scientific notation with limited significant figures. Then it does make sense that 1.00000 * 10^15 + 1 still rounds to 1.00000 * 10^15 if you are hypothetically limited to 5 significant figures.
I think perhaps 0 as + and 1 as - could be a case of default (off) versus the only possible changed state (on)/ underspecification (+ by default) vs specified case (-)
It also works for fixed point or Integer addition, with 2s complement. If you add a negative number to a positive number if the negative number's absolute value is larger than the positive number the resulting number won't have an overflow carry so the sign bit will still be 1 (so still negative). If the positive number is larger than the negative number's absolute value, you will have an overflow carry into the sign bit, which will then become 0, as 1+0+1 = 0 with a carry out (which is usually ignored). As a result, the writers of the IEEE floating point standard probably went with the same convention for the sign bit as prior signed binary representations.
This video was really interesting and helpful, but the best thing I can contribute to the comments is that without a specified radix (base) the point is called a “radix point”.
This is a very nice video. I like that you explain not just what floating point is, but how someone would arrive at it as a way of representing numbers. Anyone interested in how the extraneous bits in NaN values are used in practice could look into something called "NaN packing." It's a technique (mostly used by programming language runtimes) used to cram a lot of different types of values into 64-bit floating point NaNs. In 64-bit (double precision) floating point, there are 12 bits used for the sign and exponent bits, and then 52 bits left over. Four bits can be used as a "tag" allowing for 16 distinct packed data types, and the remaining 48 bits are enough to store a pointer (on 64-bit intel platforms, a pointer is only 48 bits wide).
I now understand subnormal numbers, I thought I never was going to get them. Thanks! By the way, have you heard of posits? It's kinda like a "floating floating point system" like you mentioned at 5:10, and it's really fascinating (albeit harder to understand than the standard IEEE 754 format).
numbers as regions is a thing that shows up in locale theory (aka "pointless topology"), where you don't worry about there being "points" and only work with hunks of space that can contain each other (aka some sort of algebraic lattice, i forget the precise formalism). Imo it's a lot more physically realistic too -- all physical measurements have degrees of uncertainty, nothing is 100% certain. It's just that classical math is very uncomfortable with any kind of uncertainty. (Everything has to be a total function! Anything that doesn't fully determine its output is Not Allowed!) The blog "graphical linear algebra" does really some interesting graphical-algebraic development of basic arithmetic and convinced me that it's often natural to deal with one-to-many relations in basic math -- that is, have operations that produce ranges instead of points -- like "nan" (when it's used in the sense of "could be anything" rather than "outside context problem"). They do stuff like turning addition "backwards" -- thinking of "reverse addition" as something that consumes an input and produces a *constraint* that its two outputs sum to the input. There's a nice visual formalism where you literally turn a little circuit diagram backwards to indicate this, it's cute.
There is a very clever trick with floating point numbers which allows you to make mathematically true calculations possible with just floating point numbers alone. The idea is that floating point numbers allow you to round in different direction and give you a result which you can choose to be either lower or higher than the result in the real numbers. Using this idea you can represent a real number as a an interval and calculations with intervals give you other intervals with the property that the mathematically true result is guaranteed to be in the interval. This gives something similar to the structure you talk about. A number in that setting isn't just a point, but an entire range of values which can be operated on like numbers. Additionally this also gives you a measure of how accurately you are calculating. If the resulting interval is wide the uncertainty is high, if it is small your errors in the calculation are low.
1) I really liked your choice for the last second of this video! 2) I wish I had this 6 months ago, before I took the Computer Science course that taught me all this. 3) I feel like you did Two’s Compliment a small disservice with the info on screen at 5:27 (the first bullet). I get *why* you did it that way - this is a video about floating point, after all. But for a one-sentence summary, it just feels _unworkable_ - especially when compared to the other two accompanying it. I don’t know what a better one would be, but it feels like for what you have, such a key (and cool!) concept for the corresponding integer standard needs to at least be *named*. But regardless of that, this is a GREAT video. It took my Professor three classes and two weeks to get to the end of what you covered completely in 17 minutes. You should be proud!
Yeah, I first heard about floating point numbers in reference to the fact that if you go far enough in Minecraft Bedrock, you can fall through the world (Java doesn't have the problem in the allowed +30 million to -30 million playspace because it uses doubles), and I next heard them in reference to Super Mario 64's "parallel universes" bug and how, if you go far enough, you can't enter certain parallel universes because they're not near enough to a floating point number.
I love this format, and that you use it so much. Like explaining the history of a letter. Knowing WHY something is the way it is is one of the most effective ways I know to remember something long term
When you type 1 vigintillion and you get a number like 1 vgnt 57 quaddec 857 tredec 959 duodec 942 undec 726 dec 969 non 827 oct 393 hep 378 hex 689 quin 175 quad 40 tril 438 bil 172 mil 647 tsnd 424
I've been a full-time researcher learning to program for about a year now. I have had people tell me I've run into floating point problems so many times, but this is the first time I've actually understood the issue.
NaN is an absolute scourge. You do some operation that results in a NaN. It doesn't error, you just get a NaN as a result. Any operation on a NaN is also a NaN. By the time this causes an issue in your program, it might be somewhere completely different and now you need to figure out where the NaN originated, before it infected all the other floats. Honestly, I'd prefer it if things just errored out when encountering a NaN. It would make debugging so much easier and in the vast majority of cases, things are going wrong once a NaN shows up anyway.
@@Chloe-ju7jp Sure, you can do that. But then you need to do a NaN check everywhere and you might forget. Or you make a function for each operation, which will make longer expressions look absolutely daft and hard to read. NaN is an exceptional state. It should throw in exception. (Or whatever error mechanism the language you are using happens to have)
@@Yotanido On x86, hardware supports signaling on FP errors. It is controlled by MXCSR register. Your compiler might have some function or setting to control it.
I love the variety of topics jan Misali covers, and I love that it's all the same topics I'm interested in (and in the same WAY I'm interested in them)
Fun fact: When parsing a text (aka a string) to a number, some systems will output NaN if the input cannot be parsed to a number. For example in JavaScript, Number.parseFloat('dQw4w9WgXcQ') returns NaN.
The weirdest thing you'll face when dealing with floating point is when you try to make games and one of the tips is "As you get farther from the (0, 0, 0) coordinate, physics gets less precise". It makes a lot of sense. If you need to spend more bits in the integer part of the number, the decimal part will be less precise. It's just so impractical for this purpose.
That makes total sense but I’ve never worked in games so it never occurred to me. How interesting. What kind of data type are the coordinates? I’d assume it’s at least a double these days but like is double precise enough for large open world games or do you need more bits?
@@huckthatdish in games most often float is used, due to higher performance on GPU (transforming vertices by matrix, etc.) open world games usually use floating world origin, which shifts entire world closer to (0, 0, 0) (implementations vary)
@@huckthatdish Generally, games with large enough areas to make this a problem handle it by setting the origin coordinate at the player's location and moving the rest of the world around them, and not simulating the parts of the world that the player isn't currently in. That, or loading zones; any given loaded area can have it's own fixed origin.
On the other hand, Outer Wilds has a whole miniature solar system and needs to run physics everywhere even when you're not around. I'm not sure how that handles it, but I am very impressed that they did.
@@kkard2 interesting. Still surprised with how far draw distances are these days everything loaded at once can be simulated with just float, but I know the tricks and hacks to make it all work are myriad. Very interesting
By the way, fixed-point arithmetic is used somewhat frequently in computer systems. It basically works the same way as integer arithmetic, but interpreted as some decimal or binary fraction. For instance, your banking app might internally use signed fixed-point binary-coded decimal numbers with 11 digits, 2 of which are after the decimal point. That way, it can be sure all of its calculations are exact, and it can apply a rigidly-defined set of rounding rules. Your account can never have half a cent in it, after all. However, you can of course be much more efficient with a binary floating-point format like IEEE 754.
I know I'm too dumb for this but I learned something. "Zero can actually have all sorts of value." As a zero out of ten this is very nice to hear. Also even though I don't understand most of it, it's a really entertaining video. Thank you jan Misali.
I think another important part of the philosophy behind NaN and Infinity being standard things is for detecting where errors happen. Having 1/0 be “the largest number in the system” may cause you to run your program, and get back a meaningful, but wrong number. Getting back NaN or Infinity signals to the programmer they made a mistake.
But what if you wanted to calculate a thing that was effectively infinity, and infinity was the correct answer? What number system would you use then, since the floating-point system basically says infinity is a number too big to calculate? Or am I asking a bad question? Like, for example, the evaluation of limits?
@@iantaakalla8180 Some floating point procedures accept infinity as a valid value. For example, in many math libraries, arctangent of +infinity = pi/2.
Posits (type III unums) use a kind of "floating floating point" by having a variable-precision exponent and mantissa, allowing them to reduce precision for very large and small values in exchange for increasing precision for numbers near 1 and increasing range.
Fun fact about the imprecision of floating points: if you’re in a infinitely generating video game, going out further and further will eventually lead to your player “teleporting” around instead of walking, since the subtlety of your individual steps cannot be represented when your player coords are so big, which is why some infinitely generated games move the world around the player instead of the player around the world, that way everything it processed close to the coordinates 0,0,0 for the greatest precision
That makes so much sense in the context of minecraft! As you go farther away from 0,0,0 the game starts bugging more. Neat!
The Outer Wilds isn't infinitely generated but it uses the latter system. If you move away from the solar system for long enough, and bring up the map, the map will get flickery and glitchy
@@samagraarohan2513 Obligatory that's mostly on Bedrock. Java has distance effects too, but Bedrock's are far more apparent (see: falling through the world lol).
Real life works this way
assuming it's using floats yeah. there are other ways to represent numbers, but they have other problems - usually being slow to work with and taking a lot of ram as they get longer (on either side of the radix point)
For anyone curious: The generalization of a decimal point is known as a radix point.
Is that pronounced [ɹadɪks] or [ɹeidɪks]?
@@ferociousfeind8538 It's [ɹeidɪks] in American English, at least. I couldn't say what the appropriate Latin pronunciation would be.
Rad!
Hmm, I'm not sure I ever heard anyone say it aloud so I usually read it to myself as /rædɪks/. Both Merriam-Webster and Cambridge only offer /reɪdɪks/, which is a bit distressing.
@@ferociousfeind8538 I don't think I've ever seen someone ask about pronunciation in the comments section on TH-cam and then someone else give an answer where both commenters both know and use IPA. It stands to reason that it would be on this channel though.
ive often found the "pretend that youre inventing it for the first time" method of teaching to be really effective, and i feel like this video is just such an excellent case study. math is not my strong suit but i still found it easy to follow because of that framing and the wonderful visualizations. thank you!
I know right. I find this works really well for me for math and science concepts like quantum physics, relativity, and number theory that I've been looking at recently. It's like the whole style of math explained like numberphile, 3Blue1Brown and Matt Parker!
Also "Physics Explained", foing him recently which is what put me on this theoretical physics streak, PHENOMENAL channel. Criminally not known enough. Though only praise and praise in his comment sections
4:43 The moment I realized this incredible fluid and clean visualization was actually RECORDING EXCEL when he typed in the function blew my mind. I can’t imagine how these videos are made.
excel is pretty neat, to put it lightly
459 likes and 1 reply
The reason 0 represents positive and 1 represents negative has to do with the fact that signed/unsigned standards with overlapping range will be the same if you do this for integers and the standard was carried over to floating point.
So for 8 bit integers the range from 0->127 can be represented in both signed and unsigned standards. Because it would be convenient to represent them the same way, someone made the decision to do that. The result is that positive numbers in signed integers have a leading 0 and negatives have a leading 1.
However that's not how CPUs actually do it, they use 2's compliment where for example 95 and -95 have (almost) no binary digits in common at all
@@Henrix1998 true, but with how two's complement works the first bit is zero for positive integers and one for negative. It still leads to "eh just copy how it works with ints for ease of use"
I'm assuming it was done that way in integer (and then floating point) because programmers often make the default state of a Boolean negative, and the modified state of a Boolean positive. For instance, if something went wrong with the assignment of the Boolean, it would have the default and/or more common value and more likely to be correct and fail silently.
@@corylong5808 No, it's entirely unrelated to booleans. Booleans don't have anything to do with positive or negative numbers, they can only store true and false. Usually that's represented canonically as 0=false and 1=true (so only the least significant bit changes, not the sign bit which is the most significant bit). And generally any non-zero value is interpreted as true.
@@Henrix1998 They weren't saying that negative and positive numbers look the same. What the original comment was saying is that a number like 2 would be written the same way in a signed and unsigned byte, both being 00000010
Edit: whereas if you made 1 represent a positive number the signed byte 2 would look like 10000010
I've always thought I'd love to have jan Misali as a teacher but after watching them explain this competely foreign and complex topic to me so well, they'd be so over qualified for any teacher's pay or compensation. Like, the w series was in depth, but this just has so many moving parts and you communicated them so well!
Let me tell you why you are objectively wrong. This is in no disrespect to jan Misali, firstly. But anyway, these videos contain a pretty large gap between each in-depth one. Going as far as a few months. While teachers have to prepare a lesson within different days in a week, at least once every two weeks but it depends what age you are. I expect you are at least thirteen which probably means at least five lessons every two weeks. While jan Misali has months for one lesson. And within preparing these lessons, the teacher has a short gap to produce a one hour lesson about a topic. And this is also to different people, around 30 people usually that they have to teach it to. This means there will be many hold ups probably. While with jan Misali, he has months to produce a lesson. This means making and perfecting scripts, perfecting slides and graphics. And it is to more than 30 people but not at one time. He is able to keep going without interruptions and put all attention on you (technically). This is why there might seem to be a big difference. And you thinking this does not show how great jan Misali is, it shows how underappreciated teachers are.
@@alternateaccount9510 So, the TL;DR of this is "Misali has more time for each lesson, and there's a slim-to-none chance that he could cover a unique subject for 100+ days, multiple times per day."
Am I correct?
That is correct. In fact, if he were a teacher he would be as good as any other teacher simply because the challenges of arranging a lesson for 30 people daily such that they all equally understand and do not resort to rote memorization is so slim that he would fall to the teaching quality of any other teacher at best.
He does at least know what he is talking about and his interests are clear, but to teach something to random people directly in a nonstop basis to the level he does is impossible.
@@iantaakalla8180 unfortunately it seems so 😔. Naive was I for thinking that we can teach everyone like this I guess
Floating-point is a very carefully-thought-out format. Here's my favorite float tricks:
* You can generate a uniform random float between 1.0 and 2.0 by putting random bits into the mantissa, and using a certain constant bit pattern for the exponent/sign bits. Then you can subtract 1 from it to get a uniform random value between 0 and 1. This is extremely fast *and* has a much more uniform distribution compared to the naive ways of generating random floats, like "rand(100000) / 100000.0f"
* GPU's can store texture RGBA pixels in a variety of formats, and some of them are floating-point. But engineers try as hard as possible to cram lots of data into a very small space, which leads to some esoteric formats. For example, the "R10G11B11" format stores each of the 3 components as an *unsigned* float, with 10 bits for the Red float, and 11 for each of the Green and Blue floats, to fit into a 32-bit pixel. Even weirder is "RGB9_E5". In this format, each of the three color channels uses a 14-bit unsigned float, 5 bits of exponent and 9 of mantissa. How does this fit into 32 bits? Because they share the same exponent! The pixel has 5 bits for the exponent, and then 9 bits for each mantissa.
So if the colors in RGB9_5 share the same exponent, then wouldn't each color have roughly the same brightness, each less than twice as much as each other? So the colors would always look pretty unsaturated.
@@pekkanen_sr no, because you would scale the other components' mantissa.
e.g. if you want very bright red with very small green value, you set high exponent, high red mantissa and low green mantissa
ofc this way you lose precision with very bright colors, but that's the point
@@kkard2 So are you saying the first digit of the mantissa can be a 0 whatever the exponent is, unlike with usual floating points?
@@pekkanen_sr hmm, tbh i just think it's that way, fast google searches failed me and left with unconfirmed information...
@@kkard2 Because if it's not like that then my point still stands, the highest the mantissa can be is 1.11111111 in binary or about 1.998 in decimal, and the lowest it can be is 1, which is exactly what i was saying
Fun fact: The Chrome javascript engine uses the NaN-space to hold other datatypes such as integers, booleans and pointers. That way everything is a double.
i can't find confirmation for this, but if true this really is some cursed knowledge
That is both ingenious and incredibly evil.
So does Firefox's. It's a common technique, called NaN-boxing.
Ew
@@dzaima NaN-boxing, my favorite sport. LuaJIT uses it too.
Another reason for 0 being positive and 1 being negative: (-1)^0 is positive, and (-1)^1 is negative. More technically, it's because the multiplication of signs acts as addition mod 2, with positive or 0 as the identity, respectively. (So when multiplying floating-point numbers, the sign bit is just the XOR of the sign bits of the multiplicands.)
Wow I never thought of it like that, neat! Also don't think I've ever heard anyone use the word "multiplicand" before. Also I know you
First time I've seen someone use "multiplicand" instead of "factor"
@@LARAUJO_0 If you want to get really particular about it, they're not synonyms. A factor is something that divides something else, usually in some 'even' way, but a multiplicand is just an operand to a multiplication operator. You wouldn't say that 1.5 is a factor of 6, but 1.5 × 4 = 6.
But the word "operand" or "argument" is usually clear enough in context that "multiplicand" is uncommonly used. Even "term" often does the trick just fine.
@@lapatatadelplato6520 yeah, since it's commutative there's no significant difference. It's not like divisor/dividend
In Matrix multiplication it is not commutative. So left multiply and right multiply give different result. So the distinction is needed.
This takes "Hey Siri, what's 0÷0?" to a whole new level
a n d y o u a r e s a d b e c a u s e y o u h a v e n o f r i e n d s
IEEE 754 is amazing in how often it just works. "MATLAB's creator Dr. Cleve Moler used to advise foreign visitors not to miss the country's two most awesome spectacles: the Grand Canyon, and meetings of IEEE p754" -William Kahan
Pentium's dev teams wishes they'd listened.
@@robinhammond4446 Lol
That barely audible mouse click to stop recording right after a nonsensical parting one-liner is just *chef's kiss*
Wow the information density here is high! This 18 minute video is essentially the first lecture of the semester in the Numerical Analysis course I took in my senior year as a math major, except that lecture was an hour long! The professor used floating point as a motivation to talk about different kind of errors in lecture 2 (i.e., round-off vs. truncation) which honestly was a pretty effective framing.
"And that's a really good question."
…
"The other problem with-"
😂
This wasn't just informative and really well-explained, but funny to boot!
I lost it at this part!
actually i think it makes sense for 0 to represent a positive number in the system. that way, a floating point number that is initialized to all zeroes would represent positive zero instead of negative zero.
@@chromosozeit also makes sense with how we represent negative numbers usually. If I right 5, you assume I mean +5, I have to indicate -5 which is done by adding that 1 in front of it.
I already know how floating point works but I’m watching this anyway because I have PRINCIPLES and they include watching every Jan misali video
I mean, it's at least as important as the origins of Carameldansen.
Shout out to my favorite underused format the fixed-point. An old game creation library I used had them. Great for 2D games when you wanted subpixel movement but not a lot of range ("I want it to move 16 pixels over 30 frames"). I found the lack of precision perfect so that you don't get funny rounding errors and have things not line up after moving them small spaces.
They're also good for avoiding bugs caused by the changing precision in floating-point formats. Sadly they have worse optimization potential in modern CPUs.
I also feel like minecraft should be using a fixed-point position format. Half of all the number space is completely wasted at the world's origin, and then the precision is stretched to its limit at quintillions of blocks out. Does minecraft need more than 4096 possible positions within a block (in X, Y, and Z directions each of course)? I don't really think so. Which leaves 19 bits for macro-block positions (and 1 for the sign of course) which is 524,288 blocks with 32 bits. Or, with 64 bits as I'm pretty sure it already does use, you can go out to 2,251,799,813,685,248 (or 2.251×10^15, or 2.2 quadrillion) blocks in any direction, before the game either craps out or loops your world. Which I think is a fine amount of space, and even 500,000 blocks was fine- you'd run out of computer memory doing ordinary minecraft things or run out of interest in the world before you actually naturally explored that far. But with 2 quadrillion blocks in any direction, there is no way you'll get out there ever, without teleworking there to see what it's like
Yeah, games with limited scale don't need the sliding scale of the floating point number system. Minecraft does not need to specify 1/65536th of a block, it probably doesn't functionally need anything more than, like, 1/4096th of a block (or 1/256th of a single pixel), which would nominally squash the range the floating point coordinate system afforded you, but expand the usable space significantly and give you a hard cutoff where the fixed point numbers top out (or bottom out negatively)
In fact, using 64 bits for positions (floats are SO last year) and a fixed-point implementation down to 1/4096th of a block, you get a range up to 2.2517 * 10^15 blocks in any direction from spawn, all which behave as well as current vanilla minecraft between 2048 and 4096 blocks away from spawn (where the minimum change is already 1/4096)
And, of course, I couldn't just NOT mention the other half of the floating-point numbers that Minecraft straight up cannot use- half of the available numbers are positions between 0 and 1! Unparalleled precision used for nothing!
Lmfao I am constantly harping on this I guess
Microprocessor/IC enthusiast here.
5:33
The reason that a 1 in the sign bit of a signed integer is negative and a 0 is positive, is that it saves a step when doing 2's Compliment, which is how computers do subtraction (basically, you can turn any addition problem into a subtraction problem by flipping the bits of one of the numbers and adding 1, since computers can't natively subtract).
11:23 "The caveat only being that they get increasingly less precise the closer you get to zero."
That isn't the only caveat. Subnormals often have a separate trap handler in the processor which can slow down processing quite a bit if a lot of subnormals appear.
Thank you, you just answered my question of why my monte carlo integrator has wildly different run times for different values of the parameters even though it should perform the exact same operations.
good tidbit/nit! thank you!
@@Kaepsele337 Wow, I didn't expect my comment to be this useful but I'm glad that I could help you.
Note that this is in many cases an implementation artifact. There _are_ implementation techniques for performant subnormal handling, but since they are supposed to happen only sparingly actual implementations have less incentive to optimize them. That's why subnormal numbers hit much harder in newer machines than in older machines. (IEEE 754 does have an additional FP exception type for subnormals, but traps are frequently disabled for the same reason that signaling NaN is unpopular.)
If I had to guess, 0 being positive and 1 being negative is a holdover from 2's complement binary representation.
For those uninitiated, 2's complement binary representation (2C) is a way to represent positive and negative whole numbers in binary that also uses the leading bit as the signed bit. To showcase why this format exists here's an example of writing a -3 in 2C using 8 bits.
Step 1: Write |-3| in binary
|-3| = 3 = (0000 0011)
Step 2: Invert all of the bits
Inv(0000 0011) = 1111 1100
Step 3: Add 1
1111 1100
+ 0000 0001
1111 1101
-3 = (1111 1101)2C
Converting it back is the reverse process
Step 1: Subtract 1
1111 1111
- 0000 0001
1111 1110
Step 2: Invert all of the bits
Inv(1111 1110) = 0000 0001
Step 3: Convert to base of choice, 10 in this example, and multiply by -1
0000 0001 = 1
1 * -1 = -1
(1111 1111)2C = -1
The advantage of this form is that addition works the same regardless of sign of both numbers or the order. It does this by using the overflow to discard the negative sign if the result would be positive.
Example: -3 + 5 Example: -3 + 2
1111 1101 1111 1101
+ 0000 0101 + 0000 0010
0000 0010 1111 1111
2 = (0000 0010)2C -1 = (1111 1111)2C
It's ingenious how effortlessly this integrates with existing computer operations and doesn't have glaring issues, such as One's Complement having a duplicate zero or requiring operational baggage like more naïve negative systems.
To go back to the original statement, this system only works if the leading digit is a one because if it were inverted 0 would be (1000 0000)2C. This is not only unsatisfying to look at, but dangerous when you consider most bits in a computer are initialized as zero, which would be read in this hypothetical system as -255.
This was in the back of my mind, too.
Using a single bit to represent the sign is convenient for thinking about and implementing the numbers, but ultimately binary integers are _modular_ integers, which don't actually have any concept of sign. Ascribing a sign based on the first bit gives odd behavior specifically for 0 = -0 being "positive" and 128 = -128 being "negative" when really both are neither.
this is a good video!! I like the way you explain how this came to exist. it is a human thing made by humans, and as such it is messy and flawed but it _works_ . and I love that. it was created for a purpose, and it serves that purpose well.
you didn't do this, but I've seen people say that "computers can't store arbitrary-precision numbers", which frustrates me, because computers _can_ do that, they just need a different format. these tools are freedoms, not restrictions. if you want to perform a different task, then find different tools. and yeah you probably won't ever need this much precision, but like things like finance exists, where making a $0.01 error in a billion-dollar field is cause for concern, at the least. arbitrary-precision numbers do require arbitrary memory, but they're definitely possible.
I think 0 is positive and 1 is negative because 0 is "default" and one is "special", like how main() returns 0 on success in C. also it could be because of how signed integers are stored, where 11111101 is -3 b/c of two's compliment (I think?), which is mathematically justified by 2-adic numbers.
anyway good video, as always.
In finance, you can instead use a (int64, currency) pair with the integer representing micros. (1000, USD) is interpreted as one tenth of a cent (one thousand micro dollars).
@@KaneYork that is a fixed-point system.
You would definitely not use a floating point number for finance, the inaccuracies will cause trouble. You'd use a couple integers (like 200 dollars and 53 cents).
@@strangejune you'd store the value in cents (our even millicents). Divide by 100 the get the dollar amount, the rest is the cents
Using int64 for finance is questionable; there have been instances of hyperinflation where the 64-bit limit (~18 quintillion) could have been exceeded in some cases. In particular, the first Zimbabwe dollar (ZWD) had a total of 25 zeros knocked off during successive redenominations (i.e. 10^25 ZWD = 1 ZWL), but 18 quintillion is less than 10^20, so you would be unable to represent 1 ZWL as an int64 of ZWD. Making matters worse, 1 ZWL was still way too small to be useful. If your financial database was never converted from ZWD to ZWL, you would be overflowing all the time. The safest way to represent money is as an arbitrary-precision integer, not as an int64.
5:43 - 5:51 is like a masterclass thesis in the realm of comedic timing and I just needed you to know that
anyway this video is awesome
"I’ve been jan Misali, and much like infinity minus infinity, I too am not a number."
guess we need to reconsider our definition of a "misalian"... and write an apology to our math teacher.
This is without question the best explanation of floating point numbers I've ever seen. I wish this video was around when I was taking my freshman CS classes and we had to memorize the structure of floats, since actually walking through the whole process of making compromises and design decisions behind the format really gives you a deep understanding of the reasoning behind the format.
5:41 “and that’s a good question”
“And?”
“Why would there be an and? I got all I need from this sentence.”
I've had a vague sense of what floats are from running into problems with them in computational design software, and the feeling of getting a clear overview on something you've only been vaguely familiar with is so good. great video.
this guy is a teacher but entertaining
teaching me useless stuff that entertains me and does not bore me like every other teacher in my school.
what a cool video!!! i always love that i can understand your content even though i have no background in it. thinking of numbers as approximations like this really is so fascinating and unique
I like how your voice gets increasingly manic throughout the video, as if you're trying to keep all this in your head without forgetting any of it.
I went from knowing nothing about floating point numbers to literally all of it (with the exception of how it's applied) in 17 minutes. I am very impressed! Great video
You broke my understanding of math and programming in just 15 minutes.
Here i am, once again
When i got out of all my maths courses i swore to never come back, but am i gonna sit through however much time jan Misali needs to talk about some cool random math thing?
Yes, yes i am
Jan misali is the reason we need the ability to subscribe to playlists.
"unless you're doing something *really* silly like [unix time_t]" cracked me up :3
this channel has instantly turned from my favourite linguistic and conlang channel to my favourite computer science channel
This is the best explanation of IEEE 754 I've ever seen. Much easier to follow than the textbook I originally learned this stuff from 20 years ago. The only things I can remember that you didn't mention are (a) that the FPU has a certain number of extra hidden bits (3?) to minimize the rounding errors applicable to the results of intermediate steps of sequences of calculations, (b) we had a whole 6 week lecture course that I didn't manage to understand (I was a CompSci student but the module was run alongside math majors) about what happens when the system breaks down (perhaps when subnormals aren't enough?) and how to perform operations in the correct order to minimize errors because formulae that should be mathematically equivalent aren't always (c) I can't remember the difference between a 'Signalling NaN' and a non-signalling one, (d) the whole issue with rounding errors, equality checking and 'epsilon', and (e) there was some hype at the time around a then new-ish IEEE *decimal* standard that was supposed to replace the standard double precision binary format and fix all the problems but I don't know if it ever gained much traction at all (obviously the binary formats are still ridiculously popular).
I love the awkward pause after saying that asking why zero is positive and one is negative is a good question
Great video as always! And I'm always happy to see Tom7 getting a plug as well :) You both exist in the same realm of "obscure but incredibly entertaining content about niche subjects"
File this under: things that made me cry in university course yet this TH-cam channel has me understand it for once. This has happened like, three seperate times!!! Thank you!
Daaamn Jan! You really blew up since last I saw you. Nice job! This video was a lot of fun. I like to think you also did the entire thing in one take. I know you said at the end you're not a number but as far as I'm concerned, you're number one!
My Yes Man has GREAT taste in videos
Thanks, this is a very straightforward explanation and I'm probably going to link it to any neophyte programmer asking me a question about floating points :)
Or anyone who shouts that JavaScript is a bad language when the flaws they're complaining about are really just flaws with the IEEE floating point standard that JS doesn't encapsulate.
I mean, "all numbers are double precision floats, deal with it" is a pretty awkward design decision. On the other hand the thing was thrown together in like a week and a half originally and when you take that into consideration it's pretty good.
Having no integer data type is FAR from the only reason why javascript is a terrible programming language.
@@notnullnotvoid that doesn’t preclude many of the complaints made from being mistaken though
@@notnullnotvoid Plenty of languages with a high level of abstraction have only one number type. 64bit floating point numbers are pretty good at representing 32 bit integers accurately, so when dealing with human-scale numbers it's rarely a problem. But JS could have done a better job handling NaN, Infinity and -0 for sure.
@@notnullnotvoid eg. The Date system
"it's kind of weird to think of a number system as having a philosophy," says the person who has spent years trying to sell me on the philosophy of seximal
As someone also weirdly into floating point numbers, I appreciate this video.
0 for the sign bit being positive is carried over from previous sign/value representations of numbers. It has the benefit that "all zero bits" is the same as positive zero, the expected "default" float value. Also, you can think of it as multiplying the rest of the number by (-1)**(sign bit).
And I tend to think of floats themselves as precise, but operators need to round to the closest precise value. This makes rounding modes make sense. But your interpretation is great for stuff like 1/0==Infty.
But I hate hearing the phrase "Infinity is a concept": You say "Infinity is not a real number", but that sounds like the non-maths meaning of "real" (i.e., it exists). Of course, it can be a "real" number in certain domains. In IEEE-754, it isn't a concept, but an entity that "exists".
Like "engineering notation" 1.72e-3 == 1.72 * 10**-3 == 0.00172, there is a "precision notation" 1.001p4 == 1.001_2 * 2**4 = 10010_2
And about NaNs: All of those values are not used in real life. You might rarely see the difference between signalling and quite NaNs, but the entire payload isn't used. In JavaScript, there is only 1 NaN value for example. Some modified floating point formats (Like the ARM 16 bit float) reuse the NaN encodings as just larger numbers.
This is the best IEEE 754 explanation ever
I think the reason that the first bit being a 0 represents positive numbers and 1 for negative is so that it's consistent with signed integer formats. When you add two signed ints (or byte, long, etc) you just go from right to left, adding bitwise and carrying the 1 to the next digit if you need to. The leading bit of a signed int is 0 if it's positive so that, to add two ints, you can just treat the leading bit like another digit in the number. For example, in signed bytes, 1+2=3 is done as 00000001+00000010=00000011, but if a leading 1 represented a positive number then if you tried to apply the same bitwise addition step to each digit, you would get 10000001+10000010=00000011. Since you're adding two leading 1's together, they add to 0, which means the number becomes negative. When using a 0 for the leading digit of a positive number, you don't have this problem when adding together negative numbers, since you'll have a 1 carried to the leading digit (so for example, (-1)+(-2)=(-3) is done as 11111111+11111110=11111101) unless the numbers are so negative that you get an integer underflow, which is an unavoidable problem. Because this convention makes logical sense for signed ints, it makes sense that it would be used for floats for consistency.
One useful result of the sign bit being 0 meaning positive is that this way the "ordinary zero", i.e. "positive zero" value consists entirely of zero-bits. So if some piece of memory is zero-initialized, and then interpreted as floating point numbers, those become zero-initialized, too.
i love how much variety this channel has
In regards to signed zero, I recall an explanation in a paper a long, *long* time ago that showed a graph of a 2D function that was correct with signed zero, and incorrect without it. It kills me that I can't find the damn thing, but the point is that signed zero is *required* for some functions to get sensible results with floating point. I think it comes down to correctly representing sign at inflection points.
Oh, also: another small, practical advantage of the way the sign bit is: it means "all zero bits" means 0.0. This is handy for contexts where data is initialised to zero: you end up with your floats all being 0.0 by default, rather than -0.0.
One interesting way that having negative zero can come in handy is with functions that have "branch cuts", which is something that's normally associated with complex numbers, but there's an analogous thing with the inverse tangent function that doesn't require complex numbers. In C (and most other programming languages), taking the inverse tangent of +infinity gives +π/2, and similarly the inverse tangent of -infinity is -π/2. So if you have a function that computes arctan(1/x) and x ends up being a negative number that's too small to be represented, the fact that it underflows to -0 instead of +0 can save you from being off by π.
I love that my interest in computer science and other esoteric internet things like homestuck and undertale somehow converge on this channel lol
I always see the sign as `isNegative`. Also, I recently did a thing in Desmos where I needed to know the quadrant a point was in, so I had Q(p) = N(p.x) + 2N(p.y); N(x) = {x < 0: 1, 0} (which generates a non-standard quadrant index, but it's maybe better.), where N(x) is the same mapping.
Also, with this sign encoding, we have that the actual sign is (-1)^s ((-1)^0 = 1, (-1)^1 = -1)
The sign function also exists in desmos btw. Might not work as well though since sign(0)=0
thank you for making this video! my code teacher told me basically everything i needed to know but also move to a more important subject, while your video on youtube is a more cozy place where i could learn more
Oh hell yeah I’ve heard floating points explained a few times and still don’t really get it so I’m really happy you did a video on it. Your brain seems to work similar to mine, or at least you’re very good at explaining, so your videos work very well with my brain.
The current second place video for explaining this topic to me is the video about quake’s fast square root hack
i love all your videos because i get so into the topic and your style of delivering comedy that totally dont realize that I dont know what your talking about until like 5 minutes after you lost me lmao
Hey Vsauce, Michael here! Technically, every floating point number - except integers, infinity, and NaN - all end with the digit 5.
every non-integer float, to be precise (3.0 vs. 3.5)
"and this a really good question..." before moving on is the most computer science way to say theres little reason ive ever heard
Once again, Jan Misali has taken something I dont give two hoots in Hell about, and convinced me to sit through a 20 (almost) minute long video, and enjoy every second of it (and learn some stuff, that even though I dont care about, I will happily carry with me forever)
Impressive
babe wake up new jan misali video
I always found it silly that IEEE-754 gives us 1/0=INF, but not INF*0=0, this is the first time I've had a plausible explanation for why it might have been designed that way. Thanks!
0 × ∞ is either a discontinuity or an exception in normal math too though. Anything × ∞ is ±∞, but 0 × anything is 0.
INF*0 is one of the standard "indeterminate forms" in calculus, which is why it evaluates to NaN.
en.wikipedia.org/wiki/Indeterminate_form
@@danielbishop1863 Exactly. 0 can be an infinitesimal in normal math too.
After I had to convert numbers into floating point format and back by hand in an exam this year, I feel really superior watching this.
I hope you do a video about the balanced ternary number system sometime, it's very cool and has lots of unique properties! Having the digits 1, 0, and -1, it can naturally represent negative numbers. Truncation is equivalent to rounding, so repeated rounding will not result in loss of precision. Converting a number to negative simply involves swapping the 1 digit with -1 and vice versa. Similar to binary, having only digits with a magnitude of 1 simplifies multiplication, allowing you to use a modified version of the shift-and-add method (flip to negative if -1, shift, and add). Some early computers used balanced ternary, such as the Setun computer at Moscow State University, and a calculating machine built by Thomas Fowler. Fowler said in a letter to another mathematician that "I often reflect that had the Ternary instead of the denary Notation been adopted in the Infancy of Society, machines something like the present would long ere this have been common, as the transition from mental to mechanical calculation would have been so very obvious and simple."
jan Misali: These five bits for where the point should go allow us to do something very clever.
Me, listening to this in the background for the third time, processing everything kinda on autopilot, and also having seen the Lidepla video: Uh oh that can't be good
There is also a decimal floating point. It works nice with giving "sensible" results, like 0.1+0.2=0.3, but it's a bit slower to process.
For those who need an explanation: In binary, the fraction one-tenth is the recurring "decimal" 0.00011001100110011001100..., so has to be rounded off in order to be stored in floating point. To be ultra-precise, it's stored as 0.100000001490116119384765625 (13421773/2^27) in 32-bit "single precision", or 0.1000000000000000055511151231257827021181583404541015625 (3602879701896397/2^55) in 64-bit "double-precision".
Base-ten arithmetic has a similar issue where (1/3)*3 = 0.333333333333333 * 3 = 0.999999999999999 instead of an exact 1.
This is better than the explanation given in my first year of my computer science degree, with one important omission:
If two approximations are very good, subtracting one from the other might STILL yield a massively disproportionate difference. For example, a quadrillion minus a quadrillion-and-one is one. But floating point numerals would return 0. So that's wrong by a factor of NaN.
If you don't want a factor of NaN in your multiplications/divisions, this is a (real world!) problem.
Hey jan Misali, take a look at "Posit numbers" if you haven't already. I feel like they'll be aligned with your interests. They feel like floating point but without all the ad-hoc design-by-committee kludges.
No negative zero, only one exception value (NotAReal) instead of trillions, more precision near zero but larger dynamic range by trading bits between the fraction part and the exponent part using the superexponential technique you actually joked about in this video. They're super cool.
I just sat up last night thinking about and looking at the wiki page for floating point numbers in the middle of the night and so now it's blowing my mind that you released this video at basically the same time that was happening. Crazy coincidence and great video as always.
doubly recommend the tom7 video. It's interesting how the different NaN values tell you what sort of NaN it is, but NaN never equals NaN if you compare them. Even if they're the same sort of NaN they don't equal each other. Even if they're literally the same bits at the same location in memory it doesn't equal itself.
tom7 in general is seriously incredible
you are my favorite youtuber because your next video is pretty much always about a niche phenomenon in a random very specific field and somehow it's always something that i've also noticed and that i want to learn more about. or sometimes wario ware which is also fantastic
Floating-point numbers are interesting. I would recommend looking into posits and unums and other such alternatives for numeric computations.
Just the other day I found myself thinking "I wonder how floating point works. I should like, do some research or something."
So this was a really nice video to have pop up in my feed.
Thank you!
0 meaning positive means that it's possible to zero out the memory (such as with `memset` or `calloc`) to get floating point value +0.
If the sign bit worked the other way around then that would set the value to -0.0. In most cases that difference would probably be harmless though sometimes it could cause problems.
Oh, that's a good point. Especially since the same holds for integers. So zeroing memory gives you the same value of +0 in both data types.
That was a very good explanation. Personally, I always considered finding the best compromise to be one of the cornerstoness of engineering, and this system is a really good example of it.
Also, I find the whole "every number is a range" thing much easier to digest by thinking about it as scientific notation with limited significant figures. Then it does make sense that 1.00000 * 10^15 + 1 still rounds to 1.00000 * 10^15 if you are hypothetically limited to 5 significant figures.
I think perhaps 0 as + and 1 as - could be a case of default (off) versus the only possible changed state (on)/ underspecification (+ by default) vs specified case (-)
It also works for fixed point or Integer addition, with 2s complement. If you add a negative number to a positive number if the negative number's absolute value is larger than the positive number the resulting number won't have an overflow carry so the sign bit will still be 1 (so still negative). If the positive number is larger than the negative number's absolute value, you will have an overflow carry into the sign bit, which will then become 0, as 1+0+1 = 0 with a carry out (which is usually ignored). As a result, the writers of the IEEE floating point standard probably went with the same convention for the sign bit as prior signed binary representations.
Just finished my computer architecture course. This explanation is way more concise and easier to understand than the one my professor gave.
This video was really interesting and helpful, but the best thing I can contribute to the comments is that without a specified radix (base) the point is called a “radix point”.
This is a very nice video. I like that you explain not just what floating point is, but how someone would arrive at it as a way of representing numbers. Anyone interested in how the extraneous bits in NaN values are used in practice could look into something called "NaN packing." It's a technique (mostly used by programming language runtimes) used to cram a lot of different types of values into 64-bit floating point NaNs. In 64-bit (double precision) floating point, there are 12 bits used for the sign and exponent bits, and then 52 bits left over. Four bits can be used as a "tag" allowing for 16 distinct packed data types, and the remaining 48 bits are enough to store a pointer (on 64-bit intel platforms, a pointer is only 48 bits wide).
I now understand subnormal numbers, I thought I never was going to get them. Thanks!
By the way, have you heard of posits? It's kinda like a "floating floating point system" like you mentioned at 5:10, and it's really fascinating (albeit harder to understand than the standard IEEE 754 format).
Thank you for mentioning posits! I remembered reading about them, but I could not remember what they were called. They're a really clever format.
This man really can talk passionately about anything
numbers as regions is a thing that shows up in locale theory (aka "pointless topology"), where you don't worry about there being "points" and only work with hunks of space that can contain each other (aka some sort of algebraic lattice, i forget the precise formalism). Imo it's a lot more physically realistic too -- all physical measurements have degrees of uncertainty, nothing is 100% certain. It's just that classical math is very uncomfortable with any kind of uncertainty. (Everything has to be a total function! Anything that doesn't fully determine its output is Not Allowed!)
The blog "graphical linear algebra" does really some interesting graphical-algebraic development of basic arithmetic and convinced me that it's often natural to deal with one-to-many relations in basic math -- that is, have operations that produce ranges instead of points -- like "nan" (when it's used in the sense of "could be anything" rather than "outside context problem"). They do stuff like turning addition "backwards" -- thinking of "reverse addition" as something that consumes an input and produces a *constraint* that its two outputs sum to the input. There's a nice visual formalism where you literally turn a little circuit diagram backwards to indicate this, it's cute.
There is a very clever trick with floating point numbers which allows you to make mathematically true calculations possible with just floating point numbers alone.
The idea is that floating point numbers allow you to round in different direction and give you a result which you can choose to be either lower or higher than the result in the real numbers.
Using this idea you can represent a real number as a an interval and calculations with intervals give you other intervals with the property that the mathematically true result is guaranteed to be in the interval.
This gives something similar to the structure you talk about. A number in that setting isn't just a point, but an entire range of values which can be operated on like numbers. Additionally this also gives you a measure of how accurately you are calculating. If the resulting interval is wide the uncertainty is high, if it is small your errors in the calculation are low.
1) I really liked your choice for the last second of this video!
2) I wish I had this 6 months ago, before I took the Computer Science course that taught me all this.
3) I feel like you did Two’s Compliment a small disservice with the info on screen at 5:27 (the first bullet). I get *why* you did it that way - this is a video about floating point, after all. But for a one-sentence summary, it just feels _unworkable_ - especially when compared to the other two accompanying it. I don’t know what a better one would be, but it feels like for what you have, such a key (and cool!) concept for the corresponding integer standard needs to at least be *named*.
But regardless of that, this is a GREAT video. It took my Professor three classes and two weeks to get to the end of what you covered completely in 17 minutes. You should be proud!
2:46
The universal term for such dividers between the fractional and whole components, suchas decimal points do, is the radix point.
quad dio null radix hex non
dude I have never watched your content and The format of your content is top teir
With love from 1 dev to another
0:10 Wow, I didn't realize that non-programmers really only hear about floats in the context of precision errors
Yeah, I first heard about floating point numbers in reference to the fact that if you go far enough in Minecraft Bedrock, you can fall through the world (Java doesn't have the problem in the allowed +30 million to -30 million playspace because it uses doubles), and I next heard them in reference to Super Mario 64's "parallel universes" bug and how, if you go far enough, you can't enter certain parallel universes because they're not near enough to a floating point number.
I love this format, and that you use it so much. Like explaining the history of a letter. Knowing WHY something is the way it is is one of the most effective ways I know to remember something long term
When you type 1 vigintillion and you get a number like 1 vgnt 57 quaddec 857 tredec 959 duodec 942 undec 726 dec 969 non 827 oct 393 hep 378 hex 689 quin 175 quad 40 tril 438 bil 172 mil 647 tsnd 424
I've been a full-time researcher learning to program for about a year now. I have had people tell me I've run into floating point problems so many times, but this is the first time I've actually understood the issue.
When your keep running into a problem in programming you're missing something subtle and sometimes sinister.
If you watched this and enjoyed it and want to see a funky thing that floating points let you do - go look up the Fast Inverse Square Root algorithm.
Thank you for shouting out Tom VII over at Suckerpinch, the man is a comp sci legend and makes great videos
NaN is an absolute scourge. You do some operation that results in a NaN. It doesn't error, you just get a NaN as a result.
Any operation on a NaN is also a NaN. By the time this causes an issue in your program, it might be somewhere completely different and now you need to figure out where the NaN originated, before it infected all the other floats.
Honestly, I'd prefer it if things just errored out when encountering a NaN. It would make debugging so much easier and in the vast majority of cases, things are going wrong once a NaN shows up anyway.
just have checks for if things are NaN after doing something if it's a problem
@@Chloe-ju7jp Sure, you can do that. But then you need to do a NaN check everywhere and you might forget.
Or you make a function for each operation, which will make longer expressions look absolutely daft and hard to read.
NaN is an exceptional state. It should throw in exception. (Or whatever error mechanism the language you are using happens to have)
@@Yotanido On x86, hardware supports signaling on FP errors. It is controlled by MXCSR register. Your compiler might have some function or setting to control it.
@@Yotanido It sounds like what you need is a float monad.
I love the variety of topics jan Misali covers, and I love that it's all the same topics I'm interested in (and in the same WAY I'm interested in them)
Fun fact: When parsing a text (aka a string) to a number, some systems will output NaN if the input cannot be parsed to a number.
For example in JavaScript, Number.parseFloat('dQw4w9WgXcQ') returns NaN.
you did NOT just hide the rick roll link as a string, I'm glad that the meme is still around :D
Do you have that committed to memory or did you copy it
@@abridge2 I have seen the link so many times that I shudder when a youtube link begins with "dQw"
Thank you for your explanatory videos. I always appreciate how clear, effective, and entertaining your videos are.
The weirdest thing you'll face when dealing with floating point is when you try to make games and one of the tips is "As you get farther from the (0, 0, 0) coordinate, physics gets less precise".
It makes a lot of sense. If you need to spend more bits in the integer part of the number, the decimal part will be less precise. It's just so impractical for this purpose.
That makes total sense but I’ve never worked in games so it never occurred to me. How interesting. What kind of data type are the coordinates? I’d assume it’s at least a double these days but like is double precise enough for large open world games or do you need more bits?
@@huckthatdish in games most often float is used, due to higher performance on GPU (transforming vertices by matrix, etc.)
open world games usually use floating world origin, which shifts entire world closer to (0, 0, 0) (implementations vary)
@@huckthatdish Generally, games with large enough areas to make this a problem handle it by setting the origin coordinate at the player's location and moving the rest of the world around them, and not simulating the parts of the world that the player isn't currently in. That, or loading zones; any given loaded area can have it's own fixed origin.
On the other hand, Outer Wilds has a whole miniature solar system and needs to run physics everywhere even when you're not around. I'm not sure how that handles it, but I am very impressed that they did.
@@kkard2 interesting. Still surprised with how far draw distances are these days everything loaded at once can be simulated with just float, but I know the tricks and hacks to make it all work are myriad. Very interesting
By the way, fixed-point arithmetic is used somewhat frequently in computer systems. It basically works the same way as integer arithmetic, but interpreted as some decimal or binary fraction. For instance, your banking app might internally use signed fixed-point binary-coded decimal numbers with 11 digits, 2 of which are after the decimal point. That way, it can be sure all of its calculations are exact, and it can apply a rigidly-defined set of rounding rules. Your account can never have half a cent in it, after all. However, you can of course be much more efficient with a binary floating-point format like IEEE 754.
9:29 legit made me laugh out loud
I know I'm too dumb for this but I learned something.
"Zero can actually have all sorts of value."
As a zero out of ten this is very nice to hear.
Also even though I don't understand most of it, it's a really entertaining video. Thank you jan Misali.
I think another important part of the philosophy behind NaN and Infinity being standard things is for detecting where errors happen. Having 1/0 be “the largest number in the system” may cause you to run your program, and get back a meaningful, but wrong number. Getting back NaN or Infinity signals to the programmer they made a mistake.
But what if you wanted to calculate a thing that was effectively infinity, and infinity was the correct answer? What number system would you use then, since the floating-point system basically says infinity is a number too big to calculate? Or am I asking a bad question?
Like, for example, the evaluation of limits?
@@iantaakalla8180 Some floating point procedures accept infinity as a valid value. For example, in many math libraries, arctangent of +infinity = pi/2.
@@iantaakalla8180 The getting infinity as a result would not be an error condition in that use case
Bab wake up, a new video on things I will understand but will fail to talk about in an interesting way to my friends just dropped!
If I remember correctly, I heard in a TH-cam comment that using floats for banking can result in loss of one's job.
this video is brilliant because not only i learned something new, it is also insanely funny. the comedic timing is perfect! thank you for this :]
Posits (type III unums) use a kind of "floating floating point" by having a variable-precision exponent and mantissa, allowing them to reduce precision for very large and small values in exchange for increasing precision for numbers near 1 and increasing range.
I've never wondered about how floating point worked before, but I couldn't have asked for anything better.