Support the channel by trying Brilliant at brilliant.org/LowLevelLearning for FREE and get 20% off when you sign up! Thanks again Brilliant for sponsoring this video and your continued support of the channel!
Here's a few video ideas: * Hardware bugs and their workarounds (e.g. F00F, FDIV, TAS on Amiga/Genesis, DPCM glitch on NES, etc) * How to program in assembly without losing motivation (me irl 😭) * How emojis are represented in binary * Unicode errors in general (e.g. rÃ(C)sumÃ(C) (I don't know how to type the copyright symbol on my phone sorry!) * Why it's so easy for a human to recognize spambots in chatrooms but computers can't do it well * How an EXE file works
You don't even need a switch statement for enums in Java, you can make frankensteinian monsters known as an abstract enum. They're usually used for adding extra data for the enum members (with a custom constructor), but you can also add abstract methods to the enum, which you have to then implement for each member. This leaves you at a simple method call on your enum variable without switching for possible values. 😂😂😂
Yes, I was disappointed that he glossed over stuff referring to things as "magic numbers" and "a bunch of garbage". The whole *point* of the video was to explain the details behind the switch statement. The valid user input are letters: "c", "d", "e", "n", "q" (which he defined earlier in the video). Note that the first letter "c" is ASCII 0x63 and the last letter is "q" is ASCII 0x71. Note that the span from "c" to "q" is 0x0e (i.e. 15 characters away). The code at 7:51 shows a transformation and a check. Line 123b sub eax, 0x63 ; this subtracts letter "c" from the entered letter, so the range of inputs is now indexed from 0. Line 123e cmp eax, 0xe ; this compare checks if the input letter is past letter "q" since letter "q" is the 15th letter (i.e. 0xe) from letter "c" At this point, the character range from "c" to "q" has been transformed to an index range of 0 to 14. If the entered character is *OUTSIDE* this range the code terminates, bypassing the jump table stuff, and goes to process with the "invalid" code not shown in the video. Line 1241 ja 12aa ;exit the jump table code block as the entered character is outside the range of the defined command characters. If the entered character is *INSIDE* this range, it proceeds to the jump table code. Line 1245 lea rdx,[rax*4+0x0] ;Load Effective Address. The index value is multiplied by 4, since each jump table location is a 4-byte offset added to the table's base location. The code at 7:58 is really just the jump table values. (Unfortunately the disassembler produces nonsense code, but the offsets are valid.) Rearranged, the offset values are given below and matched against the entered letter: 𝟶𝟶: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝟸 ; "𝚌" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟿𝟼 𝙲𝚘𝚗𝚝𝚒𝚗𝚞𝚎 𝟶𝟷: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟸𝚊 ; "𝚍" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟽𝚎 𝙳𝚎𝚕𝚎𝚝𝚎 𝟶𝟸: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟷𝚎 ; "𝚎" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟽𝟸 𝙴𝚍𝚒𝚝 𝟶𝟹: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚏" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝚌𝚘𝚍𝚎 𝚋𝚕𝚘𝚌𝚔 𝟶𝟺: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚐" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟶𝟻: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚑" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟶𝟼: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚒" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟶𝟽: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚓" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟶𝟾: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚔" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟶𝟿: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚕" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟷𝟶: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚖" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟷𝟷: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟷𝟸 ; "𝚗" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟼𝟼 𝙽𝚎𝚠 𝟷𝟸: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚘" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟷𝟹: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚙" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝟷𝟺: 𝚏𝚏 𝚏𝚏 𝚏𝟸 ?? ; "𝚚" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟾𝟿 𝚀𝚞𝚒𝚝 There is no particular reason that the offsets are negative, it's just that the compiler chose to put the jump table _after_ the code fragments. In fact, referencing the offsets to the JUMP table *is* a bit odd. Usually the base address for the jump table is the FIRST ROUTINE. (in this example the "New" routine at address 1266 is the first rountine...as shown at 8:52 ) And because it's first, It would have an offset of simply 0. The rest of the routines are reference from there (the FIRST routine). In the given code each entry point is just 12 bytes from the previous routine. 𝙹𝚞𝚖𝚙𝚁𝚘𝚞𝚝𝚒𝚗𝚎𝙱𝚊𝚜𝚎𝙰𝚍𝚍𝚛 𝚒𝚜 𝟷𝟸𝟼𝟼 (𝚒𝚗 𝚝𝚑𝚎 𝚐𝚒𝚟𝚎𝚗 𝚎𝚡𝚊𝚖𝚙𝚕𝚎 𝚊𝚝 𝟾:𝟻𝟸) +𝟶 𝚏𝚘𝚛 "𝙽𝚎𝚠" +𝟷𝟸 𝚏𝚘𝚛 "𝙴𝚍𝚒𝚝" +𝟸𝟺 𝚏𝚘𝚛 "𝙳𝚎𝚕𝚎𝚝𝚎" +𝟹𝟼 𝚏𝚘𝚛 "𝚂𝚝𝚘𝚙" +𝟺𝟾 𝚏𝚘𝚛 "𝙲𝚘𝚗𝚝𝚒𝚗𝚞𝚎" With those simpler and smaller numbers, the example would have been easier to follow, but we're at the mercy of the compiler (and crappy disassembler). As was pointed out, if the cases are "sparse" then the jump table will have a lot of those "invalid" entries. It still executes very quickly, but just takes up some memory. Input letters *OUTSIDE* the range are culled out and terminate quickly. Letters *INSIDE* the range all result in the jump code being executed, even if it is just results in an "invalid" stub.
@@joeystenbeck6697 this is something I would like to see @lowLevelLearning mention in the video. I haven't benchmarked it, but my gut feeling tells me the difference in runtime complexity is just pushed down to compile time, i.e. increasing time needed to compile the program in order to get a better performance in runtime. In some situations this is undesirable (or less desirable) and the program would be better off running a bit slower.
@@PrivacyRevocation in 99% of the time you rather want longer compile times than longer run times. And I'd assume that if statements are more difficult to optimize?
Before everyone decides to switch to switch statements😊I think it’s good to point out that switch statement CAN be faster, but it really depends on the use case. Like you mentioned, when you only have a few conditions, if/else might actually be faster. Also in your example the gaps between the enum values are small, but when very large gaps are used, the compiler might opt for a different strategy, which can again be slower than a few if/else statements. And finally, switch statements might be faster, but have their own caveats. Which can lead to slow development. It’s so easy to forget a break, which leads to an unexpected fall through, which can be real nasty bugs to trace. It’s easy to forget to handle the default case. Also in C/C++ switch statements cannot be used for strings, which is annoying to say the least. And the switch cases are not scoped, unless you add curly braces; which again can lead to unexpected behavior if used improperly.
It really depends on the use case, squeezing more than 1 line under each case can be hard to read and ugly. And so often the conditions are not as simple as a case. So I think its use case can be quite narrow.
If you use enums, the gaps should be quite small. Generally I'm happy to let the compiler figure out if it thinks the switch is faster or not, but I prefer switches because I think it looks prettier and I sometimes like to know if I'm missing a value by looking at the warnings when I remove the default statement. (I tend to write java or rust more than C, so the precise warnings may be different.)
Embedded dev here. The compiler impressed me a couple years ago. I had a static constant std::map of enums to functors and i changed where it was used to a switch calling the functions directly and it compiled into an identical object file.
Using the switch can also improve code safety by enabling the compiler warning/error for switch statement not handling all values in your enum. So in future, when you add another value to your enum, the compiler will tell you about any switch statements you forgot to update
The trade off point between if and switch will depend on your target processor. A processor cannot look ahead with a jump table. Meaning it cannot keep the pipeline full. A processor may have branch prediction to try and keep the pipeline full. Lookup branch prediction and jump for your target processor. Only benchmarking will tell you the tradeoff point. Good luck.
I would have imagined that modern compilers with optimizations turned on would be smart enough to choose to translate the if/switch to the appropriate assembly if/switch depending on the specific code.
Most of them are. Not sure what compiler he used with what settings. In some cases they can convert if-elses to jump table, like they do in case of switch. In some other cases they might decide to not generate jump table for switch, because multiple compares would be faster or much more memory efficient (in terms of produced code).
Yep, to many compilers it doesnt matter at all whether you make a switch or if statement. I guess the switch just happens to be easier to optimize so its more likely to be faster if we choose any random compiler. In addition to O(1) jump tables and O(n) if else chains some compilers actually uses hardcoded binary search to find the correct case by repeatedly cutting the range in half which is O(logn) in average. I guess that can be good middleground between those solution. Most compilers change the implementation for if and switch statements depending on the amount of items to be compared and the type of data being presented (consecutive numbers/something that translates to them like enums, random looking numbers without seemingly any pattern etc.)
You forget the human here. If I have a long if-else chain that I carefully created with only fixed comparisons to a single variable The compiler can optimize that to a jump table. The next developer working in this file adds a if-else that breaks this pattern, and buum performance drops for no obvious reason (unless you know if this compiler optimization). By using a switch I tell the next developer my intention, and she/he needs to make a conscious decision to change this concept.
This is the correct answer. People who write modern optimising compilers are generally smarter than the average programmer. It's _relatively_ easy to look at an if-else cascade and analyse if it's evaluating equality on the same variable and is, therefore, functionally a switch statement. One only need look at Python: Guido explicitly excludes a switch statement as he _knew_ this was possible.
To be fair, if the if-statement can easily be transformed into the switch statement as you did in the video, any reasonable compiler should output the same asm as the switch statement.
@@piisfun What? Modern compilers are perfectly capable of generating jump table from cascaded if statements as well as converting switch statement into series of comparision depending on whatever is faster in certain situation and are able to do it compiling much higher level language like C++.
yeah with the example code yes, the problem comes when it's organized a little differently and the if/else chain isn't always comparing the same variable to the same value. (like, wouldn't be compatible with a switch). Easy thing for someone to do accidentally coming in to old complex code and adding another case and suddenly it performs a lot worse. So with modern compilers it's more of a protection against myself rather than intentional optimization imo. Though there are use cases where you need to be intentional about this, like some embedded stuff.
In my first design review at Novell, my team lead said I should use a switch statement for true/false tests rather than if/else. So I coded both and looked at the generated code - they were identical.
Compilers since 20+ years ago have been smart enough to ignore your preferences as a source code writer and just do the right thing at the machine code level. I used enjoy trying to outsmart compilers, to make 3D renders run faster by studying the machine code output, but gave that up long ago. We have created tools smarter than ourselves!
If the if-else chain tests something that can be trivially changed to a switch, it's better to do so, simply because it communicates intent better, and gives more well-structured code. Or, put another way, a long, simplistic if-else chain is just a plain eyesore, IMHO. It doesn't necessarily matter if the compiler can magically figure your intent out in this specific case; in other cases, bad code (which this smells like) can actively hinder it from figuring things out and making optimisations. TL;DR: Don't make it any harder for other people to read your code than absolutely necessary, and don't expect your compiler to perform magic on weird code, even if it's _often_ able to.
@@mnxs It's most useful when you have a situation with multiple checks that are simple tests and the actions to take are mutually exclusive. Presumably in this situation those code blocks can be treated like an anonymous function and reduced to something akin to BASIC's IF ... THEN ... of the form IF condition1 is true THEN gosub 100 IF condition2 is true THEN gosub 200 IF condition3 is true THEN gosub 300 P.S. That said you can "optimize" your If-Else constructs by ordering the checks in terms of the most common conditions you expect to be true. Switch statements are to be preferred when that ordering/ratio is completely unpredictable. Or, as he puts it in the video, they are all equally likely.
1. which compiler 2. what compiler options (if you have comparet on no optimization flags, then good luck, it's not even production code) 3. what language version. This whole if / switch comparison makes litle sense, compilers this days will chop your code in sutch a way that you won't make heads and tails of it, so i assume adding -o3 will make bot comparable, but then again, you won't be able to read through ASM
That misses the point. The point is that the main reason switch statements exist is due to this optimization. A compiler turning an if-else tree into a jump table is explicitly a compiler optimization. A compiler turning a switch statement into a jump table is just what the compiler logically should do.
@@vibaj16 I did find a cases where switch cases won't get turned into jump tables. So it's not as logical as it seems. If for example you have non-equally spaced case statements that have very large gaps in between (like some random numbers like 1, 260, 1070, 24080). In those cases the jump table would be gigantic and even with no optimization gcc and clang (with -O0) seem to refuse to use jump-tables in those cases. It makes sense but it makes me doubt that this optimization is the main reason for the existence of switch statements since the compiler obviously still integrates it's own decisions even on -O0.
PSA: It does not matter.. modern compilers have agressive optimizers that transforms if to switch and switch to bit-tests whenever it is valid && profitable speed/size wise to do so.
I believe it's a method to prevent nesting. It gets unusable at the depth of 3, so it's motivating to extract and write separate functions/methods/classes.
@@moorscode that is actually the first positive thing I heard about such a big amount of spacing and it totally makes sense. Maybe I should try to get my coworkers machines to look that way xD
tabstops at 8 spaces is.... a very long-standing de-facto standard. I don't care what you set your indent level to, but on a modern computer, there's very little reason to make your editor necessarily use hard tabs of a different width, and plenty of reason (IMHO) to maintain the idea that tabs are 8 spaces (or rather, align you to a column number divisible by 8), for compatibility with existing things.
scanf("%4s", optionbuf), from man scanf: An optional decimal integer which specifies the maximum field width. Reading of characters stops either when this maximum is reached or when a nonmatching character is found, whichever happens first. Most conversions discard initial white space characters (the exceptions are noted below), and these discarded characters don't count toward the maximum field width. ***String input conversions store a terminating null byte ('\0') to mark the end of the input; the maximum field width does not include this terminator.*** Also, your while loop condition is UB on the first iteration.
This is correct. Common subexpression elimination is one of the first optimizations I learned back in the 1970s while in college as it can be applied early while doing lifetime and flow analysis before code generation
This optimization can only be applied on if statements who conditionals strictly operate on integral types and never check said integral types against a range of other integers (e.g. > or
It is a neat mechanic, but disappointingly rarely actually relevant. 1. You need to have a decent number of branches to actually notice a difference. Even with 5 options, it's still only a minute dufference. 2. If you have a huge number of choices, you will often implement ot as some kind of hash map or state machine anyway, rather than if/else or switch. 3. The speed of the branching is only relevant if the code that is executed in the branches is also fast. If you're doing a printf at the end, then your improvement by using a switch will only make up a fraction of a percent between hitting the control structure and actually seeing the text on screen. That said, these niche cases where the speed gain is relevant of course still exist, just not in many places. I guess I'm kinda traumatised about this topic over the whole YandereDev incident where everyone and their got hung up on the point that the code was bad because it used if/else instead of switch, even though that was the least of the problem with that code. Ultimately it turned out that their performance issues came from calling an expensive UI function multiple times per frame and the use if heavily unoptimised graphic assets.
I just assumed it was like: Table address = 1200 q = 71 d = 55 So it just reads the value (q) and adds it to 1200 and jumps to address 1271 and executes the code there. Actually he said this part contains a negative number(?) and uses that in conjunction with the table address to jmp to code. Close enough? I think he went over this part too quickly or something.
@@user-zu1ix3yq2w he kinda explains how they jump to the exact switch statement that satisfies the condition. But there was a magic address that points to the statement in his explanation. How it calculates the address to that statement was not explained and was mentioned as "magic". I clicked on the video title to know exactly about that, but it was mentioned as "magic math" here.
@@aneeshprasobhan Oh well. I think he should make another video explaining it. Obviously we could just look up jump tables and learn about them, but I'd prefer he just explained it in more depth. I've always found magic numbers fascinating. I think a jump table is a natural solution, one we would invent ourselves if we had been approaching these problems in assembly and using the goto statement, instead of using high(er) level languages. I guess what I can say is, this jump table solution for switch statements (goto table[table_address+offset]) is better for (numerically) consecutive cases, and less so for cases that are "sparser," especially for memory management.
I think it boils down to a feature in the language. Switch statements have to have constants for the case statements. You can't use a variable in a case statement (i.e. CASE: ). So the values to be checked are all known at compile time, and the 'hashing' can be pre-calculated. 'If' statements however, could be comparing two variables whose values are not known at compile time. So even though in this example the value just happens to be a constant, the compiler doesn't have an easy way to go from assembling in each 'comparison' to trying to 'hash' if it just happens to be that one of the items being compared is a constant. Also, all the 'if' 'else if' clauses could be using completely different variables and values, whereas 'switch' statements the test parameter is always the same throughout all the 'case' statements.
As per Jason Turner, I would also almost always avoid using the default label when switching on an enum since you could forget to adjust the switch statement when a new value is added to that enum. This should always be configured to cause a warning in the compiler, but the warning won't work when a default label is present. And yes, you'll probably have to extract the switch statement to a new function so you can handle the default case outside of the switch if it falls through, but without that label. Also another vote from me to use Compiler Explorer like Turner does when demonstrating assembly. It's just the best.
MISRA, for example, requires an explicit default. The alternative is that, unless intentional, the default should assert()... or just use a better static analysis tool/compiler, which will tell you when you have an unchecked enum value. If this kind of error makes it into the wild, you have more serious process issues.
Perhaps in embedded land where unrecoverable errors are Really Bad™, but for most programs I would argue you should always have a default case that throws or asserts. Of course, I would never recommend a default case that just silently ignores the problem. Even better is when your compiler can catch the problem for you (a la Typescript).
@@clonkexIt really depends on the context and situation. When writing code for yourself, it's perfectly fine to have an empty default or just print out a message. Where it becomes a problem is when the valid cases are strictly defined and receiving anything else is Very Bad News. If you write a compiler or assembler, you really want to know if it encounters a non-existent instruction, a math/arithmetic error (division by zero), or some other issue that could cause the program to unexpectedly crash the computer it's running on or at least hang it up badly. Otherwise it might quietly produce an executable that will suddenly crash without an obvious cause or waste a lot of computational resources on recovering from avoidable errors. And God help you if it should take garbage and produce otherwise valid but pointless instructions.
@@jnharton I agree. This is why I say you should have a default case that throws. You want to know _for certain_ that you've covered every case, and if not, crash rather than do something unexpected.
you should use compiler explorer if you want to show whats going on in assembly, it has a much nicer view and will even highlight what gets translated into what assembly instructions.
actually an if else tree might get optimized into a jump table by the compiler if there are many branches, so using an if else tree or a switch really doesnt matter with an optimizing compiler. Though its good to know when to use what as it can make your code much more readable.
well perhaps, but people that care about such permeformance improvements are working with embedded devices like arduino etc. Everything is going to be quite basic.
This depends on the nature of the if else tree. It can only perform the jump table optimization for if-else branches if 1) The conditionals only operate on basic integral types 2) The conditionals only test for equality or inequality, and doesn't use >= or
@user-li2yv5je5e I think you missed the point. ISA on an arduino is vastly different than x86-64 ISA.. or even x86-32 ISA. The performance profile is going to be vastly different. This is low level stuff that I doubt 99% of the people here understand, unless you work on compiler development. Joso997 apparently gets it. I've seen plenty of people come from 'modern' development teams, into 'tiny MCU' world and not understand this.
Fun fact: * Write some code in a function, doesn't matter what * start a switch statement at the top of the function * put a "}" at the end, to finish the switch * sprinkle "case" statements in between the lines of your code wherever you want. Even inside loops, if staments, other switches etc. This is legal C/C++ and it even works, although it makes no sense at all.
This will very likely be optimized out (for the if statement checks) if you tell the compiler to do any sort of optimization. Also, it's relevant to note that the hashing of the switch statement was so simple because you used a series of character comparisons of which each character was different. If you had multiple command inputs that matched even the first letter of the command, the switch statement would still branch. But you showed a neat topic few know about, kudos to that
Please correct me if I'm wrong, but isn't this only true with optimizations turned off? I did a quick bit of playing around on compiler explorer before commenting and it was true there was a substantial difference between a switch and a if chain with optimizations turned off but if -O3 set both switch and if result in the exact same assembly.
This. Decisions between if-else chains and switch statement should be based on what is more readable or what expresses the intend of your code better. Every major compiler is perfectly capable to see through switches and if statements and can transform one into the other if it thinks that's going to be faster. And generally the compiler will do a way better job at making this decision than the average programmer. The advice given out in this video is simply misleading and will make new programmers rewrite perfectly fine code because they now think that is will be faster (which it won't).
@@williamdrum9899 I did a bit more playing around this time with 9 cases instead of 3. The asm is almost identical, The compiler did switch to a different approach than before, this time it declared basically an array of length 26, one for each letter containing the start address of each case and then used the switch argument as the index to this array to find the correct case code, e.g. 'C' - 'A' = 2, 2 index in the array is the address for case 'C' which it then jumps to. This is the same for both if and switch. there is one small difference which is with the if version, there is a extra comparison at the begin which check the first if condition, if its matches then it skips everything I just mentioned. The switch version doesn't have this comparison but everything else is the same.
Nice! A very common place to optimize with switch is when parsing command line arguments. The problem is when you need to call strcmp or some function like that for every argument, you can't do that in a switch statement. A nice trick is to separate what can be solved with ifs vs switch and use the switch first, managing to even prevent the strcmp function calls.
I wonder if you could just run a switch statement over the first character and only then do the `strcmp` (and only compare with values that actually start with that character), but that would make the code look a little ugly. :(
@@ThePC007 That's actually a very good trick, for example gperf uses that trick to parse keywords to generate hash maps. I actually use it all the time when parsing command line arguments, I check if the first character is a dash '-', and only then I call strcmp from the second character, as most command line options will start with dashes.
To parse argument options you can hash argument values into a fixed size (per value) hash board buffer (long enough to hold your argument options) where you hash the string value and then memcmp() the two memory blocks to make sure the value is the same as you wanted. Then you get an almost constant evaluation time.
@@informagico6331 That sound extremely efficient and fast, and probably the way to go for big production programs, but maybe an overkill if you're making a simple cli tool. Depending on the amount of available command line options, the simpler way could even be faster.
I ran the if statements versions if O3 optimizations enabled with g++ compiler, and it got converted to the switch version !! Compiler optimization is pretty dope
Oh hey that's actually really cool! I've never seen someone explain how switches work at a byte level, and when you do it makes it really obvious why switches are so fast.
Any optimising compiler worth your time will produce identical code for either statement anyway. Usually it is able to transform the code into either lookup tables, trees of if & if/else statements, and often even branchless code. LLVM has been able to do all of this for a while now, as has GCC, and even MSVC has some capabilities here. This guy is really being irresponsible by presenting his performance advice as true when it very rarely is, and not even measuring the difference.
Yo this is actually so sick. I saw a comment saying what is basically is... But man am I glad I stayed around for the actual explanation this was great
I like that you supported your thesis with a reference to the assembly code. I would be even happier if you ran some test code to compare the time it takes to complete 100K of switch vs. if versions of code.
Nice tutorial! I know its just a test program but you should consider to clear the optionbuf with 0 before you use it since there may be arbitrary values in it and it potentially can trigger your stop condition
He also has a 1byte buffer overflow since the length specifier in scanf("%4s", optionbuf) doesn't include the null byte and completely failed to consider compiler optimizations. This video can basically be deleted
MIND = BLOWN! What a great optimization, having seen that I can see bunches of ways to employ that to accelerate code flow in scenarios with complex nested comparisons. I would definitely want to wrap it with some sugar so its readable though.
Reminds me of the time I used a function pointer rather than a switch statement (20 years ago) did not do this, but instead searched down the list like the if statement. This was for a state machine with lots of states. I knew it was inefficient. This switch-case example looks kind of like a function pointer. When I have a long-ish list of compares like this, I tend to check the assembly output for efficiency like was done here.
I began programming in machine code, then assembly. Working down to the bare metal is the best way to understand how to write efficient code. I learned all these tricks from the start.
I learned something new and also realized some illustrations would be very helpful. Especially if to explain how to use an input in a table of negative numbers to subtract it and find which option to choose without it comparing two objects 🤨?
🤓 This is my comp sci degree speaking out, but there is a case where an if/else block might have a marginal performance benefit. If your input data is distributed in a way where one or two values are the overwhelming majority, with tertiary and onward values being exceedingly rare, it might prove just barely optimal to use an if/else tree rather than a switch.
The important point is that you *MUST* profile your code. Modern processors are just so complex that you just can't make any assumptions about performance anymore.
Thanks for the video, I think it would help a lot of people if you dived deeper into ASM to explain the differences between using ifs and switches. There are some things that really do need to be talked about in terms of what the computer and processor is ACTUALLY doing and switches and ifs are a perfect example. PS. I had no idea people still had tabs as 8 spacing instead of 4 or even smaller. That is wild.
enum class in c++ is nicer because it doesn't pollute your namespace std::array in c++ is nicer than char arr[4] because it has built in bounds checking without overhead. better still would be std::string, because that's the correct datatype for a string. i could go on. I see no reason to use c over c++ when it's mostly backwards compatible anyway and you can just use the nice, safe and fast features of c++ over the unsafe trash that you will inexplicably write when writing in pure C. That said, mixing C and C++ will yield terrible code, so might as well just write c++ and dump all the c junk. And yes, c++ works just fine for embedded.
Yes, this code which reads from the same memory location repeatedly, with no other data memory accesses in between, is going to have cache misses. Do they not teach you CS guys anything about locality? Do you just see memory accesses and assume "cache could miss there"? Yes there are some edge cases, but in general, that should hit, and in many of those edge cases (interrupts, multicore, etc) the cause will cause a much larger slowdown than the single cache miss it causes in that code.
I was afraid of for loops and used nothing but switch statements in my code. Then I finally learned for loops and have began rewriting a lot of my code to make it a whole lot better.
I remember learning about another benefit in my computer architecture class. Modern processors use tricks like out of order processing and branch prediction to improve performance. More branches = more wrong predictions and more operations performed that need to be discarded
glad to hear THAT, because the program I've been writing has a switch/case block with (so far) 171 cases in it, and it'll probably end up with a lot more than that before I'm done
Predictable Fixed execution time is Another important difference. In this case Switch: O(1) == fixed time If-else: O(n) == depends on the input... If the input is true random it's O(n) If 99% if the input matches first if then it's fast and close to O(1), if 99% matches the last if.... you get the point. In any case execution til for the if-else is harder to predict.
I still remember reading the AMD optimization manual like 20-ish years ago and figuring out how to use that to your advantage in so many ways. Even when you have simple string constants you can still take advantage of it by pre-hashing the strings, and there was another trick in there involving multiple if branches that could be compressed into a single switch by using the compiler's own optimizations to convert a test into a simple bit and then bitwise or them all together.
@@sent4444 I can't directly link it, but just Duck the AMD optimization manual and you should find it. The Intel optimization manual is also a good resource. They both provide some really good tips if you're thinking of getting into writing assembly or a compiler or just improving your code from a higher level.
It's almost like the binary is a hash table and the switch is turned by the compiler into a very simple hash function to index(jump) to the right spot. Cool.
Pretty much. The beauty of this is that no comparisons are actually made. You basically have a big table where most options are "do nothing" so that no matter what the user types, all cases are handled
First of all, use "gcc -O2 -S", and then analyze the resulting assembly code. The assembly code you have shown (the first one) most probably resulted from unoptimized compilation. my gcc 13.2.1 generates different code. Second, use non valued enums (not assigning values to the enumeration identifiers), and the result will even be better. And finally: switch was always supposed to be implemented by jump tables, but in some cases (like having only a few choices), the code generator could select a more appropriate (=faster) implementation.
I think the best place to illustrate the Power of switch-statements is Windows-Applications. Any logical window (like a button or a static control) has a callback-function that processes Events in a large switch-statements. And Events can be very common things like "paint the window" or "mouse has moved". As much I dislike Microsoft, but Event-Driven-Programming that Windows uses is superior to the Object-Oriented programming that Gnome uses. Also the SDL-Backend has a Event-sytem for input, that typically is processed by a large switch-statement.
It looks more like a jump table than a hash table as others have opined. Also, this only works in certain cases, yes? if the range of the switch is too large (say, outside the range of a few alphabetical letters) then I think the switch statement resolves to a series of "if" statements. Great video, I'm loving all this geeky low-level stuff. I'm subscribed and continued success!
I have recently started using switch() statements in interesting way: Have two compares on both sides of &&? Bit-pack compare bools into int as two bits and switch case on the int. (4x cases) On three bools (8x cases) this is massively more simple and faster.
Because in many cases they translate directly to a single or very few assembly statements. However, given a properly designed compiler and language designed to be optimized, often IF statements can in select cases be translated to be as efficient as switch statements.
Do note, this behavior is specific to languages that only implement switch statements that only accept integral types, and only compare against single integral constants, like C. If you use C#, the behavior is different, according to Microsoft's documentation: "A switch statement evaluates case patterns in text order from top to bottom.", text patterns which can be either a constant list evaluation (in which case, if its an integral type, it can use jump table optimizations) *or* a bunch of comparison evaluations (e.g. value >= 10), which compiles down to IL that is no different than an if-else tree, so understand the behavior of your chosen language and use cases before throwing switch statements everywhere, expecting a performance boost that never manifests!
hi, computer engineering student(hardware lol) here. switch statements are basically state machines on the hardware level(think of it like a recursive register that updates itself due to some conditions of the current state and whatever inputs/outputs) and they are quite fast as opposed to abunch of if statements/nested if statements as thats just a ton of multiplexers in a row and as great as a multiplexer is, they can have quite the long bit depth(the longest path a bit of information can take) its just more efficient to use a state machine which may only use a couple mulitplexers per clock cycle and register delay is much less than that of a mux.
Let me tell you a shocking truth: You are wrong! Both “if” and “switch” compiles to the **EXACT SAME** binary file, on x86-64 GCC 13.2 with any optimization level higher than 0. Please see my follow-up comment for a link to my Godbolt Compiler Explorer link. Did you forget to turn optimization on? Or are you using an old compiler?
Might be different for out of order execution, but when your cpu is piping in instructions, it will usually be processing multiple instructions at a time. But that means for an instruction like jne, your cpu may end up processing instructions right after the jump when in reality it needed to be processing instructions at the location specified by the jump instruction. When this happens, the cpu is forced to flush out the instructions and then continue from the correct location. Branch predictors can reduce this chance but that is an additional cost of if statements.
so basically the compiler converts the switch statement into a sort of hash table? that way it runs a hash on the condition and immediately jumps to the corresponding branch's address, correct?
nice. Now let's see 3 ifs and switch/cases vs 8 ifs and switch/cases when optimized. Essentially, this is just a highlight of branch conditional logic vs jump tables and how for large conditionals, a jump table (or lookup table) can be faster, usually because arithmetic of certain sizes are faster than conditionals of certain sizes. (Iirc GCC the size is 4) The secret here is that large if/elseif blocks are decayed to jump tables, and small switch case statements are decayed to if/else branch blocks. You can do this manually as well by building an array of the conditional logic you want to do (say if the functions are the same, you can build an array of function pointers), and just indexing into that using conditionals. It's a very cool trick, especially for printing a string from an enum containing its value, for debugging for instance.Turns the entire switch case into a one-liner.
This is so interesting that I used to avoid switch sentences in favor of hash maps (external or built my own) or array offset based options because I really thought the compiler was that stupid to make a sequential search, but honestly this video got me thinking...
That's so sick. It does make sense, since in a switch statement, you're always just doing an equals comparison with one operand being fixed, effectively turning it into a unary operation. Whereas in an if-else chain the condition you're branching on is completely arbitrary.
one for one, a predicted if is faster than a switch. if you have more important values (such as a success vs error return) where the success performance is far more necessary, an if for the important value(s) first then do whatever with the rest. it allows for branch prediction then. Also, if your tables isnt dense it will be converted to an if-else tree anyways, Predicted branches are msotly just the instruction slot expense so single cycle - almost free.
coding sites for awhile and this just made me want to dig deeper into assembly, maybe lol. Thanks for the video great information! *the more you know star*
Switch statement is also to do docotomy when values are dispersed. We can also take advantage of both worlds, by doing if (most probable value) and switch on the other values.
Ah yes the good old jump table. I remember playing ShenZhen IO (it's a game (if you can call it that) where you build digital circuits and code with a very simple RISC assembly) back in 2018. One of the levels was a sandwich making machine or something, and I had the idea to use a hard-coded jump table (using a ROM) to store the recipes and thereby cut down the number of instructions. This is basically just that, but slightly more involved. Slightly.
@@cherubin7th break breaks everything, which is why I prefer if. Also in game making, whenever I have a switch I'll need to break it because exception happen, with a if it's just inserting or changing the test. Though nowadays I just pass a fonction or an object by reference, much cleaner.
IIRC, the Pythonic way to do a switch statement (Python doesn't have switch statements) is basically what the assembly does; using a variable (usually user input) as a key to lookup in a dictionary or array of functions
to me personally to use Switch statements the value has to be a single denominator AND the impacted options have to be atleast more than 3 or 4 options. If is just more flexible and Switch requires more code to work. But if you got many options depending on your value I think its a powerful tool and when you write big programmes or it gets called a lot its worth to implement
I also consider size of the table since I usually program on 16 bit hardware. These kind of things work best when the choices are consecutive. So instead of using letters it's better to ask the user to type 0, 1, 2, or 3 since you don't need a massive table, you can just use bit twiddling to clamp everything down to four options
The issue is mostly about difference in branch prediction misses. Anytime one branch goes differently than predicted every instruction in the pipeline after branch gets squashed. For instance ZEN2 that's 18 cycles with potentially 4 instructions decoded per cycle, and for all modern PC processors it seems to be on that ballpark. The nested if statements can do that multiple times while calculated only does it once. Damn. I just needed this since I've completely forgotten to use the switch statements. Calling function pointers from array is quite similar, but should be somewhat faster.
It also depends on the language. Some like VB compile a Select statement into a bunch of If/Else statements anyway, so you can lose the expected efficiency.
The IL might end up containing multiple If statements, but the compiler can still optimise that to a jump table in the case of an enumeration, which has known values.
The way I understood it from working in assembly is this: With a long chain of if-else you compare your input to each possibility. With a switch statement you're comparing nothing at all!
You may also have a look at CPU architecture, the scheduler load the instructions in a cache and do speculative out of order executions, which basically render this optimization useless. 😂
@@timmygilbert4102 Right. Would be interesting to test under which circumstances the benefits of branch prediction of if/else outweighs the gains of a switch statement. 🤔
@@sciencelab4225 there is a blog that test it, 1024 random if with no slowdown. Memory access is slower than compute, so reordering the instructions cache to feed the register memory and discarding false execution is faster than just querying the right branch. So if you don't have long switch statement or deep if tree, by witch you probably need another way to structure code, it's unlikely to even make a dent or show up in benchmark. It's only style choice.
C is just such a pleasant language to use tbh. I'm mentally scarred from having to use Classic ASP at work ATM, can't wait to get back to the C codebase
For small count of options, a jump table can have negative impact because branch predictor can run into difficulties to predict jump through the table. However, the compilers often transform this special edge cases into if-else series anyway. If one option appears often then others, is is probably better to make if statement with likely attribute for this option and then use switch-case for other options
I had to refactor some code written in GO, and when I saw the looooong tree of if-else statements, I was baffled. Somebody wrote that, without even once thinking "Damn, this is getting ridiculously big, maybe there's a different way to do that". Now code looks compact, much more readable, and I don't even have proper experience in GoLang.
Support the channel by trying Brilliant at brilliant.org/LowLevelLearning for FREE and get 20% off when you sign up! Thanks again Brilliant for sponsoring this video and your continued support of the channel!
I did not catch the perf difference. What is the perf difference?
Here's a few video ideas:
* Hardware bugs and their workarounds (e.g. F00F, FDIV, TAS on Amiga/Genesis, DPCM glitch on NES, etc)
* How to program in assembly without losing motivation (me irl 😭)
* How emojis are represented in binary
* Unicode errors in general (e.g. rÃ(C)sumÃ(C) (I don't know how to type the copyright symbol on my phone sorry!)
* Why it's so easy for a human to recognize spambots in chatrooms but computers can't do it well
* How an EXE file works
You don't even need a switch statement for enums in Java, you can make frankensteinian monsters known as an abstract enum. They're usually used for adding extra data for the enum members (with a custom constructor), but you can also add abstract methods to the enum, which you have to then implement for each member.
This leaves you at a simple method call on your enum variable without switching for possible values. 😂😂😂
I cant wait for neural link so i can learn everything you know lol
@@TheTrackoShow right?
Yes, I was disappointed that he glossed over stuff referring to things as "magic numbers" and "a bunch of garbage".
The whole *point* of the video was to explain the details behind the switch statement.
The valid user input are letters: "c", "d", "e", "n", "q" (which he defined earlier in the video).
Note that the first letter "c" is ASCII 0x63 and the last letter is "q" is ASCII 0x71. Note that the span from "c" to "q" is 0x0e (i.e. 15 characters away).
The code at 7:51 shows a transformation and a check.
Line 123b sub eax, 0x63 ; this subtracts letter "c" from the entered letter, so the range of inputs is now indexed from 0.
Line 123e cmp eax, 0xe ; this compare checks if the input letter is past letter "q" since letter "q" is the 15th letter (i.e. 0xe) from letter "c"
At this point, the character range from "c" to "q" has been transformed to an index range of 0 to 14.
If the entered character is *OUTSIDE* this range the code terminates, bypassing the jump table stuff, and goes to process with the "invalid" code not shown in the video.
Line 1241 ja 12aa ;exit the jump table code block as the entered character is outside the range of the defined command characters.
If the entered character is *INSIDE* this range, it proceeds to the jump table code.
Line 1245 lea rdx,[rax*4+0x0] ;Load Effective Address. The index value is multiplied by 4, since each jump table location is a 4-byte offset added to the table's base location.
The code at 7:58 is really just the jump table values. (Unfortunately the disassembler produces nonsense code, but the offsets are valid.)
Rearranged, the offset values are given below and matched against the entered letter:
𝟶𝟶: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝟸 ; "𝚌" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟿𝟼 𝙲𝚘𝚗𝚝𝚒𝚗𝚞𝚎
𝟶𝟷: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟸𝚊 ; "𝚍" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟽𝚎 𝙳𝚎𝚕𝚎𝚝𝚎
𝟶𝟸: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟷𝚎 ; "𝚎" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟽𝟸 𝙴𝚍𝚒𝚝
𝟶𝟹: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚏" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝚒𝚗𝚟𝚊𝚕𝚒𝚍 𝚌𝚘𝚍𝚎 𝚋𝚕𝚘𝚌𝚔
𝟶𝟺: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚐" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟶𝟻: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚑" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟶𝟼: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚒" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟶𝟽: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚓" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟶𝟾: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚔" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟶𝟿: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚕" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟷𝟶: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚖" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟷𝟷: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟷𝟸 ; "𝚗" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟼𝟼 𝙽𝚎𝚠
𝟷𝟸: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚘" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟷𝟹: 𝚏𝚏 𝚏𝚏 𝚏𝟸 𝟺𝚎 ; "𝚙" 𝚒𝚗𝚟𝚊𝚕𝚒𝚍
𝟷𝟺: 𝚏𝚏 𝚏𝚏 𝚏𝟸 ?? ; "𝚚" 𝚓𝚞𝚖𝚙 𝚝𝚘 𝙻𝚒𝚗𝚎 𝟷𝟸𝟾𝟿 𝚀𝚞𝚒𝚝
There is no particular reason that the offsets are negative, it's just that the compiler chose to put the jump table _after_ the code fragments.
In fact, referencing the offsets to the JUMP table *is* a bit odd.
Usually the base address for the jump table is the FIRST ROUTINE. (in this example the "New" routine at address 1266 is the first rountine...as shown at 8:52 )
And because it's first, It would have an offset of simply 0.
The rest of the routines are reference from there (the FIRST routine).
In the given code each entry point is just 12 bytes from the previous routine.
𝙹𝚞𝚖𝚙𝚁𝚘𝚞𝚝𝚒𝚗𝚎𝙱𝚊𝚜𝚎𝙰𝚍𝚍𝚛 𝚒𝚜 𝟷𝟸𝟼𝟼 (𝚒𝚗 𝚝𝚑𝚎 𝚐𝚒𝚟𝚎𝚗 𝚎𝚡𝚊𝚖𝚙𝚕𝚎 𝚊𝚝 𝟾:𝟻𝟸)
+𝟶 𝚏𝚘𝚛 "𝙽𝚎𝚠"
+𝟷𝟸 𝚏𝚘𝚛 "𝙴𝚍𝚒𝚝"
+𝟸𝟺 𝚏𝚘𝚛 "𝙳𝚎𝚕𝚎𝚝𝚎"
+𝟹𝟼 𝚏𝚘𝚛 "𝚂𝚝𝚘𝚙"
+𝟺𝟾 𝚏𝚘𝚛 "𝙲𝚘𝚗𝚝𝚒𝚗𝚞𝚎"
With those simpler and smaller numbers, the example would have been easier to follow, but we're at the mercy of the compiler (and crappy disassembler).
As was pointed out, if the cases are "sparse" then the jump table will have a lot of those "invalid" entries. It still executes very quickly, but just takes up some memory.
Input letters *OUTSIDE* the range are culled out and terminate quickly. Letters *INSIDE* the range all result in the jump code being executed, even if it is just results in an "invalid" stub.
This makes it all even more interesting. Thank you for the addition.
How did you do that monospace-ish font in a youtube comment?
`mono`
how did you do that bro
this is what i came here for, thanks
basically a hasmap lookup vs an array search, but at a low level
To add on: it’s like the hash function is computed at compile time! (iiuc)
Edit: Y’all just use Rust check and you don’t needa compile
@@joeystenbeck6697 this is something I would like to see @lowLevelLearning mention in the video. I haven't benchmarked it, but my gut feeling tells me the difference in runtime complexity is just pushed down to compile time, i.e. increasing time needed to compile the program in order to get a better performance in runtime. In some situations this is undesirable (or less desirable) and the program would be better off running a bit slower.
Beatrice can walk 14_88
@@PrivacyRevocation in 99% of the time you rather want longer compile times than longer run times. And I'd assume that if statements are more difficult to optimize?
@@PrivacyRevocation- Like .. when? When would it be better to have the program run slower so the compile would be faster?
Before everyone decides to switch to switch statements😊I think it’s good to point out that switch statement CAN be faster, but it really depends on the use case. Like you mentioned, when you only have a few conditions, if/else might actually be faster. Also in your example the gaps between the enum values are small, but when very large gaps are used, the compiler might opt for a different strategy, which can again be slower than a few if/else statements.
And finally, switch statements might be faster, but have their own caveats. Which can lead to slow development. It’s so easy to forget a break, which leads to an unexpected fall through, which can be real nasty bugs to trace. It’s easy to forget to handle the default case. Also in C/C++ switch statements cannot be used for strings, which is annoying to say the least. And the switch cases are not scoped, unless you add curly braces; which again can lead to unexpected behavior if used improperly.
Usually compiler issues warning about fall through. At least gcc does.
Also video explains why strings cannot be used in switch cases. You need a number for switch.
It really depends on the use case, squeezing more than 1 line under each case can be hard to read and ugly. And so often the conditions are not as simple as a case. So I think its use case can be quite narrow.
Seriously I don't understand why break statements are not a default on switch in all languages I have used. It blows my mind.
If you use enums, the gaps should be quite small. Generally I'm happy to let the compiler figure out if it thinks the switch is faster or not, but I prefer switches because I think it looks prettier and I sometimes like to know if I'm missing a value by looking at the warnings when I remove the default statement. (I tend to write java or rust more than C, so the precise warnings may be different.)
Embedded dev here. The compiler impressed me a couple years ago. I had a static constant std::map of enums to functors and i changed where it was used to a switch calling the functions directly and it compiled into an identical object file.
That’s nuts
Using the switch can also improve code safety by enabling the compiler warning/error for switch statement not handling all values in your enum.
So in future, when you add another value to your enum, the compiler will tell you about any switch statements you forgot to update
C99?
@@desertranger82, std::map implies C++.
doubtful. that means the compiler would have to have to elide the map entirely. maybe in a very simple case, but highly doubtful.
I'm convinced, I'm switching to switch.
If only i could
You should in any case
@@PingSharp 😆
If I use it, I won't break. That's if I don't default.
Do not get convinced so easy, test and measure times, that's the only way to know which strategy is best
The trade off point between if and switch will depend on your target processor. A processor cannot look ahead with a jump table. Meaning it cannot keep the pipeline full. A processor may have branch prediction to try and keep the pipeline full. Lookup branch prediction and jump for your target processor. Only benchmarking will tell you the tradeoff point. Good luck.
I would have imagined that modern compilers with optimizations turned on would be smart enough to choose to translate the if/switch to the appropriate assembly if/switch depending on the specific code.
Most of them are. Not sure what compiler he used with what settings. In some cases they can convert if-elses to jump table, like they do in case of switch. In some other cases they might decide to not generate jump table for switch, because multiple compares would be faster or much more memory efficient (in terms of produced code).
Yep, to many compilers it doesnt matter at all whether you make a switch or if statement. I guess the switch just happens to be easier to optimize so its more likely to be faster if we choose any random compiler. In addition to O(1) jump tables and O(n) if else chains some compilers actually uses hardcoded binary search to find the correct case by repeatedly cutting the range in half which is O(logn) in average. I guess that can be good middleground between those solution. Most compilers change the implementation for if and switch statements depending on the amount of items to be compared and the type of data being presented (consecutive numbers/something that translates to them like enums, random looking numbers without seemingly any pattern etc.)
You forget the human here.
If I have a long if-else chain that I carefully created with only fixed comparisons to a single variable
The compiler can optimize that to a jump table.
The next developer working in this file adds a if-else that breaks this pattern, and buum performance drops for no obvious reason (unless you know if this compiler optimization).
By using a switch I tell the next developer my intention, and she/he needs to make a conscious decision to change this concept.
They are, the if code is not optimized in any way
This is the correct answer.
People who write modern optimising compilers are generally smarter than the average programmer.
It's _relatively_ easy to look at an if-else cascade and analyse if it's evaluating equality on the same variable and is, therefore, functionally a switch statement.
One only need look at Python: Guido explicitly excludes a switch statement as he _knew_ this was possible.
To be fair, if the if-statement can easily be transformed into the switch statement as you did in the video, any reasonable compiler should output the same asm as the switch statement.
@@piisfun What? Modern compilers are perfectly capable of generating jump table from cascaded if statements as well as converting switch statement into series of comparision depending on whatever is faster in certain situation and are able to do it compiling much higher level language like C++.
Yeah, just put -O3
yeah with the example code yes, the problem comes when it's organized a little differently and the if/else chain isn't always comparing the same variable to the same value. (like, wouldn't be compatible with a switch). Easy thing for someone to do accidentally coming in to old complex code and adding another case and suddenly it performs a lot worse. So with modern compilers it's more of a protection against myself rather than intentional optimization imo. Though there are use cases where you need to be intentional about this, like some embedded stuff.
In my first design review at Novell, my team lead said I should use a switch statement for true/false tests rather than if/else. So I coded both and looked at the generated code - they were identical.
Compilers since 20+ years ago have been smart enough to ignore your preferences as a source code writer and just do the right thing at the machine code level. I used enjoy trying to outsmart compilers, to make 3D renders run faster by studying the machine code output, but gave that up long ago. We have created tools smarter than ourselves!
If the if-else chain tests something that can be trivially changed to a switch, it's better to do so, simply because it communicates intent better, and gives more well-structured code. Or, put another way, a long, simplistic if-else chain is just a plain eyesore, IMHO.
It doesn't necessarily matter if the compiler can magically figure your intent out in this specific case; in other cases, bad code (which this smells like) can actively hinder it from figuring things out and making optimisations.
TL;DR: Don't make it any harder for other people to read your code than absolutely necessary, and don't expect your compiler to perform magic on weird code, even if it's _often_ able to.
@@mnxs It's most useful when you have a situation with multiple checks that are simple tests and the actions to take are mutually exclusive.
Presumably in this situation those code blocks can be treated like an anonymous function and reduced to something akin to BASIC's IF ... THEN ... of the form
IF condition1 is true THEN gosub 100
IF condition2 is true THEN gosub 200
IF condition3 is true THEN gosub 300
P.S.
That said you can "optimize" your If-Else constructs by ordering the checks in terms of the most common conditions you expect to be true.
Switch statements are to be preferred when that ordering/ratio is completely unpredictable. Or, as he puts it in the video, they are all equally likely.
@@DrunkenUFOPilot That takes all the fun out of it. :P --- Still it does save your butt if you do something stupid in the code.
1. which compiler
2. what compiler options (if you have comparet on no optimization flags, then good luck, it's not even production code)
3. what language version.
This whole if / switch comparison makes litle sense, compilers this days will chop your code in sutch a way that you won't make heads and tails of it, so i assume adding -o3 will make bot comparable, but then again, you won't be able to read through ASM
That misses the point. The point is that the main reason switch statements exist is due to this optimization. A compiler turning an if-else tree into a jump table is explicitly a compiler optimization. A compiler turning a switch statement into a jump table is just what the compiler logically should do.
@@vibaj16 I did find a cases where switch cases won't get turned into jump tables. So it's not as logical as it seems. If for example you have non-equally spaced case statements that have very large gaps in between (like some random numbers like 1, 260, 1070, 24080). In those cases the jump table would be gigantic and even with no optimization gcc and clang (with -O0) seem to refuse to use jump-tables in those cases.
It makes sense but it makes me doubt that this optimization is the main reason for the existence of switch statements since the compiler obviously still integrates it's own decisions even on -O0.
PSA: It does not matter.. modern compilers have agressive optimizers that transforms if to switch and switch to bit-tests whenever it is valid && profitable speed/size wise to do so.
Yeah I was going to ask about that in the comments, optimising compilers were my first thought
A good compiler will let you set flags to indicate what you want it to do.
I don't really care about tabs vs spaces, but 8 spaces tab width hurts my soul 😭
the Linux kernel codebase actually use 8-width space or tabs (iirc)
I believe it's a method to prevent nesting. It gets unusable at the depth of 3, so it's motivating to extract and write separate functions/methods/classes.
@@moorscode that is actually the first positive thing I heard about such a big amount of spacing and it totally makes sense. Maybe I should try to get my coworkers machines to look that way xD
tabstops at 8 spaces is.... a very long-standing de-facto standard. I don't care what you set your indent level to, but on a modern computer, there's very little reason to make your editor necessarily use hard tabs of a different width, and plenty of reason (IMHO) to maintain the idea that tabs are 8 spaces (or rather, align you to a column number divisible by 8), for compatibility with existing things.
I know right.. i only use 2 spaces.
scanf("%4s", optionbuf), from man scanf:
An optional decimal integer which specifies the maximum field width. Reading of characters stops either when this maximum is reached or when a nonmatching character is found, whichever happens first. Most conversions discard initial white space characters (the exceptions are noted below), and these discarded characters don't count toward the maximum field width. ***String input conversions store a terminating null byte ('\0') to mark the end of the input; the maximum field width does not include this terminator.***
Also, your while loop condition is UB on the first iteration.
This looks like a compiler's job to me
It really isn’t it’s all about how you want your code to run that’s why there are so many ways to do the same thing
I'm pretty sure same optimization applies to a sequence of IF statements on modern compilers. What optimization level is used here?
seeing the re-move of the value in eax it's obviously in debug mode, no compilers would output such thing outside -O0
This is what I was also thinking
When I was accelerating code by eliminating branches, gcc was already eliminating the lowest level if by using an optimal assembler instruction.
This is correct. Common subexpression elimination is one of the first optimizations I learned back in the 1970s while in college as it can be applied early while doing lifetime and flow analysis before code generation
This optimization can only be applied on if statements who conditionals strictly operate on integral types and never check said integral types against a range of other integers (e.g. > or
It is a neat mechanic, but disappointingly rarely actually relevant.
1. You need to have a decent number of branches to actually notice a difference. Even with 5 options, it's still only a minute dufference.
2. If you have a huge number of choices, you will often implement ot as some kind of hash map or state machine anyway, rather than if/else or switch.
3. The speed of the branching is only relevant if the code that is executed in the branches is also fast. If you're doing a printf at the end, then your improvement by using a switch will only make up a fraction of a percent between hitting the control structure and actually seeing the text on screen.
That said, these niche cases where the speed gain is relevant of course still exist, just not in many places.
I guess I'm kinda traumatised about this topic over the whole YandereDev incident where everyone and their got hung up on the point that the code was bad because it used if/else instead of switch, even though that was the least of the problem with that code.
Ultimately it turned out that their performance issues came from calling an expensive UI function multiple times per frame and the use if heavily unoptimised graphic assets.
No measurements? The first step in optimization is measurement.
The second step is to turn on your compiler's optimization.
Who else is interested in how the "magical value math" works ? The whole point of me watching this video was to understand this lol.
Jump/Hash tables are things I'm not sure I quite get, so me.
I just assumed it was like:
Table address = 1200
q = 71
d = 55
So it just reads the value (q) and adds it to 1200 and jumps to address 1271 and executes the code there. Actually he said this part contains a negative number(?) and uses that in conjunction with the table address to jmp to code. Close enough?
I think he went over this part too quickly or something.
@@user-zu1ix3yq2w he kinda explains how they jump to the exact switch statement that satisfies the condition. But there was a magic address that points to the statement in his explanation.
How it calculates the address to that statement was not explained and was mentioned as "magic". I clicked on the video title to know exactly about that, but it was mentioned as "magic math" here.
@@aneeshprasobhan Oh well. I think he should make another video explaining it. Obviously we could just look up jump tables and learn about them, but I'd prefer he just explained it in more depth. I've always found magic numbers fascinating. I think a jump table is a natural solution, one we would invent ourselves if we had been approaching these problems in assembly and using the goto statement, instead of using high(er) level languages.
I guess what I can say is, this jump table solution for switch statements (goto table[table_address+offset]) is better for (numerically) consecutive cases, and less so for cases that are "sparser," especially for memory management.
Seems to be like branchless programming. There is an Aussie guy on here that is really in to it.
I think it boils down to a feature in the language. Switch statements have to have constants for the case statements. You can't use a variable in a case statement (i.e. CASE: ). So the values to be checked are all known at compile time, and the 'hashing' can be pre-calculated.
'If' statements however, could be comparing two variables whose values are not known at compile time. So even though in this example the value just happens to be a constant, the compiler doesn't have an easy way to go from assembling in each 'comparison' to trying to 'hash' if it just happens to be that one of the items being compared is a constant. Also, all the 'if' 'else if' clauses could be using completely different variables and values, whereas 'switch' statements the test parameter is always the same throughout all the 'case' statements.
As per Jason Turner, I would also almost always avoid using the default label when switching on an enum since you could forget to adjust the switch statement when a new value is added to that enum. This should always be configured to cause a warning in the compiler, but the warning won't work when a default label is present. And yes, you'll probably have to extract the switch statement to a new function so you can handle the default case outside of the switch if it falls through, but without that label.
Also another vote from me to use Compiler Explorer like Turner does when demonstrating assembly. It's just the best.
MISRA, for example, requires an explicit default. The alternative is that, unless intentional, the default should assert()... or just use a better static analysis tool/compiler, which will tell you when you have an unchecked enum value.
If this kind of error makes it into the wild, you have more serious process issues.
Perhaps in embedded land where unrecoverable errors are Really Bad™, but for most programs I would argue you should always have a default case that throws or asserts. Of course, I would never recommend a default case that just silently ignores the problem. Even better is when your compiler can catch the problem for you (a la Typescript).
@@clonkexIt really depends on the context and situation. When writing code for yourself, it's perfectly fine to have an empty default or just print out a message.
Where it becomes a problem is when the valid cases are strictly defined and receiving anything else is Very Bad News.
If you write a compiler or assembler, you really want to know if it encounters a non-existent instruction, a math/arithmetic error (division by zero), or some other issue that could cause the program to unexpectedly crash the computer it's running on or at least hang it up badly.
Otherwise it might quietly produce an executable that will suddenly crash without an obvious cause or waste a lot of computational resources on recovering from avoidable errors.
And God help you if it should take garbage and produce otherwise valid but pointless instructions.
@@jnharton I agree. This is why I say you should have a default case that throws. You want to know _for certain_ that you've covered every case, and if not, crash rather than do something unexpected.
you should use compiler explorer if you want to show whats going on in assembly, it has a much nicer view and will even highlight what gets translated into what assembly instructions.
actually an if else tree might get optimized into a jump table by the compiler if there are many branches, so using an if else tree or a switch really doesnt matter with an optimizing compiler. Though its good to know when to use what as it can make your code much more readable.
well perhaps, but people that care about such permeformance improvements are working with embedded devices like arduino etc. Everything is going to be quite basic.
keyword “might”.
@@Skyaidrules that's also the case with switches though
This depends on the nature of the if else tree. It can only perform the jump table optimization for if-else branches if 1) The conditionals only operate on basic integral types 2) The conditionals only test for equality or inequality, and doesn't use >= or
@user-li2yv5je5e I think you missed the point. ISA on an arduino is vastly different than x86-64 ISA.. or even x86-32 ISA. The performance profile is going to be vastly different. This is low level stuff that I doubt 99% of the people here understand, unless you work on compiler development. Joso997 apparently gets it. I've seen plenty of people come from 'modern' development teams, into 'tiny MCU' world and not understand this.
Fun fact:
* Write some code in a function, doesn't matter what
* start a switch statement at the top of the function
* put a "}" at the end, to finish the switch
* sprinkle "case" statements in between the lines of your code wherever you want. Even inside loops, if staments, other switches etc.
This is legal C/C++ and it even works, although it makes no sense at all.
I think the like button glowed when you said "like" at 10:57 o-o
Great video as always. I understand the difference but only in abstract level so it's pretty cool to see how exactly it work under assembly.
Very good. We learn today that switch statements in c is highly optimized. Thanks!
This will very likely be optimized out (for the if statement checks) if you tell the compiler to do any sort of optimization. Also, it's relevant to note that the hashing of the switch statement was so simple because you used a series of character comparisons of which each character was different. If you had multiple command inputs that matched even the first letter of the command, the switch statement would still branch.
But you showed a neat topic few know about, kudos to that
Love your videos man, I know some of this stuff but it's a great refresher or intro for others keep them coming
Please correct me if I'm wrong, but isn't this only true with optimizations turned off? I did a quick bit of playing around on compiler explorer before commenting and it was true there was a substantial difference between a switch and a if chain with optimizations turned off but if -O3 set both switch and if result in the exact same assembly.
It depends on how many cases you have
This. Decisions between if-else chains and switch statement should be based on what is more readable or what expresses the intend of your code better. Every major compiler is perfectly capable to see through switches and if statements and can transform one into the other if it thinks that's going to be faster. And generally the compiler will do a way better job at making this decision than the average programmer. The advice given out in this video is simply misleading and will make new programmers rewrite perfectly fine code because they now think that is will be faster (which it won't).
@@williamdrum9899 I did a bit more playing around this time with 9 cases instead of 3. The asm is almost identical, The compiler did switch to a different approach than before, this time it declared basically an array of length 26, one for each letter containing the start address of each case and then used the switch argument as the index to this array to find the correct case code, e.g. 'C' - 'A' = 2, 2 index in the array is the address for case 'C' which it then jumps to. This is the same for both if and switch. there is one small difference which is with the if version, there is a extra comparison at the begin which check the first if condition, if its matches then it skips everything I just mentioned. The switch version doesn't have this comparison but everything else is the same.
Not if your if statements greatly differ from one another
Nice! A very common place to optimize with switch is when parsing command line arguments. The problem is when you need to call strcmp or some function like that for every argument, you can't do that in a switch statement. A nice trick is to separate what can be solved with ifs vs switch and use the switch first, managing to even prevent the strcmp function calls.
I wonder if you could just run a switch statement over the first character and only then do the `strcmp` (and only compare with values that actually start with that character), but that would make the code look a little ugly. :(
@@ThePC007 That's actually a very good trick, for example gperf uses that trick to parse keywords to generate hash maps. I actually use it all the time when parsing command line arguments, I check if the first character is a dash '-', and only then I call strcmp from the second character, as most command line options will start with dashes.
To parse argument options you can hash argument values into a fixed size (per value) hash board buffer (long enough to hold your argument options) where you hash the string value and then memcmp() the two memory blocks to make sure the value is the same as you wanted.
Then you get an almost constant evaluation time.
@@informagico6331 That sound extremely efficient and fast, and probably the way to go for big production programs, but maybe an overkill if you're making a simple cli tool. Depending on the amount of available command line options, the simpler way could even be faster.
Yacc + lex. It's a long time I used them last, but if I remember correct Lex can make exactly such tables.
I ran the if statements versions if O3 optimizations enabled with g++ compiler, and it got converted to the switch version !!
Compiler optimization is pretty dope
Oh hey that's actually really cool! I've never seen someone explain how switches work at a byte level, and when you do it makes it really obvious why switches are so fast.
Some benchmarks would be nice to check if the performance really changes that much (probably not).
Any optimising compiler worth your time will produce identical code for either statement anyway. Usually it is able to transform the code into either lookup tables, trees of if & if/else statements, and often even branchless code. LLVM has been able to do all of this for a while now, as has GCC, and even MSVC has some capabilities here.
This guy is really being irresponsible by presenting his performance advice as true when it very rarely is, and not even measuring the difference.
@@codaaaaaaaaa and even if the compiler doesnt optimize in the vast majority of cases you dont even care as processors are extremely fast
Yo this is actually so sick. I saw a comment saying what is basically is... But man am I glad I stayed around for the actual explanation this was great
I like that you supported your thesis with a reference to the assembly code. I would be even happier if you ran some test code to compare the time it takes to complete 100K of switch vs. if versions of code.
Nice tutorial! I know its just a test program but you should consider to clear the optionbuf with 0 before you use it since there may be arbitrary values in it and it potentially can trigger your stop condition
He also has a 1byte buffer overflow since the length specifier in scanf("%4s", optionbuf) doesn't include the null byte and completely failed to consider compiler optimizations. This video can basically be deleted
MIND = BLOWN! What a great optimization, having seen that I can see bunches of ways to employ that to accelerate code flow in scenarios with complex nested comparisons. I would definitely want to wrap it with some sugar so its readable though.
Reminds me of the time I used a function pointer rather than a switch statement (20 years ago) did not do this, but instead searched down the list like the if statement. This was for a state machine with lots of states. I knew it was inefficient.
This switch-case example looks kind of like a function pointer. When I have a long-ish list of compares like this, I tend to check the assembly output for efficiency like was done here.
I began programming in machine code, then assembly. Working down to the bare metal is the best way to understand how to write efficient code. I learned all these tricks from the start.
I learned something new and also realized some illustrations would be very helpful. Especially if to explain how to use an input in a table of negative numbers to subtract it and find which option to choose without it comparing two objects 🤨?
“Brilliant,a *free* and easy way, …, 20% off subscription” sure
🤓 This is my comp sci degree speaking out, but there is a case where an if/else block might have a marginal performance benefit. If your input data is distributed in a way where one or two values are the overwhelming majority, with tertiary and onward values being exceedingly rare, it might prove just barely optimal to use an if/else tree rather than a switch.
The important point is that you *MUST* profile your code. Modern processors are just so complex that you just can't make any assumptions about performance anymore.
Thanks for the video, I think it would help a lot of people if you dived deeper into ASM to explain the differences between using ifs and switches. There are some things that really do need to be talked about in terms of what the computer and processor is ACTUALLY doing and switches and ifs are a perfect example.
PS. I had no idea people still had tabs as 8 spacing instead of 4 or even smaller. That is wild.
enum class in c++ is nicer because it doesn't pollute your namespace
std::array in c++ is nicer than char arr[4] because it has built in bounds checking without overhead. better still would be std::string, because that's the correct datatype for a string.
i could go on. I see no reason to use c over c++ when it's mostly backwards compatible anyway and you can just use the nice, safe and fast features of c++ over the unsafe trash that you will inexplicably write when writing in pure C. That said, mixing C and C++ will yield terrible code, so might as well just write c++ and dump all the c junk.
And yes, c++ works just fine for embedded.
Yes, this code which reads from the same memory location repeatedly, with no other data memory accesses in between, is going to have cache misses. Do they not teach you CS guys anything about locality? Do you just see memory accesses and assume "cache could miss there"? Yes there are some edge cases, but in general, that should hit, and in many of those edge cases (interrupts, multicore, etc) the cause will cause a much larger slowdown than the single cache miss it causes in that code.
I've only ever work with 16-bit MSDOS (no cache) and even I knew this.
Just that modern compilers already optimize all this stuff for you, so it really doesn't matter which one you use.
I always knew switch was the faster option after I learned Java but had no idea why until now. Thanks for this!
I was afraid of for loops and used nothing but switch statements in my code. Then I finally learned for loops and have began rewriting a lot of my code to make it a whole lot better.
NGL I was not expecting much. But this is amazing. Not a single comparison.
There are 3 ways to do this and option 3 is the fastest and cleanest. That’s having an array with function pointers for each of the enum values.
I remember learning about another benefit in my computer architecture class. Modern processors use tricks like out of order processing and branch prediction to improve performance. More branches = more wrong predictions and more operations performed that need to be discarded
glad to hear THAT, because the program I've been writing has a switch/case block with (so far) 171 cases in it, and it'll probably end up with a lot more than that before I'm done
Predictable Fixed execution time is Another important difference.
In this case
Switch: O(1) == fixed time
If-else: O(n) == depends on the input...
If the input is true random it's O(n)
If 99% if the input matches first if then it's fast and close to O(1), if 99% matches the last if.... you get the point.
In any case execution til for the if-else is harder to predict.
I still remember reading the AMD optimization manual like 20-ish years ago and figuring out how to use that to your advantage in so many ways. Even when you have simple string constants you can still take advantage of it by pre-hashing the strings, and there was another trick in there involving multiple if branches that could be compressed into a single switch by using the compiler's own optimizations to convert a test into a simple bit and then bitwise or them all together.
can you give me that resource to read?
@@sent4444 I can't directly link it, but just Duck the AMD optimization manual and you should find it. The Intel optimization manual is also a good resource. They both provide some really good tips if you're thinking of getting into writing assembly or a compiler or just improving your code from a higher level.
It's almost like the binary is a hash table and the switch is turned by the compiler into a very simple hash function to index(jump) to the right spot. Cool.
Pretty much. The beauty of this is that no comparisons are actually made. You basically have a big table where most options are "do nothing" so that no matter what the user types, all cases are handled
First of all, use "gcc -O2 -S", and then analyze the resulting assembly code. The assembly code you have shown (the first one) most probably resulted from unoptimized compilation. my gcc 13.2.1 generates different code.
Second, use non valued enums (not assigning values to the enumeration identifiers), and the result will even be better.
And finally: switch was always supposed to be implemented by jump tables, but in some cases (like having only a few choices), the code generator could select a more appropriate (=faster) implementation.
I think the best place to illustrate the Power of switch-statements is Windows-Applications. Any logical window (like a button or a static control) has a callback-function that processes Events in a large switch-statements. And Events can be very common things like "paint the window" or "mouse has moved". As much I dislike Microsoft, but Event-Driven-Programming that Windows uses is superior to the Object-Oriented programming that Gnome uses.
Also the SDL-Backend has a Event-sytem for input, that typically is processed by a large switch-statement.
This was super cool. I got out of the habit of using switch statements but I'll start doing it more now.
It looks more like a jump table than a hash table as others have opined. Also, this only works in certain cases, yes? if the range of the switch is too large (say, outside the range of a few alphabetical letters) then I think the switch statement resolves to a series of "if" statements. Great video, I'm loving all this geeky low-level stuff. I'm subscribed and continued success!
I have recently started using switch() statements in interesting way:
Have two compares on both sides of &&?
Bit-pack compare bools into int as two bits and switch case on the int. (4x cases)
On three bools (8x cases) this is massively more simple and faster.
Because in many cases they translate directly to a single or very few assembly statements. However, given a properly designed compiler and language designed to be optimized, often IF statements can in select cases be translated to be as efficient as switch statements.
Do note, this behavior is specific to languages that only implement switch statements that only accept integral types, and only compare against single integral constants, like C.
If you use C#, the behavior is different, according to Microsoft's documentation: "A switch statement evaluates case patterns in text order from top to bottom.", text patterns which can be either a constant list evaluation (in which case, if its an integral type, it can use jump table optimizations) *or* a bunch of comparison evaluations (e.g. value >= 10), which compiles down to IL that is no different than an if-else tree, so understand the behavior of your chosen language and use cases before throwing switch statements everywhere, expecting a performance boost that never manifests!
hi, computer engineering student(hardware lol) here. switch statements are basically state machines on the hardware level(think of it like a recursive register that updates itself due to some conditions of the current state and whatever inputs/outputs) and they are quite fast as opposed to abunch of if statements/nested if statements as thats just a ton of multiplexers in a row and as great as a multiplexer is, they can have quite the long bit depth(the longest path a bit of information can take) its just more efficient to use a state machine which may only use a couple mulitplexers per clock cycle and register delay is much less than that of a mux.
Let me tell you a shocking truth: You are wrong!
Both “if” and “switch” compiles to the **EXACT SAME** binary file, on x86-64 GCC 13.2 with any optimization level higher than 0.
Please see my follow-up comment for a link to my Godbolt Compiler Explorer link.
Did you forget to turn optimization on? Or are you using an old compiler?
The Godbolt Compiler Explorer: /z/e6Kc4cWxx
By the way, I fixed 3 bugs of your code: two undefined behavior, one buffer overflow.
Awesome video, thanks! :3
I wonder whether the same holds true for rust if statements vs switch statements
It does -- play around with rust's match with compiler explorer, it's very smart
Assembly is beautiful! My favorite language, would not use it for product development but it is beautiful.
This was actually very confusing when I was younger, what a nostalgia trip.
Might be different for out of order execution, but when your cpu is piping in instructions, it will usually be processing multiple instructions at a time. But that means for an instruction like jne, your cpu may end up processing instructions right after the jump when in reality it needed to be processing instructions at the location specified by the jump instruction. When this happens, the cpu is forced to flush out the instructions and then continue from the correct location. Branch predictors can reduce this chance but that is an additional cost of if statements.
Exactly my thoughts 😊
I would imagine that a switch statement makes it easier for a compiler to optimize for branch prediction.
so basically the compiler converts the switch statement into a sort of hash table? that way it runs a hash on the condition and immediately jumps to the corresponding branch's address, correct?
yes, correct.
If nothing else, the case syntax is super clean and readable.
nice. Now let's see 3 ifs and switch/cases vs 8 ifs and switch/cases when optimized.
Essentially, this is just a highlight of branch conditional logic vs jump tables and how for large conditionals, a jump table (or lookup table) can be faster, usually because arithmetic of certain sizes are faster than conditionals of certain sizes. (Iirc GCC the size is 4)
The secret here is that large if/elseif blocks are decayed to jump tables, and small switch case statements are decayed to if/else branch blocks.
You can do this manually as well by building an array of the conditional logic you want to do (say if the functions are the same, you can build an array of function pointers), and just indexing into that using conditionals. It's a very cool trick, especially for printing a string from an enum containing its value, for debugging for instance.Turns the entire switch case into a one-liner.
This is so interesting that I used to avoid switch sentences in favor of hash maps (external or built my own) or array offset based options because I really thought the compiler was that stupid to make a sequential search, but honestly this video got me thinking...
Give the compiler more credit. They are REALLY good these days.
That's so sick. It does make sense, since in a switch statement, you're always just doing an equals comparison with one operand being fixed, effectively turning it into a unary operation. Whereas in an if-else chain the condition you're branching on is completely arbitrary.
The switch table at 8:19 pretty much lets the CPU not need to compare anything at all.
@@williamdrum9899 Well, yes, I know? I watched the video. I might be missing the point of your comment 😅
one for one, a predicted if is faster than a switch. if you have more important values (such as a success vs error return) where the success performance is far more necessary, an if for the important value(s) first then do whatever with the rest. it allows for branch prediction then. Also, if your tables isnt dense it will be converted to an if-else tree anyways, Predicted branches are msotly just the instruction slot expense so single cycle - almost free.
coding sites for awhile and this just made me want to dig deeper into assembly, maybe lol. Thanks for the video great information! *the more you know star*
Switch statement is also to do docotomy when values are dispersed.
We can also take advantage of both worlds, by doing if (most probable value) and switch on the other values.
The workings of "switch" under the hood are ingenious. Can't believe how simple yet insanely smart that is.
I just noticed, when he said "if you LIKE this video", the like div glowed- That's awesome!
Ah yes the good old jump table. I remember playing ShenZhen IO (it's a game (if you can call it that) where you build digital circuits and code with a very simple RISC assembly) back in 2018. One of the levels was a sandwich making machine or something, and I had the idea to use a hard-coded jump table (using a ROM) to store the recipes and thereby cut down the number of instructions.
This is basically just that, but slightly more involved. Slightly.
Switch statements are also much more clean when you read code
It's more verbose
@@timmygilbert4102 well, I suppose we can't have everything. If I have to write more than two else statements, I prefer go for a switch
@@peyop5262In principle I agree it is cleaner, but I don't like the need for break.
@@cherubin7th break breaks everything, which is why I prefer if. Also in game making, whenever I have a switch I'll need to break it because exception happen, with a if it's just inserting or changing the test. Though nowadays I just pass a fonction or an object by reference, much cleaner.
Amazing. I've always wondered how a switch...case worked at low level.
IIRC, the Pythonic way to do a switch statement (Python doesn't have switch statements) is basically what the assembly does; using a variable (usually user input) as a key to lookup in a dictionary or array of functions
Pythonic way 😂😂😂😂
Python does have a match - case statement
@@juhotuho10 True, but only since 3.10
to me personally to use Switch statements the value has to be a single denominator AND the impacted options have to be atleast more than 3 or 4 options. If is just more flexible and Switch requires more code to work. But if you got many options depending on your value I think its a powerful tool and when you write big programmes or it gets called a lot its worth to implement
I also consider size of the table since I usually program on 16 bit hardware. These kind of things work best when the choices are consecutive. So instead of using letters it's better to ask the user to type 0, 1, 2, or 3 since you don't need a massive table, you can just use bit twiddling to clamp everything down to four options
The issue is mostly about difference in branch prediction misses. Anytime one branch goes differently than predicted every instruction in the pipeline after branch gets squashed. For instance ZEN2 that's 18 cycles with potentially 4 instructions decoded per cycle, and for all modern PC processors it seems to be on that ballpark.
The nested if statements can do that multiple times while calculated only does it once.
Damn. I just needed this since I've completely forgotten to use the switch statements.
Calling function pointers from array is quite similar, but should be somewhat faster.
Now I'm super interested to what does a match statement in rust compiles to
I must say Assembly language explanation is truly valuable and allow to reduce loop cycle it mean CPU time....Cool good work.
shouldn't the compiler be able to optimize the if tree to a switch like statement? (with -03 or -0fast)
I agree, but I'm not sure you can always rely on the compiler to do that on all platforms though
Trust, but check
So much this. At the end of the day if you didn't look at the compiler's assembly output you have no idea if your program is optimized
Funnily enough, just like an if tree, you iterated through all previously stated information every time you provided new information in this video.
Fantastic video.
Basically proof that switch statements are indeed magical since they teleport you based on the input lmao.
Going to -O1 or -O2 could enable jump optimization and/or inlining and most of the code disappears in this example.
I like your rectangular bracket styler theme, it helped me with understanding switch syntax a lot.
It also depends on the language. Some like VB compile a Select statement into a bunch of If/Else statements anyway, so you can lose the expected efficiency.
The IL might end up containing multiple If statements, but the compiler can still optimise that to a jump table in the case of an enumeration, which has known values.
That's actually awesome. Time to go restructure some of my Rust code.
Compilers will likely be able to optimize ifs to the same code as the switch in such cases.
Absolutely fascinating! 🤯 Thank you very much! This is the first explanation of why switch is preferred over if else that made sense to me.
The way I understood it from working in assembly is this: With a long chain of if-else you compare your input to each possibility. With a switch statement you're comparing nothing at all!
You may also have a look at CPU architecture, the scheduler load the instructions in a cache and do speculative out of order executions, which basically render this optimization useless. 😂
@@timmygilbert4102 Right. Would be interesting to test under which circumstances the benefits of branch prediction of if/else outweighs the gains of a switch statement. 🤔
@@sciencelab4225 there is a blog that test it, 1024 random if with no slowdown. Memory access is slower than compute, so reordering the instructions cache to feed the register memory and discarding false execution is faster than just querying the right branch. So if you don't have long switch statement or deep if tree, by witch you probably need another way to structure code, it's unlikely to even make a dent or show up in benchmark. It's only style choice.
It was the fastest in the era before caching
I appreciate your explanation and comparison between if/else and switch statement.
C is just such a pleasant language to use tbh. I'm mentally scarred from having to use Classic ASP at work ATM, can't wait to get back to the C codebase
imagine writing javascript and python for a living
c++ >>
For small count of options, a jump table can have negative impact because branch predictor can run into difficulties to predict jump through the table. However, the compilers often transform this special edge cases into if-else series anyway.
If one option appears often then others, is is probably better to make if statement with likely attribute for this option and then use switch-case for other options
What do you mean with prediction? isn't it a number that goes into adding to the program counter value to jump for that specific branch location line?
I had to refactor some code written in GO, and when I saw the looooong tree of if-else statements, I was baffled. Somebody wrote that, without even once thinking "Damn, this is getting ridiculously big, maybe there's a different way to do that".
Now code looks compact, much more readable, and I don't even have proper experience in GoLang.