I don't care what other viewers say. Keep using paper! Sometimes you have to go the extra mile to make a point. I like your teaching style. Thanks for the videos. Very good info here!
+JP Aldama I agree. There is something I love in making notes on printed text. Plus, you can explain something so much quicker on paper because drawing and organising information is quick and intuitive, whereas doing such on a computer takes time to plan out.
They don't teach us that because the last forty years of computing history has been all about NOT reinventing the wheel. People got tired of having to start over every time a new computer came around, so we standardized our hardware, and operating systems (most notably, Unix) became portable between CPU architectures. Developers (the vast majority of them, at least) stopped caring about the low level stuff because they didn't need to anymore, and the computer science world progressed towards higher level things. They don't teach us how to actually do it because to go from nothing to even just a bare bones, functional shell environment by yourself would take years and years of development. So they just teach us the theory behind how it works and leave it up to you to do that stuff, if you want to. I feel where you're coming from, though. I used to feel the same way and I tried to learn things from the bottom up, but trust me; you'll be a lot better off if you start with the higher level systems and work your way down. It gives you kind of a bigger picture to see where the little things fit into.
It's not a problem, though. Nobody teaches 8-bit assembly because nobody uses 8-bit assembly anymore except hobbyists, and hobbyists already have many resources available them to learn from. Not to mention that most people interested in 8-bit assembly grew up with computers that ran it, and thus already know it! In fact, we have access to all the resources they did and more with the help of the internet. We can't expect the world to cater to our extremely niche interests. That's why we're all so grateful to Ben for sharing his knowledge and guiding us through the process
Not just hobbyists. Assembly language can also be useful for hacking. I'd imagine it be really useful for reverse engineering, finding certain exploits, and malware development.
The instruction at 0x10000f63 is moving the result of the printf function (the number of characters written) to a location in memory (even though it isn't used)
I never figured out what the printf() was supposed to be; It is implemented in 16-bit code that has to keep two registers pointed to the same address; It runs much much slower than what makes sense to me; A data-block like 1024 or whatever shroud be alloc at init; Like above while ( int ) I found much established C/S to be Horror Code of the Damned written by relatives of the Munsters to prevent use of sanity checks like if do while which works much much better due to zero based indexing
When I first started programming in C (mid 80s) I wanted to make sure the compiler was doing a good job and would always check the assembly for timing critical code. After doing this for a while I realized I could write the C code in such a way to influence the compiler to output very efficient assembly. Nowadays, the few times I do this, I'm amazed at how good modern compilers have gotten at optimizing for speed.
@@random-user-s theres not much you can do now in days lol. Also not really worth it imho because of how good compilers have gotten. But there are some reserved keywords in C and C++ that can tell the compiler certain things. All i really know about is marking functions inline can speed up the compilation process sometimes and can boost performance. Again, its not really worth doing that because the compiler should do all of that for you when necessary (if you mark the compiling command with -O3) its pretty easy to look up and youll eventually get the hang of it when you code more
@@random-user-s Why would you? What's wrong with your compiler that you want to upgrade it already as a beginner? Maybe you should just try another one? Msvc, mingw or clang.
@@squizex7463 Maybe just due to curiosity? Or because a man wants to understand things better or just keen to do hard things? By your logic, one doesn't have to do anything because all the good things, by which you can do your software, are already written. So all you left to do is use them, which's boring af
Ah, I thought that the memory adresses were just chosen "ranndomly" by the compiler". But this makes me wonder though... how does the computer know how much space a variable takes up? Nothing in the machine code in the video shows that. What if the variable took up more than 4 bytes?
@@aurelia8028 in many languages you determine the datatype right? in "int x = 2;" an "int" is for example always 4 bytes and "double y = 5.4;" would make it 8 bytes etc *edit:* the size also depends on your platform... as mentioned by another commenter below, an int may be 2 bytes as well
@@deschia_ We did testing on it back in college, comparing hand-coded assembly, C, Fortran 77, PL/1, and last (and least) Cobol. C and Fortran compilers did a reasonable job of producing something pretty close to what we did in assembly. PL/1 threw in some extra overhead which I think was related to memory management. And Cobol created a scary pile of machine code that we decided not to look into too deeply. I think it was summoning something from Cthulhu.
One of my college profs was in the Navy and needed to write assembly for the Navy to optimize COBOL code. He wrote it in FORTRAN and turned in the assembly. They had strict goals on lines of assembly to be written and debugged per day. He always met his goals. His reasoning was that FORTRAN was a pretty efficient language, and so he probably couldn't do much better. The Navy never knew they were converting their COBOL to FORTRAN.
You’ve reminded me of a talk I gave this year showing how some fortran code appeared in assembly. Fortran is still widely used in my field (supercomputing) and understanding the impact of such things like compiler optimisation is very helpful.
I like how the compiler optimised the while(1) into an unconditional jump instead of actually evaluating the expression "1". I know compilers have been doing that for decades, also it's a very basic optimisation, but I enjoyed seeing it on paper :D
@@_yakumo420 I think it is pretty interesting that even without any optimization, it became an unconditional jump, rather than test whether the int 1 evaluated to 1 (I'm pretty sure that's how while(..) works in C). I guess it's common enough that the GCC developers just hard coded that optimization construct into the compiler?
@@splashhhhhhhhhh Yes indeed but that wasn’t the point. The compiler detected that it’s a tautology and optimised it even without the optimisation flags set.
Even without optimizations on there are some optimizations that will always take place, such as not using hardware multiply/divide/modulus on powers of 2 etc
Regarding, moving eax onto the stack. eax contains the return value of the printf call. It's not actually needed by this example. It's probably saved to help a C debugger display what was returned and is likely a nuance of the compiler.
So basically, it's almost like the compiler turned "printf ("%d ", x);" into "int oX14 /* I chose the name as a mock of the memory location shown in the above assembly */ = printf ("%d ", x);"?
This makes sense, but I was wondering why this instruction only occurs after the prior 7 lines instead of right after the call instruction? I'm guessing this might be because the cmpl instruction will actually overwrite the value of eax to store the comparison result. Does this have to do with the compiler not being able to look ahead to see if the value will be referenced and just postponing storing the value for future reference until it absolutely has to? Also, this would mean the instruction wouldn't be there if the routine wouldn't reuse eax and just returned instead, correct? What code could have followed and still use this value at this point, without explicitly assigning it to a variable right away? Can you give an example?
Thanks for the explanation, but I'm still unclear on part of it. I understand that eax/rax contains the return value of the printf function and by the time "movl %eax, -0x14(%rbp)" gets executed, that's still the value of eax. From what you're saying, I get that trying to access -0x14 from assembler code would be a mistake, and I get that, but I don't see why the value needs to be kept around at all - it's clearly not referenced anywhere in the source code? What use is the return value of the printf function at that point? And why does it only get moved to that address at that point in time, instead of sooner?
Yes, I suppose so, in that I agree with you: it's really a question about the compiler and not so much about the program either in C or assembler. I'm a software engineer myself, and having written compilers, as well as tinkered with command interpreters in the age of DOS on an 8086, I can strongly relate to what you're saying. My curiosity was raised by the question raised in the video, about the meaning of that particular instruction - which was answered above by +Dameon Smith: it's the return value of the printf function that's being saved for whatever reason, independently of the program under consideration. I suppose I could look into the inner workings of the GCC compiler to find out, I more or less hoped someone might have an intuitive (and therefore short) reason off the top of their heads. But I agree with you, that's likely not the case - and certainly not the topic of the video, as the author rightfully stepped over the problem and seems to have taken some care to write their C code in such a way that the assembler would be as clean as possible for demonstration purposes.
The eax register will contain the return value of the printf function. Evidently it is being stored on the stack in the expectation that it will be needed later. Presumably you had the optimiser turned off when you compiled it.
I'm genuinely surprised C makes so much use of the hardware stack, since if you looked at the C2 compiler in Java for example it absolutely hates using stacks and almost always does everything in registers unless it has no other choice
@@theshermantanker7043 If you compile on any level of optimization, it usually doesn't make as much use of the stack. By default, GCC compiles with absolutely no optimizations on, though. I find it's easier to make sense of the compiler's assembly on -O1 (the lowest level for GCC), because it puts things in registers a lot more, like a human would.
@@theshermantanker7043Originally that is what the register keyword was for. It told the compiler you wanted it to store variables in registers if possible, but it was just a request and not a given.
I see, thanks for pointing that out, its interesting that the compiler still consider that [printf] would need to going back to where it come from even when it see that the loop is infinite
I always thought assembly is useless and just a waste of time and money to take that class in uni but after I finished the class I realized how important it is, this might seem like an exaggeration but Assembly made me finally understand how Computers actually work and its diff one of the most important classes in CS . also its really useful for reverse engineering a TA in my uni showed me how to crack a program just by understanding assembly
@Adam Richard lol so true, I tried making more elaborated programs and instantly gave up. The fact it might be very different for each processor one might have makes it very discouraging. Or just raging, don't even need the "disco"
My 14 year old self back in 2003 would be extremely excited and thankful if someone would explain machine naguage in such a clear way. Thank you and well done!
Just a little remark for people wondering why the code generated by the compiler contains strange and unuseful constructs. It is simply because the code was generated with the -O0 parameter which means, no optimization whatsoever. This means that the compiler basically does a nearly 1 to 1 translation of the C code to the assembly, without considering if the operation are redundant, unused or stupid. It is only when optimization is enabled that the compiler will generate better code. In this example, for example, it is stupid to read & write x, y, z continuously from memory. An optimizing compiler will assign register in the inner loop and will never write their values to memory. The spilling of the printf return value 'movq eax, 0x14'bp)' will of course not be emitted/
Interesting that clang -O2 results in the output values (1, 1, 2, ... 144, 233) being hardcoded into the binary. The clang compiler is evaluating the result of the loop at compile time.
@@zoomosis Hahaha, thats very interesting. I've always thought compiler does so complicated stuff that Im not gonna even try to understand it. So I always assume that they can do pretty much anything. I wish to write my own compiler one day, very simple though.
Can you define what you mean by 'spilling'? I mean, yeah, the return value of printf is loaded into this memory location, but it is never checked for success anyways, so why isn't it redundant?
@@lukasseifriedsberger3208 That 'redundant' store of the printf() result _IS_ the 'spilling'. The 'prototype' of the printf() function shows that it returns an int so, by default, the compiler will SAVE that value somewhere (even though the value is never used!) If the source code is compiled with some degree of optimisation (eg: -O1, -O2 etc), then it will remove this redundant store of the printf() result since it's never USED! For further reading, what does the returned value of the printf() function actually mean!!! (Not many people have ever USED this printf() return value, so they don't know what it actually signifies - It's probably more relevant for sprintf() or fprintf())
Just fantastic to see how efficient the code produced by the C compiler is. I spent years writing assembler as a kid and used to have competitions with other on how fast and small we could make our code..
Spilling every value (including even the unused printf return value) on the stack isn't exactly the most efficient thing to do-however, that's exactly the thing to expect when compiling with optimization disabled.
I guess it's tightly related to how memory and cpu work internally. And it is very limiting due to the binary nature as well as a frequency ceiling of the transistors. Dead end if you will in my opinion. Invention of multiple cpu cores bought us some time I suppose but the future is somewhere else.
Bvic3 Notice that something is being already saved to the position 0x04 at the top. And the number is basically an offset to the base pointer (%rbp) so 0x00 would be the base(?) of the stack frame. I don't know, maybe something is stored there
I love how you write on the disassembled code like that. Makes it so much easier to retain and understand. I've wanted to learn to read disassembled code, I'll be doing this to help.
AKA "dry running", in the days when computer time was horribly expensive. It's still the best way to understand what's going on in code, and uncovering places for code optimisation, if performance is a problem. Don't optimise code before you've considered the algorithm, though.
@@thewhitedragon4184 they give you a block of code with a lot of unusual stuff and you have to answer what it outputs, or what are some elements of an array or something similar
Man, well done to you, you perfectly explained in 10 minutes what a professor in University had 6 months to demonstrate and still wasn't able to. Really interesting.
The compiler emits movb $0,%al because printf() takes a variable number of arguments. The ABI specifies that when calling such functions, %al must contain the number of floating point arguments. There are no floating point arguments passed to printf() in your example, so %al is set to zero.
Which ABI are you referencing? I tried to look for an appropriate OS X ABI that would cover the cdecl calling convention, but nothing I found mentioned this approach to counting floating point arguments.
Too late in this discussion, but the zero inside "movb $0,%al" is just an information, that the printed value should go into stdout stream (in normal circumstances it means that it will be printed on the screen). Anyway, this video and discussion have returned back a lot of memories... And last but not least, If anybody would like to, source codes for printf() are available, but be warned this function is really complicated one, because of a posibility to use variable list of of arguments with all kinds of types, formats and architectures.
frozen_dude - Yeah, I was hopping to have "otool" installed, but I didn't. I looked around and found this: stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc There are lots and lots of ways to get gcc to output the intermittent stages of compilation. I love gcc! If people have never walked through the stages of compilation, I highly recommend doing it.
I thought I'd throw an example of the complete compilation stages out there... I guess because I find it interesting and informative. So when you compile a C source file, the process goes through 4 stages: Preprocessing, Compiling, Assembling, and Linking. 1. Preprocessing: 'gcc -E example.c -o test.i' < The example.c file is preprocessed with the include files, and other directives, #ifdef, #include, and #define. 2. Compiling: 'gcc -S example.i -o example.s' < The source file is compiled into assembly. 3. Assembling: 'gcc -c example.s -o example.o' < The assembly file is converted into an object file, a machine code file. 4. Linking: 'gcc example.o -o example' < The machine code file is linked together with other machine code objects and/or object libraries into an executable binary file. The *.i and *.s files can be examined in your favorite text editor. The *.o file and the final binary file are both binaries, so you'll need a hex editor to view their contents.
This is a good example on why learning coding without understanding how computer technologies layer on each other seems so daunting. Just learning a coding language is not really that difficult. But coding is complexity built on complexity, and each layer down it become exponentially more complex. From an outside perspective, like when when I first started learning code, it feels like you don't just need to know the top layer of knowledge, be it python or c++, but you need to understand what makes that work and how something else makes that work. At the end of the day Id have the impression I was going to have to learn how electricity works to understand the chipsets or ram to understand the next layer to understand the next layer all the way up to my code. The great thing is that these languages were made so we don't have to do that. OOP and modern tech has almost made everything so independent and modular that you can learn the end result without knowing fuck all about how it works. You don't even need to know to code to write games anymore.
You are right but there is one thing.. I dont think learning OOP or coding language is easy .. They are also difficult because if you want to learn really well they steal a loot of time from you :(
I was thinking this exactly today! I was wondering how much do I need to know about this stuff and how may It help me. Although I know I don't need to know all of this stuff is so interesting to me and I think It can give me a better understanding of computer science as a whole, so I'm planning on at least do some research. It's only been 8 months since I started learning Web Development but I am fascinated with everything related to computer science.
When I was a newbie programmer back there in 2000, playing around with assembler, memory and registers really helped me to get a grasp of what pointers and references are.
Hey , am newbie , I just had a quick peek on pointers and references and I don't see why they have this reputation. &Variable just accsses the memory *Variable returns the value stored in the memory. Also instead of typing the address of the memory manually & is access an address of none local Variables I just don't see why they sound hard. Like its explained when the line Z = X The assembler grabs the value of Z puts it in a temporary memory and then puts this value in Z. Its all about memory.
Is it just me or you are feeling excited as well when you see machine language? I was learning python and working on stuff with for like everyday in 8 months. I started learning C and now it just feels a lot of fun language to work with! I even gave a break to python for the time being. Watching assembly feels interesting as well.
@@unknownguywholovespizzaTo me it is eye-opening to see the true atoms of computation. It bridges the understanding of high-level programming and the understanding of how hardware fundamentally operates on the values stored in memory. I am a beginning game developer. I have heard stories of how developers have written their games in C or even directly in assembly to maximize performance while keeping the size of the games very low. While most of my projects use existing engines and much higher-level languages for the ease they provide, I wish to pursue skill in C and assembly so that I may be able to write games that perform as well as humanly possible.
I taught myself BASIC then Pascal then C++. Learning was actually fun with some of the books they had in the '80s. I got a C64 for my 8th birthday, and I got the C64 Programmer's Reference Guide. It's just amazing the things that were in that book. It went from teaching you BASIC to showing you the memory maps, the pinouts for all the chips, and how to do graphics and sound. But it also had a 100-page chapter teaching assembly! It confused me because it made cryptic references to an assembler called 64MON which I had no idea how to get, but that made it more intriguing. The assembler class I took in college was also one of the only interesting classes I ever took. But I'm pretty weird. I was such a nerdy kid that in middle school I wrote letters to Brian Fargo and John Carmack asking for career advice.
@@captaincaption Brian Fargo actually wrote me back! That would've been about 1990 or 1991. I don't know what happened to the letter. I really loved Bard's Tale III and Wasteland. And today, 21 years later, I'm doing 2nd round interviews for L5 (senior dev) at Google ... but I just wanted to see if they offered anything interesting.
This is so cool, and I think this would be a way more fun/efficient way to learn Assembly than what's taught in colleges. It's way easier to see where these commands come from and what they mean if they're being directly compared to an actual C program. Much harder if a bunch of Assembly terms you've never heard are tossed at you and all of a sudden you're expected to code a program like this.
Try coding in machine language, now that was a chore. Assembly is just a higher level language that is converted/compiled into machine code. I originally started out studying electronics, so we had a course in machine code and had to write a program using it.
Minor correction, because I used to program in 8080 and Z-80 Assembly: Those instructions from the disassembly are more properly referred to as assembly code instructions. Machine code would be represented by nice hex numbers for the opcodes and operands.
Actually Z80 machine language is relatively easy to program by hand, for each opcode there are few bits of prefix and then register addressing etc. Then you convert all the bits in a hex number and done
Early textbooks used to make a distinction between assembly mnemonics and machine code. Looks like those days are long gone and the terms are used interchangeably.
Z80... My computer life started programming a TK-82C at 1982... Good times... 15 Minutes to load a 15 KB program from a cassette tape (after many attempts)...
Since we're being pedantic here about the difference between assembly code and machine code, it doesn't HAVE to use 'nice hex numbers'. Some CPU architectures were more suited to OCTAL representations, and technically, binary would be equally valid! Footnote: Check out the MODR/M byte in x86 code and you'll see how well-suited it is to use octal in this specific case! Having said that, I willingly admit that I'm predominantly a binary and hex man... LOL
The mnemonics directly represent those hex numbers. If he did print out the instructions in hex, you may as well then complain that it's not really machine code because it's not stored electrically in a computer, but printed with ink. It doesn't matter how you represent something, it's the same thing.
this brings me back to my assembly class at university, in 2002. i liked that class a lot, but i've never used it again since i didn't go into a career in embedded
@@tamny9963 - So you think that because someone can't remember something from 20-years ago that they're automatically lying? Or, are you just looking for attention?
I found your teaching so understandable by non programmers and beginners. I dont know to do programming and wanted to learn. I have interest in c programming and now assebly. Thank you for video you are great teacher
Remembering my first programming. You looked up the op codes and entered them on a keypad in Hexadecimal. This literally was writing the cpu instructions directly. I miss the 6502.
Way back when I was in school, we had a lab course working with the M6800 (6800, not 68000). I used to write my programs in C then hand-compile them into M6800 assembler. And of course, hand convert that into machine code, which then had to get toggled into the machine.
Hey me too man! Learned Basic on my Apple II and when I wanted to include some heavy-duty math subroutines, I'd POKE the hex code into a memory location then call it when needed. Even on that old 8-bit processor it ran blazingly fast!
The 6502 instruction set was very nice and clear, as was the Z80's to some extent. The intel instruction set was ugly in comparison. ARM assembly language is even worse, it's not meant for humans. Every instruction can do something and can also do something completely different, depending on some weird prefixes. I hope no human being was ever forced to write ARM assembly code.
I appreciate your effort to make this teaching video to share what you know and honestly say don't know to things you don't know. Well done. I'm not sure either what's the point of moving the contents of eax register on to stack.
so it can be formatted loaded and printed ... it has to strip the format out of the print ... the the data pointer then the data then print it ... and a stack is the best place to do that from as you can shift left and grab the format ... and then shift left and set format up then load the next chunk and shift left ... read data pointer ...and shift left ... load data .. shift left and finally print ...
Just commenting to increase engagement because this video is so well crafted, a truly great presentation. Most people don't understand how difficult it can be to explain technical concepts well, and this is a classic example of how to do it right.
This actually makes sense. As mainly a C# dev, C isn't actually hard, first off. Pointers and such can get a bit complex, but they make sense. This code is certainly simple. The assembly makes sense too. It is beautiful how simple it is and how it uses such sinple functionality in order to create more complex end results. This helped my understanding of Assembly and it might be one of the things that help me finally make a PS2 game one day.
It seems simple, until you have to do implement data structures in C; then you find yourself crying for days on end, because you can't seem to resolve the clobbered memory errors that keep popping up on you!
@@IM-qy7mf AddressSanitizer makes this significantly easier to debug, though. It's like a plugin for compilers that instruments code using the compilers' own semantic information. You should also get in the habit of writing asserts for potentially incorrect or dangerous code.
Almost a year late... On x86-based computers, eax is usually for return values. Don't forget that printf is not void, it returns a length. The compiler is a macro-assembler so it stores it on the stack anyway. What you can do is ignore the stack & use only the registers ebx, ecx & edx to store x, y & z, so in theory, it should execute faster. If i remember well, if you only want 8 bits, you can use even bx, cx & dx, or even b, c ,d
I miss programming in assembly. The first code I ever wrote was 6502 Assembly on an Atari 600xl. I also programmed in the following assembly languages over the years: 8088, 80286, IBM 360, R10000 and MIPS. After 20+ other languages over the years, assembly is still the one I liked best. It just felt natural. When I first learned C and was using the Turbo C compiler, I often wrote the function headers and variable declarations in C, and just inlined the guts in assembly. Those were the days...
I don't. At all. I wrote Railsounds II in Assembly because the processor (Microchip 17C42) had 2k code space and 160 bytes of ram. It ran at 4MIPS and at the time (93) was the fastest micro on the market. I couldn't wait until I could rewrite in C. Which we did. The hardest part was convincing Neil Young, my client, that we needed to do that. The rest is history. Over a million units sold.
Agreed. Very creative, very obedient. CPU does exactly what you tell it; nothing more, nothing less. If errors exist nobody to blame but yourself; and maybe the standard libraries which for assembly are minimal and usually just the startup code. I also wrote assembly for Honeywell DPS 8 mainframe; now THAT was programming!
@@thomasmaughan4798 Not so much on the obedient part. I remember seeing in a presentation that intel's 486 was the last x86 processor to simply run the instructions, in their order. After that came the out-of-order execution optimisations. And things like processing both outcomes of a check in the time the required value is fetch from memory and then simply using the correct outcome. So, nowadays, you don't really know what and how are things actually executing inside of a processor. Sometimes a less optimized code can be better optimized by the CPU optimzer.
rax is a 64-bit register eax is a 32-bit register which refers to the lower 32-bits of rax ax is a 16-bit registers which refers to the lower 16-bits of eax ah is an 8-bit register which refers to the upper 8-bits of ax al is an 8-bit register which refers to the lower 8-bits of ax gcc -S -masm=intel program.c ATT syntax is ok, but I prefer Intel personally... you’re welcome and thanks for the good video!
@@wh7988 Pick a processor, read the documentation, the documentation will tell you what commands there are and what they do. You can look up youtube videos or books for the processor and how to program in assembly for the processor. The class I am taking right now has us using code-warrior (ide) for programming the HCS12 (mircro-controller). I am assuming going with an arms processor would be a better idea though, they are more popular.
School! A good (but expensive) Assembly book is "Assembly Language" by Kip Irvine. You can use Visual Studio, admittedly a "long" process to set up, to write, run, and debug MASM. Give it a go.
In my last year of college. I now finally understood this! Thank You! What I realised is that the teaching (at my was good but they didn't put enough efforts to link this to higher level of programming we were practising daily. But now it makes sense. Thank you again!
8:15 I believe that line puts tge x value to the aex, where it can set a flag. The next line sets the flag, and the next line uses it to determine wether to jump or not.
Really enjoy your videos, started my programming journey, if you will, about 5 years ago with the idea of wanting to make video games. i later found assembly programming and electronics engineering FAR more interesting than game design. I have been learning 8086 ASM on DosBox lately hoping i can get enough experience to understand how computers work entirely, i am currently in the process of learning how different IC's work on a breadboard and hope to build my own 8bit computer soon. Thanks for getting me started on such a fun hobby i hope to make my job someday, keep up the excellent videos! Hope to see your channel continue to grow :)
Maybe he figured out that using machine language in software makes your product un-portable. There are many reasons *not* to write in assembler. And there are distinct instruction sets for different CPU architectures, so you can learn one ISA (inst set architecture) or you can learn all of them; compilers *do* have their advantages. All digital computers work the same way (registers, storage, interrupts, etc) but the devil's in the detail level you can't avoid in assembler. Everybody should *know* what compilers do and appreciate that today's compilers (I've been doing this for 40 years) are very, very good. You should also understand the overhead of interpreted languages like Java & Python (and the list goes on) before you make an implementation/design decision. Knowing the heart of how most of your customers' machines work (x86_64 for {lap,desk}tops, ARM ISAs for phones/tablets) is a valuable datum, should motivate us all to write code that's as efficient as possible. I still check my assembler output most of the time, but I'm about ready to retire ... probably an "old skool" type. But today's typical bloatware sucks. *Fight it.* Take pride in your work, know what you're delivering :-) _and good luck on your autodidactic journey!_
I don't know why, but this video is very satsfying to watch as a programmer. It's very logical and makes sense. Like if you'd suddenly have a partial look into a womans brain and actually start understanding something.
Why, I think everyone learns backwards, If they would start at low level which is cold hard logic memory movement and work up the chain I believe they would learn how to program much faster. Lang like basic trigger bad habits that become hard to break such as never clearing your memory or initializing variables and things like C++ have turned into a cluster fuck due to the Total Over use of OPP everyone seems hell bent on these days. I would suggest if someone wants to learn to code go back to DOS, Get Turbo C and use that, It was a great lang with great documentation to help you telling what every single command did ect.
If he tried to learn Java before ASM hes going to be crying like everyone else on this video is about how hard ASM is to understand when its WAYYYYYYY easier to understand then any lang I have ever used including Basic. I think the Fail comes with most people because they don't comment their code and lose track of whats what but its simple top down programming that can be traced with ease.
I know I will catch a mess load of flak for saying it because I still get a lot of flak for using it from time to time but I honestly believe DarkBasic is one of the better things for a programmer to start in.... Hear me out before yall hate on me. Starting off a programmer wants results, ASAP. With darkbasic its as simple as Sync On Make Object Cube(1,10) Position object (1,0,0,0) Position Camera (0,100,0) Point Camera (0,0,0) do control camera using arrow keys 0,1,1 loop wait key That code above will draw a cube on the screen and point the camera at as well as allow you to look around with the arrow keys, it which is a great starting point for most hobby programmer since the will feel the excitement going right away with a 3d object they can manipulate. This same code in say C++ for instance would literally take hundreds or thousands of boiler plate code just to setup the engine to draw the cube and accept the input. Look into darkbasic. Its old but its effective and its fun as all hell to toy with.
I started on a TRS 80 Model 1 with 2k ram and a cassette tape player. Basic. Then a Commodore 64. Commodore Basic. Then C on my BSD systems at home, took online local community college courses for Visual basic .net and C - grew tiresome. Right about then it became evident that code monkeys had to compete with $3/hr dev teams in India. Writing on the wall was that the money would be in Java. I stuck with sys admin needs; Perl and C. FEAR of Java, FEAR of having to think about this stuff, FEAR of actually applying what I've learned in school... NEVER learned these basics. (been TAUGHT it many times!) Never formed this solid foundation. In other words; I can't code to save my life...but I have worked for years making money doing it. Flying by the seat of your pants every day...making it work, doing the seemingly impossible. There is reward in that, at least. It feels good to actually DO this stuff in the real world for real world paying client needs. I can't even last in a programming conversation for two minutes. My point? - Just *do* *it*.
I'm studying IT, and coursing a few subjects that include C, C++, Assembler and Pentium processors architecture. And this is one of the best, and more interesting video that I've seen. Great work!
Since it's always true, checking it is a waste of time. Even with optimizations "off" some optimizations are always done. Such as bit shifting instead of MUL/DIV by powers of 2.
great video. When I learnt about programming languages, I always wanted to somewhat understand how computers treat the information we feed them, but looking at assembly on your own is just like *question marks* comparing it side to side to C is really insightful!
I've always regarded C as a sorts of macro generator. You can almost see the result in asm when you write C. Although with any level above O1, things get totally too much for a human to read, unless you wrote the compiler.
When I was a C developer I always used the compiler's option to output the assembler code it was creating to check it was creating good code. There are lots of ways of coding and hints that you can give the compiler to help it understand what you want and to help it create good code.
Yeah, you're correct; machine code is literally just binary. Otool seems to be a disassembler; it tries to format the machine code into something a little easier for a person to read Trying to read an executable written for an operating system through a hex editor or something would leave all the header information and such in the output; making it a little more difficult to see what's going on
I could be wrong, but the actual machine code would be 1s and 0s of the low level language the CPU uses. The code shown in the video is that code translated into a kind of assembly.
Machine Code is binary ... The nemonic we use LDA ... etc is assmebly language and in hex because 255 ones and zeros take up a ton space on a line ... while ffff doesnt converting from binary to assembly you run an ASSEMBLER and to convert a langauge like C++ you compile it into assembly language then assemble it in to machine code ... because sending ffff is easier to handle than 255 ones and zeros in a line
Very interesting to watch! The main takeaway I got from this, is when studying DAS for coding interviews- I really doesn't matter what language you use- once you see how to read it, the fundamentals are really the same. That wen from looking horribly obscure to something I could probably get a handle on in a couple of days. Great concept for a video- Thanks!
5:56 "Not sure what this other thing is. It writes 0 to the lower byte of the eax register (rax on 64bit but you seem to have a 32 bit machine). The other line is just setting the value of eax into the stack. Eax will hold the return of the last printf function.
"It writes 0 to the lower byte of the eax register " so what... you didn't push the envelope. It specifies "0 floating point arguments in registers passed in to variadic function".
Simple and interesting explanation, I have experience with assembler, and C ++ is my main language, but I tried to watch this like i'm a beginner. And in my opinion, that was very easy to understanding. Big respect!) Sry for my bad eng)))0
Thanks for the video! Glad to find others who think this is super cool. I just finished my assembly course and I'm sad its over. I'm pretty sure I'm the only student who actually did my assignments and didn't just find code to poach on stack exchange. I'm even more sure I was the only one who really enjoyed the class and preffered it over C++ and way more than Visual Basic. My C++ teacher has been giving me a hard time. Assembly is "neat" he says, but VB can make "real world programs" Humph. I figure if I love something that most people dislike, even if I don't do it directly, there's a market for doing that kind of thinking....???????
Tell your C++ teacher he is an idiot (you can quote me). VB is the worst for making real world programs. Create a Hello WOrld program in VB and compile it. You get a program that is >10K. Do it in assembly and it is 128 bytes..... He must have stock in storage manufacturers.... I'm CIO that used to teach machine code/Assembly when the first PC's came out. Wrote games on C64's until the C compiler couldn't comile them anymore and switched to (macro) assembler. You don't know programming until you have done that at least once for a larger project.
If you have access to the original source code you can use: clang -S -masm=intel prog_name.c which will generate prog_name.s with Intel assembly syntax.
you can tell the presenter really has a finite grasp on the information when he says 'i'm not sure what this other thing is' but hey i'm really glad you enjoy this hobby of yours
I remember spending hours upon hours typing almost endless lines of hexadecimal code into the computer's RAM and then compiling it overnight and recording it onto DAT cassettes so I could play computer games. Intel 4004 processor, 4k of RAM, with a 12" amber CRT... Good times... Good times...
Earliest versions of Pokemon series games were completely programmed in assembly language. Just think for a moment how much time and focus it would have taken for those programmers.😉😉
I see your reference there. But I got to say, most professional coders don't do stuff this hard for work. Not that I think journalists could learn low level or high level languages to proficiency.
I see people online saying "Recursion is easier to read, faster". Whilst the last one may be true, I don't know nearly enough lol, recursive functions have always been pretty much impossible for me to read.
@@psun256 recursion definitely shouldn't be faster, as a general rule all the repeated function calls that have to be allocated on the stack make the recursive version of a function either slower or at least more resource intensive, the only case i've ever seen recursive recommended for is when it makes code easier to read (and the only example of this i've personally experienced was with binary trees)
@@jake3736 Not necessarily. Some languages (Scala comes to my mind straight away) have tail recursion optimisation, so effectively the compiler is translating recursive code into iterative one. Of course the problem of stack allocation (and eventually stack overflow) is another reason to stay away from recursion if the trade off are not very well understood (and usually young university student don't understand those at all).
When you are making a call to printf you could go into detail about the x86 calling convention.. when it's doing lea 0x56(%rip), Edi you're actually moving the "%d " string from the .Data section of your program into %rdi (the first parameter in the x86 calling convention) when you call movl -0x8(%rsp) you're setting the second parameter to the value in X, and movb 0x0, %al is clearing the return value register %rax
Your last point about %al is not true. The reason %al is set to 0 before the call to printf is that the function reads from %al the number of vector registers used. This is the number of floating point arguments. The printf here doesn't use any, hence it's set to 0.
In addition to NickS' comment, the first instruction does not "actually move" the string. It Loads the Effective Address of the string into RDI, in other words it calculates the string starting address and sends it off to printf().
I think you've made a mistake when you told about the stack frame. Actually it was already set up one line higher and "movl $0x0, -0x4($rbp)" just sets up one of your variables (=
Probably start by reading about fibonacci series. You'll find interesting videos explaining how it appears in nature. Then read some basics of how C programming language can be used to perform certain operations like printing something on the standard output, like in this case we are printing the fibonacci series
Which processor is the machine language for? Another point is if this program had been directly written in assembly, only 1 byte would have been needed for each variable, and the "compare with 255" would be a simple "jump on carry flag set", as the carry flag is automatically updated on each calculation.
The length of the variable is not a consequence of the language. He specifically declared the variables as ints (4 bytes long, for this particular system). And while he wouldn't have been able to read the carry flag directly in C, he could have declared the variables as unsigned chars (1 byte) and figured out the wrap around by comparing z to y. But I the point of the video was to be as easy to understand as possible, not do pointless optimisations that would only confuse beginners.
Thats a very interesting strength reduction - although you could use an 8 bit unsigned data type in C as well for that. A properly implemented compiler would most probably do this strength reduction during the peephole optimization step, if not earlier.
You could have done that in C too, the 4 bytes is a consequence of choosing an int variable type. However, the comparison with 255 (in C) would then become a problem. But note that this would not have been any faster on a 32 or 64 bit machine.
More than awesome video bro! :D ... and I have a guess for for movl %eax, -0x14(%rbp): CPU Register -------------------------------------- EAX = 4 bytes -------------------------------------- | AX = 2 bytes | AH | AL = 1 byte -------------------------------------- Since the printf block played around with al ... and we have stuff (x and y) on -0x8(%rbp) and on -0xc(%rbp), respectivelly ... it seems really suspicious that line playing with -0x14(%rbp), which has an offset 12 bytes away in memory from our -0x8(%rbp). If I remember correctly, the bus actually aligns the data before sending it to the CPU from memory to improve performance, and this means including some bytes that might be used soon like 0xc(%rbp) ( cache y :D ); for instance, or even send garbage bytes so we don't have to create circuitry to get the exact byte from memory. What this means is that even though our data to be printed is on 0x8(%rbp), it will be also sent to the CPU 0xc(%rbp), 0x10(%rbp) and -0x14(%rbp). Therefore, I am going to guess this is actually the flush of buffer call for printing... and this the exact time when the printf is actually displaying the values for x on the screen... I guess more information could be given if you compiled with -g -O0 ... however, this video is an awesome explanation. A+!
+Desnes Augusto Nunes do Rosário Right, it seems specific to the author's platform, I compiled the same program with Ubuntu 14.04 and don't see the same spurious instruction when using any of the -O options, but I do see changes in the assembler to optimize z = x + y, so yeah, a good debugger run would help interpret who's responsible for that out of place instruction.
its the compiler he is using actually and the version of the language and the system he is on ... the eax is his usable side of the c language stdio.h ... and it is used to allow formating ... as his printf statement wants to print a %D data bit then do a carriage return ... with the data pointed to by the value x .... . eax is a formating stack alu and program controller in itself ... because he sent a format command the language has to strip the format out of the print command ... and the data pointer and then load the data ... . printf ("%d%/n",x) .... prrintf is in stdio.h ... so the first thing is to push it onto a stack to pull the format info out .. then advance and find the data pointer ... then advance and place the data into the formatted array and advance ... then send it off to the default display device .... just like when you step from 0000 0001 and have to fetch the first code line and strip it apart then find what it means and do it ... youre doing the exact same thing here just with software
@@0623kaboom Dude, no. Stop. That line is a spill to cache the value of eax on the stack because it will be clobbered by the return value of the next printf call. The only purpose of eax within this stack frame is to hold the return value of printf. Literally nothing more. WIth even the smallest level of optimization turned on you see the line disappear as it isn't even remotely needed.
I made a hello world program in C then edited the output in the binary using VSCode Hex Editor on (line?) 00002000. I compiled the program on Linux x86_64 with gcc 12.2.0. edit: edited some empty lines and nothing changed, does this mean I can encode stuff in executables lol
If you fully know the file structure, address values, and you can change them if your dimension increases, then yes. But not with all the bytes in the structure, this will work, but with many it will work.
cant believe if you change the executable it will change what it does, that's so unexpected! there's a joke: "for someone who knows assembly very well, every program is open source"
I don't care what other viewers say. Keep using paper! Sometimes you have to go the extra mile to make a point. I like your teaching style. Thanks for the videos. Very good info here!
+JP Aldama I agree. There is something I love in making notes on printed text. Plus, you can explain something so much quicker on paper because drawing and organising information is quick and intuitive, whereas doing such on a computer takes time to plan out.
They don't teach us that because the last forty years of computing history has been all about NOT reinventing the wheel. People got tired of having to start over every time a new computer came around, so we standardized our hardware, and operating systems (most notably, Unix) became portable between CPU architectures. Developers (the vast majority of them, at least) stopped caring about the low level stuff because they didn't need to anymore, and the computer science world progressed towards higher level things.
They don't teach us how to actually do it because to go from nothing to even just a bare bones, functional shell environment by yourself would take years and years of development. So they just teach us the theory behind how it works and leave it up to you to do that stuff, if you want to.
I feel where you're coming from, though. I used to feel the same way and I tried to learn things from the bottom up, but trust me; you'll be a lot better off if you start with the higher level systems and work your way down. It gives you kind of a bigger picture to see where the little things fit into.
It's not a problem, though. Nobody teaches 8-bit assembly because nobody uses 8-bit assembly anymore except hobbyists, and hobbyists already have many resources available them to learn from. Not to mention that most people interested in 8-bit assembly grew up with computers that ran it, and thus already know it! In fact, we have access to all the resources they did and more with the help of the internet.
We can't expect the world to cater to our extremely niche interests. That's why we're all so grateful to Ben for sharing his knowledge and guiding us through the process
In the UK assembler is taught as part of A-level electronics. The kids love it
Not just hobbyists. Assembly language can also be useful for hacking. I'd imagine it be really useful for reverse engineering, finding certain exploits, and malware development.
The instruction at 0x10000f63 is moving the result of the printf function (the number of characters written) to a location in memory (even though it isn't used)
Thank you! This comment should be pinned.
I never figured out what the printf() was supposed to be;
It is implemented in 16-bit code that has to keep two registers pointed to the same address; It runs much much slower than what makes sense to me; A data-block like 1024 or whatever shroud be alloc at init; Like above while ( int ) I found much established C/S to be Horror Code of the Damned written by relatives of the Munsters to prevent use of sanity checks like if do while which works much much better due to zero based indexing
@@opus_X And I get paid well for it 🤣
@@craig1231 how much time u spend in learning machine code, i want to learn too!! Its cool
So you're saying the code was suboptimal in execution time?
When I first started programming in C (mid 80s) I wanted to make sure the compiler was doing a good job and would always check the assembly for timing critical code. After doing this for a while I realized I could write the C code in such a way to influence the compiler to output very efficient assembly. Nowadays, the few times I do this, I'm amazed at how good modern compilers have gotten at optimizing for speed.
this guy is the real dela
I would like to learn to make the compiler more efficient. But I just started whit c and c++.
@@random-user-s theres not much you can do now in days lol. Also not really worth it imho because of how good compilers have gotten. But there are some reserved keywords in C and C++ that can tell the compiler certain things. All i really know about is marking functions inline can speed up the compilation process sometimes and can boost performance. Again, its not really worth doing that because the compiler should do all of that for you when necessary (if you mark the compiling command with -O3) its pretty easy to look up and youll eventually get the hang of it when you code more
@@random-user-s Why would you? What's wrong with your compiler that you want to upgrade it already as a beginner? Maybe you should just try another one? Msvc, mingw or clang.
@@squizex7463 Maybe just due to curiosity? Or because a man wants to understand things better or just keen to do hard things?
By your logic, one doesn't have to do anything because all the good things, by which you can do your software, are already written. So all you left to do is use them, which's boring af
7:05 you can actually notice how each variable takes 4 bytes of memory from the way they are located always 0x4 apart from each other
same thought!
Ah, I thought that the memory adresses were just chosen "ranndomly" by the compiler". But this makes me wonder though... how does the computer know how much space a variable takes up? Nothing in the machine code in the video shows that. What if the variable took up more than 4 bytes?
@@aurelia8028 in many languages you determine the datatype right?
in "int x = 2;" an "int" is for example always 4 bytes
and "double y = 5.4;" would make it 8 bytes
etc
*edit:* the size also depends on your platform... as mentioned by another commenter below, an int may be 2 bytes as well
@@SreenikethanI Sort of, it depends on hardware and/or compiler. `int` can be 2 bytes as well.
@@TeoTN oh right yeah
Idk why but there's something so satisfying about seeing terminal output on paper. Especially C code and disassembled code. Mmmmmm.....
Too bad its at&t syntax though. Eww.
yea lol intel 4ever
random offspring Ikr
random offspring , you deserve a stack of tractor-feed paper with alternating green & white lines :)
IKR can't explain it either, but it just looks so satisfying and perfectly organized. something like asmr
As a web dev, watching this makes me feel like I just swallowed the red pill and saw the real world for the first time.
Yeah I know how this feeling too. It just kicks in like "Oh we evolved all the way to here, jeez"
As an electrical engineer this makes me say "here we go again".
@@hattrickster33 you could say that c is one of the closest to the metal in the high language class.
@@hattrickster33 well compared to other languages, C is probably the closest thing to machine code, but C itself is still a high level language
@@deschia_ We did testing on it back in college, comparing hand-coded assembly, C, Fortran 77, PL/1, and last (and least) Cobol. C and Fortran compilers did a reasonable job of producing something pretty close to what we did in assembly. PL/1 threw in some extra overhead which I think was related to memory management. And Cobol created a scary pile of machine code that we decided not to look into too deeply. I think it was summoning something from Cthulhu.
One of my college profs was in the Navy and needed to write assembly for the Navy to optimize COBOL code. He wrote it in FORTRAN and turned in the assembly. They had strict goals on lines of assembly to be written and debugged per day. He always met his goals. His reasoning was that FORTRAN was a pretty efficient language, and so he probably couldn't do much better. The Navy never knew they were converting their COBOL to FORTRAN.
You’ve reminded me of a talk I gave this year showing how some fortran code appeared in assembly. Fortran is still widely used in my field (supercomputing) and understanding the impact of such things like compiler optimisation is very helpful.
I think we had the same college professor
@@mohamedrh4093 of what college?
@@18890426 aui ?
@@18890426 Al Akhawayn
I like how the compiler optimised the while(1) into an unconditional jump instead of actually evaluating the expression "1".
I know compilers have been doing that for decades, also it's a very basic optimisation, but I enjoyed seeing it on paper :D
Except it didn't optimise anything. This was without any optimisations
@@_yakumo420 I think it is pretty interesting that even without any optimization, it became an unconditional jump, rather than test whether the int 1 evaluated to 1 (I'm pretty sure that's how while(..) works in C). I guess it's common enough that the GCC developers just hard coded that optimization construct into the compiler?
C doesn’t have booleans… so 1 == True
@@splashhhhhhhhhh Yes indeed but that wasn’t the point. The compiler detected that it’s a tautology and optimised it even without the optimisation flags set.
Even without optimizations on there are some optimizations that will always take place, such as not using hardware multiply/divide/modulus on powers of 2 etc
Regarding, moving eax onto the stack. eax contains the return value of the printf call. It's not actually needed by this example. It's probably saved to help a C debugger display what was returned and is likely a nuance of the compiler.
So basically, it's almost like the compiler turned "printf ("%d
", x);" into "int oX14 /* I chose the name as a mock of the memory location shown in the above assembly */ = printf ("%d
", x);"?
Dameon Smith this was going to be my guess
This makes sense, but I was wondering why this instruction only occurs after the prior 7 lines instead of right after the call instruction? I'm guessing this might be because the cmpl instruction will actually overwrite the value of eax to store the comparison result. Does this have to do with the compiler not being able to look ahead to see if the value will be referenced and just postponing storing the value for future reference until it absolutely has to? Also, this would mean the instruction wouldn't be there if the routine wouldn't reuse eax and just returned instead, correct?
What code could have followed and still use this value at this point, without explicitly assigning it to a variable right away? Can you give an example?
Thanks for the explanation, but I'm still unclear on part of it. I understand that eax/rax contains the return value of the printf function and by the time "movl %eax, -0x14(%rbp)" gets executed, that's still the value of eax. From what you're saying, I get that trying to access -0x14 from assembler code would be a mistake, and I get that, but I don't see why the value needs to be kept around at all - it's clearly not referenced anywhere in the source code? What use is the return value of the printf function at that point? And why does it only get moved to that address at that point in time, instead of sooner?
Yes, I suppose so, in that I agree with you: it's really a question about the compiler and not so much about the program either in C or assembler. I'm a software engineer myself, and having written compilers, as well as tinkered with command interpreters in the age of DOS on an 8086, I can strongly relate to what you're saying.
My curiosity was raised by the question raised in the video, about the meaning of that particular instruction - which was answered above by +Dameon Smith: it's the return value of the printf function that's being saved for whatever reason, independently of the program under consideration. I suppose I could look into the inner workings of the GCC compiler to find out, I more or less hoped someone might have an intuitive (and therefore short) reason off the top of their heads. But I agree with you, that's likely not the case - and certainly not the topic of the video, as the author rightfully stepped over the problem and seems to have taken some care to write their C code in such a way that the assembler would be as clean as possible for demonstration purposes.
The eax register will contain the return value of the printf function. Evidently it is being stored on the stack in the expectation that it will be needed later. Presumably you had the optimiser turned off when you compiled it.
I'm genuinely surprised C makes so much use of the hardware stack, since if you looked at the C2 compiler in Java for example it absolutely hates using stacks and almost always does everything in registers unless it has no other choice
@@theshermantanker7043 If you compile on any level of optimization, it usually doesn't make as much use of the stack. By default, GCC compiles with absolutely no optimizations on, though. I find it's easier to make sense of the compiler's assembly on -O1 (the lowest level for GCC), because it puts things in registers a lot more, like a human would.
@@theshermantanker7043Originally that is what the register keyword was for. It told the compiler you wanted it to store variables in registers if possible, but it was just a request and not a given.
@@theshermantanker7043 THIS.
THIS is a comment I like.
I wish I had a save button like Reddit here...
I'm replying instead. Thanks!
I see, thanks for pointing that out, its interesting that the compiler still consider that [printf] would need to going back to where it come from even when it see that the loop is infinite
In only 10 minutes, you made me want to learn assembly language. Il looks so simple when it's explained so well. You did a great job, Ben Eater.
Hahaha......
Go for it. Sure a fun language, you start seeing everything the compiler or interpreter does in background for your happiness
I always thought assembly is useless and just a waste of time and money to take that class in uni but after I finished the class I realized how important it is, this might seem like an exaggeration but Assembly made me finally understand how Computers actually work and its diff one of the most important classes in CS .
also its really useful for reverse engineering a TA in my uni showed me how to crack a program just by understanding assembly
@Adam Richard lol so true, I tried making more elaborated programs and instantly gave up. The fact it might be very different for each processor one might have makes it very discouraging. Or just raging, don't even need the "disco"
The real question is which flavor? Arm? Intel? 68000? PIC?
My 14 year old self back in 2003 would be extremely excited and thankful if someone would explain machine naguage in such a clear way. Thank you and well done!
Just a little remark for people wondering why the code generated by the compiler contains strange and unuseful constructs. It is simply because the code was generated with the -O0 parameter which means, no optimization whatsoever. This means that the compiler basically does a nearly 1 to 1 translation of the C code to the assembly, without considering if the operation are redundant, unused or stupid.
It is only when optimization is enabled that the compiler will generate better code.
In this example, for example, it is stupid to read & write x, y, z continuously from memory. An optimizing compiler will assign register in the inner loop and will never write their values to memory. The spilling of the printf return value 'movq eax, 0x14'bp)' will of course not be emitted/
Interesting that clang -O2 results in the output values (1, 1, 2, ... 144, 233) being hardcoded into the binary. The clang compiler is evaluating the result of the loop at compile time.
@@zoomosis Hahaha, thats very interesting.
I've always thought compiler does so complicated stuff that Im not gonna even try to understand it. So I always assume that they can do pretty much anything. I wish to write my own compiler one day, very simple though.
Can you define what you mean by 'spilling'? I mean, yeah, the return value of printf is loaded into this memory location, but it is never checked for success anyways, so why isn't it redundant?
@@lukasseifriedsberger3208 That 'redundant' store of the printf() result _IS_ the 'spilling'.
The 'prototype' of the printf() function shows that it returns an int so, by default, the compiler will SAVE that value somewhere (even though the value is never used!)
If the source code is compiled with some degree of optimisation (eg: -O1, -O2 etc), then it will remove this redundant store of the printf() result since it's never USED!
For further reading, what does the returned value of the printf() function actually mean!!! (Not many people have ever USED this printf() return value, so they don't know what it actually signifies - It's probably more relevant for sprintf() or fprintf())
Thanks for that info... This excellent video's inspired lots of useful comments!
0:20 how did you get that infinitely long paper?
it's a vector h ah ahaha
iam bad , iam going to commit a suicide,bye world, sorry people that were actually hurt by this joke
Coz while(1) is an infinite loop
I would rather call it "indefinitely long" :P
its still being printed out, he just cut out a part of it
I was wondering the same thing. Wizardry?
At Uni I made a Snake game in Assembly IA-32 for a course. Never again, thanks.
Github? :P
I wrote the a-star pathfinding algo in x86-64. Just for fun...
I feel terrible for you. I tried messing with assembly once but i couldnt get anything working
Play shenzen io
@@xaiano794 i dont need to buy shenzen i/o to experience the pain of assembly
I understood in theory how C went up to other languages. Now I understand how C goes down to bits. Awesome work.
Just fantastic to see how efficient the code produced by the C compiler is. I spent years writing assembler as a kid and used to have competitions with other on how fast and small we could make our code..
Spilling every value (including even the unused printf return value) on the stack isn't exactly the most efficient thing to do-however, that's exactly the thing to expect when compiling with optimization disabled.
The type of video that makes you ask "How did people come up with this?"
The type of video that makes you ask "about the type of people that came up with this?!"
@@hiotis75 Ελληνάρα
The crash course yt channel has a series on computer science. Clears a lot of things up.
Of course aliens taught these people lol
I guess it's tightly related to how memory and cpu work internally.
And it is very limiting due to the binary nature as well as a frequency ceiling of the transistors. Dead end if you will in my opinion. Invention of multiple cpu cores bought us some time I suppose but the future is somewhere else.
you didnt mention why y is allocated in 0xC. That is because integers have a sizeof 4 bytes so 0x8 + 4 = 0xC
Technically a long. :P
that's only in C definitions
Same for z: 0x0C + 4 bytes => 0x10
Zupprezed And why does it starts at 8 instead of 0 ?
Bvic3 Notice that something is being already saved to the position 0x04 at the top. And the number is basically an offset to the base pointer (%rbp) so 0x00 would be the base(?) of the stack frame. I don't know, maybe something is stored there
I love how you write on the disassembled code like that. Makes it so much easier to retain and understand. I've wanted to learn to read disassembled code, I'll be doing this to help.
to any one doing c++ exams on paper.. do a table with all variables and update their values like the code says.. this way you keep track of everything
AKA "dry running", in the days when computer time was horribly expensive. It's still the best way to understand what's going on in code, and uncovering places for code optimisation, if performance is a problem. Don't optimise code before you've considered the algorithm, though.
I had to do both courses of programming (Pascal and C) on paper, and there's no time to do that (if you want the highest grade)
@@dowrow6898 When writing code or answering what a block of code gives as an answer?
@@thewhitedragon4184 they give you a block of code with a lot of unusual stuff and you have to answer what it outputs, or what are some elements of an array or something similar
@@dowrow6898 I have the feeling we attended the same collage because it's the same garbage here 😂
Man, well done to you, you perfectly explained in 10 minutes what a professor in University had 6 months to demonstrate and still wasn't able to. Really interesting.
The compiler emits movb $0,%al because printf() takes a variable number of arguments. The ABI specifies that when calling such functions, %al must contain the number of floating point arguments. There are no floating point arguments passed to printf() in your example, so %al is set to zero.
Which ABI are you referencing? I tried to look for an appropriate OS X ABI that would cover the cdecl calling convention, but nothing I found mentioned this approach to counting floating point arguments.
@@Hamled Personally, I just assumed that the string has to be null terminated. But I have no idea what that %al stands for.
@@Hamled The System V ABI, I'm pretty sure
Too late in this discussion, but the zero inside "movb $0,%al" is just an information, that the printed value should go into stdout stream (in normal circumstances it means that it will be printed on the screen).
Anyway, this video and discussion have returned back a lot of memories...
And last but not least, If anybody would like to, source codes for printf() are available, but be warned this function is really complicated one, because of a posibility to use variable list of of arguments with all kinds of types, formats and architectures.
Compilers were invented in 1952. People in 1951:
pretty much, yeah
Old video, but I still want to remark that you can add the "-S" switch to make GCC output assembly directly into the output file.
Nice tip, thanks!
frozen_dude - Yeah, I was hopping to have "otool" installed, but I didn't. I looked around and found this: stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc
There are lots and lots of ways to get gcc to output the intermittent stages of compilation. I love gcc! If people have never walked through the stages of compilation, I highly recommend doing it.
or
> otool -tv main > main.s
I thought I'd throw an example of the complete compilation stages out there... I guess because I find it interesting and informative.
So when you compile a C source file, the process goes through 4 stages: Preprocessing, Compiling, Assembling, and Linking.
1. Preprocessing: 'gcc -E example.c -o test.i' < The example.c file is preprocessed with the include files, and other directives, #ifdef, #include, and #define.
2. Compiling: 'gcc -S example.i -o example.s' < The source file is compiled into assembly.
3. Assembling: 'gcc -c example.s -o example.o' < The assembly file is converted into an object file, a machine code file.
4. Linking: 'gcc example.o -o example' < The machine code file is linked together with other machine code objects and/or object libraries into an executable binary file.
The *.i and *.s files can be examined in your favorite text editor. The *.o file and the final binary file are both binaries, so you'll need a hex editor to view their contents.
otool is an OS X program. On Linux, use objdump -d.
I feel calm when people use paper to explain :) very educational and relaxing
Scrolling through your videos i can see the depth of your knowledge , its brilliant and inspiring. I just subscribed.
I want to be knowledgeable like him about computers one day 😍
This is a good example on why learning coding without understanding how computer technologies layer on each other seems so daunting. Just learning a coding language is not really that difficult. But coding is complexity built on complexity, and each layer down it become exponentially more complex. From an outside perspective, like when when I first started learning code, it feels like you don't just need to know the top layer of knowledge, be it python or c++, but you need to understand what makes that work and how something else makes that work. At the end of the day Id have the impression I was going to have to learn how electricity works to understand the chipsets or ram to understand the next layer to understand the next layer all the way up to my code.
The great thing is that these languages were made so we don't have to do that. OOP and modern tech has almost made everything so independent and modular that you can learn the end result without knowing fuck all about how it works.
You don't even need to know to code to write games anymore.
if you want to know what hardware is doing, learn Computer architecture
Like the Techmen from Foundation. They knew how to work on nuclear power plants but had no idea how that shite worked
wrg, no such thin gas dauntingx
You are right but there is one thing.. I dont think learning OOP or coding language is easy .. They are also difficult because if you want to learn really well they steal a loot of time from you :(
I was thinking this exactly today! I was wondering how much do I need to know about this stuff and how may It help me. Although I know I don't need to know all of this stuff is so interesting to me and I think It can give me a better understanding of computer science as a whole, so I'm planning on at least do some research. It's only been 8 months since I started learning Web Development but I am fascinated with everything related to computer science.
When I was a newbie programmer back there in 2000, playing around with assembler, memory and registers really helped me to get a grasp of what pointers and references are.
Hey , am newbie , I just had a quick peek on pointers and references and I don't see why they have this reputation.
&Variable just accsses the memory *Variable returns the value stored in the memory.
Also instead of typing the address of the memory manually & is access an address of none local Variables
I just don't see why they sound hard.
Like its explained when the line Z = X
The assembler grabs the value of Z puts it in a temporary memory and then puts this value in Z.
Its all about memory.
Is it just me or you are feeling excited as well
when you see machine language?
I was learning python and working on stuff with for like everyday in 8 months.
I started learning C and now it just feels a lot of fun language to work with!
I even gave a break to python for the time being.
Watching assembly feels interesting as well.
Yeah same I'm having more fun learning assembly than the high level languages maybe that's because I'm a computers' nerd lol
@@unknownguywholovespizzaTo me it is eye-opening to see the true atoms of computation. It bridges the understanding of high-level programming and the understanding of how hardware fundamentally operates on the values stored in memory.
I am a beginning game developer. I have heard stories of how developers have written their games in C or even directly in assembly to maximize performance while keeping the size of the games very low. While most of my projects use existing engines and much higher-level languages for the ease they provide, I wish to pursue skill in C and assembly so that I may be able to write games that perform as well as humanly possible.
I taught myself BASIC then Pascal then C++. Learning was actually fun with some of the books they had in the '80s. I got a C64 for my 8th birthday, and I got the C64 Programmer's Reference Guide. It's just amazing the things that were in that book. It went from teaching you BASIC to showing you the memory maps, the pinouts for all the chips, and how to do graphics and sound. But it also had a 100-page chapter teaching assembly! It confused me because it made cryptic references to an assembler called 64MON which I had no idea how to get, but that made it more intriguing. The assembler class I took in college was also one of the only interesting classes I ever took. But I'm pretty weird. I was such a nerdy kid that in middle school I wrote letters to Brian Fargo and John Carmack asking for career advice.
That is seriously awesome.
@@captaincaption Brian Fargo actually wrote me back! That would've been about 1990 or 1991. I don't know what happened to the letter. I really loved Bard's Tale III and Wasteland. And today, 21 years later, I'm doing 2nd round interviews for L5 (senior dev) at Google ... but I just wanted to see if they offered anything interesting.
@@RaquelFoster great stories! It sounds like you’re doing well in your career and interest. That’s always good to read!
Your video is still helpful in 2020 and I'm sure other people would also understand concepts from it in coming years. Subscribed!
This is so cool, and I think this would be a way more fun/efficient way to learn Assembly than what's taught in colleges. It's way easier to see where these commands come from and what they mean if they're being directly compared to an actual C program. Much harder if a bunch of Assembly terms you've never heard are tossed at you and all of a sudden you're expected to code a program like this.
Try coding in machine language, now that was a chore. Assembly is just a higher level language that is converted/compiled into machine code. I originally started out studying electronics, so we had a course in machine code and had to write a program using it.
@@johnshaw6702assembly is machine code put in a readable way for a human.
Fr 💀
@@johnshaw6702 That might be interesting to know how those 0s and 1s run your Processor right ?
@@johnshaw6702where to start
Minor correction, because I used to program in 8080 and Z-80 Assembly: Those instructions from the disassembly are more properly referred to as assembly code instructions. Machine code would be represented by nice hex numbers for the opcodes and operands.
Actually Z80 machine language is relatively easy to program by hand, for each opcode there are few bits of prefix and then register addressing etc. Then you convert all the bits in a hex number and done
Early textbooks used to make a distinction between assembly mnemonics and machine code. Looks like those days are long gone and the terms are used interchangeably.
Z80... My computer life started programming a TK-82C at 1982... Good times... 15 Minutes to load a 15 KB program from a cassette tape (after many attempts)...
Since we're being pedantic here about the difference between assembly code and machine code, it doesn't HAVE to use 'nice hex numbers'. Some CPU architectures were more suited to OCTAL representations, and technically, binary would be equally valid!
Footnote: Check out the MODR/M byte in x86 code and you'll see how well-suited it is to use octal in this specific case!
Having said that, I willingly admit that I'm predominantly a binary and hex man... LOL
The mnemonics directly represent those hex numbers. If he did print out the instructions in hex, you may as well then complain that it's not really machine code because it's not stored electrically in a computer, but printed with ink.
It doesn't matter how you represent something, it's the same thing.
this brings me back to my assembly class at university, in 2002. i liked that class a lot, but i've never used it again since i didn't go into a career in embedded
Can you tell me what does the 0000000100000f2e under _main: means
@@tamny9963 - So you think that because someone can't remember something from 20-years ago that they're automatically lying? Or, are you just looking for attention?
@@deepkarmakar5346 virtual address (image base + VA = full address) of the instruction ?
@@radon-sp thinku
@Jonathan Dahan you okay?
I found your teaching so understandable by non programmers and beginners. I dont know to do programming and wanted to learn. I have interest in c programming and now assebly. Thank you for video you are great teacher
Remembering my first programming. You looked up the op codes and entered them on a keypad in Hexadecimal. This literally was writing the cpu instructions directly. I miss the 6502.
Way back when I was in school, we had a lab course working with the M6800 (6800, not 68000). I used to write my programs in C then hand-compile them into M6800 assembler. And of course, hand convert that into machine code, which then had to get toggled into the machine.
Hey me too man! Learned Basic on my Apple II and when I wanted to include some heavy-duty math subroutines, I'd POKE the hex code into a memory location then call it when needed. Even on that old 8-bit processor it ran blazingly fast!
The 6502 instruction set was very nice and clear, as was the Z80's to some extent. The intel instruction set was ugly in comparison.
ARM assembly language is even worse, it's not meant for humans. Every instruction can do something and can also do something completely different, depending on some weird prefixes. I hope no human being was ever forced to write ARM assembly code.
I appreciate your effort to make this teaching video to share what you know and honestly say don't know to things you don't know. Well done. I'm not sure either what's the point of moving the contents of eax register on to stack.
so it can be formatted loaded and printed ... it has to strip the format out of the print ... the the data pointer then the data then print it ... and a stack is the best place to do that from as you can shift left and grab the format ... and then shift left and set format up then load the next chunk and shift left ... read data pointer ...and shift left ... load data .. shift left and finally print ...
This the piece of the puzzle I was looking for years, thank you.
Just commenting to increase engagement because this video is so well crafted, a truly great presentation. Most people don't understand how difficult it can be to explain technical concepts well, and this is a classic example of how to do it right.
Ohhhhh this helped me for my malware and reverse engineering final. THANK YOU!
This actually makes sense. As mainly a C# dev, C isn't actually hard, first off. Pointers and such can get a bit complex, but they make sense. This code is certainly simple. The assembly makes sense too. It is beautiful how simple it is and how it uses such sinple functionality in order to create more complex end results. This helped my understanding of Assembly and it might be one of the things that help me finally make a PS2 game one day.
Not to be the party stopper but ps2 is a dead thing of the past
Sure it is all simple. But it takes a genius to appreciate the simplicity. Shamelessly paraphrased.
It seems simple, until you have to do implement data structures in C; then you find yourself crying for days on end, because you can't seem to resolve the clobbered memory errors that keep popping up on you!
@@IM-qy7mf structs are very trivial.......
if you have massive experience
@@IM-qy7mf AddressSanitizer makes this significantly easier to debug, though. It's like a plugin for compilers that instruments code using the compilers' own semantic information. You should also get in the habit of writing asserts for potentially incorrect or dangerous code.
Almost a year late... On x86-based computers, eax is usually for return values. Don't forget that printf is not void, it returns a length. The compiler is a macro-assembler so it stores it on the stack anyway. What you can do is ignore the stack & use only the registers ebx, ecx & edx to store x, y & z, so in theory, it should execute faster. If i remember well, if you only want 8 bits, you can use even bx, cx & dx, or even b, c ,d
ax, bx, cx, dx are 16 bit, the lower and upper half registers al, ah, bl, bh, cl, ... are actually 8bit. Obscure knowlege FTW!
Came in the comments to find out what this line did. Thank you sir.
you can use the register keyword in C then it will compile like that
For a person very interested in how computers manage to function through tasks, you really pleased me.
I miss programming in assembly. The first code I ever wrote was 6502 Assembly on an Atari 600xl. I also programmed in the following assembly languages over the years: 8088, 80286, IBM 360, R10000 and MIPS. After 20+ other languages over the years, assembly is still the one I liked best. It just felt natural. When I first learned C and was using the Turbo C compiler, I often wrote the function headers and variable declarations in C, and just inlined the guts in assembly. Those were the days...
I don't. At all. I wrote Railsounds II in Assembly because the processor (Microchip 17C42) had 2k code space and 160 bytes of ram. It ran at 4MIPS and at the time (93) was the fastest micro on the market. I couldn't wait until I could rewrite in C. Which we did. The hardest part was convincing Neil Young, my client, that we needed to do that. The rest is history. Over a million units sold.
Agreed. Very creative, very obedient. CPU does exactly what you tell it; nothing more, nothing less. If errors exist nobody to blame but yourself; and maybe the standard libraries which for assembly are minimal and usually just the startup code. I also wrote assembly for Honeywell DPS 8 mainframe; now THAT was programming!
@@thomasmaughan4798 Not so much on the obedient part. I remember seeing in a presentation that intel's 486 was the last x86 processor to simply run the instructions, in their order. After that came the out-of-order execution optimisations. And things like processing both outcomes of a check in the time the required value is fetch from memory and then simply using the correct outcome. So, nowadays, you don't really know what and how are things actually executing inside of a processor. Sometimes a less optimized code can be better optimized by the CPU optimzer.
Using paper. I've gotta give you a thumbs up.
@LoveLiveKillBillLife Paper is a technology bruh
rax is a 64-bit register
eax is a 32-bit register which refers to the lower 32-bits of rax
ax is a 16-bit registers which refers to the lower 16-bits of eax
ah is an 8-bit register which refers to the upper 8-bits of ax
al is an 8-bit register which refers to the lower 8-bits of ax
gcc -S -masm=intel program.c
ATT syntax is ok, but I prefer Intel personally... you’re welcome and thanks for the good video!
where do u learn all of this? any good books or websites as I want to understand how the machine runs c programs better
AH, ok
@@wh7988 Pick a processor, read the documentation, the documentation will tell you what commands there are and what they do. You can look up youtube videos or books for the processor and how to program in assembly for the processor. The class I am taking right now has us using code-warrior (ide) for programming the HCS12 (mircro-controller). I am assuming going with an arms processor would be a better idea though, they are more popular.
School!
A good (but expensive) Assembly book is "Assembly Language" by Kip Irvine.
You can use Visual Studio, admittedly a "long" process to set up, to write, run, and debug MASM. Give it a go.
T-rex is a dinousaur-bit
In my last year of college. I now finally understood this! Thank You!
What I realised is that the teaching (at my was good but they didn't put enough efforts to link this to higher level of programming we were practising daily. But now it makes sense. Thank you again!
8:15 I believe that line puts tge x value to the aex, where it can set a flag. The next line sets the flag, and the next line uses it to determine wether to jump or not.
0x0f's where given on that day
@strontiumXnitrate It was a joke referring to "0 fucks given"
@strontiumXnitrate ok booomer
@@افاداتواستفادات why you gotta do em like that
inxane有害な wooooshhhh
@@افاداتواستفادات
Where the hell did that come from?
I don't know C nor assembly but I watched this from start to finish with my mouth hanging open. So interesting.
Very impressed the way you explained with the amount of knowledge you had with that assembly code.
Really enjoy your videos, started my programming journey, if you will, about 5 years ago with the idea of wanting to make video games. i later found assembly programming and electronics engineering FAR more interesting than game design. I have been learning 8086 ASM on DosBox lately hoping i can get enough experience to understand how computers work entirely, i am currently in the process of learning how different IC's work on a breadboard and hope to build my own 8bit computer soon. Thanks for getting me started on such a fun hobby i hope to make my job someday, keep up the
excellent videos! Hope to see your channel continue to grow :)
Redxone Gaming How is your progress if you don't mind asking?
Yes I am interested to know too. I would like to build 8 bit computer too.
Please answer us bro!
Maybe he figured out that using machine language in software makes your product un-portable. There are many reasons *not* to write in assembler. And there are distinct instruction sets for different CPU architectures, so you can learn one ISA (inst set architecture) or you can learn all of them; compilers *do* have their advantages. All digital computers work the same way (registers, storage, interrupts, etc) but the devil's in the detail level you can't avoid in assembler. Everybody should *know* what compilers do and appreciate that today's compilers (I've been doing this for 40 years) are very, very good. You should also understand the overhead of interpreted languages like Java & Python (and the list goes on) before you make an implementation/design decision. Knowing the heart of how most of your customers' machines work (x86_64 for {lap,desk}tops, ARM ISAs for phones/tablets) is a valuable datum, should motivate us all to write code that's as efficient as possible. I still check my assembler output most of the time, but I'm about ready to retire ... probably an "old skool" type. But today's typical bloatware sucks. *Fight it.* Take pride in your work, know what you're delivering :-) _and good luck on your autodidactic journey!_
wtf am I doing here, I can't even code
I don't know why, but this video is very satsfying to watch as a programmer. It's very logical and makes sense. Like if you'd suddenly have a partial look into a womans brain and actually start understanding something.
Why, I think everyone learns backwards, If they would start at low level which is cold hard logic memory movement and work up the chain I believe they would learn how to program much faster. Lang like basic trigger bad habits that become hard to break such as never clearing your memory or initializing variables and things like C++ have turned into a cluster fuck due to the Total Over use of OPP everyone seems hell bent on these days. I would suggest if someone wants to learn to code go back to DOS, Get Turbo C and use that, It was a great lang with great documentation to help you telling what every single command did ect.
If he tried to learn Java before ASM hes going to be crying like everyone else on this video is about how hard ASM is to understand when its WAYYYYYYY easier to understand then any lang I have ever used including Basic. I think the Fail comes with most people because they don't comment their code and lose track of whats what but its simple top down programming that can be traced with ease.
I know I will catch a mess load of flak for saying it because I still get a lot of flak for using it from time to time but I honestly believe DarkBasic is one of the better things for a programmer to start in.... Hear me out before yall hate on me. Starting off a programmer wants results, ASAP. With darkbasic its as simple as
Sync On
Make Object Cube(1,10)
Position object (1,0,0,0)
Position Camera (0,100,0)
Point Camera (0,0,0)
do
control camera using arrow keys 0,1,1
loop
wait key
That code above will draw a cube on the screen and point the camera at as well as allow you to look around with the arrow keys, it which is a great starting point for most hobby programmer since the will feel the excitement going right away with a 3d object they can manipulate. This same code in say C++ for instance would literally take hundreds or thousands of boiler plate code just to setup the engine to draw the cube and accept the input. Look into darkbasic. Its old but its effective and its fun as all hell to toy with.
I started on a TRS 80 Model 1 with 2k ram and a cassette tape player. Basic. Then a Commodore 64. Commodore Basic. Then C on my BSD systems at home, took online local community college courses for Visual basic .net and C - grew tiresome. Right about then it became evident that code monkeys had to compete with $3/hr dev teams in India. Writing on the wall was that the money would be in Java. I stuck with sys admin needs; Perl and C.
FEAR of Java, FEAR of having to think about this stuff, FEAR of actually applying what I've learned in school...
NEVER learned these basics. (been TAUGHT it many times!) Never formed this solid foundation. In other words; I can't code to save my life...but I have worked for years making money doing it. Flying by the seat of your pants every day...making it work, doing the seemingly impossible. There is reward in that, at least. It feels good to actually DO this stuff in the real world for real world paying client needs. I can't even last in a programming conversation for two minutes. My point? - Just *do* *it*.
Your teaching is great, informative and esthetic. I loved watching it. Thanks!
This is by far the best series explaning how computers works inside. Amazing work!!
I'm studying IT, and coursing a few subjects that include C, C++, Assembler and Pentium processors architecture. And this is one of the best, and more interesting video that I've seen. Great work!
Nice video bro!
Chris!
why your not verified
I like it too!!!!!!
Interesting how the "clever" compiler converts an infinite loop while(1) in absolute jump
it has a no optimization parameter in this specific case
computers are stupid. we just give them instructions,
Since it's always true, checking it is a waste of time. Even with optimizations "off" some optimizations are always done. Such as bit shifting instead of MUL/DIV by powers of 2.
great video. When I learnt about programming languages, I always wanted to somewhat understand how computers treat the information we feed them, but looking at assembly on your own is just like *question marks*
comparing it side to side to C is really insightful!
THATS SO COOL, i always programmed in C and was thinking about how it worked inside of the processor
Wonderful, keep it up bro.
For "%eax, -0x14(%rbp)", it's keeping backup of eax in a memory location as cmp command changes ax register and set flags.
Cmp changes only flag register and nothing else, that has always been the case since the 8bit processors.
As far as I know, the "move" instructions are not "mov1" but "movl" - Move Long - where long means 4 bytes.
Ain't that exactly what he had on paper?!?
@@motsgar On paper "l" and "1" look very similar. The first time I heard the video I understood "move one".
@@fnunnari actually mee too but started to question that so concluded that it must be l
This code is calculating fibonacci numbers under 255, it's just amazing.
Thanks for this great video.
I've always regarded C as a sorts of macro generator. You can almost see the result in asm when you write C. Although with any level above O1, things get totally too much for a human to read, unless you wrote the compiler.
"Back in my day we had to compile code by hand"
When I was a C developer I always used the compiler's option to output the assembler code it was creating to check it was creating good code.
There are lots of ways of coding and hints that you can give the compiler to help it understand what you want and to help it create good code.
Could you elaborate on those ways? or like name something i could look up to read more about this?
@@Noah-nj5ct add `-S` to gcc, like `gcc -S hello_world.c` ; it will write the asm to `hello_world.s`
programming C and micro controllers gives in depth beauty of computing.
Sorry for the noob question, but isn't this actually assembly? I thought machine language was basically just ones and zeros?
Yeah, you're correct; machine code is literally just binary. Otool seems to be a disassembler; it tries to format the machine code into something a little easier for a person to read
Trying to read an executable written for an operating system through a hex editor or something would leave all the header information and such in the output; making it a little more difficult to see what's going on
I could be wrong, but the actual machine code would be 1s and 0s of the low level language the CPU uses. The code shown in the video is that code translated into a kind of assembly.
Assembler code is human readable, the assembler program turns it into machine code.
Machine Code is binary ... The nemonic we use LDA ... etc is assmebly language and in hex because 255 ones and zeros take up a ton space on a line ... while ffff doesnt
converting from binary to assembly you run an ASSEMBLER and to convert a langauge like C++ you compile it into assembly language then assemble it in to machine code ... because sending ffff is easier to handle than 255 ones and zeros in a line
It is x86 (-64) assembly. Machine code is literally just bytes.
Wonder if he tried deleting that "eax" line or replacing it with a no-op or something to see if it mattered, or if it was eronious compiler overhead.
I like how he can explain this so well and is barely able to write :)
No need to write when you can type :)
Very interesting to watch! The main takeaway I got from this, is when studying DAS for coding interviews- I really doesn't matter what language you use- once you see how to read it, the fundamentals are really the same. That wen from looking horribly obscure to something I could probably get a handle on in a couple of days.
Great concept for a video- Thanks!
5:56 "Not sure what this other thing is. It writes 0 to the lower byte of the eax register (rax on 64bit but you seem to have a 32 bit machine). The other line is just setting the value of eax into the stack. Eax will hold the return of the last printf function.
"It writes 0 to the lower byte of the eax register " so what... you didn't push the envelope. It specifies "0 floating point arguments in registers passed in to variadic function".
Simple and interesting explanation, I have experience with assembler, and C ++ is my main language, but I tried to watch this like i'm a beginner.
And in my opinion, that was very easy to understanding. Big respect!)
Sry for my bad eng)))0
Thanks for the video! Glad to find others who think this is super cool. I just finished my assembly course and I'm sad its over. I'm pretty sure I'm the only student who actually did my assignments and didn't just find code to poach on stack exchange. I'm even more sure I was the only one who really enjoyed the class and preffered it over C++ and way more than Visual Basic. My C++ teacher has been giving me a hard time. Assembly is "neat" he says, but VB can make "real world programs" Humph. I figure if I love something that most people dislike, even if I don't do it directly, there's a market for doing that kind of thinking....???????
Visual Basic, ewww!! :)) Yes there is a big market for assembler and c programmers - think hardware controllers and other fancy things.
Tell your C++ teacher he is an idiot (you can quote me). VB is the worst for making real world programs. Create a Hello WOrld program in VB and compile it. You get a program that is >10K. Do it in assembly and it is 128 bytes..... He must have stock in storage manufacturers.... I'm CIO that used to teach machine code/Assembly when the first PC's came out. Wrote games on C64's until the C compiler couldn't comile them anymore and switched to (macro) assembler.
You don't know programming until you have done that at least once for a larger project.
visual basic is dead. it hasn't had a real application in literally decades
Gives a true appreciation for compilers! So many layers built upon layers to translate everything into machine code.
"subq $0x20, %rsp" will reserve 32 bytes of space on stack for the function to use x,y and z.
and other variables used by the compiler, as well
Linux and Mac uses AT&T assembly which is so difficult to read.
I prefer intel notation.
If you have access to the original source code you can use:
clang -S -masm=intel prog_name.c
which will generate prog_name.s with Intel assembly syntax.
you don't think source should come before destination?
me, I think I'd say a = b to mean b gets into a, hence mov rax, rbx
We used ARM assembly in school and it was pretty much identical to this.
great video thanks to share
0:22 😱 how long is that paper!?
you can tell the presenter really has a finite grasp on the information when he says 'i'm not sure what this other thing is' but hey i'm really glad you enjoy this hobby of yours
This guy is beyond words. Cannot describe how grateful I am; I found the video channel I have been looking for
holy cow he literally printed his print outputs 0:11
Bro it aint that hard. 0 1 1 2 3 5 8 13 21 34 55 89 144 so on so forth
i dont even remember commenting this, also i probably meant that it was pointless to print it but whatever
I remember spending hours upon hours typing almost endless lines of hexadecimal code into the computer's RAM and then compiling it overnight and recording it onto DAT cassettes so I could play computer games. Intel 4004 processor, 4k of RAM, with a 12" amber CRT... Good times... Good times...
How old are you?
@@CamaradaArdi 150 y.o at least
@@MrKidori How old do you think digital computers are?
Having done web development for so long now, this is a refreshing reminder of my college C class. Nostalgic, to say the least!
Your channel is awesome.
Ta chaîne est géniale.
Merci
Must be a professor or something. Your expression is great Ben. Thanks for the good work.
Earliest versions of Pokemon series games were completely programmed in assembly language.
Just think for a moment how much time and focus it would have taken for those programmers.😉😉
Android pinball
I did assembly the first 5 years or so of my 45 years in IT and SW dev. Great foundation. Your explanation is pretty good.
Some “journalist” need this video, for sure :)
I see your reference there. But I got to say, most professional coders don't do stuff this hard for work.
Not that I think journalists could learn low level or high level languages to proficiency.
^ It depends on whom you call "professional coders", buddy
@@chillappreciator885 professional coders= people who won't jump out of the window if their code doesn't work
Refreshing to see Fibonacci being implemented with a loop, instead of the usual (and very terrible) recursion solution.
I see people online saying "Recursion is easier to read, faster". Whilst the last one may be true, I don't know nearly enough lol, recursive functions have always been pretty much impossible for me to read.
@@psun256 Faster, I don't know. As fast, if written properly. I too find recursive functions hard to read.
@@psun256 recursion definitely shouldn't be faster, as a general rule all the repeated function calls that have to be allocated on the stack make the recursive version of a function either slower or at least more resource intensive, the only case i've ever seen recursive recommended for is when it makes code easier to read (and the only example of this i've personally experienced was with binary trees)
@@DavideAnastasia recursion if I'm not wrong takes up far more memory, so I don't see how it could be faster.
@@jake3736 Not necessarily. Some languages (Scala comes to my mind straight away) have tail recursion optimisation, so effectively the compiler is translating recursive code into iterative one. Of course the problem of stack allocation (and eventually stack overflow) is another reason to stay away from recursion if the trade off are not very well understood (and usually young university student don't understand those at all).
I didn't know otool existed so I tabbed over to a shell on my Mac and typed 'man otool' ... this quickly prompted me to alias man to 'peter' 😏
And was your reply in fractured French?
After ignoring 5 video ..I subscribed this guy . He's worth sharing and learning from.. thanks bro
When you are making a call to printf you could go into detail about the x86 calling convention.. when it's doing lea 0x56(%rip), Edi you're actually moving the "%d
" string from the .Data section of your program into %rdi (the first parameter in the x86 calling convention) when you call movl -0x8(%rsp) you're setting the second parameter to the value in X, and movb 0x0, %al is clearing the return value register %rax
Your last point about %al is not true. The reason %al is set to 0 before the call to printf is that the function reads from %al the number of vector registers used. This is the number of floating point arguments. The printf here doesn't use any, hence it's set to 0.
In addition to NickS' comment, the first instruction does not "actually move" the string. It Loads the Effective Address of the string into RDI, in other words it calculates the string starting address and sends it off to printf().
I think you've made a mistake when you told about the stack frame. Actually it was already set up one line higher and "movl $0x0, -0x4($rbp)" just sets up one of your variables (=
I think it's a result of stack alignment to 16 bytes, and gcc is zero-initializing the unused data
Compiling to machine code happens at 2:30
This example stays mostly true with current compilers (only that GCC likes to compare x
I have no freaking idea what the hell did this men said but still satisfied
@blvckmetxl it was just an expression and I have my right to express.
And frankly I was expecting someone to explain this to me.
Probably start by reading about fibonacci series. You'll find interesting videos explaining how it appears in nature. Then read some basics of how C programming language can be used to perform certain operations like printing something on the standard output, like in this case we are printing the fibonacci series
Which processor is the machine language for?
Another point is if this program had been directly written in assembly, only 1 byte would have been needed for each variable, and the "compare with 255" would be a simple "jump on carry flag set", as the carry flag is automatically updated on each calculation.
The length of the variable is not a consequence of the language. He specifically declared the variables as ints (4 bytes long, for this particular system). And while he wouldn't have been able to read the carry flag directly in C, he could have declared the variables as unsigned chars (1 byte) and figured out the wrap around by comparing z to y.
But I the point of the video was to be as easy to understand as possible, not do pointless optimisations that would only confuse beginners.
Thats a very interesting strength reduction - although you could use an 8 bit unsigned data type in C as well for that. A properly implemented compiler would most probably do this strength reduction during the peephole optimization step, if not earlier.
You could have done that in C too, the 4 bytes is a consequence of choosing an int variable type. However, the comparison with 255 (in C) would then become a problem. But note that this would not have been any faster on a 32 or 64 bit machine.
More than awesome video bro! :D ... and I have a guess for for movl %eax, -0x14(%rbp):
CPU Register
--------------------------------------
EAX = 4 bytes
--------------------------------------
| AX = 2 bytes
| AH | AL = 1 byte
--------------------------------------
Since the printf block played around with al ... and we have stuff (x and y) on -0x8(%rbp) and on -0xc(%rbp), respectivelly ... it seems really suspicious that line playing with -0x14(%rbp), which has an offset 12 bytes away in memory from our -0x8(%rbp).
If I remember correctly, the bus actually aligns the data before sending it to the CPU from memory to improve performance, and this means including some bytes that might be used soon like 0xc(%rbp) ( cache y :D ); for instance, or even send garbage bytes so we don't have to create circuitry to get the exact byte from memory. What this means is that even though our data to be printed is on 0x8(%rbp), it will be also sent to the CPU 0xc(%rbp), 0x10(%rbp) and -0x14(%rbp).
Therefore, I am going to guess this is actually the flush of buffer call for printing... and this the exact time when the printf is actually displaying the values for x on the screen...
I guess more information could be given if you compiled with -g -O0 ... however, this video is an awesome explanation. A+!
Yeah man I agree.
+Desnes Augusto Nunes do Rosário Right, it seems specific to the author's platform, I compiled the same program with Ubuntu 14.04 and don't see the same spurious instruction when using any of the -O options, but I do see changes in the assembler to optimize z = x + y, so yeah, a good debugger run would help interpret who's responsible for that out of place instruction.
its the compiler he is using actually and the version of the language and the system he is on ... the eax is his usable side of the c language stdio.h ... and it is used to allow formating ... as his printf statement wants to print a %D data bit then do a carriage return ... with the data pointed to by the value x ....
.
eax is a formating stack alu and program controller in itself ... because he sent a format command the language has to strip the format out of the print command ... and the data pointer and then load the data ...
.
printf ("%d%/n",x) ....
prrintf is in stdio.h ... so the first thing is to push it onto a stack to pull the format info out .. then advance and find the data pointer ... then advance and place the data into the formatted array and advance ... then send it off to the default display device .... just like when you step from 0000 0001 and have to fetch the first code line and strip it apart then find what it means and do it ... youre doing the exact same thing here just with software
@@0623kaboom Dude, no. Stop. That line is a spill to cache the value of eax on the stack because it will be clobbered by the return value of the next printf call. The only purpose of eax within this stack frame is to hold the return value of printf. Literally nothing more. WIth even the smallest level of optimization turned on you see the line disappear as it isn't even remotely needed.
I made a hello world program in C then edited the output in the binary using VSCode Hex Editor on (line?) 00002000. I compiled the program on Linux x86_64 with gcc 12.2.0.
edit: edited some empty lines and nothing changed, does this mean I can encode stuff in executables lol
If you fully know the file structure, address values, and you can change them if your dimension increases, then yes. But not with all the bytes in the structure, this will work, but with many it will work.
cant believe if you change the executable it will change what it does, that's so unexpected!
there's a joke: "for someone who knows assembly very well, every program is open source"