At university 20 years ago, I had a project involving MD5 hash calculation written in Delphi/Pascal. For some reason, the Pascal implementation was incredibly slow, even with bitwise optimization. I injected ASM into my project to speed it up, and even the simple, unoptimized ASM version was twice as fast as the optimized Pascal version. With some additional tweaks, I improved performance by up to three times. Nowadays, compilers are smart enough 😅
The biggest hint for assembly calls is to use -S on gcc and generate a prototype for your function in assembly. IE, if you want to make an assembly function to be called int addnumbers(int a, int b) { return a+b; } IE, do a prototype of the function in C, then use gcc -S to generate the file. Then you just edit the output file .s to add your assembly code. gcc fills in all the details for you. This does generate AT&T format code, not Intel code, but I think you are better off learning the AT&T code format in any case. Its not that different, and you can use gcc to generate example code for you.
Hi assembly experts, can you explain why he passes and returns values in R registers, which are 64 bits, but in C he specified int, which is 32 bits. Why is there no conflict?
There is conflict, but conveniently E registers are just lower 32 bits of R registers. So as consequence meaninglessly higher portion of %rax and %rdi will be modified, but that's fine.
This is actually brilliant, you could use this method to create a new language via assembly linked into C. Makes me wonder how the o-file-format looks like to create a compiler for the new language so that you go directly into generating those o-files using that compiler. Once more you have dropped a golden nugget 👍👍
The .o files or object files contain machinecode of the compiled source files without concrete adress values for extern stuff like calls to functions, which are not defined in the same source file. The linker takes these .o files and resolves the adresses to the concrete ones used in the final executable file.
thank you so much for your content and videos, it seems like there are not many people who teach or share about actually putting C to practical use from what i've found but your videos fill so many gaps and are so helpful, thank you again and if you are open to take any suggestions for new videos in C, maybe how to parse JSON / XML or making HTTP requests for beginners in C?
kinda, yeah. with assembly, things differ in that each symbol is working almost with bare metal components, so it’s best if you get a good understanding of how computers work at their bare lowest functions. most good youtube assembly tutorials should cover this kinda thing at the start before you even get to assembly.
I usually declare the functions with the extern keyword if added into a C file like this. That way it's extra obvious they are from a different file. Or I create a header file.
Should have told me this 20 years ago. I had to fetch arguments from stack then add to SP(for speed) the size of the args plus return address. Were naked functions supported back then? I remember GCC supporting inline asm with the .intel directive though as this was my fallback until optimizing compilers came into the picture.
There is the fastcall convention that can pass two 32-bit arguments in ECX and EDX (vectorcall also allows floats and vector types in SSE/AVX registers). For lack of general registers on IA-32, you usually can't pass more. Also only Visual C++ and GCC 4.0 onwards supported it.
Thanks for the video, but what if I want to build a secure executable ? I mean, I don't want to have the '-z noexecstack' option during my build ? it seems like calling assembly from C is not secure.
The -z noexecstack option does not make the executable unsecure, its just to tell gcc that the stack should be non-executable, otherwise there would be a warning when combining the assembly and the c code with GCC. there is also a way to specify that the stack is non executable inside of the assembly and that can be an alternative for the flag
Hi Nir! I've been following your videos for a good while, you're one of my favorite content creators ever. So much golden information on here. I've fallen in love with low-level coding (C, os-dev, embedded, etc) in the past 2 years and I'd love to pursue it as a career path. I tried getting a formal education in programming before but I just can't bear working with Java and web stuff, i find it incredibly boring and soulless. Do you have any tips on how to get into the market in this area (low-level programming)? is it even a viable path or is it just a pipe dream in 2024? whenever i find jobs for C developers or anything in that nature, they're only looking for senior devs with years and years of experience and a bachelor's. Thanks for the amazing video as always!
It is both a signed and unsigned addition, in the end in terms of the actual bytes in memory signed and unsigned are the same, but the ADD instruction takes care of setting the sign flag properly after the addition, more info in the official intel instruction reference :) www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
Tysm for the feedback and the link. 😊 rax indeed truncates to eax (from the objdump). so are rdi and rsi before calling. thus adding __INT_MAX__ and -__INT_MAX__-1 gives -1 although 64bit registers are used. and addNumbers(5 , -1) gives 4.
It's defined by the function calling convention. Linux and pretty much every UNIX-like OSes on the x86-64 platform use the System V AMD64 ABI, where registers RDI, RSI, RDX, RCX, R8 and R9 are used to pass the first 6 integer arguments to a function.
it's the call convention of gcc. Before the actual subroutine call, gcc will add "mov" instructions to take the C function arguments and put them in those registers
Wow this was really cool I know about CDECL and STDCALL conventions from school but that’s for x86_32 nasm. Why don’t you have to push the arguments onto the stack and why do you just use registers? Is it faster that way? Can’t there be something useful in RDI that you’ve overwritten?
Unlike most popular x86 calling conventions which pass all the arguments through the stack, on x64 the calling conventions pass the first couple of arguments through registers and the rest through the stack.
@@nirlichtman okay and you don’t need to create a new stack frame, save and load the previous states of registers when using this convention? Something along the lines of pushad popad, so that the value 5 as from your example doesn’t change in the rdi register when calling the function? Or is it because C handles that on its own?
@@nezby3945 In leaf functions like these that don't use the stack, you don't have to set up a stack frame. Even if you do, the SysV ABI defines a "red zone" of 128 bytes under the stack pointer that you can use without interference from signal handlers. Also since RDI is one of the registers used to pass parameters, it's also part of the scratch registers that the caller has to save if needed before the function call, while the called function can use it freely.
@@D0Samp Oh wow I had no idea that It worked like that our uni teacher never told us something of such existed. Thank you very much for clarifying and taking the time out of your day to do so. If only there were more people like you :D!!!!!!!
Hi @NirLichtman this is just library linking.... It's not calling assembly from c. Even if you have wrote the same code in c and compiled it and linked the binary it would have worked. But anyways good video.
In the end, it depends on which one you feel more comfortable with/the project you are working on, for example the assembly parts in the Linux kernel are written with the AT&T syntax, but the general standard syntax of x86 assembly is in the intel syntax (including in the official intel manual), I would recommend being at least familiar in a basic level with both.
The only purpose of AT&T syntax is uniformity among instruction sets for compiler output. Its (source, destination) format comes from the DEC assembler for the PDP-11 and VAX (and Alpha), which was the birth place of Unix. Basically all other modern instruction sets like MIPS, ARM and RISC-V use (destination, source) without prefixes as well.
Yes, but in most cases not directly like we can do with lower level languages such as C, for example in Python you need to create a separate DLL or Shared Object (depending if you are writing for Windows or Linux) which implements a specific interface that Python can work with, so that Python can later import it and work with it
I started programming by learning some HTML in w3schools and building websites since I really wanted my own website, from there on went and learned more advanced stuff first CSS and JavaScript and eventually also C and Assembly, I would recommend starting with the basics first and only after you start feeling comfortable moving on to more advanced stuff, good luck!
@@nirlichtman I got another question. What resources do you recommend for learning command line and general computer knowledge stuff like how to work operating systems and shit?
Depends, are you interested in learning Windows or Linux? If you mean Linux, in general I would highly recommend learning to work with man pages and using them to learn about the various basic commands, you can start by running "man man", and then start by reading the man pages of the basic commands for instance ls, cat, grep ... For getting started videos about this kind of stuff I would recommend Network Chuck and Engineer Man
@@nirlichtman More so on Windows but Linux is interesting too. I'm a total beginner so I wanted to learn Windows first. Ty for the youtuber recommendations though.
*Symbols*. The linker finds one and only one matching symbol definition for each reference, in static linking literally copying the definition code into place and in dynamic linking inserting an executable address resolution. The distinction is important when you have function overloads, name mangling (functions in classes, in C++, for example), and non-function objects. Calling convention also sometimes impacts the fully qualified symbol name, and therefore resolution.
Is this actually "calling assembly code from C" ?, the compiler will generate main.s and then the assembler will assemble it to main.o, since you have ext.o, the linker will know that there is an UND symbol in main.o and it will statically link it from ext.o, this is not calling Assmbly from C, so I think writing Assembly in a C file or including an assembly file from C file is the actual thing you need to cover. BTW, good videos, I really appreciate your work.
From the linker's perspective there's indeed no programming languages at all, it resolves symbols and takes startup code (from the C standard library, if you use the C compiler frontend) to produce an executable. The biggest difference here is that only the foreign function follows the conventions of the C compiler, usually it's either a higher level language like C++ or Rust that calls into a C library or two of them using C conventions as a lingua franca to communicate. I think it's still valid to call this "calling assembly" if that assembly procedure just follows the common denominator in calling conventions.
Right in the part where you said that C needed to know that the addNumbers function existed, shouldn't you have used "extern int addNumers(...);"? Great video by the way, cheers!
Hi @NirLitchman you promised that you would make a sereis on linux from boot to shutdown... i.e Indepth linux kernel walkthrough.... Eagerly waiting for that.
The EABI guarantees that the specified input registers do not need to be preserved. If you interfere with other registers you have to save and restore their state.
Okay, now I can add "assembly expert" to my CV
Hell yeah
It's crazy how well you can demystify some things with the simple examples. Makes this stuff feel so much more accessible!
@@MadMathMike now you are ready to program Roller Coaster Tycoon
The fact that you are showing documentation for every single step is unprecedented! Thank you for the content once more!
this should be the golden standard for howto's
we need a standard above golden honestly, the style of howto nir does goes above and beyond my expectations every time, we need a diamond standard
thought you were going for the inline assembly
I have no idea why, but I love that this is possible.
At university 20 years ago, I had a project involving MD5 hash calculation written in Delphi/Pascal. For some reason, the Pascal implementation was incredibly slow, even with bitwise optimization. I injected ASM into my project to speed it up, and even the simple, unoptimized ASM version was twice as fast as the optimized Pascal version. With some additional tweaks, I improved performance by up to three times. Nowadays, compilers are smart enough 😅
It's great to see some updates from this channel :)
Awesome video! More assembly/C. Either together or separate. Great job!
I knew this, but I was curious to see how well you would cover this in the advertised 4 minutes. The answer is "very well". Very well indeed.
Hi there. I love your videos. Such beautiful simplicity.
These videos are so cool and easy-to-understand :) Well done!
Ah, brings back memories from early 90's. Actually back then I inlined asm in C too not just external files.
Thank you very much, I have always wondered how to call assembler code, your explanation is great!!! Thank you so much!! It
this is one of the more obvious tricks id say, but it's still a really cool thing you can do. and as always, you make it so easily digestible
Great video, super underrated channel!
Nice video. That calling convention was new for me. Please also make a video on how to create functions in assembly with more complex data types.
You make hard things easy 🙌🙏
The biggest hint for assembly calls is to use -S on gcc and generate a prototype for your function in assembly. IE, if you want to make an assembly function to be called int addnumbers(int a, int b) { return a+b; } IE, do a prototype of the function in C, then use gcc -S to generate the file. Then you just edit the output file .s to add your assembly code. gcc fills in all the details for you.
This does generate AT&T format code, not Intel code, but I think you are better off learning the AT&T code format in any case. Its not that different, and you can use gcc to generate example code for you.
Great content as usual ❤
I'm coming in for another, "Neat!"
Neat!
Hi assembly experts, can you explain why he passes and returns values in R registers, which are 64 bits, but in C he specified int, which is 32 bits. Why is there no conflict?
There is conflict, but conveniently E registers are just lower 32 bits of R registers. So as consequence meaninglessly higher portion of %rax and %rdi will be modified, but that's fine.
This is actually brilliant, you could use this method to create a new language via assembly linked into C. Makes me wonder how the o-file-format looks like to create a compiler for the new language so that you go directly into generating those o-files using that compiler. Once more you have dropped a golden nugget 👍👍
The .o files or object files contain machinecode of the compiled source files without concrete adress values for extern stuff like calls to functions, which are not defined in the same source file. The linker takes these .o files and resolves the adresses to the concrete ones used in the final executable file.
I'm almost certain sure that's how Zig does it
thank you so much for your content and videos, it seems like there are not many people who teach or share about actually putting C to practical use from what i've found but your videos fill so many gaps and are so helpful, thank you again and if you are open to take any suggestions for new videos in C, maybe how to parse JSON / XML or making HTTP requests for beginners in C?
does assembly use the following syntax? instruction dest, src
also are registers special variables? common for all assembly? like you used rax?
kinda, yeah.
with assembly, things differ in that each symbol is working almost with bare metal components, so it’s best if you get a good understanding of how computers work at their bare lowest functions. most good youtube assembly tutorials should cover this kinda thing at the start before you even get to assembly.
If you mean x86 assembly then yes and yes.
Syntax will be different between different architectures (x86, arm, risc-v...)
intel syntax uses dest, src
atnt does it the other way around
thanks guys!
I usually declare the functions with the extern keyword if added into a C file like this.
That way it's extra obvious they are from a different file.
Or I create a header file.
I was wondering about this, but apparently `extern` is the default linkage for a C function declaration.
@@RayBellis yes, it is. Still better for readability though
Should have told me this 20 years ago. I had to fetch arguments from stack then add to SP(for speed) the size of the args plus return address. Were naked functions supported back then?
I remember GCC supporting inline asm with the .intel directive though as this was my fallback until optimizing compilers came into the picture.
There is the fastcall convention that can pass two 32-bit arguments in ECX and EDX (vectorcall also allows floats and vector types in SSE/AVX registers). For lack of general registers on IA-32, you usually can't pass more. Also only Visual C++ and GCC 4.0 onwards supported it.
Oh hey, it's the shower thought i immediately forgot about!
why would execstack be relevant here? I thought that only mattered with nested functions
Thanks for the video, but what if I want to build a secure executable ? I mean, I don't want to have the '-z noexecstack' option during my build ? it seems like calling assembly from C is not secure.
The -z noexecstack option does not make the executable unsecure, its just to tell gcc that the stack should be non-executable, otherwise there would be a warning when combining the assembly and the c code with GCC. there is also a way to specify that the stack is non executable inside of the assembly and that can be an alternative for the flag
@@nirlichtman Thanks for the clarification
Can you make a video about making coroutines/virtual threads in C and assembly?
Hi Nir! I've been following your videos for a good while, you're one of my favorite content creators ever. So much golden information on here. I've fallen in love with low-level coding (C, os-dev, embedded, etc) in the past 2 years and I'd love to pursue it as a career path.
I tried getting a formal education in programming before but I just can't bear working with Java and web stuff, i find it incredibly boring and soulless.
Do you have any tips on how to get into the market in this area (low-level programming)? is it even a viable path or is it just a pipe dream in 2024? whenever i find jobs for C developers or anything in that nature, they're only looking for senior devs with years and years of experience and a bachelor's.
Thanks for the amazing video as always!
man! I had no idea of those directives! made my day! ty. btw, is it a signed addition?
It is both a signed and unsigned addition, in the end in terms of the actual bytes in memory signed and unsigned are the same, but the ADD instruction takes care of setting the sign flag properly after the addition, more info in the official intel instruction reference :) www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
Tysm for the feedback and the link. 😊 rax indeed truncates to eax (from the objdump). so are rdi and rsi before calling. thus adding __INT_MAX__ and -__INT_MAX__-1 gives -1 although 64bit registers are used. and addNumbers(5 , -1) gives 4.
Thank you for the explanation
fun fact: compilers often uses load effective address instead of add instruction:
addNumbers:
lea eax, [rdi+rsi]
ret
I’m just missing how the arguments got passed to the registers RDI and RSI.
It's defined by the function calling convention. Linux and pretty much every UNIX-like OSes on the x86-64 platform use the System V AMD64 ABI, where registers RDI, RSI, RDX, RCX, R8 and R9 are used to pass the first 6 integer arguments to a function.
it's the call convention of gcc. Before the actual subroutine call, gcc will add "mov" instructions to take the C function arguments and put them in those registers
@@typedef_ Thank-you!
@@dorquemadagaming3938 Thank-you!
Wow this was really cool I know about CDECL and STDCALL conventions from school but that’s for x86_32 nasm. Why don’t you have to push the arguments onto the stack and why do you just use registers? Is it faster that way? Can’t there be something useful in RDI that you’ve overwritten?
Unlike most popular x86 calling conventions which pass all the arguments through the stack, on x64 the calling conventions pass the first couple of arguments through registers and the rest through the stack.
@@nirlichtman okay and you don’t need to create a new stack frame, save and load the previous states of registers when using this convention? Something along the lines of pushad popad, so that the value 5 as from your example doesn’t change in the rdi register when calling the function? Or is it because C handles that on its own?
@@nezby3945 in this case since these registers are used for passing args (and rax is used for the ret value), no preservation is needed
@@nezby3945 In leaf functions like these that don't use the stack, you don't have to set up a stack frame. Even if you do, the SysV ABI defines a "red zone" of 128 bytes under the stack pointer that you can use without interference from signal handlers. Also since RDI is one of the registers used to pass parameters, it's also part of the scratch registers that the caller has to save if needed before the function call, while the called function can use it freely.
@@D0Samp Oh wow I had no idea that It worked like that our uni teacher never told us something of such existed. Thank you very much for clarifying and taking the time out of your day to do so. If only there were more people like you :D!!!!!!!
your videos are awesome!!
How did you get the linux terminal on powershell?
it's WSL2
3:32 Pretty sure newer versions of GCC automatically disable executable stack except when there's nested functions
Hi @NirLichtman this is just library linking.... It's not calling assembly from c. Even if you have wrote the same code in c and compiled it and linked the binary it would have worked. But anyways good video.
i thought this would be a video about inline assembly in c, i'd like to see how it works
just asking, which one is more better? intel or AT&T syntax?
In the end, it depends on which one you feel more comfortable with/the project you are working on, for example the assembly parts in the Linux kernel are written with the AT&T syntax, but the general standard syntax of x86 assembly is in the intel syntax (including in the official intel manual), I would recommend being at least familiar in a basic level with both.
The only purpose of AT&T syntax is uniformity among instruction sets for compiler output. Its (source, destination) format comes from the DEC assembler for the PDP-11 and VAX (and Alpha), which was the birth place of Unix. Basically all other modern instruction sets like MIPS, ARM and RISC-V use (destination, source) without prefixes as well.
Tried going the other way around (calling c from assembly) back in the day for a school project.. could never figure out how to do it properly.
That's a good idea for a future vid, added to my list
Is it possible to call assembly code from higher level languages, like C# or Java?
Yes, but in most cases not directly like we can do with lower level languages such as C, for example in Python you need to create a separate DLL or Shared Object (depending if you are writing for Windows or Linux) which implements a specific interface that Python can work with, so that Python can later import it and work with it
Stupid question but where do I even start with all this computer stuff if I don't even understand what I'm looking at?
I started programming by learning some HTML in w3schools and building websites since I really wanted my own website, from there on went and learned more advanced stuff first CSS and JavaScript and eventually also C and Assembly, I would recommend starting with the basics first and only after you start feeling comfortable moving on to more advanced stuff, good luck!
@@nirlichtman thanks bro
@@nirlichtman I got another question. What resources do you recommend for learning command line and general computer knowledge stuff like how to work operating systems and shit?
Depends, are you interested in learning Windows or Linux? If you mean Linux, in general I would highly recommend learning to work with man pages and using them to learn about the various basic commands, you can start by running "man man", and then start by reading the man pages of the basic commands for instance ls, cat, grep ... For getting started videos about this kind of stuff I would recommend Network Chuck and Engineer Man
@@nirlichtman More so on Windows but Linux is interesting too. I'm a total beginner so I wanted to learn Windows first. Ty for the youtuber recommendations though.
thanks man :)
just realised the linker finds similar functions names in the given file name ( as parameters )
*Symbols*. The linker finds one and only one matching symbol definition for each reference, in static linking literally copying the definition code into place and in dynamic linking inserting an executable address resolution.
The distinction is important when you have function overloads, name mangling (functions in classes, in C++, for example), and non-function objects. Calling convention also sometimes impacts the fully qualified symbol name, and therefore resolution.
Can you please make a video about __asm embedding right into C/C++ source code?
Yah thats a good idea, I will add to my list
@@nirlichtman Thank you!
Is this actually "calling assembly code from C" ?, the compiler will generate main.s and then the assembler will assemble it to main.o, since you have ext.o, the linker will know that there is an UND symbol in main.o and it will statically link it from ext.o, this is not calling Assmbly from C, so I think writing Assembly in a C file or including an assembly file from C file is the actual thing you need to cover. BTW, good videos, I really appreciate your work.
From the linker's perspective there's indeed no programming languages at all, it resolves symbols and takes startup code (from the C standard library, if you use the C compiler frontend) to produce an executable. The biggest difference here is that only the foreign function follows the conventions of the C compiler, usually it's either a higher level language like C++ or Rust that calls into a C library or two of them using C conventions as a lingua franca to communicate. I think it's still valid to call this "calling assembly" if that assembly procedure just follows the common denominator in calling conventions.
You didn't say the outro line! But great vid as always
Right in the part where you said that C needed to know that the addNumbers function existed, shouldn't you have used "extern int addNumers(...);"? Great video by the way, cheers!
Thanks! Regular non-inline function declarations are extern by default, so no need
All functions in C are implicitly extern
C is already a "portable assembly language" so calling assembly language from C is a bit redundant.
Hi @NirLitchman you promised that you would make a sereis on linux from boot to shutdown... i.e Indepth linux kernel walkthrough.... Eagerly waiting for that.
Stay tuned for next video :)
I dindt expect to see Windows on a C and Assembly video
This is called efficient code
Thanks.
Next video calling C functions from Java
👍Thanks man!
nice
(5+5 is probably not the best test to check this code 😉)
This is very hardcore!
well, that dead short and simple one could have been implemented with inline asm, and asm/C interactions deserve a slightly longer video
shut up, he did what the title said.
@@JeersNX no u
tHiS cOulD hAvE beEn 🤡
Your code is beautiful, your desktop cries for some ricing, my brother in Christ.
hi
hola
Low Level Programming, fuck yea.
Why would anyone do this anyway? Isnt it dangerous as the c++ code may also be messing with registers and us doing the same may mess up state?
The EABI guarantees that the specified input registers do not need to be preserved. If you interfere with other registers you have to save and restore their state.
@@RayBellis Oh I see, tiny context switch
hi
hi