Спасибо за видео, но у меня возник вопрос. Зачем это все нужно, когда можно просто перейти в EntryPoint через main, который находится в вкладке Symbols X64DBG(Ollydbg).
Semi-related: you may have assumed that the code in main is the first code to run in the process. Obviously this video shows that's not true, but it's not the entire story either: there's also code that runs before the entry point, too. If you set x64dbg to break on the entry point and look at the stack, you'll see a return to KERNEL32.DLL on the top of the stack - because even before running the entry point, the Windows loader has to set up the Import Address Table, the TLS Index, etc. so that those things can be read from memory by the executable's code. So KERNEL32.DLL code runs before anything in the main executable file, a bit counterintuitively, and then it calls the entry point like a function pointer. This fact is sometimes abused by IAT rebuilders in order to avoid calling LoadLibrary and GetProcAddress. The idea is if the code looks up in the stack to where the return to KERNEL32 would be, it can take that address and, knowing it's in KERNEL32 somewhere, keep looking backwards from that address until it finds the PE Header of KERNEL32 (by looking for its signature, "MZ.") Then it can manually traverse KERNEL32's export table to find the addresses of its exports. At this stage, you might use Toolhelp Snapshots to do the same for other DLLs. The return address can also be jumped to, as a way to exit the process without using any imports.
Ah you are describing the loader, the first part of PE code that is run is either the TLS callbacks, or entry point (DLLMain for dlls). You should join our discord, I think you would enjoy the topics there discord.gg/oalabs
Nice vid! It's a Visual Studio stub. If you take the same code that was written in VS and compile it with something else (not MSVC), the stub will not be there obviously or you will get something else. But it is good to know this so you can identify the compiler used and know what you should analyze and what is just a compiler code.
OMG 🤦♂️ I can't believe I didn't catch that hahah! My bad, yes RCX, RDX, R8, and R9. I don't do a lot with 64-bit so it's rusty... maybe I need to watch my own calling conventions tutorial 😆
Nice and useful video! (as alwys) Although I have a dumb question. The caller allocates "shadow space" only on x64 or it also does it in x86? I looked it at myself and my hypothesis is it's only a thing in x64 applications.
Can you please go over how to analyze x86 ELF malware? I have not been able to analyze one. I use readelf, strace, ltrace, gdb for debugging and IDA for static code analysis. The symbols of binaries that I am looking at are stripped and I can’t tell what’s what. I figured I can try to catch every syscall using gdb, or look for int 80 instructions and look for the interesting syscalls I saw in strace and start analysis or something. Is that how you do it? I would like to know your methodology please.
I really don't look at ELF binaries very often so I don't have a specific Linux setup. The last ELF malware I had to analyze was some ransomware that was encrypting ESXi hosts. In that case I did the full analysis statically. Even with a stripped binary you can use Lumina or other custom solutions to label library code. If you are going to be analyzing a lot of ELF malware (is there a lot?) I would suggest investing some time in building out a good FLIRT db. There are a few open source attempts at this that might help get you started (github.com/push0ebp/sig-database).
If you compile with MSVC the options are limited, you can check out TLS callbacks though, those used to be a pretty common malware trick (isc.sans.edu/diary/How+Malware+Defends+Itself+Using+TLS+Callback+Functions/6655). Obviously the binary could be modified post-compilation or something custom setup maybe using mingw or another compiler but that's far out of the scope of the tutorial.
There's nothing stopping you from patching the compiled binary after the fact to hide something in CRT code, at least if you know what you're doing and avoid using the features that code sets up (such as floats, TLS, etc.) In fact, it'd be a pretty stealthy place to hide something since everyone just steps over that stuff (especially in a DLL.) It's a little high effort though.
Full tutorial with self-study examples and some more links in on Patreon here: www.patreon.com/posts/why-is-pe-entry-61343353
Спасибо за видео, но у меня возник вопрос. Зачем это все нужно, когда можно просто перейти в EntryPoint через main, который находится в вкладке Symbols X64DBG(Ollydbg).
Can't wait for the video on calling conventions
As FreeDomSy pointed out I may need that video more than anyone haha!
Added to my programming playlist
Semi-related: you may have assumed that the code in main is the first code to run in the process. Obviously this video shows that's not true, but it's not the entire story either: there's also code that runs before the entry point, too. If you set x64dbg to break on the entry point and look at the stack, you'll see a return to KERNEL32.DLL on the top of the stack - because even before running the entry point, the Windows loader has to set up the Import Address Table, the TLS Index, etc. so that those things can be read from memory by the executable's code. So KERNEL32.DLL code runs before anything in the main executable file, a bit counterintuitively, and then it calls the entry point like a function pointer.
This fact is sometimes abused by IAT rebuilders in order to avoid calling LoadLibrary and GetProcAddress. The idea is if the code looks up in the stack to where the return to KERNEL32 would be, it can take that address and, knowing it's in KERNEL32 somewhere, keep looking backwards from that address until it finds the PE Header of KERNEL32 (by looking for its signature, "MZ.") Then it can manually traverse KERNEL32's export table to find the addresses of its exports. At this stage, you might use Toolhelp Snapshots to do the same for other DLLs. The return address can also be jumped to, as a way to exit the process without using any imports.
Ah you are describing the loader, the first part of PE code that is run is either the TLS callbacks, or entry point (DLLMain for dlls). You should join our discord, I think you would enjoy the topics there discord.gg/oalabs
Thank you for the breakdown!
Nice vid! It's a Visual Studio stub. If you take the same code that was written in VS and compile it with something else (not MSVC), the stub will not be there obviously or you will get something else. But it is good to know this so you can identify the compiler used and know what you should analyze and what is just a compiler code.
I like your tutorials already, but could you pleeeease make font bigger at the beginning of every video?) It will save my eyes)
Excellent as always!
This has always confused me especially when the binary is virtualized. Great tutorial, like always
Exactly! Our next tutorial on Patreon is looking precisely at this with VMP 😉
10:20 I think the moves are RCX, RDX, R8 R9 for the first 4 args on x64 and not R10
OMG 🤦♂️ I can't believe I didn't catch that hahah! My bad, yes RCX, RDX, R8, and R9. I don't do a lot with 64-bit so it's rusty... maybe I need to watch my own calling conventions tutorial 😆
Nice and useful video! (as alwys) Although I have a dumb question. The caller allocates "shadow space" only on x64 or it also does it in x86? I looked it at myself and my hypothesis is it's only a thing in x64 applications.
x86 doesn't use shadow space it's an x64 only feature (also it wouldn't make too much sense except for fascall calling conventions)
Nice
Do you have any remote positions open for malware reverse engineers? I’ve been learning some stuff over the years from you and I think I’m ready 😤
Best place to find remote RE jobs is ninjajobs.org/. It's free, and I highly reccomend!
What do you think about switching the video thumbnails to dark mode?
lol I don't think about it 😂
Can you please go over how to analyze x86 ELF malware? I have not been able to analyze one. I use readelf, strace, ltrace, gdb for debugging and IDA for static code analysis. The symbols of binaries that I am looking at are stripped and I can’t tell what’s what. I figured I can try to catch every syscall using gdb, or look for int 80 instructions and look for the interesting syscalls I saw in strace and start analysis or something. Is that how you do it? I would like to know your methodology please.
I really don't look at ELF binaries very often so I don't have a specific Linux setup. The last ELF malware I had to analyze was some ransomware that was encrypting ESXi hosts. In that case I did the full analysis statically. Even with a stripped binary you can use Lumina or other custom solutions to label library code. If you are going to be analyzing a lot of ELF malware (is there a lot?) I would suggest investing some time in building out a good FLIRT db. There are a few open source attempts at this that might help get you started (github.com/push0ebp/sig-database).
I'm guessing windows needs to do some fixing before it can call main, like setting up the heap etc...
What's stopping malware dev from executing code before the main as a form of "obfuscation"?
If you compile with MSVC the options are limited, you can check out TLS callbacks though, those used to be a pretty common malware trick (isc.sans.edu/diary/How+Malware+Defends+Itself+Using+TLS+Callback+Functions/6655). Obviously the binary could be modified post-compilation or something custom setup maybe using mingw or another compiler but that's far out of the scope of the tutorial.
There's nothing stopping you from patching the compiled binary after the fact to hide something in CRT code, at least if you know what you're doing and avoid using the features that code sets up (such as floats, TLS, etc.) In fact, it'd be a pretty stealthy place to hide something since everyone just steps over that stuff (especially in a DLL.) It's a little high effort though.