The video that never stops giving Made brainfuck compiler with only movs? Really cool Then it continued with "also made a C compiler for 32 bit arithmetic, divisons, floating point operations, matrices, sine and cosine, ray tracing..." "So I looked into what other instructions I could use" *proceeds to make conversions scripts for the entire library of x86 functions* What's next? A compiler with no instructions? "MoVfuscator 3.0" fuck
It's interesting how the "bottles of beer" program runs much faster towards the end (45:45) than at the start (28:00), probably because all the [ - ] snippets in brainfuck take such a huge amount of time to execute, since they'll have to run over the whole program for a decrement by one.
OMGWTFLOL?!? This dude just FLEW through his presentation in every way! The non-stop (yet extremely coherent) words coming out of his mouth, effortless navigation through his presentation slides and flawless execution of demos. All put together made for probably the most impressive presentation I've ever seen at a conference ever!! That shit set the bar for case study presentations or lecture type lessons/education or...idk what I'm trying to say except that I think everything about this was awesome! Even tho at the same time he makes me wanna just kind of give up at idk... life? I mean, it's like I really wanted to join the NBA but just watched Kobe Bryant score 81pts and realize I can't compete with that shit so I scratch NBA Player off my list lol! 😂
That approach could eliminate a whole sort of timing attacks in cryptographic systems. If all branches are executed anyway, every system-state would take the same amount of execution time.
You would still leak a lot of information through cache timing. You'll be running all the code each time, but if it operates on null data it might have different timing from cashe hits/misses.
Holy crap this guy is awesome! He made something that could be dry and boring into something fascinating and entertaining. Also I wish I had half his intelligence.
+James Edge I had him for my systems I class, and he was awesome there too. Took C programming and x86 and made it interesting, if not always the most fun!
Crux161, Cse2421 - Systems I: Introduction to Low-Lewel Programming and Computer Organization at Ohio State University, apparently. But it looks like the course may have a different instructor now.
This is both impressive and terrifying. Impressive that it was figured out. Terrifying at the prospect that someone might actually use it to obfuscate something important.
There are processors designed specifically around just the move instruction, such as Maxim Integrated's MAXQ microcontroller and the ABLE processor used in the Synclavier.
This isn't a new discovery. Amiga commercial coders were using this type of obfuscation to protect their software from cracking, way back in the 1980s.
Does REMath / movfuscator exist outside of github? Github has since been taken over by microsoft and closed down and restricted...did anyone clone REMath / rehost it anywhere? "and nothing else" well.. and a 'mov eip start' / 'jmp start' at the end almost spit my coffee out of my nose when it hit array inc[] @17:07 not sure if I'm understanding this one -- does this conditionally stop a) only when it's time to stop b) when the execution is on the executing , rather than the scratch (ie "not executing") branch?
No this does not work, see stackoverflow.com/questions/8333413/why-cant-you-set-the-instruction-pointer-directly . This would improve the speed of the movuscated binaries by some magnitude I assume.
While this may make the code more difficult to reverse, it still can be debugged, and I'd say it's likely there could be a de-movfuscator, that is able to automatically turn instructions like this back into equivalent assembly for a pretty high amount of instructions, and if that tool became necessary, someone would build it
Would it make sense to make a basic interpreter, and store the actual program as data? By consolidating all the branches to a small fixed set, you can save a lot of looping time as the program grows. You might also consider optimizing idioms like [-] to mov wherever, 0. Maybe less good for obfuscation (although you can probably infer a lot of structure in the original), but it’s still just mov.
you should watch his other talk 'reductio ad absurdum', or something to that effect - it kinda builds on that 'program as data' part of your first sentence :)
Having one mnemonic (in this case MOV) for dozens of different modes is very CISC, not RISC. I think LDI is from a different RISC though (e.g. AVR); RISC-V uses a constant 0 register (r0/zero), so small immediates are set using ADDI, while upper bits are set using LUI or AUIP. There's an LI pseudoinstruction that can generate an ADDI+LUI pair. Anyhow, since load immediate only alters state it cannot be affected by (writes general purpose registers, reads code memory), it is not Turing complete. ARM has a similarly complex address generation system and direct access to PC, so might be an easier target. I suspect you might still find it challenging to both read and write memory.
Because x86-64 can only use RIP as a base register in addressing, not as a destination register. wiki.osdev.org/X86-64_Instruction_Encoding It's a different story on e.g. ARM, which can access PC as R15, and MSP430, where it is R0. It might still be readonly on some architectures, such as when used both as a noop destination and for relative operations.
I could image using this compiler for a demo of a program. Since who in their right mind would try to reverse engineer such a Programm. You could also advertise better performance in the real version
eliluong you first write a 0 to mem position 3 because x=3, then you write a 1 to mem position 3 because y=3. Then you read mem position 3 again because x=3 and if you read 1 it is because y = x.
21:22 - In text mode you can write to 0xb0000 to write text directly to the video-cards buffer (0xb8000 for color mode) 24:35 - You can branch by modifying the instruction-pointer. You can also perform comparisons by modifying the flags register. 27:37 - It's bigger _and_ slower? RISC can be faster than CISC if you optimize it. This is super-RISC, so it needs lots of tweaking. 33:54 - Now you _really_ need to comment your code. 46:00 - Is that _CSI: Miami_ ? 🤔 (Sunglasses always are now. ¬_¬)
Ikr. I was confused why he insisted on over-complicating the process when he could achieve pretty much everything (comparisons, branching, etc.) by manipulating the registers. I mean I can understand the desire to push yourself with extra restrictions for the challenge, but it seemed like he didn't even try it the easy way first. 😕
@@user-vn7ce5ig1z it's literally not possible to write to the EIP register. You really think after this talk he'd be that dumb to not branch using the pc register if it was possible?
The video that never stops giving
Made brainfuck compiler with only movs? Really cool
Then it continued with "also made a C compiler for 32 bit arithmetic, divisons, floating point operations, matrices, sine and cosine, ray tracing..."
"So I looked into what other instructions I could use" *proceeds to make conversions scripts for the entire library of x86 functions*
What's next? A compiler with no instructions?
"MoVfuscator 3.0"
fuck
I was impressed by the M/o/Vfuscator 1.0
Then he whipped out the M/o/Vfuscator 2.0
I see movOS coming soon :D
or maybe the nopfuscator
It's interesting how the "bottles of beer" program runs much faster towards the end (45:45) than at the start (28:00), probably because all the [ - ] snippets in brainfuck take such a huge amount of time to execute, since they'll have to run over the whole program for a decrement by one.
This is ridiculous, unbelievable, and incredible. Looking forward to that zero instruction compiler.
OMGWTFLOL?!? This dude just FLEW through his presentation in every way! The non-stop (yet extremely coherent) words coming out of his mouth, effortless navigation through his presentation slides and flawless execution of demos. All put together made for probably the most impressive presentation I've ever seen at a conference ever!! That shit set the bar for case study presentations or lecture type lessons/education or...idk what I'm trying to say except that I think everything about this was awesome! Even tho at the same time he makes me wanna just kind of give up at idk... life? I mean, it's like I really wanted to join the NBA but just watched Kobe Bryant score 81pts and realize I can't compete with that shit so I scratch NBA Player off my list lol! 😂
you got me. exactly the same feeling
... and the branch prediction unit could not stop from crying.
That approach could eliminate a whole sort of timing attacks in cryptographic systems. If all branches are executed anyway, every system-state would take the same amount of execution time.
You would still leak a lot of information through cache timing. You'll be running all the code each time, but if it operates on null data it might have different timing from cashe hits/misses.
Great talk, surprised how fast it was
+Edwin Forsberg right?! the snake game honestly blew my face away. i figured maybe a frame every few seconds but dang!
This... is gold
Holy crap this guy is awesome! He made something that could be dry and boring into something fascinating and entertaining. Also I wish I had half his intelligence.
+James Edge I had him for my systems I class, and he was awesome there too. Took C programming and x86 and made it interesting, if not always the most fun!
this guy taught a class? Where?
Crux161, Cse2421 - Systems I: Introduction to Low-Lewel Programming and Computer Organization at Ohio State University, apparently. But it looks like the course may have a different instructor now.
I think you're the creepy one pallie.
One of the most entertaining talks i'd seen for a while :)
0:39 Start
This is both impressive and terrifying. Impressive that it was figured out. Terrifying at the prospect that someone might actually use it to obfuscate something important.
Like using a cross-assembler to translate conventional programs into all mov instructions 🙃
JMP is just a different mnemonic for Mov PC,X
change my mind
Imagine a processor designed to only use a single instruction like this.
CreeperOnYourHouse i recall an older hard drive using one with memory mapped io and alu
Easy to design. You can follow Ben Eater's series on here to build your own in a week or two. But it is horrible from a performance perspective.
That's literally a turing machine.
@@lotrbuilders5041 can the hard drive theoretically run Linux by itself?
Chimney Chap not with 1k of ram and 2k of ROM it had. With a lot of RAM it might be possible
This is my kind of trolling
awesome talk!
I can't believe I didn't see this earlier. This talk blew my mind!
Still faster than java.
Wow.. that was pointless, but incredibly entertaining. I was hooked the whole time!
Bananas. I love it!! Great presentation and mind-blowing concepts.
Of course! Because it's all just electrons MOVing from here to there. Oh my god I'm so stupid.
There are processors designed specifically around just the move instruction, such as Maxim Integrated's MAXQ microcontroller and the ABLE processor used in the Synclavier.
Doomguy called. He wants his compiler back.
See, that's the definition of a mad wizard. Bil Herd would be proud...
This isn't a new discovery. Amiga commercial coders were using this type of obfuscation to protect their software from cracking, way back in the 1980s.
Does REMath / movfuscator exist outside of github? Github has since been taken over by microsoft and closed down and restricted...did anyone clone REMath / rehost it anywhere?
"and nothing else"
well.. and a 'mov eip start' / 'jmp start' at the end
almost spit my coffee out of my nose when it hit array inc[]
@17:07 not sure if I'm understanding this one -- does this conditionally stop
a) only when it's time to stop
b) when the execution is on the executing , rather than the scratch (ie "not executing") branch?
Now I want to see a program written with a shitload of movs replacing the nops in the psychological warfare control flow.
38:02 I'm disapointed he didn't do DOOM
since the C source code is on github
Lol, but he did. Runs at 1 frame every 7mins but hey it works...
This is insane! But so awesome.
MOV [Mind], 'Blown'
Couldn't you just move a value into the program counter?
No this does not work, see stackoverflow.com/questions/8333413/why-cant-you-set-the-instruction-pointer-directly . This would improve the speed of the movuscated binaries by some magnitude I assume.
I knew there would be some fundamental problem. Thanks for the link!
are you still trying to make your own game though?
I wonder if this is something games could use for their copy protection.
andrewh111 No, because the framerate would make it unplayable
Yes. It would make the games unplayable, but it would be an even more annoying form of DRM.
There's a movfuscated version of DOOM, running at one frame every 7 hours :)
@@ITR You wouldn't run the whole game under it, just a little bit of stuff that's super crucial to making everything work.
While this may make the code more difficult to reverse, it still can be debugged, and I'd say it's likely there could be a de-movfuscator, that is able to automatically turn instructions like this back into equivalent assembly for a pretty high amount of instructions, and if that tool became necessary, someone would build it
You can't just "mov [EQ+eax], 0", because the eax can be any arbitrary number, which can even point to the address of the next assembly instruction!
awesime!
5:37 that's basically what Itanium was thinking
Would it make sense to make a basic interpreter, and store the actual program as data? By consolidating all the branches to a small fixed set, you can save a lot of looping time as the program grows. You might also consider optimizing idioms like [-] to mov wherever, 0. Maybe less good for obfuscation (although you can probably infer a lot of structure in the original), but it’s still just mov.
you should watch his other talk 'reductio ad absurdum', or something to that effect - it kinda builds on that 'program as data' part of your first sentence :)
I wonder is &MOV R, [x] implemented in Arm CPU? is that why apple switched over?
Question: is ldi Turing complete in RISC-V?
Having one mnemonic (in this case MOV) for dozens of different modes is very CISC, not RISC. I think LDI is from a different RISC though (e.g. AVR); RISC-V uses a constant 0 register (r0/zero), so small immediates are set using ADDI, while upper bits are set using LUI or AUIP. There's an LI pseudoinstruction that can generate an ADDI+LUI pair. Anyhow, since load immediate only alters state it cannot be affected by (writes general purpose registers, reads code memory), it is not Turing complete.
ARM has a similarly complex address generation system and direct access to PC, so might be an easier target. I suspect you might still find it challenging to both read and write memory.
Quick question, why can't I just move something into the program counter for jumps?
Because x86-64 can only use RIP as a base register in addressing, not as a destination register.
wiki.osdev.org/X86-64_Instruction_Encoding
It's a different story on e.g. ARM, which can access PC as R15, and MSP430, where it is R0. It might still be readonly on some architectures, such as when used both as a noop destination and for relative operations.
Reminds me of functional / church!
This somehow reminded me of how DNA works.
I could image using this compiler for a demo of a program. Since who in their right mind would try to reverse engineer such a Programm. You could also advertise better performance in the real version
could it be implemented on ARM and without the dummy data issue?
ARM has separate instructions for reading from memory, writing to memory and moving values between registers, so this approach probably wouldn't work.
Its not a cross product blin, its a cartesian product!
I thing I will write a better Excel with 'mov'... damn... even better,,, with Z80's 'LD'...
but first... coffeeeeeeeee.... ;-)
But can it run Crysis
around 7:00 still don't get how you know it's equal or less than just by reading memory values
eliluong you first write a 0 to mem position 3 because x=3, then you write a 1 to mem position 3 because y=3. Then you read mem position 3 again because x=3 and if you read 1 it is because y = x.
Thanks for that short and simple explanation.
Why do all the videos of Chris Domas have such horrendous audio quality? I would love to listen to them but i get a cluster headache 5 minutes in.
Looks like someone just rediscovered FPGAs
It’s kinda like lambda calculus
21:22 - In text mode you can write to 0xb0000 to write text directly to the video-cards buffer (0xb8000 for color mode)
24:35 - You can branch by modifying the instruction-pointer. You can also perform comparisons by modifying the flags register.
27:37 - It's bigger _and_ slower? RISC can be faster than CISC if you optimize it. This is super-RISC, so it needs lots of tweaking.
33:54 - Now you _really_ need to comment your code. 46:00 - Is that _CSI: Miami_ ? 🤔 (Sunglasses always are now. ¬_¬)
You can't modify the instruction pointer in x86. EIP can't be a destination of a mov instruction.
mov ip, [eax]? Wuahahahaha
Ikr. I was confused why he insisted on over-complicating the process when he could achieve pretty much everything (comparisons, branching, etc.) by manipulating the registers. I mean I can understand the desire to push yourself with extra restrictions for the challenge, but it seemed like he didn't even try it the easy way first. 😕
Or maybe, he didn't use mov eip, x because there's no such instruction?
@@user-vn7ce5ig1z it's literally not possible to write to the EIP register. You really think after this talk he'd be that dumb to not branch using the pc register if it was possible?
Your if is more like goto than if.
that's how ifs work
you can jump by moving a value to the ip register.
33:40 but there is a c to bf compiler...
And If We Create A CPU With A Move Instruction Only?
I Called "SIMA" For ::::::::::: Single Instruction Move Architecture
Wikipedia article: One-instruction set computer. They exist, and can function quite well.
Isn't this how quantum computing works? Compute everything at once and only look at specific values (the real values)?
i_kneel.jpg