Speculative execution is there since the 80ties , since RISC enabled pipelining. The reason is that pipeliniing gives a speedup of m, where m is the lenght of the pipeline, as long as the pipeline is full. The cost is that you pay a startup latency in the so called "Windup" phase, the phase were you fill the pipeline. This price you pay every time you gotta branch since you have to refill the pipeline. Without brench prediction and speculative execution you would have to refill the pipeline at the final branch of every for loop , fully defeating the advantages of pipelining, actually making it worst that if you were not pipelining. I have an exam on programming techniques for super computers in some days, life is hard
@@indiano-99 I had just finished reading the first post, and indeed, I was curious how it went. Well done, you! I hope you've treated yourself to an ice cream.
Speculative execution in 1970 on an IBM 360/75. A program I was testing had a S0C4 exception, basically illegal address. When I reviewed the crash dump, I observed that maybe 5-6 of the following register_to_register instructions had been performed. No virtual memory, no page tables, just raw memory. So the machine was running ahead of the slowish instruction that accessed raw memory.
I started with COBOl but a few months later was transferred to "systems" department and did nothing but assembler work for the next several years. A couple weeks playing with PL/1. Worked on auto answers to outstanding WTORs, patched the MVT open routines so the operators log would write to a tape, improved performance of long running programs... normal stuff at that time.
Watching Prime spend 5 minutes trying to dig out Memory Management Units (MMUs) and Translation Lookaside Buffers (TLBs) from his school days memory is wild. Props to you man. Streaming live content is tough!
You should interview Christopher Domas. The TLB, the virtual cache table, makes memory storage more efficient, only storing the actual bytes in any register. The remaining bits can be used to store other pieces of information, having two or more items in a single register. Before virtual paging, you could only store, say 4 bits, in only one register, in 32bit registers having 28bits of empty unused space.
26:42 One way to mitigate "the weather man problem" is to set up your chair and your camera slightly to the left of your monitor, that way when you look at the screen you are always looking to your right.
3:40 Aleph_0 is the size of the natural numbers. Aleph_1, well that get's complicated fast. The full answer requires somewhat deep infinitary set theory. It also requires that you get somewhat in the weeds with which set-theoretical axioms imply what. Here's an attempt to summarize: Assuming merely the Zermelo-Fraenkel axioms Aleph_1 is the second smallest infinity which can be well-ordered*. The axiom of choice** is equivalent to saying all sets can be well-ordered, so if you assume that, this simplifies to 'aleph_1 is the second smallest infinity'. The idea that Aleph_1 is the size of the real numbers is known as the continuum hypothesis. It was originally discussed by Cantor who invented infinitary set theory. Hilbert then made proving or disproving the continuum hypothesis one of the problems of the next century in the year 1900. Later Gödel proved the continuum hypothesis was consistent with the Zermelo-Fraenkel axioms and the axiom of choice (ZFC for short)*** Even later Cohen proved the negation of the continuum hypothesis was also consistent with ZFC.*** This means that the continuum hypothesis is independent of ZFC. Simply put, we can't prove or disprove it, without making assumptions we don't usually make. This means that beyond saying that aleph_1 is the second smallest infinity (that can be well-ordered) we don't exactly know how big it is. It might be the size of the real numbers, but it might be the size of some subset of the real numbers that's neither the size of the natural numbers, or the real numbers. It's perhaps worth noting that without the axiom of choice the real numbers might not be well-orderable at all, and hence would not be any aleph. * A well-order is a set joined with some kind of smaller than operation '
Short version: Aleph-1 is the cardinality of the set of ordinal numbers. The ordinal numbers represent the order of something. For example, 1st, 2nd, 3rd. You may think that the number of ordinal numbers and the number of cardinal numbers (positive integers) is the same, but it's actually not. For example, you can define an ordinal number called omega, which is one greater than infinity. Getting omegath place in a race means that an infinite number of people finished the race, and then you finished. That's as far as I can get without getting into the weeds of it lol
@18:06 "We still get like 15% per year" Prime: "Yeah but 15% is nothing like it was in 2000s... like it 4x-ed over 10 years" BUT PRIME 15% increase per year for 10 years is almost exactly a 4x ! 1.15^10 is a 4.04x increase...
Speculative execution is vital for processor performance, but its repercussions seem to be too hard for humans to really fully comprehend. We're truly borked nowadays.
It's like translation, it's hard to translate - well not anymore but it used to be - and easy to find mistakes or argue translations, aka finding bugs or exploits. But one needs to produce code/language, the other only proof-reads it.
actually from 2000 to 2004 the common clock went from 800Mhz to 3.4Ghz. The computing power was doubling every 2 years for decades. Now we're far from that, since the core family were introduced in 2006. The number of cores and transistors still increase a lot but the computing power don't scale that much.
@@meowsqueak no. I think around 2006 there were 4Ghz pentium 4 (I was playing BF2 with it :-) ). The core architecture brought lower clocks for years, only recently they went back to high frequencies because they can't gain in architecture anymore
@@meowsqueak maybe you were barely born :-), yes young people probably don't realize how little tech has progressed since this time, only core number and probably cycles per instruction and vectored instructions are the cause of any computing power increased, as well as parallel processing used by software whenever it makes sense (not that often sadly), and a bit from low latency cache increase, memory bandwidth : frequency and bus width (bits) I have a laptop from 2008 with a quad core core i7.
About the pointer authentication you can write a pretty simple chip to encrpypt/ decrypt pointers with very little overhead. Dedicated chips that does just one thing like encrypting can be done using a "single" clock tick. PS: just realized that the whole ASIIC Miners is exactly that, an dedicated encrpyt chip that is able to perform those encrpyptions orders of magnitude faster than any cpu.
@@hanifarroisimukhlis5989 Honestly, I don't know. I never tough about implementing one, I just remembered my FPGA class where we had to code some hardware for specific problems. But i'm sure we could come up with a solution like we have now, something with Vtables.
"likely" is mostly just about ICACHE - code that's likely branch is there to be loaded and "unlikely" somewhere else behind branch. It makes you take branch on unlikely code and have next likely code in ICACHE.
async socket code in c is messy. Windows implementation of iocompletion ports take in structs which can be filled differently depending on what mode you want them to run. Eg you can get notifications as a windows event loop message, a waitforsingleobject blocking call or getqueuecompletionstatus blocking call. By blocking calls i mean a method that blocks i mean something similar to how pselect works, it blocks until one of the multiplexed sockets has an event
7:45 All that a memory mapper does is translate a programmer limited by disk storage capacity virtual address into a real address by means of a very fast hardware table array. If the virtual address is not in memory at the current time, then it is paged from disk storage or whatever it is into the least recently used spot in real memory, and the memory mapper table entry for that virtual memory value is updated to contain the newly paged in program or data location of real memory. It's just a cool means of caching your disk space as virtual memory for program code or data.
runahead optimizations were already a thing in the early 2000s, at least in research, it also enhances branch prediction greatly, but cache misses and memory latency were already big issues back in the 90s, there was just not enough resources in the CPU to do anything about it
Very interesting that stack smashing is seen as a security issue - well - it _is_ a security issue - no doubt about that, but some systems/languages actually use this type of manipulation as a feature! In the Forth programming language, the return stack is often manipulated to execute code! What an interesting domain we work in!
I learned about this number theory in discrete mathematics. Aleph null is the set of all sets. I loved it because I introduced the concept of infinite infinites and counting Infinity and how some infinities are larger than others... I love this stuff.
How do you count an infinite set? By its definition it will take an infinite time, unless the counting per step time is literally zero (not even "effectively" zero).
7:45 no, memory mapping didn't come from 32bit. It probably came from when computers were running one program at a time and wanted to introduce multitasking.
One game on like Playstation or something used buffer overflows to do the first OTA updates. They had a like pre screen that showed game news. They realised they could send the game news with a bunch of extra data that would patch the game live...
When you try to do something crazy to improve performance, you often have to take a risk in it being easy to mess up and break or make a security hole.
@@TianYuanEX they're just sets, set can be defined as a number, thus you can type the type. you can have a set of sets, but category theory is abstract non-sense. if you can write it in mathematical notation, it is not infinite, isn't it ? it just discrete representation of a possibly infinitude, but it still not the biggest infinite thing. here's a new concept. y = Aleph(Aleph(x)) Because Godels Aleph number wasn't infinite enough.
That wasn’t a bug. It was an attempt by Intel to extend the life of 32-bit software, thinking that 64-bit wasn’t ready yet. IIRC, it allowed the 32-bit memory space to be mapped into a 48-bit physical space. No one program would have more than 4GB*, but many programs could be run with their own 4GB space. This reduced paging pressure on the system, something that was becoming a serious issue for operating system performance. * In theory a program could have supported PAE directly and use more than 4GB of RAM. Similar to extended memory in the DOS days. In practice, I don’t think that ever happened.
@@thewiirocks I didn't say that its a bug. Prime mentioned that 32-bit programs cannot access more than 4GB of memory which is technically true but not so much.
@@thewiirocks Also useful to allow a 32-bit program to access 4GB of RAM on a 64-bit system, handy if you're modding games so they need more RAM than they did originally.
It basically allowed applications access to multiple 4GB memory spaces by including extra bytes, 16 to be specific. You could therefore have 2^16 * 4GB of addressed virtual memory per process.
I first read about Aleph in a book called "Mathematics and the Imagination" by Kasner Edwards and Newman James, 1949. Also introduced the infamous 'Gogoolplex'. The concept of infinity was very popular at the time, though nobody really understood it, if anybody actually does now. The book illustrates this fact by leveraging finite numbers that are so big that they are more less impossible outside of the imagination, whereas (regular!) infinity dwarfs that even further into nothingness (the equation of size and distance is a misnomer, it cant actually apply). It talks about a number of other fun and paradoxical mathematical subjects. It's one of the best books I have read and is still available for free online.
Let's say you are brute forcing a tag guess. What happens when there's a mismatch? Crash. But if the mismatch is in future it won't crash. This allows brute forcing tag guesses. Run a long speculative tag guesser and you'll have all the valid tags available, you're back to good ol 90s & u can finally stack smash in peace. Guys being guys, always trynna find a way to smash smt
Aleph 0 is the cardinality of the set of whole numbers. Aleph 1 is the cardinality of the set of real numbers. It is presumed (though not satisfactorily proven) that no set is larger than Aleph 0 but smaller than Aleph 1. The reason we aren't sure is that P = NP or the Riemann Hypothesis being false would break a lot of math, but the existence of sets with cardinality larger than Aleph 0 but smaller than Aleph 1 wouldn't break set theory, nor would their nonexistence. A proof of P = NP or a counterexample to the Riemann hypothesis would break decades of work that has held up to scrutiny for now, so those counterexamples probably don't exist.
Didn't this already happen? It wasn't that long ago maybe 1.5-2y or so there was a vulnerability in the architecture that could only be software patched
ℵ₀ is the cardinality of the set of natural numbers, the largest countable set. ℵ₁ is the next cardinality, the cardinality of the smallest uncountable set. 𝔠 is the cardinality of the real numbers. The Continuum Hypothesis (CH) states that ℵ₁ == 𝔠. This cannot be proven in ZFC because it has been shown to be independent of ZFC, that is, both it and its negation are compatible with ZFC.
I think ThePrimeagen explained it wrong. I think "probe test_ptr" means it simply tries to read test_ptr or maybe read its memory address and times the read, expecting it to be faster if it's cached.
yea he really has absolutely no clue whatsoever on cpu advancements he literally stated the opposite of the truth, it is insane to me. he knows nothing about this.
I'm not that experienced with computer hardware or any lower level programming. Can someone explain why increasing cache size isn't a solution? I know cache is very small, and I want to know why.
It has to be very close physically to the CPU, and has to have very fast access. And the cache itself has to be fast. Cache is so fast that the speed of electricity matters. The biggest innovation has been AMD's 3D cache which puts the cache literally on top of the CPU. But, even then, they can only put so much in there. To greatly simplify: the restraints are physical.
While i was trying to predict the future of this video it got really easy after he said the thing about his haircut. Has my productive algorithm been hacked?
"I smashed some stacks, but only unintentionally". Is that dev speak for "I smoked, but didn't inhale"? I did smoke some chips, unintentionally... Or was it unsmoked, because they let the magic smoke out, thus stopped working... Lack of magic issues...
If this affects M series chips, are they not all defective and should not all mac users demand a return of excessive funds paid for "secure" computing?
Hackers who do it for fun and not to cause problems, and actually try to learn--those are cool. Even if they aren't cool in other ways, they are a cool hacker. The ones who do it seriously and do it to stop exploits are also cool hackers.
I'm going to disagree with him that the fundamentals are "C and assembly" -assuming the context of "low level"- I would advocate Forth over C any day of the week for the simple reason that Forth defines 'words' [what other languages call subprograms or functions] as _"a sequence of words or else a chunk of machine-code to execute"_ *this* actually allows you to _quickly_ go low-level or abstract out [fairly] high-level, and *very* quickly. But, in general, "the fundamentals" are not actually anything like a programming-language or set of assembly-instructions; rather, "the fundamentals" are: (a) the ability to define your problem, (2) the ability to break your problem down into small[er] parts, and (3) the ability to read/research.
As the old joke goes: There are two hard things in computing: Naming things, cache invalidation, and off by one errors.
Related to this: only 10 types of peope, those who know binary and those who don't.
Bro being cheeky with the off by one pun
there are only 10 types of people, those who know binary, those who don't and those who didn't expect a ternary joke
I have this on a t-shirt, I am in 2 minds about how it should be numbered (1,2,3 as current; or 0,1,2 which feels more appropriate, sort-of)
Works also as There are three hard things in computing: Naming things, cache invalidation, and off by one errors. Shit I fu up the joke.
As a cpp user, I like putting constexpr everywhere so that 100% of my code execute at compile time and I achieve infinite fps
If you constexpr everything, then the moment you run the game it immediately ends so you can move on to another one
Efficiency over usability! I like it.
By C++48 everything in C++ will be constexpr and the compiler will simply have the equivalent of the Java JVM built in
Constexpr doesn't guarantee compile time exec, it will only work on simple stuff anyway
Now that's a compiler that can *actually* see into the future (unlike the fake see-into-the-future that the speculative execution is).
Speculative execution is there since the 80ties , since RISC enabled pipelining. The reason is that pipeliniing gives a speedup of m, where m is the lenght of the pipeline, as long as the pipeline is full. The cost is that you pay a startup latency in the so called "Windup" phase, the phase were you fill the pipeline. This price you pay every time you gotta branch since you have to refill the pipeline. Without brench prediction and speculative execution you would have to refill the pipeline at the final branch of every for loop , fully defeating the advantages of pipelining, actually making it worst that if you were not pipelining. I have an exam on programming techniques for super computers in some days, life is hard
Btw exam went great! if anybody in this planet was wondering
@@indiano-99 You're built different my dude.
@@indiano-99 I had just finished reading the first post, and indeed, I was curious how it went.
Well done, you! I hope you've treated yourself to an ice cream.
if i had a dollar for every low level bug/cpu security feature discovered this year i'd have like 20
Good money
Yeah, I thought this already happened last year and this year; now twice this year for the ARM.
But still not enough to buy a carton of eggs 😔
If I had 2 to the power of the number of security bugs found this year, I would have around a million.
She speculates on my execution 'til I crash.
Is that a buffer overflow in your pants or are you just happy to see me?
*[EXTREMELY LOUD STREAM ALERT]*
5:25 "I am not a hacker, I just ship bugs to production" said the hacker's best friend =)))
That joke just flew past me initally
Speculative execution in 1970 on an IBM 360/75. A program I was testing had a S0C4 exception, basically illegal address. When I reviewed the crash dump, I observed that maybe 5-6 of the following register_to_register instructions had been performed. No virtual memory, no page tables, just raw memory. So the machine was running ahead of the slowish instruction that accessed raw memory.
Sir, were you working with COBOL?
I started with COBOl but a few months later was transferred to "systems" department and did nothing but assembler work for the next several years. A couple weeks playing with PL/1. Worked on auto answers to outstanding WTORs, patched the MVT open routines so the operators log would write to a tape, improved performance of long running programs... normal stuff at that time.
Watching Prime spend 5 minutes trying to dig out Memory Management Units (MMUs) and Translation Lookaside Buffers (TLBs) from his school days memory is wild.
Props to you man. Streaming live content is tough!
Just need to remember that without TLB cache the MMU sh*t wouldn’t execute fast enough on its own to be worth a d*mn
@@TheSulross Could you say it's not worth a DIMM? Sorry i'll see myself out
@@TheArrowedKnee Well, we all know how touchy YT can be with those explicit words - that was very brave of you to go there
Me trying to sleep: zzzz
That one mosquito: 5:36
Underrated
I'm dying
You should interview Christopher Domas.
The TLB, the virtual cache table, makes memory storage more efficient, only storing the actual bytes in any register. The remaining bits can be used to store other pieces of information, having two or more items in a single register. Before virtual paging, you could only store, say 4 bits, in only one register, in 32bit registers having 28bits of empty unused space.
26:42
One way to mitigate "the weather man problem" is to set up your chair and your camera slightly to the left of your monitor, that way when you look at the screen you are always looking to your right.
3:40
Aleph_0 is the size of the natural numbers. Aleph_1, well that get's complicated fast. The full answer requires somewhat deep infinitary set theory. It also requires that you get somewhat in the weeds with which set-theoretical axioms imply what. Here's an attempt to summarize:
Assuming merely the Zermelo-Fraenkel axioms Aleph_1 is the second smallest infinity which can be well-ordered*.
The axiom of choice** is equivalent to saying all sets can be well-ordered, so if you assume that, this simplifies to 'aleph_1 is the second smallest infinity'.
The idea that Aleph_1 is the size of the real numbers is known as the continuum hypothesis. It was originally discussed by Cantor who invented infinitary set theory.
Hilbert then made proving or disproving the continuum hypothesis one of the problems of the next century in the year 1900.
Later Gödel proved the continuum hypothesis was consistent with the Zermelo-Fraenkel axioms and the axiom of choice (ZFC for short)***
Even later Cohen proved the negation of the continuum hypothesis was also consistent with ZFC.***
This means that the continuum hypothesis is independent of ZFC. Simply put, we can't prove or disprove it, without making assumptions we don't usually make.
This means that beyond saying that aleph_1 is the second smallest infinity (that can be well-ordered) we don't exactly know how big it is.
It might be the size of the real numbers, but it might be the size of some subset of the real numbers that's neither the size of the natural numbers, or the real numbers.
It's perhaps worth noting that without the axiom of choice the real numbers might not be well-orderable at all, and hence would not be any aleph.
* A well-order is a set joined with some kind of smaller than operation '
Short version: Aleph-1 is the cardinality of the set of ordinal numbers. The ordinal numbers represent the order of something. For example, 1st, 2nd, 3rd. You may think that the number of ordinal numbers and the number of cardinal numbers (positive integers) is the same, but it's actually not. For example, you can define an ordinal number called omega, which is one greater than infinity. Getting omegath place in a race means that an infinite number of people finished the race, and then you finished.
That's as far as I can get without getting into the weeds of it lol
it's a big sign of respect when Prime highlights your whole sentence
@18:06 "We still get like 15% per year"
Prime: "Yeah but 15% is nothing like it was in 2000s... like it 4x-ed over 10 years"
BUT PRIME 15% increase per year for 10 years is almost exactly a 4x !
1.15^10 is a 4.04x increase...
Prime just casually insulting top-notch cyber security researchers as basement dwellers. 😂
They won, but at what cost.
Win some, lose some.
Top-notch? When has he covered any?
take my muney for the "felt cute, might execute later" shirt
Starting with the Penium 4 (!) we had the branch prediction. In that time 50% of the silicon was just to detect if the prediction was wrong 🎉🎉
It's called "Aleph null" (pointer exception)
Speculative execution is vital for processor performance, but its repercussions seem to be too hard for humans to really fully comprehend. We're truly borked nowadays.
basedment++
“All hackers are cool…ummmmm!!” Had me lol and smashing a like😂👍🏼
The difference between a programmer and a hacker. One creates exploits and the other exploits them.
@@Telhias that's more like the difference between a developer and a hacker. both are programmers.
It's like translation, it's hard to translate - well not anymore but it used to be - and easy to find mistakes or argue translations, aka finding bugs or exploits. But one needs to produce code/language, the other only proof-reads it.
actually from 2000 to 2004 the common clock went from 800Mhz to 3.4Ghz. The computing power was doubling every 2 years for decades. Now we're far from that, since the core family were introduced in 2006. The number of cores and transistors still increase a lot but the computing power don't scale that much.
“… to 2004” - did you mean 2024?
@@meowsqueak he did mean 2004 as he’s talking about the time of the Pentium 4
@@meowsqueak no. I think around 2006 there were 4Ghz pentium 4 (I was playing BF2 with it :-) ). The core architecture brought lower clocks for years, only recently they went back to high frequencies because they can't gain in architecture anymore
Ok, I didn’t realise clocks hit 3 GHz+ by 2004, but it was a long time ago…
@@meowsqueak maybe you were barely born :-), yes young people probably don't realize how little tech has progressed since this time, only core number and probably cycles per instruction and vectored instructions are the cause of any computing power increased, as well as parallel processing used by software whenever it makes sense (not that often sadly), and a bit from low latency cache increase, memory bandwidth : frequency and bus width (bits) I have a laptop from 2008 with a quad core core i7.
About the pointer authentication you can write a pretty simple chip to encrpypt/ decrypt pointers with very little overhead. Dedicated chips that does just one thing like encrypting can be done using a "single" clock tick.
PS: just realized that the whole ASIIC Miners is exactly that, an dedicated encrpyt chip that is able to perform those encrpyptions orders of magnitude faster than any cpu.
Then how you can do offsets?
@@hanifarroisimukhlis5989 Honestly, I don't know. I never tough about implementing one, I just remembered my FPGA class where we had to code some hardware for specific problems. But i'm sure we could come up with a solution like we have now, something with Vtables.
How do you follow that chat stream it’s like a gaggle of chickens clucking
@26:06 ultimate bumber sticker
"Im not a hacker, I just ship bugs to production"
That got me
8:15 it's a vpt prime, a virtual page table. the 'translation' is named virtual address translation, accelerated via translation look-aside buffers.
"Gate all around - the future of transistors."
Good youtube video.
"likely" is mostly just about ICACHE - code that's likely branch is there to be loaded and "unlikely" somewhere else behind branch. It makes you take branch on unlikely code and have next likely code in ICACHE.
That moment when all the researchers have Korean names: Oh no....
async socket code in c is messy. Windows implementation of iocompletion ports take in structs which can be filled differently depending on what mode you want them to run. Eg you can get notifications as a windows event loop message, a waitforsingleobject blocking call or getqueuecompletionstatus blocking call. By blocking calls i mean a method that blocks i mean something similar to how pselect works, it blocks until one of the multiplexed sockets has an event
from 5:00 to 5:30 is just hilarious
7:45 All that a memory mapper does is translate a programmer limited by disk storage capacity virtual address into a real address by means of a very fast hardware table array. If the virtual address is not in memory at the current time, then it is paged from disk storage or whatever it is into the least recently used spot in real memory, and the memory mapper table entry for that virtual memory value is updated to contain the newly paged in program or data location of real memory. It's just a cool means of caching your disk space as virtual memory for program code or data.
1:58 They are in
nested mother's basement😂
runahead optimizations were already a thing in the early 2000s, at least in research, it also enhances branch prediction greatly, but cache misses and memory latency were already big issues back in the 90s, there was just not enough resources in the CPU to do anything about it
This whole thing flew over my head.
If a bug is not fixable, it’s not a bug, it’s a feature.
Yeah 😅🫠😬
double basement of knowledge is my new quality metric for these types of topics
Very interesting that stack smashing is seen as a security issue - well - it _is_ a security issue - no doubt about that, but some systems/languages actually use this type of manipulation as a feature! In the Forth programming language, the return stack is often manipulated to execute code! What an interesting domain we work in!
I learned about this number theory in discrete mathematics. Aleph null is the set of all sets. I loved it because I introduced the concept of infinite infinites and counting Infinity and how some infinities are larger than others... I love this stuff.
No, with ZFC there isn't any "set of all sets". Aleph null is cardinality of a countably infinite set.
How do you count an infinite set? By its definition it will take an infinite time, unless the counting per step time is literally zero (not even "effectively" zero).
"a little exciting, but mostly dangerous"
4:50 Self modifying code. It's as cool as it is scary.
7:45 no, memory mapping didn't come from 32bit. It probably came from when computers were running one program at a time and wanted to introduce multitasking.
4:02 wild "yeh"
V8 is the armv8 the successor to armv7
8:07 It was 2G or 3G with Large Address Aware.
I love these videos
One game on like Playstation or something used buffer overflows to do the first OTA updates. They had a like pre screen that showed game news. They realised they could send the game news with a bunch of extra data that would patch the game live...
5:42 it’s called furry swag
When you try to do something crazy to improve performance, you often have to take a risk in it being easy to mess up and break or make a security hole.
Top quote here: russian-doll basement.
Aleph 1 = Cardinality of the smallest uncountable set
isent it aleph null?
@@forest6008 No that's the cardinality of the smallest infinite set (countable infinity). Smallest uncountable set is the second smallest infinite set
@@TianYuanEX ohh thanks
@@TianYuanEX they're just sets, set can be defined as a number, thus you can type the type.
you can have a set of sets, but category theory is abstract non-sense.
if you can write it in mathematical notation, it is not infinite, isn't it ? it just discrete representation of a possibly infinitude, but it still not the biggest infinite thing.
here's a new concept. y = Aleph(Aleph(x))
Because Godels Aleph number wasn't infinite enough.
@@monad_tcp I'm gonna be honest with you, nothing you wrote makes any sense
I think there was something called PAE instructions that let you access more than 4GB of memory on 32 bit intel processors.
That wasn’t a bug. It was an attempt by Intel to extend the life of 32-bit software, thinking that 64-bit wasn’t ready yet.
IIRC, it allowed the 32-bit memory space to be mapped into a 48-bit physical space. No one program would have more than 4GB*, but many programs could be run with their own 4GB space. This reduced paging pressure on the system, something that was becoming a serious issue for operating system performance.
* In theory a program could have supported PAE directly and use more than 4GB of RAM. Similar to extended memory in the DOS days. In practice, I don’t think that ever happened.
@@thewiirocks I didn't say that its a bug. Prime mentioned that 32-bit programs cannot access more than 4GB of memory which is technically true but not so much.
@@thewiirocks Also useful to allow a 32-bit program to access 4GB of RAM on a 64-bit system, handy if you're modding games so they need more RAM than they did originally.
It basically allowed applications access to multiple 4GB memory spaces by including extra bytes, 16 to be specific. You could therefore have 2^16 * 4GB of addressed virtual memory per process.
I first read about Aleph in a book called "Mathematics and the Imagination" by Kasner Edwards and Newman James, 1949. Also introduced the infamous 'Gogoolplex'.
The concept of infinity was very popular at the time, though nobody really understood it, if anybody actually does now. The book illustrates this fact by leveraging finite numbers that are so big that they are more less impossible outside of the imagination, whereas (regular!) infinity dwarfs that even further into nothingness (the equation of size and distance is a misnomer, it cant actually apply).
It talks about a number of other fun and paradoxical mathematical subjects. It's one of the best books I have read and is still available for free online.
lmao the Russian doll knowledge/basement had me rolling
Did I just hear mention of the cardinality of different infinities? This is why I pay the man money.
9:47 it’s not costly in performance because the HW handles it in parallel
Let's say you are brute forcing a tag guess. What happens when there's a mismatch? Crash.
But if the mismatch is in future it won't crash.
This allows brute forcing tag guesses.
Run a long speculative tag guesser and you'll have all the valid tags available, you're back to good ol 90s & u can finally stack smash in peace.
Guys being guys, always trynna find a way to smash smt
Also,
If I am "that" guy who stack smashes in the future, it won't be hard to figure it out whether RAM was actually accessed. So many ways 🤔
Speculative execution came way before the CPU speed wall. It was already there in Pentium Pro processors in the 90's. It's nothing new.
2:01 😂 Double basement
i wonder if just submitting both execution paths to different cores would be faster than branch prediction
14:20 learning moment for me. Aware
Aleph 0 is the cardinality of the set of whole numbers. Aleph 1 is the cardinality of the set of real numbers. It is presumed (though not satisfactorily proven) that no set is larger than Aleph 0 but smaller than Aleph 1.
The reason we aren't sure is that P = NP or the Riemann Hypothesis being false would break a lot of math, but the existence of sets with cardinality larger than Aleph 0 but smaller than Aleph 1 wouldn't break set theory, nor would their nonexistence. A proof of P = NP or a counterexample to the Riemann hypothesis would break decades of work that has held up to scrutiny for now, so those counterexamples probably don't exist.
Didn't this already happen? It wasn't that long ago maybe 1.5-2y or so there was a vulnerability in the architecture that could only be software patched
Yikes! I have been trying to learn ARM for the last few months now (fun by the way). Yeah. Gotta watch!
GaN based proccessors could go to the hundreds of GHz. P-type gan transistors are just really difficult.
I would have thought we'd have transitioned to syn-diamond from silicon already for the substrate
ℵ₀ is the cardinality of the set of natural numbers, the largest countable set.
ℵ₁ is the next cardinality, the cardinality of the smallest uncountable set.
𝔠 is the cardinality of the real numbers.
The Continuum Hypothesis (CH) states that ℵ₁ == 𝔠. This cannot be proven in ZFC because it has been shown to be independent of ZFC, that is, both it and its negation are compatible with ZFC.
No wonder i seem drawn to cybersecurity stuff
You should definitely learn assembly... Sure way to go insane
I think ThePrimeagen explained it wrong. I think "probe test_ptr" means it simply tries to read test_ptr or maybe read its memory address and times the read, expecting it to be faster if it's cached.
Prime's alien!
yea he really has absolutely no clue whatsoever on cpu advancements he literally stated the opposite of the truth, it is insane to me. he knows nothing about this.
@@JohnSmith-pn2vl He is a web dev, what did you expect lol
You can't do that sort of research in a basement; you need like an abandoned factory, maybe an entire underground complex.
5:57 in chat "Yea its the rainbow hats" has got to be the funniest tech joke I've seen in a minute.
so was the joke at 4:57 funnier
5:38, I legit thought he was doing the Captain Crunch whistle phone hack tone.
Ever heard about c3 lang?
All these newer bugs around speculative execution make me scared about all the times I implemented it. Some of these are just too hard to catch.
"we dont wanna touch eachs others memory"
/proc/*/mem:
Basements all the way down.
You never tried? I still have some memories of a school project writing a stack overflow vulnerable kernel driver ...
I'm not that experienced with computer hardware or any lower level programming. Can someone explain why increasing cache size isn't a solution? I know cache is very small, and I want to know why.
It has to be very close physically to the CPU, and has to have very fast access. And the cache itself has to be fast. Cache is so fast that the speed of electricity matters.
The biggest innovation has been AMD's 3D cache which puts the cache literally on top of the CPU. But, even then, they can only put so much in there.
To greatly simplify: the restraints are physical.
While i was trying to predict the future of this video it got really easy after he said the thing about his haircut. Has my productive algorithm been hacked?
basement inside a basement :D
"I smashed some stacks, but only unintentionally". Is that dev speak for "I smoked, but didn't inhale"? I did smoke some chips, unintentionally... Or was it unsmoked, because they let the magic smoke out, thus stopped working... Lack of magic issues...
The golden era of CPU acceleration wasn't in the 2000s at all, more from the 70s to the 90s.
you mean when they went from wardrobe to printer size?
@@Kiyuja yes, when they went from decahertz to kilohertz to megahertz, that was amazing
obviously not, because it hasnt achive gigahertz where the gap is big >
@@retropaganda8442There were already computers running at 1MHz before 1970...
1965-2013 was about exponential acceleration.
5:40 I don’t like that group, but I gotta admit they have a really fun name.
thinking 10 seconds ahead is what you do on LSD lol!
unrelated to the topic but i wanted to ask which news sources you know are confiable
If this affects M series chips, are they not all defective and should not all mac users demand a return of excessive funds paid for "secure" computing?
I think primagen never took an OS course at college. Because they will teach you how virtual memmory actually works.
Accuuually!
Low Level Learning pre followed
If I had a dime for every bug in "ALL ARM PROCESSORS" which is actually a f**k-up by Apple, I would have the market capitalisation of ARM.
"A russian doll basement situation to have that level of knowledge" Wait, isn't that just Mr. Zozin, aka Tsoding? 😆
infinity exists !
I think the guys joke at 22:00 was digging at himself, hope it gets cleared up before the video ends :')
Hackers who do it for fun and not to cause problems, and actually try to learn--those are cool. Even if they aren't cool in other ways, they are a cool hacker.
The ones who do it seriously and do it to stop exploits are also cool hackers.
I'm going to disagree with him that the fundamentals are "C and assembly" -assuming the context of "low level"- I would advocate Forth over C any day of the week for the simple reason that Forth defines 'words' [what other languages call subprograms or functions] as _"a sequence of words or else a chunk of machine-code to execute"_ *this* actually allows you to _quickly_ go low-level or abstract out [fairly] high-level, and *very* quickly.
But, in general, "the fundamentals" are not actually anything like a programming-language or set of assembly-instructions; rather, "the fundamentals" are: (a) the ability to define your problem, (2) the ability to break your problem down into small[er] parts, and (3) the ability to read/research.
"Just saying.. it seems weird I'm seeing my own thing." It's ok dude, you get used to it over time. I'm seeing my own thing at least twice a day. 😏
Every CPU is vulnerable ... No matter what.