13:31 This slide shows that another drawback of the “regressive” way is not only its higher iteration count, but also that every iteration has one extra invocation of “operation”. Great talk, I’m sure a generic approach that chooses regressive vs. ‘forward’ via some heuristic or caller-choice can be devised.
Not trying to throw shade, but what is old is new again? 8-bit coders and game devs have been using techniques like this since the early 80s (or perhaps earlier) like multiplication on 6502 which only supports addition and shifting by 1 bit. So much 8-bit coding knowledge has been lost to time, it seems. Interesting extension to using generic monoids though, and it was very nice to see in a more formal light and nicely presented, thank you.
In university, we are being taught 4 multiplication algorithms: 1. Multiplication from the least significant bits of the multiplier and shifting the multiplicand to the left 2. Multiplication from the least significant bits of the multiplier and shifting the sum of partial products to the right 3. Multiplication from the highest significant bits of the multiplier and shifting the multiplicand to the right 4. Multiplication from the highest significant bits of the multiplier and shifting the sum of partial products to the left Would be nice if the remaining 2 could also be used for abstract operations.
15:06 Naming is hard. I think I would rename the IterationCount type to ‘StepOperand’ and ‘CountHalver’ to ‘StepIncrement’. I would then allow the forSquaring parameter to have a different type (maybe SquareStepOperand, default to StepOperand) to allow the Operator to be overaloaded for 2 types of operands similar to the (iterator vs. sentinel approach done for ranges). Great talk!
34:31 well, division isn’t associative. There is the ‘long division’ algorithm which perhaps can be implemented using the suggested approach- but if so it might teach us more about the domain for which this technique applies
16:15 I think this slide has a typo, seems to imply that the function is recursive. I assume the name of the function presented at the second line of code should actually be something like template_regressive_egyptian_exponentiate
Nice talk! The "forward" algorithm is usually called "exponentiation by squaring", AFAIK, not sure if that can help you naming your invention. My favorite toy application for this is the computation of the nth Fibonacci number using matrix exponentiation (actually, that can be generalized to any linear recurrence relation).
20:32 Note the use of ‘horizontalEq(sum, c)’ and not simply ‘sum == c’. There was a lot of discussion in the C++ committee about the semantics of operator== for std::simd - so it’s smart of Jamie to steer away from it.
Population count only needs log base 2 of the word size, not a linear scanning of all n bit positions. Another example of a monoid not on your platform might be to build a *string* of x copies of an original, faster than the naive loop through x times.
The book you're talking about is very good -- I got it when it came out and also spent some time working through it. The "regressive" algorithm you describe isn't always an inefficient round-about method -- I first learned about it (and really understood it when I figured it out for myself) in the '80's, where 8-bit computers *did not have* multiplication and division as hardware primitives. More recently, even higher levels of SSE didn't have integer division! To really speed it up, you have to get rid of the `if` test for the least bit. When called for arbitrary numbers, this will essentially be random, and performance suffers do to impossible branch prediction. You need to always calculate the update in a separate variable, and then (only) conditionally replace the accumulated value -- the compiler should generate a CMOV for that on X86/X64.
I was expecting you to explain how Tutankhamen would debug my binary tree.
Multiplying by shifting the bits in the opposite direction is also explored in Programming the Z80 3rd ed. on page 128.
13:31 This slide shows that another drawback of the “regressive” way is not only its higher iteration count, but also that every iteration has one extra invocation of “operation”. Great talk, I’m sure a generic approach that chooses regressive vs. ‘forward’ via some heuristic or caller-choice can be devised.
Not trying to throw shade, but what is old is new again? 8-bit coders and game devs have been using techniques like this since the early 80s (or perhaps earlier) like multiplication on 6502 which only supports addition and shifting by 1 bit. So much 8-bit coding knowledge has been lost to time, it seems.
Interesting extension to using generic monoids though, and it was very nice to see in a more formal light and nicely presented, thank you.
In university, we are being taught 4 multiplication algorithms:
1. Multiplication from the least significant bits of the multiplier and shifting the multiplicand to the left
2. Multiplication from the least significant bits of the multiplier and shifting the sum of partial products to the right
3. Multiplication from the highest significant bits of the multiplier and shifting the multiplicand to the right
4. Multiplication from the highest significant bits of the multiplier and shifting the sum of partial products to the left
Would be nice if the remaining 2 could also be used for abstract operations.
15:06 Naming is hard. I think I would rename the IterationCount type to ‘StepOperand’ and ‘CountHalver’ to ‘StepIncrement’. I would then allow the forSquaring parameter to have a different type (maybe SquareStepOperand, default to StepOperand) to allow the Operator to be overaloaded for 2 types of operands similar to the (iterator vs. sentinel approach done for ranges).
Great talk!
34:31 well, division isn’t associative. There is the ‘long division’ algorithm which perhaps can be implemented using the suggested approach- but if so it might teach us more about the domain for which this technique applies
16:15 I think this slide has a typo, seems to imply that the function is recursive. I assume the name of the function presented at the second line of code should actually be something like template_regressive_egyptian_exponentiate
At 34:20, they talk about division and subtraction, but those aren't associative, right? (a - (b - c)) = (a - b) + c != ((a - b) -c)
Nice talk!
The "forward" algorithm is usually called "exponentiation by squaring", AFAIK, not sure if that can help you naming your invention.
My favorite toy application for this is the computation of the nth Fibonacci number using matrix exponentiation (actually, that can be generalized to any linear recurrence relation).
20:32 Note the use of ‘horizontalEq(sum, c)’ and not simply ‘sum == c’. There was a lot of discussion in the C++ committee about the semantics of operator== for std::simd - so it’s smart of Jamie to steer away from it.
Please post these videos at 720@30fps so I can read the text on the slides without frame drops.
Starting with the most significant bit is the so-called Horner's scheme.
Population count only needs log base 2 of the word size, not a linear scanning of all n bit positions.
Another example of a monoid not on your platform might be to build a *string* of x copies of an original, faster than the naive loop through x times.
This is so interesting! Great presentation!
Thank you, this is insightful, well done.
@@Sarah1Smith-v4r thanks!
Such a cool perspective! Thanks for the presentation
Thanks dude!! Glad you enjoyed!
hey that guy looks familiar
Thank you!
Thank you!
How does this relate to fastpow?
Super interesting:)
Cheers!
Jamie is so cool!
Damn right.
This is amazing 👏
The book you're talking about is very good -- I got it when it came out and also spent some time working through it.
The "regressive" algorithm you describe isn't always an inefficient round-about method -- I first learned about it (and really understood it when I figured it out for myself) in the '80's, where 8-bit computers *did not have* multiplication and division as hardware primitives. More recently, even higher levels of SSE didn't have integer division!
To really speed it up, you have to get rid of the `if` test for the least bit. When called for arbitrary numbers, this will essentially be random, and performance suffers do to impossible branch prediction. You need to always calculate the update in a separate variable, and then (only) conditionally replace the accumulated value -- the compiler should generate a CMOV for that on X86/X64.