Square & Multiply Algorithm - Computerphile

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 เม.ย. 2022
  • How do you compute a massive number raised to the power of another huge number, modulo something else? Dr Mike Pound explains the super-quick square & multiply algorithm.
    Numberphile's Witness Numbers video which inspired Mike: • Witness Numbers (and t...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

ความคิดเห็น • 309

  • @pleasedontwatchthese9593
    @pleasedontwatchthese9593 2 ปีที่แล้ว +473

    For the people who have worked with assembly programming they will be really use to these. In the past CPUs did not have multiply and you often had a table of the fastest way to multiply a numbers. Which you guessed it was shifts (which is like a square) and addition

    • @NoNameAtAll2
      @NoNameAtAll2 2 ปีที่แล้ว +30

      even modern cpus have shortcuts for squaring and for multiplying by small numbers, so this algo is still benefitial

    • @klaxoncow
      @klaxoncow 2 ปีที่แล้ว +26

      Johnny Ball covered this on Numberphile before - search "Russian multiplication".
      Let's call the two numbers A and B.
      With number A, we shift all the bits to the right once.
      The rightmost bit "falls out" of the number and typically gets shifted into a flag on most CPUs - let's say our CPU shifts the bit out of the right into the carry flag.
      So, the carry flag now contains the rightmost (least significant) bit of number A. If it's a one, then we add number B to our running total. If it's a zero, carry on (so you could code that as a "branch if carry not set" over a "add B to total" instruction).
      Then we shift B left one, which doubles it.
      Then we do it again. Shift A to the right. The rightmost bit falls out into the carry flag. If it's a one, then we add B (which we've just doubled, remember) to the running total.
      Shift B left one, doubling it again.
      Shift A to the right. If it's a one, then add B (now four times bigger than it originally was, as we're shifting it left every round) to the running total.
      You keep doing this until you've shifted every original bit of A out of the right side. Stop. You've multiplied the two numbers together and your answer is in the running total register.
      And all we did was shifting bits left and right, and simple addition. And there's only as many "rounds" of this as there are bits in A.
      Another bit of useful binary maths is that the result cannot be more than twice the number of bits in B. That is, if A and B are 8-bit numbers, the result register only needs to be 16-bits - as it's just not possible for two 8-bit numbers to multiply to more than a 16-bit number.
      Indeed, if you're implementing this algorithm, then stick B in a register with twice as many bits and have your "running total" register be twice as many bits. Then you can run the algorithm blindly.
      Which is, as you may have guessed, what the CPU's actually doing in the circuitry with a hardware multiply.
      It's just multiplying by 2 - which, in binary, is nothing more than shifting all the bits to the left once - and addition.
      So, yeah, it's basically the same algorithm as this video, but working in a higher order. So multiplication -> exponential. And multiplying by 2 -> squaring. And addition -> multiplication.

    • @RegrinderAlert
      @RegrinderAlert 2 ปีที่แล้ว +3

      @@NoNameAtAll2 Are those tables actually part of the CPU (making use of microcode) or done by a compiler?

    • @NoNameAtAll2
      @NoNameAtAll2 2 ปีที่แล้ว +1

      @@RegrinderAlert multipliers are simple enough to be just logic gates
      what I was talking about is "check if top 48 bits are 0, so we don't need to wait/use most of the circuit"
      that is too, simple enough to be just logic
      about compilers... idk how common the explicit "square" command is in processors

    • @FrankHarwald
      @FrankHarwald 2 ปีที่แล้ว +9

      yes, except that's a shift & add algorithm, & shifts aren't like squaring but like multiplying by a power of 2.

  • @GeorgeBratley
    @GeorgeBratley 2 ปีที่แล้ว +384

    I think the last bit of the video is facinating - that you could perform an attack to work out a key based on the CPU time to calculate a square vs a square & multipy. A great example of the theoretical mathematics being ideal vs. the real world implementation being fundamentally vulnerable.

    • @nebuleon
      @nebuleon 2 ปีที่แล้ว +54

      Yes! And the technical term for it is a "timing attack".
      Timing attacks can be so insidious that you need to resort to assembly language just to get everything out of the way.
      Dr Pound's example has us doing an "always multiply", multiplying by one (the multiplication identity) after squaring for an unset bit in the exponent, so that we execute the algorithm in constant time. However, using a statement like [if (bit is zero) { multiplicand = one; } else { multiplicand = base; }] to do this "always multiplication" can end up in the *branch predictor's way.* If there are lots of zeroes in your key, it's going to take the "if" path more of the time; conversely, if there are lots of ones in your key, it's going to take the "else" path more of the time. Either way, the branch predictor will execute it faster than if the key were evenly-distributed ones and zeroes.
      To execute *that* in constant time, the CPU has to have a branchless test instruction to set the new multiplicand. Either you validate that the compiler uses the branchless instruction in your C code (say), or you write at least that part of the algorithm in assembly language.
      Edit: Or use the Montgomery form of the numbers, per David Gillies's comment, which makes it easier to have constant-time algorithms

    • @2Cerealbox
      @2Cerealbox 2 ปีที่แล้ว +25

      In data centers that have servers that the government uses, the government requires that their servers are plugged into an air-gapped power supply, unconnected to the power that every other server uses, so that a spy couldn't surreptitiously measure changes in their power usage. These are surprisingly effective attacks.

    • @mbican
      @mbican 2 ปีที่แล้ว +18

      That's why cryptographic implementation need to have constant time, no optimization allowed for multiplication by zero.

    • @domogdeilig
      @domogdeilig 2 ปีที่แล้ว +1

      @@nebuleon Multiply by base^binary. Thus if there is a 0, it will be multiplied by 1, and for 1 it's the ordinary multiply. As both x^n where n is 0 or 1 is easily calculated that should be same time (?).

    • @justinreusnow
      @justinreusnow 2 ปีที่แล้ว +2

      Why not just simply add a random sleep at the end of the algorithm? The size of the sleep would be a question, but if it’s a random amount each time that is sufficiently large to mask any work being done (or lack there of), it would remove this timing attack issue. It’s certainly wasteful to simply sleep, but it’s also wasteful to do unnecessary calculations to remain in constant time, no?

  • @thuokagiri5550
    @thuokagiri5550 2 ปีที่แล้ว +206

    Dr Pounds breadth and depth of knowledge in computer science never cease to amaze me!!
    "Man from the future"

    • @Ins4n1ty_
      @Ins4n1ty_ 2 ปีที่แล้ว +7

      Absolutely, but this specific piece of knowledge is pretty much common knowledge for anyone in CS. I studied this in college about 7 years ago, it was NOT a fun time since we had a pretty bad professor...

    • @quincy2142
      @quincy2142 2 ปีที่แล้ว +3

      Not necessarily the breadth, but connecting the theoretical with the practical. Bit on timing attacks was pretty nice.

    • @thuokagiri5550
      @thuokagiri5550 2 ปีที่แล้ว +2

      @@quincy2142 he has a very impeccable pedagogy

    • @jaydeep-p
      @jaydeep-p ปีที่แล้ว

      Even if he doesn't have the knowledge I still like his teaching style, it's engaging and fun.

  • @davidgillies620
    @davidgillies620 2 ปีที่แล้ว +71

    Note that for RSA and similar, the modular multiplication operation itself can be quite expensive, so modern implementations typically convert the numbers involved to an intermediate representation, called a Montgomery form, after Peter Montgomery. The binary exponentiation method can use Montgomery forms throughout, so only at the end is the result converted back to a conventional representation. Montgomery multiplication is also resistant to the side channel attacks mentioned at the end of the video.

    • @andrewharrison8436
      @andrewharrison8436 2 ปีที่แล้ว +14

      Now I have to look up Montgomery forms - or wait for the Computerphile video. I do like internet rabbit holes.

    • @locusf2
      @locusf2 2 ปีที่แล้ว +5

      @@andrewharrison8436 also look up Montgomery Ladder which is the similar algorithm for elliptic curves

    • @Czeckie
      @Czeckie ปีที่แล้ว

      fascinating, I had no idea this exists

  • @SRISWA007
    @SRISWA007 2 ปีที่แล้ว +21

    This is also known as "Fast Binary Exponentiation", which calculates pow(a, b, mod) in logarithmic time.

  • @todayonthebench
    @todayonthebench 2 ปีที่แล้ว +63

    Interesting algorithm.
    At first I thought it were just going to be a simple, "first we build our list of binary equivalents and then just multiply them all together in the end."
    As an example, calculate 3^1, 3^2, 3^4, 3^8, 3^16, etc. And then choose the values our exponent actually contains.
    Then the slight of hands of mathematicians came in at 9:40 and made things far far simpler and much easier to execute in practice.

    • @pikasnoop6552
      @pikasnoop6552 2 ปีที่แล้ว +13

      You might have noticed that Mike said that this was the left to right method. Yours (with taking the modulus) is the right to left variant and is just as quick.

  • @Alex_Deam
    @Alex_Deam 2 ปีที่แล้ว +150

    9:34 It's actually not the minimum number of operations. For example, to make 31 by this method takes 8 operations (SMSMSMSM), whereas the minimum is only 7 operations (N^2, N*(N^2), (N^3)^2, (N^6)^2, (N^12)^2, (N^6)*(N^24), N*(N^30)). However, in general finding the minimum number for a given exponent is NP-complete, so in practice square and multiply is presumably what you'd do. Otherwise, great video!

    • @pikasnoop6552
      @pikasnoop6552 2 ปีที่แล้ว +40

      The NP-completeness is a common misconception: this is only proven for sets of numbers, not single numbers. In practice I believe a window method is used, for which one precomputes some values so one can "combine" some multiplications.

    • @Alex_Deam
      @Alex_Deam 2 ปีที่แล้ว +15

      @@pikasnoop6552 Thanks for the correction

    • @Wecoc1
      @Wecoc1 2 ปีที่แล้ว +16

      Efficient exponentiation is a very interesting topic. The minimum number of multiplications required for N is an open problem in mathematics, you can read more about that on OEIS A003313, "Length of shortest addition chain for n".

    • @Skyb0rg
      @Skyb0rg 2 ปีที่แล้ว +12

      That example seems to use more space (you need to remember N^6 until after you finish (N^12)^2). May be the minimum operations in fixed space, where the space is exactly the size of the input string.
      Also important for cryptographic libraries which shouldn’t be allocating memory dynamically.

    • @realKlabauterklaus
      @realKlabauterklaus 2 ปีที่แล้ว +3

      If you introduce division as an additional operation, the example of 31 can be reduced to 6 operations: SSSSSD

  • @LeDabe
    @LeDabe 2 ปีที่แล้ว +30

    Also called russian peasant multiplication. It works for any power operation tbh, not only scalar multiplication. The power operator on matrix can for instance be used to compute large fibonacci numbers very quickly using the matrix 2x2 [1, 1, 1, 0]

  • @ezg5221
    @ezg5221 2 ปีที่แล้ว +22

    I read binary numbers left to right by starting at 1, doubling for each bit, and adding 1 if the bit was a 1. Very cool to see this pattern coming up in exponents

  • @longlostwraith5106
    @longlostwraith5106 2 ปีที่แล้ว +10

    I always liked calculating that recursively. For example, 2^6 is (2^3)*(2^3), 2^3 is (2^2)*(2^1) and 2^2 is (2^1)*(2^1).
    It's extremely simple to code it too. Here's the algorithm that performs the calculation:
    1) If exponent is zero, return 1
    2) Divide exponent by two, and save both the quotient and the remainder
    3) Call algorithm recursively with (exponent = quotient) and save the result
    4) If remainder is zero, return result*result
    5) If remainder is one, return result*result*base

    • @canaDavid1
      @canaDavid1 ปีที่แล้ว +1

      Unless you cache the results, this is no faster than multiplying one-by-one (probably slower because of recursion overhead)

    • @longlostwraith5106
      @longlostwraith5106 ปีที่แล้ว

      @@canaDavid1 I don't think you appreciate the difference between O(N) and O(logN).

    • @schwingedeshaehers
      @schwingedeshaehers ปีที่แล้ว

      @@longlostwraith5106 you have O(2^log(N)) so O(N)

    • @longlostwraith5106
      @longlostwraith5106 ปีที่แล้ว

      @@schwingedeshaehers How, exactly? Are you taking the division into account?

    • @schwingedeshaehers
      @schwingedeshaehers ปีที่แล้ว

      @@longlostwraith5106 you have log n layers, but these layer get more and more calculations each level. And they get an exponential growth until the log n barrier from the amount of layers

  • @hazemessawi2954
    @hazemessawi2954 2 ปีที่แล้ว +8

    I love how entertaining the video is given that I already know what the answer is and have used this quite a lot

  • @MrGooglevideoviewer
    @MrGooglevideoviewer 2 ปีที่แล้ว

    I love the step-by-step simplistic explanations you give and the focus on the core concepts. Thanks Mike! bloody champion! Peace and Love from Perth Australia!✌❤✌

  • @japedr
    @japedr 2 ปีที่แล้ว +10

    This is also called "exponentiation by squaring" and it's super useful in many cases.
    One quick example is in computing the nth Fibonacci number using the 2x2 matrix formula, where one raises a matrix to the nth power. But using this method, the number of multiplications is greatly reduced. There is also a closed form expression using a power of the golden ratio but that requires a lot of numerical precision for large n.

  • @jkye_314
    @jkye_314 2 ปีที่แล้ว +20

    I am currently purchasing master degree in cybersecurity and this guy summerize a 2h of lectures in literally 17min ;)

    • @QuantumHistorian
      @QuantumHistorian 2 ปีที่แล้ว +3

      Why are you purchasing a degree? Maybe do it somewhere with better teaching then?

    • @ait-gacemnabil9181
      @ait-gacemnabil9181 2 ปีที่แล้ว +8

      @@QuantumHistorian he probably meant pursuing

    • @johningham1880
      @johningham1880 2 ปีที่แล้ว

      I’m afraid that is the model for university education these days

    • @jkye_314
      @jkye_314 2 ปีที่แล้ว

      @@ait-gacemnabil9181yeah, you right. but, in some sense, it means actually a business activities for university.

    • @saiprasad8078
      @saiprasad8078 2 ปีที่แล้ว

      In a way, he is right. Nowadays everything needs to be purchased -- even knowledge.

  • @levyroth
    @levyroth 2 ปีที่แล้ว

    This is the coolest maths/CS video I've seen in a long time. Wow!

  • @tsjbb
    @tsjbb ปีที่แล้ว +3

    This was fascinating, so simple and intuitive once explained but so powerful

  • @LuciolaSama
    @LuciolaSama 2 ปีที่แล้ว +1

    Dude, you’re such a fun guy to listen to. Keep it up, cheers!

  • @Muzer0
    @Muzer0 2 ปีที่แล้ว +1

    Always wondered how the key reading timing/power attacks worked, that makes a lot of sense, cheers!

  • @conradludgate
    @conradludgate 2 ปีที่แล้ว +17

    I did the 3^45 mod 7 in my head fairly simply. 3 and 7 are coprime, so you know that 3 will cycle through all 7 numbers. Then we can do 3^42 * 3^3 mod 7, which is just 1*3^3 or 27 mod 7 which is 6. Still a very useful algorithm though

    • @thenewnew1997
      @thenewnew1997 2 ปีที่แล้ว +8

      Well, the algorithm you use is efficient for human, unfortunately computers don't see the same thing as us and know instinctively to do it, and it is just one particular case, this algorithm allows to be applied on all case scenario with the complexity of O(2*floor(log2(n)) +1) (worst case scenario, so big O) which n is the exponent of the number, so very efficient in terms of complexity. Anyways the method you use is very useful too, just for humans, not pc

    • @SimonBuchanNz
      @SimonBuchanNz 2 ปีที่แล้ว +2

      In encryption, properties like this are what makes the selection of values so necessary. In this case, the modulo value is generally thousands of bits, while the base is either 3 or 65537 (and the exponent is the message and must be less than the modulo)

  • @sean_vikoren
    @sean_vikoren 2 ปีที่แล้ว +1

    1) You rock, thank you for making world better.
    2) Focus fail hurts eyes.

  • @johnchessant3012
    @johnchessant3012 ปีที่แล้ว +1

    binary to decimal: go left to right, start from 0 and double if it's a 0 and double and add one if it's a 1. e.g. for 101010 you do 0 -> 1 -> 2 -> 5 -> 10 -> 21 -> 42. so 101010 = 42.
    decimal to binary: halve your number rounding down until you reach 1, e.g. 42 -> 21 -> 10 -> 5 -> 2 -> 1. now go backwards through this sequence and put a 1 if it's odd and put a 0 if it's even. so 42 = 101010.

  • @PopeLando
    @PopeLando 2 ปีที่แล้ว

    Fantastic! I watched the same Numberphile video and did the high power mod p calculation on my calculator. And during the process I realised that to get the right number of squares, you turn the power into its binary number and then square the same number of times as the power of two. (And mod every time the answer is bigger than 747). I even checked it by finding the nearest actual primes, which are 743 and 751. Perfect 1s for both!

  • @onlyeyeno
    @onlyeyeno 2 ปีที่แล้ว +2

    I LOVE this type of content !!!! Thanks a million for making and sharing :)

  • @meispi9457
    @meispi9457 2 ปีที่แล้ว +7

    I remember using this algorithm for a competitive programming question on one of the codechef's monthly contests, didn't know it had a name.

  • @NotAnAviator
    @NotAnAviator 2 ปีที่แล้ว +1

    This video was a lovely reminder of my time spent with number theorists in college, cryptography is so damn fascinating

  • @luminous2585
    @luminous2585 2 ปีที่แล้ว

    Thank you for this video. One of the most interesting things I've ever done in school, and I'd almost forgotten about it.

  • @eggsquishit
    @eggsquishit 2 ปีที่แล้ว +16

    You can do multiplication this way, too (by doubling & adding). Very useful on CPUs that can only do addition.

    • @trejkaz
      @trejkaz 2 ปีที่แล้ว +2

      This is also how I've seen multiplication done on mechanical calculators.

    • @PvblivsAelivs
      @PvblivsAelivs 2 ปีที่แล้ว

      You can. But it's faster to subtract squares. You have to build the table first. But you don't need multiplies to do it.

    • @canaDavid1
      @canaDavid1 ปีที่แล้ว +1

      @@PvblivsAelivs this depends on the speed of memory accesses, and how much memory space is available. But yes, table lookups are usually faster.

  • @chieeyeoh6204
    @chieeyeoh6204 ปีที่แล้ว

    This is just mind-blowing! Awesome video!

  • @user-vn9ld2ce1s
    @user-vn9ld2ce1s 2 ปีที่แล้ว +6

    You could explain this much more easily and without binary like this:
    You take the exponents and apply two rules until you get to 1:
    - if it's odd, subtract one
    - if it's even, divide by two
    Then you just do the squares/multiplies in reverse order of these operations.

    • @Loldemord
      @Loldemord 2 ปีที่แล้ว

      This is basically how you create the Binary Number out of the 10-base ^^ So its the same

    • @deanjohnson8233
      @deanjohnson8233 2 ปีที่แล้ว +1

      That might explain it, but that is not how it would be programmed efficiently

    • @user-vn9ld2ce1s
      @user-vn9ld2ce1s 2 ปีที่แล้ว

      @@Loldemord True

    • @user-vn9ld2ce1s
      @user-vn9ld2ce1s 2 ปีที่แล้ว

      @@deanjohnson8233 That's probably true, if we're talking about something like assembly (those bit shifts are single opcodes, aren't they?), but if i were doing this is in python, it probably wouldn't matter...

    • @deanjohnson8233
      @deanjohnson8233 2 ปีที่แล้ว

      @@user-vn9ld2ce1s you would implement it like this in assembly, c, c++, go, rust, c#, Java and many more. Bit shifting is not a rare and unusual thing.
      In python it would be strange because python does not have fixed numeric sizes. Using bitwise operations on something like that can easily lead to the wrong result if you don’t carefully study what Python does in various cases.
      Also, this video was about how to efficiently do this math. If you are concerned with the efficiency of math operations, you probably aren’t going to be using python.

  • @touficjammoul4482
    @touficjammoul4482 ปีที่แล้ว

    You Sir saved my life before the exam, I can't thank you enough.

  • @diagorasofmel0s
    @diagorasofmel0s 2 ปีที่แล้ว

    what a coincidence, i started studying RSA and y'all put out this banger, thanks Mike and Sean

  • @matthewisrail
    @matthewisrail 2 ปีที่แล้ว

    You guys and numberphile my 2 favorite channels

  • @b2bb
    @b2bb 2 ปีที่แล้ว

    I know it was touched on toward the end of the video but I think a part II to this video where Dr. Pounds could go into a specific application example where this is used. Can always use more videos with him!

  • @4akat
    @4akat 2 ปีที่แล้ว

    i love the channel. the slowness of the math demonstrations made me itchy!

  • @Richardincancale
    @Richardincancale 2 ปีที่แล้ว

    The last minute was spot on - avoiding side attacks!

  • @robertbrummayer4908
    @robertbrummayer4908 2 ปีที่แล้ว

    Interesting algorithm and great video as usual

  • @gustavofring4788
    @gustavofring4788 2 ปีที่แล้ว +2

    Truly interesting lesson, just studied this at school!

  • @timsmith2525
    @timsmith2525 11 หลายเดือนก่อน

    I love the idea of solving a complicated problem by solving a lot of simpler problems. Genius!

  • @applePrincess
    @applePrincess 2 ปีที่แล้ว +3

    I love this (semi-)collaboration. You are computerphile version of Matt Parker in any way.

    • @benwisey
      @benwisey 2 ปีที่แล้ว +2

      Matt Parker and Mike Pound. MP=MP.

  • @Joe_Payne
    @Joe_Payne 2 ปีที่แล้ว

    I literally submitted my rsa cryptography coursework in two weeks ago. This is all fresh in my mind. I'd love to see this go to the next step.

  • @franziscoschmidt
    @franziscoschmidt 2 ปีที่แล้ว +5

    Saw an implementation of this in a programming tutorial video but they just rushed over the details. Computerphile does a wonderful job at filling this gap (as always I might add!)

  • @wolfoftheair
    @wolfoftheair ปีที่แล้ว +1

    So, it turns out Square and Multiply on its own is not the greatest scheme for cryptography, because it lends itself to side channel attacks (timing and power usage).
    The way this is addressed is through a Montgomery Ladder, where every square operation is performed, and every multiply is performed, but the bit that determines whether it's a simple square or a multiplication actually determines where the output is placed. If it's intended to be used, it goes in the "correct" output location and mixed usefully in with the result. If it's not, it goes into an incorrect output location, and mixed in with all the other side-effect garbage from the function. This results in the power draw and time being constant, which defeats those side-channel attacks.

  • @JivanPal
    @JivanPal 2 ปีที่แล้ว

    Excellent video! Alternative summarised explanation: exponentiation in a sense "converts" addition to multiplication (see 3Blue1Brown's excellent intro to group theory and e^(iπ) = -1 for an exploration of this). The algorithm for converting a bitstring to a number (or equivalently, the binary representation of a number to its decimal representation) is to start with zero and read the number from left to right, doubling when you see a new digit, and then adding the value of that digit (i.e. add noting if it's "0", or add 1 / increment if it's "1"). For example, the binary number 101110101 is equal to decimal 373, as follows, reading the digits of the binary representation from left to right:
    • Start with 0.
    • Read a digit, "1": double, then increment, giving 1.
    • Read "0": double, giving 2.
    • Read "1": double, then increment, giving 5.
    • Read "1": double, then increment, giving 11.
    • Read "1": double, then increment, giving 23.
    • Read "0": double, giving 46.
    • Read "1": double, then increment, giving 93.
    • Read "0": double, giving 186.
    • Read "1": double, then increment, giving 373.
    The square and multiply algorithm just starts off with the base of the exponent (i.e. 23 as in the video) rather than 0, and replaces each doubling operation with a squaring, and each increment operation with a multiplication by the base. That is, exponentiation with base 23 has converted addition of 1 into multiplication by the base, 23. Likewise, doubling a number (which is the same as adding a number to itself) has been converting into squaring a number (which is the same as multiplying a number by itself).

  • @tdchayes
    @tdchayes 2 ปีที่แล้ว +7

    It's true that using this algorithm on the private key exponent is more expensive than the specially chosen public exponent. (2048 bit exponent -> 4096 multiplies). However since for the RSA algorithm, the private key holder knows the factors used for the key, an algorithm based on the Chinese Remainder Theorem can reduce the cost of the private key operations.

    • @666Tomato666
      @666Tomato666 2 ปีที่แล้ว

      yes, but CRT reduces the cost by a factor of about 3, so the private key operations are still slower than the public key operations which need to calculate power by a 16 bit number

  • @anonymousvevo8697
    @anonymousvevo8697 3 หลายเดือนก่อน

    Amazing each time a watch your videos

  • @samharkness8861
    @samharkness8861 2 ปีที่แล้ว

    Great video, thanks! He belongs in Numberphile videos too

  • @richardyao9012
    @richardyao9012 ปีที่แล้ว

    I always did square and multiply from the last significant bit first. In C, this is:
    double pow(double x, int exp) {
    unsigned int e = (exp >= 0) ? exp : -exp;
    double result = 1.0;
    while (e) {
    if (e & 1) {
    result *= x;
    }
    x *= x;
    e >>= 1;
    }
    if (exp < 0)
    return (1.0 / result);
    return (result);
    }
    When I do it on paper by hand, I just calculate all squares first. Then I multiply every result corresponding to a 1 bit, starting from the least significant bit. Of course, the order in which I multiply does not matter, but it is how I always did it.

  • @demonblood8841
    @demonblood8841 2 ปีที่แล้ว +8

    This guy should have his own channel lol great stuff tho love it

  • @thatcreole9913
    @thatcreole9913 2 ปีที่แล้ว

    This was fantastic!

  • @johnsenchak1428
    @johnsenchak1428 2 ปีที่แล้ว

    MIND BLOWING !

  • @piiumlkj6497
    @piiumlkj6497 2 ปีที่แล้ว +1

    This man is a legend

  • @timholloway7413
    @timholloway7413 ปีที่แล้ว +1

    The one involving modulo 7 can be done relatively easily- as it’s prime we know 3^6 is congruent to 1 mod 7 ( Fermat’s little theorem ), then do 45 mod 6 and hence get to (3^6)^7*(3^3) mod 7 which is (1)^7*(3^3) mod 7 which is of course 6 mod 7.

  • @sembutininverse
    @sembutininverse 2 ปีที่แล้ว

    thank you guys🙏🏻🙏🏻🙏🏻🙏🏻🙏🏻, it was really insightful.
    ♥️

  • @estapeluo
    @estapeluo 2 ปีที่แล้ว

    Waiting for those follow-up videos!

  • @gloverelaxis
    @gloverelaxis 2 ปีที่แล้ว

    god Dr Pound is so good at explaining things

  • @lightyagmi4925
    @lightyagmi4925 2 ปีที่แล้ว +1

    We call it binary exponent algorithm
    For example 3^10 =?
    we write its power in binary 10 = 1010
    The bits which are set will be included in the final ans as we can calculate all exponent which are powers of two very quickly.
    3^10= 3^8 * 3^2

  • @dougfoo
    @dougfoo 2 ปีที่แล้ว

    cool trick, didn't realize that relation
    i love this series

  • @abdallahegniia1672
    @abdallahegniia1672 6 หลายเดือนก่อน

    A comment that i've liked
    "This man forgot things about computers more than what i will ever learn"

  • @irwainnornossa4605
    @irwainnornossa4605 2 ปีที่แล้ว

    Amazing video, I almost want to incorportate it to my program.

  • @nodroGnotlrahC
    @nodroGnotlrahC 2 ปีที่แล้ว +4

    Basically Russian Multiplication (covered by Johnny Ball on Numberphile), but square and multiply instead of double and add. Surprised that wasn't mentioned.

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว

      Indeed! That is the basis of a common efficient algorithm for converting string representations of integers expressed in base _n_ into actual integer datatypes, too, e.g. for decimal in C:
      char* input_string = "285657";
      int result = 0;
      for (char* c = input_string; c != NULL; c++) {
      result += *c - '0';
      result *= 10;
      }
      Or for capitalised hexadecimal:
      char* input_string = "45BD9";
      int result = 0;
      for (char* c = input_string; c != NULL; c++) {
      result += isdigit(*c) ? *c - '0' : 10 + *c - 'A';
      result *= 16;
      }

  • @michaelhunte743
    @michaelhunte743 ปีที่แล้ว

    Nice use of symmetry and multiplication.

  • @KaneYork
    @KaneYork 2 ปีที่แล้ว

    Are you going to make a followup talking about addition chains?

  • @ricardoabh3242
    @ricardoabh3242 2 ปีที่แล้ว

    Crazy impressive

  • @demon_hunter7905
    @demon_hunter7905 7 หลายเดือนก่อน

    fooking genius mate

  • @wktodd
    @wktodd 2 ปีที่แล้ว +1

    Mike Pound - Always good value 8-)

  • @cameronsteel6147
    @cameronsteel6147 2 ปีที่แล้ว +3

    Such a cool method! The very fact that you can calculate 3^45 mod 7 on paper in a few minutes is awesome considering 3^45 has 22 digits!

    • @trejkaz
      @trejkaz 2 ปีที่แล้ว +1

      You can do it faster. For instance, observe that 3^6 mod 7 is 1. So 3^45 mod 7 is going to be the same as 3^3 mod 7, which is 6.

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว

      @@trejkaz That depends on you knowing the totient of the modulus / order of the multiplicative group, which is hard if you don't know the prime factorisation of the modulus.

    • @trejkaz
      @trejkaz 2 ปีที่แล้ว

      @@JivanPal I don't know much about groups at all and didn't really use any group theory to do that solution, just normal modular arithmetic. Although, in the video he did say that the modulus is usually prime for these examples, so I don't think I'd have too much trouble determining the factors either.

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว +1

      @@trejkaz It depends. The hardness of that factorisation problem is what gives RSA its security. The totient function, denoted φ(x), counts how many numbers less than x are coprime to x. It is such that φ(p) = p-1 where _p_ is any prime, and φ(ab) = φ(a)·φ(b) where _a_ and _b_ are any two integers. The encryptor's/signer's secret is a pair of large primes, _p_ and _q,_ that serve as the private key, and the public knowledge that serves as the public key is their product, _n_ = _pq._
      Thus, the encryptor/signer is always dealing with _p_ and _q,_ whereas the decryptor/verifier is always dealing with _n,_ whose prime factorisation he doesn't know. Without that knowledge, computing φ(n) is hard; with that knowledge, it is trivial: φ(n) = (p-1)(q-1). If he could figure out the prime factorisation, the encryption scheme is broken, precisely because he then knows φ(n) and can thus quickly compute these modular exponentials we're interested in: g^x mod n = g^[x mod φ(n)] mod n.

    • @thenewnew1997
      @thenewnew1997 2 ปีที่แล้ว

      @@trejkaz can you generalized this method for all case scenario for computers? This method allows to be generalized for every case scenario with complexity of O(2*floor(log2(n))+1) and it is very efficient already (n being the exponent of the number to verify). Since I'm at it I'll also remind you that computer don't have instinct or intelligence like us

  • @KX36
    @KX36 2 ปีที่แล้ว

    nice how you built in the eyes-glazing-over effect into the video so my eyes didn't have to this time like they have in some other videos (because things are complex, not because they're boring) :D

  • @appropinquo3236
    @appropinquo3236 2 ปีที่แล้ว

    This is really cool! I'm glad that i learned about binary numbers, because I wouldn't have been able to understand any of this otherwise.

  • @nickdunstone
    @nickdunstone 8 หลายเดือนก่อน

    Yet again I see 65537 which coincidentally is my favourite number too! It's the 17 bit big brother of 257.

  • @SillyMakesVids
    @SillyMakesVids 2 ปีที่แล้ว

    That's a wicked smart algorithm.

  • @jeremyahagan
    @jeremyahagan 2 ปีที่แล้ว

    Does this represent the smallest number of steps for any given exponent?

  • @drskelebone
    @drskelebone 2 ปีที่แล้ว

    Is there a video about modulo math commuting for both addition and multiplication? I don't remember one, and it seems to be explicitly required here.

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว

      What do you mean by "commuting" here?

  • @zyghom
    @zyghom 2 ปีที่แล้ว

    math is amazing, especially if used in the correct way ;-) you guys are also AMAZING! ;)

  • @hemangchauhan2864
    @hemangchauhan2864 2 ปีที่แล้ว

    This is really clever

  • @edwealleans
    @edwealleans 2 ปีที่แล้ว

    A cool video and topic but I would like to nit pic the camera placement when Dr Mike writes on his paper. Surely the camera could have been on his left side?

  • @deekshantmalvi4612
    @deekshantmalvi4612 2 ปีที่แล้ว

    Thanks man. ❤️❤️

  • @martixbg
    @martixbg 2 ปีที่แล้ว

    This was a fascinating video even if the algorithm was rather obvious.

  • @SlimThrull
    @SlimThrull ปีที่แล้ว

    Huh. I was using a similar but substantially slower method. Good to know it can be improved upon.

  • @devonbraner4353
    @devonbraner4353 2 ปีที่แล้ว

    cool video! cool algorithm!

  • @vikingthedude
    @vikingthedude 2 ปีที่แล้ว

    So is modulo distributive over multiplication? Is that why we can keep the numbers small?

  • @KlaasDeSmedt
    @KlaasDeSmedt 2 ปีที่แล้ว

    12:55 you can work backwards: if it's odd, subtract 1, if it's even, devide by 2 ;)

  • @jotrockenmitlocken
    @jotrockenmitlocken 10 หลายเดือนก่อน

    Very helpful.

  • @MM-by6qq
    @MM-by6qq ปีที่แล้ว

    thank you!!

  • @harveychallinor367
    @harveychallinor367 2 ปีที่แล้ว

    Regarding the final point about reading a private key from power usage, wouldn't it already need to be on the hacker's computer anyway for this to work? In which, case they already have your private key

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว +4

      No; see "side-channel attack". For example, the encryption may be running in a trusted execution environment (like ARM TrustZone or Apple Secure Enclave) and the adversary is running untrusted code outside of that environment that can determine power consumption / voltage levels / timings.
      Another example: the adversary is sitting outside your house with a supersensitive probe in the electricity lines that run into your bedroom, where your desktop computer is plugged in, and is able to determine changes in voltage on the lines that way, which correspond to your computer's power consumption.
      In both examples, the adversary does _not_ have any access to the trusted environment, but is able to acquire information about the behaviour of that environment which can be reverse-engineered to determine what was actually happening in that environment.

    • @harveychallinor367
      @harveychallinor367 2 ปีที่แล้ว

      @@JivanPal thanks for the reply, that clears things up

  • @j7ndominica051
    @j7ndominica051 2 ปีที่แล้ว

    When making a really big number, if you can't do the intermediary mod trick, how would the computer handle overflow to another word?

    • @nebuleon
      @nebuleon 2 ปีที่แล้ว +1

      If you can't do "mod" at any step of the way, you have to allocate enough memory for a multi-word number having, as a number of bits, at least the sum of the number of bits in both multiplicands at every multiplication.
      For example, if you're at a point where base^64 is 415 bits, you need at least 830 bits for a square (base^64 x base^64), since both multiplicands are 415 bits and 415 + 415 = 830. Then the multiplication proceeds as usual: on a 64-bit computer, bits 63 to 0 of each multiplicand contribute to partial sums at bits 127 to 0 of the result, and so on, until bits 447 to 384 contribute to partial sums at bits 895 to 768 of the result.
      You could always pre-allocate enough bits for the entire number in advance. Then you would execute a modified square and multiply algorithm that just calculates how many bits you're likely to need to hold the final result given all the multiplications involved, allocate that (say it's 16190 bits), and execute the proper square and multiply in multi-word arithmetic on 16190 bits.

  • @anon_y_mousse
    @anon_y_mousse 2 ปีที่แล้ว

    Being half asleep when watching this, for a moment when he was going over all the steps I thought I was nodding off, but nope, it was just the camera. Perhaps get the camera some coffee in the future.

  • @ujjawalsinha8968
    @ujjawalsinha8968 2 ปีที่แล้ว

    Intresting, so a different base (instead of 2) can give different performances. For base = 3, will it be called cube, square and multiply algorithm?

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว

      It is true that in base _n,_ you will determine which of _n_ operations to do (e.g. for _n_ = 3, these are multiply, square, or cube). However, cubing is just multiplying a number by itself twice, so _n_ = 3 actually gives *_worse_* performance. In fact, _n_ ≥ 4 also gives worse performance, since what would be squaring twice in the _n_ = 2 case would become multiplying a number by itself four times. Thus, _n_ = 2 gives the best average performance.
      Even more generally, square and multiply does not give the best possible performance / shortest possible algorithm for any given exponent, but it is the best we can do without solving a harder problem for the exponent first, which on average would give us much worse performance. It is this: if we know the totient of the modulus _m_ (or equivalently, the order of the multiplicative group { g^x mod m | x ∈ 𝐙 }, where _g_ is the generator of the group, i.e. the base of the modular exponential we're trying to compute, which was 23 in the video), which is denoted φ(m), then we have g^[φ(m)] mod m = 1 (by an extension of Fermat's Little Theorem), and so
      g^x mod m
      = g^[q φ(m) + r] mod m
      = ( g^[φ(m)] ^ q ) g^r mod m
      = g^r mod m,
      where _q_ and _r_ are the quotient and remainder of _x_ divided by φ(m), respectively. However, computing φ(m) is hard unless the prime factorisation of _m_ is known, so in practice this is not used often. The hardness of this problem is what underpins the security of RSA.

  • @kdawg3484
    @kdawg3484 2 ปีที่แล้ว

    Mike Pound for Numberphile video, please.

  • @Tristoo
    @Tristoo 2 ปีที่แล้ว

    the timing/power thing at the end is called a side channel attack

  • @ksc91u
    @ksc91u 2 ปีที่แล้ว

    Could you make a video about opaque.

  • @heaslyben
    @heaslyben 2 ปีที่แล้ว

    I didn't learn this one at school. It's gorgeous! Thank you.

  • @bwill325
    @bwill325 2 ปีที่แล้ว

    I always wondered how we dealt with such enormous numbers

  • @greatestever2914
    @greatestever2914 2 ปีที่แล้ว

    yeah! in my assembly class, when we had to program, there is no multiplication operator, so you'd have to shift the bits of the binary value in the registers to actually multiply two numbers. Don't even get me started with bringing in value into registers, and these values can't be stored in just any registers... and then pulling the value out.. storing elsewhere... god bless ..

  • @Alecu100
    @Alecu100 2 ปีที่แล้ว

    A similar algorithm can be applied for divisions.

  • @tomyao7884
    @tomyao7884 2 ปีที่แล้ว

    I wonder if square and multiply would lose to a naive multiply-only method for a small modulo m and a large exponent x. The naive method would multiply everytime and mod m, then cache the result, quickly generating a repeating pattern of length at most m, and then x mod patternLength would give the place in the pattern which is the answer. So the number of operations is at most m, compared to a square and multiply which can be very expensive for very large x.

  • @christopherg2347
    @christopherg2347 ปีที่แล้ว

    Square is ab it limit Multiplication, while Multiplication is a bit like addition.
    Just in the amount it will change the overall result.

  • @misterkite
    @misterkite ปีที่แล้ว

    There are a couple of Project Euler questions that this will help solve.

  • @andrewjknott
    @andrewjknott 2 ปีที่แล้ว +2

    5:24 - cleaner explanation to convert 45 -> 101101 -> 32 + 8 + 4 + 1.

    • @NoNameAtAll2
      @NoNameAtAll2 2 ปีที่แล้ว +1

      that's backward of this algorithm
      you did calculation of powers of 2 and multiply them
      in the video 101101 -> (((((1)*2+0)*2+1)*2+1)*2+0)*2+1

    • @JivanPal
      @JivanPal 2 ปีที่แล้ว

      @@NoNameAtAll2 Or in postfix notation to avoid all those parentheses: 1 2× 0+ 2× 1+ 2× 1+ 2× 0+ 2× 1+.

    • @NoNameAtAll2
      @NoNameAtAll2 2 ปีที่แล้ว

      @@JivanPal why not prefix then?
      + * + * + * + * + * 1 2 0 2 1 2 1 2 0 2 1
      :)

  • @snack711
    @snack711 ปีที่แล้ว

    math rules, i love these kind of tricks

  • @QuantumHistorian
    @QuantumHistorian 2 ปีที่แล้ว +3

    Is there a proof that that's the fastest decomposition to exponentiate a number? The algorithm here clearly works in all cases in, at worse, 2 ln_2(m) for exponentiating by m, but it's not at all obvious to me that this is the fewest possible number of steps for all m. I vaguely recall hearing a few years ago that in general this was actually still an open problem.

    • @romajimamulo
      @romajimamulo 2 ปีที่แล้ว +3

      That's because it's not. For instance, if you knew your exponent was x^y and X was odd, it would be fastest to be taking to the power of X repeatedly.
      However, this is useful with computers because they use binary, so finding which powers of 2 make up the exponent is trivial

    • @pikasnoop6552
      @pikasnoop6552 2 ปีที่แล้ว +6

      It indeed is still an open problem. The fact that this is not the fastest way can be seen even for n=15. In that case one can compute (x^3)^5 and save a step.