Prime Factorization Algorithm!

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024

ความคิดเห็น • 87

  • @MsAnnaBs
    @MsAnnaBs 5 ปีที่แล้ว +12

    This is the best explanation that I've ever seen. Exactly how all topics should be explained. Before this video I couldn't get, why we need to divide number when we are looking for all Prime Factors of a number. Thank a lot dude!!

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว +1

      Hello and thank you very much for your comment! Very happy to be of help =)

  • @mamazu1995
    @mamazu1995 6 ปีที่แล้ว +2

    One other thing to note is that the more factors a number has the easier it is to factor it.
    For example the number 2000000000000 which is 2^13 * 5^12 which the computer can figure out instantly. However, if I flip one of the middle digits to a five and make the number to get 2000005000000 then it takes 1.5 seconds to figure out 2^6*5^7*7*57143. Because now it had to generate all the prime up to 57143.

    • @NERDfirst
      @NERDfirst  6 ปีที่แล้ว

      Yes! This is so true!
      When I was trying to find numbers that would take a while to factorize (for the demonstration at the end), I observed this as well. The length of the number isn't the determining factor at all, and as you've mentioned, there are extremely large numbers that can be factored very easily.
      I guess this is a good demonstration to show us that generating a "difficult" number goes beyond picking a large one. Large prime factors have to be used, so it isn't a straightforward process either.

  • @marklondon9004
    @marklondon9004 ปีที่แล้ว

    Great tutorial! Thanks

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว +1

      You're welcome! Glad to be of help =)

  • @mamazu1995
    @mamazu1995 6 ปีที่แล้ว

    Great video and nice explanations. I also implemented prime finders like that.
    And I have looked at your code which you provide with Google Docs link. If you did that with github, I would have made a pull request. Your code contains a lot of redundancy up to the point where only the prime generators differ. What you could easily do is put the whole execution code into one function that takes in a parameter (the prime generator and runs the program) that way you could easily run the program twice with both generators and compare the output.

    • @NERDfirst
      @NERDfirst  6 ปีที่แล้ว +1

      Hello and thank you very much for your comment! Yes I avoided using github for simplicity on my part. I also avoided directly comparing the two since the slower function would significantly bottleneck the other for most large inputs, so it wouldn't be particularly useful.
      Of course, if you've made modifications to the code that you'd like to share, feel free to do so here! I'm sure it'll help others as well =)

  • @briancannard7335
    @briancannard7335 4 ปีที่แล้ว

    Thanks for teaching! A great example! :-)

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +1

      Hello and thank you very much for your comment! Glad you liked the video =)

    • @briancannard7335
      @briancannard7335 4 ปีที่แล้ว

      @@NERDfirst thanks for checking in! I actually now think that I found a way to totally blow RSA. Have you heard about en.wikipedia.org/wiki/RSA_numbers#RSA-896 factoring challenge?

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      No, I haven't!

  • @pigizoid9924
    @pigizoid9924 ปีที่แล้ว +1

    27:10
    23909 isnt a factor of 64,563,455,677,457,222
    what happened?

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว +1

      Hello and thank you very much for your comment! Yeah, I don't really know what happened there! Unfortunately I don't have the time to investigate at the moment due to work commitments but I'll put up a note so users are aware.

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว +2

      Hello again! Figured out the problem - While Python is supposedly able to handle integers of arbitrary length, we eventually divide it by a prime number. This operation converts the number into a floating point number, which, if large enough, will suffer from floating point precision loss. Hence the value changes, causing the results to become incorrect. Using floor division (//=) instead of regular divison (/=) fixes the problem. I'm working on correcting the code on the Bitbucket page now.

    • @pigizoid9924
      @pigizoid9924 ปีที่แล้ว

      @@NERDfirst use
      from decimal import *
      then use
      num = Decimal( input ("num :"))
      and it should work
      but you might need to convert other calculations that you do on num into decimal as well
      e.g.
      to square root use num**Decimal(0.5) or it will raise an error

    • @pigizoid9924
      @pigizoid9924 ปีที่แล้ว

      basically Decimal will automatically store the number with place value precision instead of storing the whole number, so you do need to convert any calculations you do to decimal as well, i found it has no impact on the run time but allows you to store as big or precise decimal place values as you want

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว

      Thank you for sharing! Good to know there's an analog to Big Integers in Python.

  • @rmaxwell86
    @rmaxwell86 4 ปีที่แล้ว +2

    I had to write a similar program for a course.
    I basically recursively used the difference of squares (fermat’s factorization), then modulo division from 2 to sqrt(x) for each factor.
    That dramatically reduced the running time.
    Basically it changes the problem from factoring a very large integers into factoring two smaller integers..
    I thought I had messed up at first, as my running times were 0 seconds for the largest integer the professor provided (about 18-19 digits).. which took them 16 seconds. And even larger integers took only 7-8 seconds to calculate.

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      Hello and thank you for your comment! Interesting approach, I'll have to look deeper into that.

    • @agnelodearaujo7413
      @agnelodearaujo7413 ปีที่แล้ว

      Could you explain the "modulo division from 2 to sqrt(x) for each factor" part ? How did you iterate through those values ?

  • @Pa_Nic
    @Pa_Nic 3 ปีที่แล้ว +1

    For the people who want to learn faster algorithms for these problems, here's a list of the algorithms that are used in practice.
    prime checking:
    Miller-Rabin algorithm, if this algorithm is run once, it can output "prime with probability 75%" or "definitely composite". If you run this algorithm 100 times for a 100 digit number, there's an incredibly high probability that your answer is correct (if there is a single "definitely composite", it's definitely composite) and this runs in under a second.
    Baillie-PSW algorithm, mathematicians haven't been able to prove that this algorithm is ever wrong, but they also haven't been able to prove that it's always correct. It's faster than Miller-Rabin.
    Atkin-Morain Elliptic Curve Primality Test (ECPP), slower than Miller-Rabin, but it can output "definitely prime" or "maybe composite" with the maybe composite answer having diminishing probability the larger the numbers you try. Just like the Miller-Rabin test, you can run this one multiple times until you get the "definitely prime" output. You also get a certificate if it's a prime so you can verify that it's a prime (if you want to prove it to your friend, for example). The verification is faster. This algorithm is slow, but verifying the certificate is really fast. It also involves A LOT of Math.
    Factorisation:
    Pollard-rho algorithm, this algorithm runs in about the same speed as trial division, but for numbers with twice the length. Coupled with Miller-Rabin, this can be a really efficient algorithm for numbers below 80 digits.
    Quadratic sieve, this algorithm is the best known for integers above 80 digits and under 120 digits. It runs in a few thousand years total (the algorithm was parallelised on many computers). This algorithm requires heavy duty math.
    General Number Field Sieve, this algorithm is the best known for integers above 120 digits.
    Other algorithms are either inefficient, only works for special numbers, is only fast for small numbers, or unknown :)

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว +1

      Thank you for sharing!

  • @stego4LP
    @stego4LP 3 ปีที่แล้ว

    If the number to be factorized contains the same prime 2 times wont the isPrime function return false on the second call?

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you for your comment! Could you give me a timestamp of the code you are referring to so I can better understand you? I'm not really sure where I should go on the video to begin with this.

    • @stego4LP
      @stego4LP 3 ปีที่แล้ว

      @@NERDfirst Yes of course sorry. 24:37. I was talking about the first optimization. E.g. our number to be factorized has 2 times 2 as prime factor. Then the second call of isPrime(divisor=2) will return False because 2 is already in primes. It is very likely that i am wrong or missed something but i cant get my head around it.

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello again! I see the confusion now, that optimization is a little bit... hard to read. AND there _does_ seem to be a bug in there, but it won't cause the problem you mention.
      Luckily it won't return false. Since 2 is already in the list primes, the if-statement (Line 10 at 25:20) will cause the loop to break immediately, resulting in a return value of True outside the list.
      Where I think I messed up is doing primes.append() again. So the list of primes would have two 2s in it. This could pose a problem in the long run as we have to deal with a ton of duplicated primes.
      EDIT: I have confirmed the above bug happens. Thanks for bringing it to my attention! I'll edit the code.

  • @tejeshwarshivanandan4966
    @tejeshwarshivanandan4966 5 ปีที่แล้ว

    If i would've used print instead of return i would've got the output as false n times....why?...whats the difference between return and print... I'm a beginner😅

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว +2

      Hello and thank you for your comment! Return *leaves* a function and passes the value back to the caller. If you use print in place of return, the function will not stop running, so the flow of events will be completely different.

  • @orcashamudeluxeu567
    @orcashamudeluxeu567 5 ปีที่แล้ว

    Cool

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว +1

      Thank you! Glad you liked the video =)

    • @orcashamudeluxeu567
      @orcashamudeluxeu567 5 ปีที่แล้ว

      @@NERDfirst thx
      /// error

  • @Inspirator_AG112
    @Inspirator_AG112 ปีที่แล้ว +1

    O(√n) still doesn't defeat astronomically large values...
    That is best suited for O(log(n)) or O(log(log(n))).

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว +1

      Hello and thank you for your comment! Yes indeed - It's only an improvement over linear time.

  • @NERDfirst
    @NERDfirst  3 ปีที่แล้ว +1

    *ERRATA AND THANKS (Expand to Read)*
    Thank you to @pigizoid9924 for pointing out that the partial factorization shown at 27:10 is incorrect - 23909 isn't a factor of the input at all. This error crops up with factorizing very large numbers - Divisions cause the number to be converted to floating point, which leads to loss of precision. This "corrupts" the number, causing the rest of the results to be incorrect. Using a floor division instead (which never converts the number to floating point) fixes the problem. The code in the repository has been updated as of 27-Jun-2023.
    Thanks to Dat_Kowalski for revealing an inefficiency with the isPrime() function in factorizer_extended.py (25:20). If you perform multiple repeated runs of isPrime() on the same prime number, that number gets added to the primes list multiple times. The attached code has been updated to address this - If a prime number is already known, return it immediately. This prevents the above problem from happening, AND prevents recomputation for already known primes.
    Thanks to denifednu for discussing the slowness of the print statement in primefactors_extended.py. In the attached code, an additional counter has been added so that the print statement only occurs once every _n_ steps (set to 5000 in the code, but you can tweak it using a constant at the top of the code).

  • @subway39964
    @subway39964 4 ปีที่แล้ว +2

    I didnt follow your solution, but listening you inspired me to think of my solution....liked how you teach to divide the problem in pieces ..helped me understand..thank you sir..:)

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      You're welcome! Very happy to be of help =)

  • @TC-rv6sz
    @TC-rv6sz 4 ปีที่แล้ว +2

    I was stuck on problem 5 from project euler and this helped me wrap my head around the issue I was having with my hopelessly-brute-force factorization function lol. thanks !!

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      You're welcome! Glad to be of help =)

  • @otetumooluwaseun3948
    @otetumooluwaseun3948 2 วันที่ผ่านมา

    Wow!
    I went through the comments before watching the video and I thought maybe it was all hype. But watching after watching the video, I wonder why it doesn't have millions like yet. That shows new software engineers are minimal. We have mostly front-end and back-end developers.

    • @NERDfirst
      @NERDfirst  15 ชั่วโมงที่ผ่านมา

      Hello and thank you very much for your comment! I'm not sure if this reflects better on my software engineering ability or rather how I structure and present information (Of course I would prefer it to be the latter, since my career is in education). Regardless, I'm glad this video has been a help for you!

  • @den1fednu
    @den1fednu 6 ปีที่แล้ว +1

    primefactors_extended.py is slowed down dramatically by the 'Please wait, now factorizing with' print statement, updating the code to print that status message once per second gives a substantial increase in speed.
    Although beyond the scope (and point) of this video, for anyone looking for one of the fastest python implementations available:
    pip install sympy
    >>> from sympy.ntheory import factorint
    >>> factorint(your_number)

    • @NERDfirst
      @NERDfirst  6 ปีที่แล้ว +1

      Hello and thank you very much for your comment! Great point on the print statement, got too caught up in flashy visuals! Thanks for sharing about sympy as well!

  • @MayankSharma-wx4mf
    @MayankSharma-wx4mf 6 ปีที่แล้ว +2

    Beautifully explained

    • @NERDfirst
      @NERDfirst  6 ปีที่แล้ว

      Hello and thank you very much for your comment! Glad you liked the video =)

  • @illiasdeekens9840
    @illiasdeekens9840 3 ปีที่แล้ว

    If we use the code at 15:34, would factors that are used multiple times also appear multiple times in the result?

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you for your comment! Yes. This is a known bug and I believe it persists all the way to 25:20. See pinned comment for more.

  • @KemonoFren
    @KemonoFren 3 ปีที่แล้ว

    This is the Sieve of Eratosthenes, correct?

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you for your comment! This does not explicitly apply the principles from that algorithm, but the effects are certainly very similar.

  • @universalponcho
    @universalponcho 3 ปีที่แล้ว

    You are amazing

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you for your comment! Glad you liked the video :)

  • @sensi59
    @sensi59 5 ปีที่แล้ว +1

    Best video on prime number ever ! Thank you.

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว

      You're welcome! Very happy to be of help =)

  • @siamsama2581
    @siamsama2581 2 ปีที่แล้ว

    Good video

    • @NERDfirst
      @NERDfirst  2 ปีที่แล้ว

      Hello and thank you for your comment! Glad you liked the video =)

  • @Dtomper
    @Dtomper 3 ปีที่แล้ว

    24:30 At first the primes list is empty, so if we give isPrime any number it should return True, am I wrong?
    Edit: I understand my mistake now, the other function that's calling isPrime() actually starts from 2 so all the previous primes are saved. Btw thanks for the video I really liked the explanation.

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you for your comment! Good catch actually. If you're explicitly calling isPrime() on its own then you need to watch out for this.

  • @ph6560
    @ph6560 ปีที่แล้ว

    I was quite struck that this channel doesn't have more subscribers. Especially as _this gentleman is impressively instructive with an exceptional talent for pedagogics_ - which is shown in the great way he structures and explains various complex topics!

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว

      Hello and thank you very much for your comment! Really happy to know you've found the video useful! The low view/sub count is probably a direct result of me not doing any marketing at all, and just relying on the TH-cam algorithm!

    • @ph6560
      @ph6560 ปีที่แล้ว

      @@NERDfirst i kinda suspected that actually but still thought I'd give iit a try and sincerely praise your talent, send some good vibes and encourage you to make more vids. As I just found your channel recently I obviously havent had a chance to watch that many vids,, but I only need a handful to realize your skills in conveying complex topics in instructive ways. Again, my aim was to suggest you create more vids if the motivation is the of course. More specifically, as my personal preference is more of maths, discrete maths, AI, etc. and not so much different hardwares, I'd make the suggestion that you perhaps delve into such topics of your likings more frequently, i.e. if I got to choose. :) I e.g. briefly watched the vid about primes, which was magnificent btw, and such topics are likely "broader" as it relates to volume of potential, interested viewers. If you understand what I mean and of course also have the interest and motivation to make such vids - I'd be first in line to watch them and give it its deserved "like" click. Lastly I'll wish you further success with your youtube-endevours, and keep up the good work!

    • @NERDfirst
      @NERDfirst  ปีที่แล้ว

      Thank you very much! The good vibes have definitely been well received and I'm super grateful you took the time out to write =)
      Unfortunately my upload rate has tapered off over the years - The hobby projects always take a back seat to the "real work", but I'm constantly on the lookout for new ideas. I've got so many ideas kicking around, just never the time and energy to execute them, but I'm always keeping an eye open for a even a little window of time.
      Luckily I definitely get to choose the topics (though that means I tend to stick to those that I'm more comfortable with, which is why I tend to shy away from math-sy stuff a lot of the time), but yes, I'm always on a lookout for what might be in demand for video ideas as well.
      Thank you again! It's comments like this that make my day =)

  • @Taqu3
    @Taqu3 4 ปีที่แล้ว

    This is a real original content. I can't believe crap posted on minutephysics gets over 300k

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      Hello and thank you very much for your comment! Glad to be of help =)

  • @johannes469
    @johannes469 4 ปีที่แล้ว

    Great explanation! Thanks!

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      You're welcome! Glad you liked the video =)

  • @manuelocana8074
    @manuelocana8074 4 ปีที่แล้ว

    i don't understand why you changed the range from (2, x) to primes, if for example when you first run the program primes it's going to be empty, so no element is going to be looped over, and the function will return true no matter what number is. Can someone explain?

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +1

      Hello and thank you for your comment! The reason why this works is because if a number is found to be prime, it is appended to the primes list (that's line 17 at 25:13). So assuming that isPrime() is called on every number in ascending order, the list of primes grows as more calls are made to it, allowing it to work as expected, checking all the known prime numbers only.

    • @manuelocana8074
      @manuelocana8074 4 ปีที่แล้ว

      @@NERDfirst Oh yeah thanks! now i get it. It works because you are giving numbers in ascending order, so i was trying to know if for example 4 is prime right away by giving the number to the function isPrime(), but that doesn't seem to work.

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +1

      Hello and thank you for your comment! Yes indeed. That's bad programming on my part to be honest, functions with side effects like this isn't great! If the prime checker is used standalone then the assumption there is violated.

  • @peschebichsu
    @peschebichsu 3 ปีที่แล้ว

    Amazing video!

    • @NERDfirst
      @NERDfirst  3 ปีที่แล้ว

      Hello and thank you very much for your comment! Glad you liked the video =)

  • @NikKappa
    @NikKappa 4 ปีที่แล้ว

    You can optimise it by doing in range 2 to sqrt(x).

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +3

      Hello and thank you for your comment! Yes, this is discussed from 19:27 onwards, in addition to the optimization where only other primes are being tested.

  • @nimeshpoudel8277
    @nimeshpoudel8277 4 ปีที่แล้ว

    I am just a beginner so I don't have skills to put into power format but my code is significantly faster than his and shorter too
    import math
    import time
    ti=time.time()
    def fact(n):
    i=2
    fact_list=[1]
    while i

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +1

      Hello and thank you for your comment! Not sure which bit of code you're comparing yours to, but I see you've missed the optimization in which you compare divisors against just the known prime numbers (that's all you need to compare as non-primes are "captured" within the primes).
      Also, we can't quite compare run times directly due to hardware differences, but if you consider runtime complexity, the code in the video is more efficient!

    • @nimeshpoudel8277
      @nimeshpoudel8277 4 ปีที่แล้ว

      @@NERDfirst sorry I sent wrong code which was not ready
      Here is the real one
      import math
      from collections import Counter
      def fact(n):
      i=2
      fact_list=[]
      while i

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว +1

      Hello again! Looks like you removed the code that stops the process at sqrt(n), plus you haven't implemented the optimization in which the divisors are compared only against the known prime numbers (Hint: Use what you have in fact_list to do this!).
      Good job with the optimizations so far! Since you've said that you're a beginner, here are some resources you can sue to explore further:
      Firstly, note that in the real world, runtime complexity tends to be a bigger determining factor of code efficiency rather than runtime, you can find out more here: th-cam.com/play/PLJse9iV6Reqh5B_w9koGyT7nlYm92iITk.html
      Code length is also not a huge factor outside of competitions. Rather, it is good software engineering practice to use meaningful variable names and have readable code. More on Software Engineering here: th-cam.com/video/sB2iQSvrcG0/w-d-xo.html
      Good job so far and all the best with your programming =)

    • @nimeshpoudel8277
      @nimeshpoudel8277 4 ปีที่แล้ว

      @@NERDfirst Thank you 😀

  • @gitlit5489
    @gitlit5489 5 ปีที่แล้ว

    What is the time complexity of the original "factorize(x)" you created ?

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว +1

      Hello and thank you for your comment! I'm gonna just analyze this intuitively:
      factorize(x) is not so easy to analyze because the way in which x changes over time is rather unpredictable. I'd say that in the worst case, "divisor" runs from 2 up until x (this happens when x is prime), making it O(x) time.
      We know that primeTest(n) is O(n), I think that one is pretty clear. And since factorize() calls primeTest() every iteration, the time complexity is O(nx). Since the largest n = x, assuming that we consider the worst case, we can say the time complexity is O(n²)

    • @gitlit5489
      @gitlit5489 4 ปีที่แล้ว

      @@NERDfirst Thanks, that helped a lot

  • @archiegillis
    @archiegillis 4 ปีที่แล้ว

    Great!

    • @NERDfirst
      @NERDfirst  4 ปีที่แล้ว

      Hello and thank you for your comment! Glad you liked the video =)

  • @profab7038
    @profab7038 5 ปีที่แล้ว

    Fantastic👍

    • @NERDfirst
      @NERDfirst  5 ปีที่แล้ว

      Thank you very much! Glad you liked the video =)

  • @leesweets4110
    @leesweets4110 2 ปีที่แล้ว

    Factor 32903082870580893241 using your algorithm. I dare you.
    I was honestly hoping for an algorithm a little less.... naïve... but whatever; you gotta cater to your base.

    • @NERDfirst
      @NERDfirst  2 ปีที่แล้ว +1

      Hello and thank you very much for your comment! Yes indeed, we're keeping things basic here with the intuitive approach plus a few optimizations.
      Got any more advanced algorithms you'd like me to look at? Would love to revisit this subject sometime!