How on Earth does ^.?$|^(..+?)\1+$ produce primes?

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ธ.ค. 2024

ความคิดเห็น • 2.2K

  • @standupmaths
    @standupmaths  หลายเดือนก่อน +467

    Take your personal data back with Incogni! Use code standupmaths at the link below and get 60% off an annual plan: incogni.com/standupmaths
    Don’t use code ^.?$|^(..+?)\1+$.

  • @Cmanorange
    @Cmanorange หลายเดือนก่อน +3731

    wow, what a neat trick! i'm adding this to my production code immediately, good luck new hires!

    • @omnijack
      @omnijack หลายเดือนก่อน +264

      You monster

    • @OverkillDM
      @OverkillDM หลายเดือนก่อน +85

      @@omnijack ‘Tis the season

    • @MichaelOfRohan
      @MichaelOfRohan หลายเดือนก่อน +21

      YOURE why my neck hurts!! ='(

    • @robertdascoli949
      @robertdascoli949 หลายเดือนก่อน +236

      make sure to comment
      "// not sure what this does but the system breaks if you remove it"

    • @ngwoo
      @ngwoo หลายเดือนก่อน +8

      Halloween post

  • @Graphene_314
    @Graphene_314 หลายเดือนก่อน +3983

    The plural of regex is regrets.

    • @hgiusdfajgfds
      @hgiusdfajgfds หลายเดือนก่อน +153

      strangely enough, also the singular

    • @Bennici
      @Bennici หลายเดือนก่อน +183

      "I have a problem. Oh, I know, I'll use regex to solve that problem." [...] "Now I have two problems."

    • @JPBelanger
      @JPBelanger หลายเดือนก่อน +26

      To many IT/Prog guys here. 😁
      That said, pretty solid explanation by Matt, but yeah, pretty easy to get lost in RE

    • @lox7182
      @lox7182 หลายเดือนก่อน +8

      @@JPBelanger is that really a surprise?

    • @JPBelanger
      @JPBelanger หลายเดือนก่อน +5

      @lox7182 not really😁
      Guessing not many strangers to twenty sided die, board games, and LotR either 😁

  • @Yogarine
    @Yogarine หลายเดือนก่อน +3379

    Regular expressions are already scary by nature. Making one that produces primes is truly terrifying.

    • @GustvandeWal
      @GustvandeWal หลายเดือนก่อน +158

      Checks for*
      It does not produce them

    • @minamagdy4126
      @minamagdy4126 หลายเดือนก่อน +77

      Also, this isn't a true regex. The type of self referentialism in action here is imopssible in simple regex's.

    • @Kosmokraton
      @Kosmokraton หลายเดือนก่อน +54

      ​@GustvandeWal True, but just hook it up to an incrementing int and a while loop and you've got yourself a generator.
      (Or abuse a for loop, lol.)

    • @Varksterable
      @Varksterable หลายเดือนก่อน +23

      ​@Kosmokraton I did think Matt was going to add this to his code.
      But on reflection, it's probably a good thing he didn't.
      Code like that is truely terrifying.
      Perl has a reputation of being a "write only language."
      Then regexes must be the "write only, and get it right only occasionally by fluke or trial and error API of coding."

    • @Varksterable
      @Varksterable หลายเดือนก่อน +38

      ​@@minamagdy4126So only "simple" regexes are "true" regexes?
      How does that work?

  • @fgvcosmic6752
    @fgvcosmic6752 หลายเดือนก่อน +609

    It really works! I put it on the extra notes section of my Amazon delivery, and it became Prime!

    • @colmx8441
      @colmx8441 หลายเดือนก่อน +53

      I forgot the not and my parcel came chopped up into different parts.

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน

      @@colmx8441 Should've used ^(?!.?$|(..+?)\1+$) instead.

    • @Frexuz
      @Frexuz หลายเดือนก่อน +7

      😂

    • @Celebration-p3u
      @Celebration-p3u หลายเดือนก่อน +2

      Haha lemme try it-

    • @leap123_
      @leap123_ หลายเดือนก่อน

      Tried it on my AliExpress delivery and it became Prime the drink. It tastes awful.
      My doctor also said my body have lead in them.

  • @RMDragon3
    @RMDragon3 หลายเดือนก่อน +183

    To simplify, this takes the number and expresses it as a unary numeral system, where the number of marks is the value of the number you put in (e.g. 1 is 1, 2 is 11, 3 is 111, ...). The regex then looks to divide the unary representation into exactly N groups of size M, where both N and M are natural numbers with value 2 or larger. For example, you can divide 4 (represented as 1111) into 2 groups of "11". You can divide 6 (represented 111111) into 2 groups of "111" or 3 groups of "11". However, you can't divide 3 (represented as 111) into groups of the same size (technically you can do 3 groups of "1", but then M isn't 2 or larger). Basically, you are checking if the number has any factors bigger than 1, which obviously is only false for prime numbers.

    • @dudicrous
      @dudicrous หลายเดือนก่อน +1

      I think you mean N and M are 2 or bigger (not "bigger than 2").

    • @dudicrous
      @dudicrous หลายเดือนก่อน

      Still, I get why M should be 2 or bigger ( because the first dot in [..+?] is obligatory and the + means at least 1 extra. I don't get why N should be at least 2. The + after the group means 1 group or more doesn't it? 1 group of three dots shouldn't render True however.

    • @RMDragon3
      @RMDragon3 หลายเดือนก่อน +5

      @@dudicrous oops, yes I do, fixed it now.
      N must be 2 or larger because the part in the parenthesis needs to match something (the first group), and the \1+ means that the same thing must appear at least one more time. The parenthesis part also "consumes" whatever it matches. So basically 4 (aka 1111) would match 11 with the (..+?) and the remaining 11 would match the \1+. If you have 2 (aka 11) you can match the parenthesis, but then you have nothing left to match the \1+ part.

    • @dudicrous
      @dudicrous หลายเดือนก่อน +4

      @@RMDragon3 Ah yes, so ".+" is different from "(.)\1+". Looking back, Matt mentions it correctly, the video shows an erroneous first "or" though.

    • @acasualviewer5861
      @acasualviewer5861 หลายเดือนก่อน +3

      I'm trying to figure out if the algorithm is quadratic or exponential
      I'd guess it's at least pseudo-exponential because the length of the number is n=LogN.. But this algorithm is N*N.
      Since LogN^b = N. Then the algorithm is O(n^b*N) = O(n^2b). Which is very big.
      Also for huge primes the O(N) memory requirement could be prohibitive.

  • @busyfreeguy
    @busyfreeguy หลายเดือนก่อน +805

    15+ years of programming experience including using regular expressions and I still watched and enjoyed this video. It's a testament to Matt's enthusiasm and his skill as a speaker/teacher.

    • @paulwomack5866
      @paulwomack5866 หลายเดือนก่อน +14

      Retired life long professional Unix user - Perl and Vi(m) mean regex is my native tongue!

    • @skylark.kraken
      @skylark.kraken หลายเดือนก่อน +9

      Same, I figure out how it worked by just reading it, but he’s a joy to watch. Although, granted tapping on the video I was confused how it would be able to parse “17” as a prime and took a re-read of the regex to understand that it’s a length of the same character

    • @SuperHansburger93
      @SuperHansburger93 หลายเดือนก่อน +1

      Same here. That's a really neat way to use regex. Not as efficient as the sieve of Eratosthenes, for the reasons mentioned at 14:30, but still really cool :)

    • @amits4744
      @amits4744 หลายเดือนก่อน

      Which Python editor/IDE is this

    • @skylark.kraken
      @skylark.kraken หลายเดือนก่อน

      @@amits4744 Matt is using VSCode

  • @tempest_dawn
    @tempest_dawn หลายเดือนก่อน +1512

    love that the halloween horror episode on a math channel is . . . code

    • @maskettaman1488
      @maskettaman1488 หลายเดือนก่อน +71

      Great idea for the next horror video. Simply presenting samples of hacked together python code from past videos

    • @minamagdy4126
      @minamagdy4126 หลายเดือนก่อน +50

      No, the real horror would be ultra-optimized C++ code. The badly-written python code is a staple

    • @Appalachian7922
      @Appalachian7922 หลายเดือนก่อน +41

      Nothing scarier than regular expressions.

    • @EarendilStar
      @EarendilStar หลายเดือนก่อน +10

      But it's code that software engineers hate :)

    • @AdamMansbridge
      @AdamMansbridge หลายเดือนก่อน +1

      And a prime-al forest and a computer joke

  • @scottieapplseed
    @scottieapplseed หลายเดือนก่อน +173

    I loved the home joke at the end; you're such a good localhost.

    • @MrSonny6155
      @MrSonny6155 หลายเดือนก่อน +11

      I tried the second incantation myself, but I just got sent to the party just down the road. Maybe I should have just used "~"...

    • @MykTaylor
      @MykTaylor หลายเดือนก่อน +8

      ​@@MrSonny6155 it's cool that your neighborhood has a designated local host

  • @sillybear25
    @sillybear25 หลายเดือนก่อน +50

    A minor bit of pedantry: ^.?$|^(..+?)\1+$ is a "regex" (i.e. a pattern specification understood by most common implementations of regular-expression-based pattern-matching engines), but it is not a regular expression in the strict sense of the term. The minimal definition of a strict regular expression includes only literal characters, a symbol standing for an empty string (typically ε), parentheses to clarify the order of operations, the alternation operator (vertical bar), and the Kleene star (asterisk). Most of the other symbols in our prime-matching regex can be derived from those in the minimal definition: The wildcard . is shorthand for (a|b|c|d|...), and a+ is shorthand for aa*. The symbols ^, $, and ? are irrelevant in a strict regular expression, since it either matches the entire string or it doesn't match at all. The real problem is the \1: There's no way to construct an equivalent out of the minimal definition.

    • @TheRyulord
      @TheRyulord หลายเดือนก่อน +15

      Yup. Most modern "regex" engines are actually context-free grammar engines but they use syntax similar to true regex engines from decades past so people keep calling them that.

    • @flyinhigh7681
      @flyinhigh7681 หลายเดือนก่อน +9

      Thank you for this comment! As someone who spends a good bit of time in theoretical computer science land it always bugs me when "regex" arent actually regular at all. Saved me the trouble of writing this comment myself

    • @yash1152
      @yash1152 17 วันที่ผ่านมา +1

      thank you @TheRyulord for mentioning that the relation to CFG. i think it's something even more sinister... 'cz whereas a CFG/PDA uses stack, i.e. single time contiguous use, the backref uses arbitrary memory. & i dont remember which Grammar represents automaton with arbitrary append-only (say immutable read/write) memory.

    • @charlesstaats9902
      @charlesstaats9902 3 วันที่ผ่านมา

      Google’s regex library, called RE2, does *not* allow backreferences like this since they can cause a regex search to take exponential time.

  • @tostadojen
    @tostadojen หลายเดือนก่อน +20

    *Matt:* "What the factor?"
    *Primes:* "Your factors can't help you here."

  • @aeroeng15
    @aeroeng15 หลายเดือนก่อน +471

    To clarify: a '?' on it's own matches 0 or 1 of the preceding character. A '?' following '+' modifies the greedy behavior of '+' to be less greedy

    • @DCSWCCkingpin1
      @DCSWCCkingpin1 หลายเดือนก่อน +39

      Knows the different flavors of ? but doesn't know the difference between its and it's

    • @thatphatbaby
      @thatphatbaby หลายเดือนก่อน +174

      @@DCSWCCkingpin1we’re programmers. Not languagers.

    • @DeronMeranda
      @DeronMeranda หลายเดือนก่อน +174

      @@DCSWCCkingpin1 it'?s

    • @Lord_zeel
      @Lord_zeel หลายเดือนก่อน +5

      @@DeronMeranda I see what you did there.

    • @lockaltube
      @lockaltube หลายเดือนก่อน +23

      Just a note, what you call "less greedy" is known as lazy.

  • @arthurmesh7884
    @arthurmesh7884 หลายเดือนก่อน +496

    Someone has to say it. Regular expressions,in their truest sense, ie ones that are equivalent to regular languages are not capable of primality test. Only extended ones that have “memory” can do it. I.e \1 part of the example above is what’s not possible in regular languages

    • @ModusTrollens91
      @ModusTrollens91 หลายเดือนก่อน +116

      This brought me flashbacks having to prove that on an exam with the pumping lemma

    • @infernape716
      @infernape716 หลายเดือนก่อน +12

      @@ModusTrollens91 *shudders*

    • @UJ-nt5oo
      @UJ-nt5oo หลายเดือนก่อน +34

      @@ModusTrollens91 "prove that on an exam". username checks out

    • @glowingfatedie
      @glowingfatedie หลายเดือนก่อน +13

      I'm not sure what you mean by "truest sense." The sense you gave is wrong - regular expressions aren't /equivalent to/ regular languages, they /describe/ regular languages. Backreferences add the ability to also describe non-regular languages, but that doesn't make such a regular expression implementation somehow be not a regular expression implementation.

    • @UJ-nt5oo
      @UJ-nt5oo หลายเดือนก่อน +38

      ​@@glowingfatedienot sure thats right. regular expressions / regular languages etc can't do prime testing because they by definitin be matched with/reduced to a finite state machine which can't do counting. ie unbounded memory ie not finite i think we are just being too pedantic about the word regular tbh.

  • @mckseal
    @mckseal หลายเดือนก่อน +1073

    Regex is already halloween worthy.

    • @nburakovsky
      @nburakovsky หลายเดือนก่อน

      use regex all the time now to validate llm output formats, they arent scary they are lovely

    • @aikumaDK
      @aikumaDK หลายเดือนก่อน +2

      It's the blackest of black magic

    • @Swordfish42
      @Swordfish42 หลายเดือนก่อน +1

      ​@@aikumaDKOh yeah? Have you tried PERL?

    • @Talderas
      @Talderas หลายเดือนก่อน +2

      ​​@@Swordfish42 RegEx inside Perl. The darkest of black magic.

    • @Mr_Yod
      @Mr_Yod หลายเดือนก่อน

      @@Talderas RegEx inside Malbolge: try that, if you are brave enough! =)

  • @tonicblue
    @tonicblue หลายเดือนก่อน +15

    I love regular expressions. The xkcd is true, you really do feel like you've developed a super power when you master them

    • @nyankers
      @nyankers หลายเดือนก่อน +2

      It's like actual wizardry, and no less arcane. Reading someone else's regex really is like deciphering a spell comprising ancient runes you individually understand...

  • @_ax1ss
    @_ax1ss หลายเดือนก่อน +19

    I've been writing code professionally for 20 years and this is the best explanation of regex I've seen. Great job!

    • @RobloxPrompt
      @RobloxPrompt หลายเดือนก่อน +1

      At first I thought the ? meant unknown quantity before watching the video lol.

  • @HugoBDesigner
    @HugoBDesigner หลายเดือนก่อน +227

    As a Regex connoisseur, I was really confused as to how the expression could possibly do that. But it makes perfect sense now. So simple, and yet so clever!

    • @WoolyCow
      @WoolyCow หลายเดือนก่อน +28

      nobody is a regex connoisseur. nobody. it is where happiness goes to die...

    • @HugoBDesigner
      @HugoBDesigner หลายเดือนก่อน +39

      @@WoolyCow Did you know that there exists a Regex Crossword puzzle game? It's surprisingly fun!

    • @I.____.....__...__
      @I.____.....__...__ หลายเดือนก่อน +8

      @@WoolyCow Maybe for you, but I married RegEx and had two kids with it.

    • @VoidEternal6644
      @VoidEternal6644 หลายเดือนก่อน +2

      @@HugoBDesigner WHERE?? IT DOES SOUND FUN I WANNA TRY IT /silly

    • @IceMetalPunk
      @IceMetalPunk หลายเดือนก่อน +9

      @@WoolyCow Incorrect. I find RegEx quite fun and interesting. Figuring out the RegEx you need for a specific task is like solving a puzzle.

  • @estivalbloom
    @estivalbloom หลายเดือนก่อน +297

    13:45 Hey! Regex capture groups *are* zero-indexed; the brackets start at one but capture group zero is the entire match. Don't you slander regular expressions by calling them one-indexed

    • @mouykaing6483
      @mouykaing6483 หลายเดือนก่อน +49

      They're Parker-one-indexed

    • @mina86
      @mina86 หลายเดือนก่อน +12

      No. The whole match is not a capture group. You can observe it by doing ‘re.search('bar', 'foobarbaz').groups()` which returns an empty tuple because there are no capture groups. \0 is the syntax to refer to the whole match.
      where `n` is non-zero digit is a syntax to refer to 1-indexed capture group.

    • @giacomostevanato2203
      @giacomostevanato2203 หลายเดือนก่อน +20

      ​@@mina86 No
      is syntax for the nth backreference. It just so happen that the 0th backreference is the whole match and the (n+1)th backreference is the nth group. Groups by themselves are not 1-indexes, that only seem to happen when referring to them using backreferences because backreferences are not just groups.

    • @androlsaibot
      @androlsaibot หลายเดือนก่อน +3

      @@giacomostevanato2203 So you're saying groups are not indexed at all because you reference the backreference and backreference \1 is the "first" (in a normal language sense) or "0th" (in a fictitious 0-indexed sense) group.

    • @giacomostevanato2203
      @giacomostevanato2203 หลายเดือนก่อน +5

      @@androlsaibot I mean, in the 0-indexed list of backreferences, the one at index 1 is the first that is also a group (and is followed by the other groups).

  • @Rohanology27
    @Rohanology27 หลายเดือนก่อน +695

    What in the ^.?$|^(..?)\1+$ did I just watch?

    • @lockaltube
      @lockaltube หลายเดือนก่อน +43

      Now it won't work! without +, ..? captures one or two ones, so magic stops after 1111. You gotta be lazy!

    • @alexandermcclure6185
      @alexandermcclure6185 หลายเดือนก่อน +3

      Oh, it was 179, AKA p[41].

    • @RainBoxRed
      @RainBoxRed หลายเดือนก่อน +12

      It's pronounced "factor".

    • @Luni_2000
      @Luni_2000 หลายเดือนก่อน +1

      ​@@lockaltubeactually it will now only say true for multiples of 1 and 2, which is all the numbers

    • @AlmondAxis987
      @AlmondAxis987 หลายเดือนก่อน +2

      This just tests for odd numbers greater than 1.

  • @rlarsen000
    @rlarsen000 หลายเดือนก่อน +1

    After spending 52 years of my life coding, this is the first time any bit of code made me laugh. Good thing I had put my coffee down first. Great video!

  • @ThePoxun
    @ThePoxun หลายเดือนก่อน +5

    you know its going to be a good day when Matt releases a video that intersects with my most used number sets and my most used programming feature.

  • @CheaterCodes
    @CheaterCodes หลายเดือนก่อน +1139

    just to be that guy: 127.0.0.1 isn't "home", it's loopback, so it should just send you back to where you're at right now.

    • @thepython10110
      @thepython10110 หลายเดือนก่อน +243

      If you really want home, use ~

    • @mystifoxtech
      @mystifoxtech หลายเดือนก่อน +91

      thanks for being that guy

    • @GrouchierThanThou
      @GrouchierThanThou หลายเดือนก่อน +102

      ​@@thepython10110 That just translates (or expands as shell nerds like to say) to home. If you actually want to go home you should use cd.

    • @theeternalsw0rd
      @theeternalsw0rd หลายเดือนก่อน +109

      just to be that more annoying guy. Um actually, it's technically the default loopback. You can add more manually and some operating systems support more out of the box. Any ip address in the subnet 127/8 is available for use as loopback.

    • @mineton1293
      @mineton1293 หลายเดือนก่อน +52

      I love all the other "that guy"s replying to this

  • @KnightTobyas
    @KnightTobyas หลายเดือนก่อน +388

    A bit pedantic, but 127.0.0.1 is a loopback address. It's like stepping outside to ring the doorbell to make sure it is working. The only time it sends you home is if you are already home.

    • @JdeBP
      @JdeBP หลายเดือนก่อน

      The correct incantations would indeed be either "Set-Location -LiteralPath $env:USERPROFILE" or "cd" . (-:

    • @RobinSyl
      @RobinSyl หลายเดือนก่อน +35

      The doorbell thing is how I'm gonna explain loopback from now on

    • @vl4dl3n
      @vl4dl3n หลายเดือนก่อน +21

      "stepping outside", its more like trying to reach doorbell while standing inside :D

    • @ShadowManceri
      @ShadowManceri หลายเดือนก่อน +3

      Or like if you would have portal gun then it makes the blue and orange portals the same.

    • @MonzennCarloMallari
      @MonzennCarloMallari หลายเดือนก่อน +17

      "The only time it sends you home is if you are already home"
      Well that was oddly poetic

  • @nickalasmontano1496
    @nickalasmontano1496 หลายเดือนก่อน +96

    The cinematography of some of the more recent videos has been astounding. Nice work!

    • @kapriolenpfeifer
      @kapriolenpfeifer หลายเดือนก่อน

      Indeed! The only minor point is that the editor should use a higher brightness or a screen with wider contrast.

  • @dalewheat
    @dalewheat หลายเดือนก่อน +7

    Thank you, Matt, for reminding me how I truly feel about regular expressions. On an unrelated note, beware of “clever” code. I read that in a Fortran book once [citation needed].

    • @swankeepers
      @swankeepers หลายเดือนก่อน

      You want "clever" code? Check out the International Obfuscated C Code contest (hosted for many years by a fellow that worked for me, who (non-ironically) is a famous prime-hunter.)

  • @stigcc
    @stigcc หลายเดือนก่อน +2

    In essence, this regex is a creative way to test the indivisibility property of prime numbers using string patterns. It’s a fascinating example of how regular expressions can be applied beyond their usual text-processing tasks!

  • @stoopidgenious
    @stoopidgenious หลายเดือนก่อน +114

    i love matts videos
    maths videos are the best

    • @nanamacapagal8342
      @nanamacapagal8342 หลายเดือนก่อน +16

      import re
      comment = input()
      print(re.match(r'm...s',comment))
      >>> True

    • @Celebration-p3u
      @Celebration-p3u หลายเดือนก่อน +2

      import re
      comment = input()
      print(re.match(r'm...s',comment))
      >>>True

    • @ssdd28561
      @ssdd28561 หลายเดือนก่อน +2

      Oh, it's nice to see a totally normal person saying a totally normal sentence in the wild!

  • @stephenj9470
    @stephenj9470 หลายเดือนก่อน +140

    3:31 The ability of Matt to accurately point to symbols that he'll put in later...

    • @mattparker-2
      @mattparker-2 หลายเดือนก่อน +8

      it was seriously impressive. i wonder if it was first try

    • @HenryLoenwind
      @HenryLoenwind หลายเดือนก่อน +11

      Not so impressive when you realise he's looking at a monitor showing a life feed from the camera. Easy to mark the right positions on that...

    • @stephenj9470
      @stephenj9470 หลายเดือนก่อน +22

      @@HenryLoenwind Still tough to nail the pointing.

    • @WJS774
      @WJS774 หลายเดือนก่อน +28

      @@HenryLoenwind Have you ever tried doing that? It is _not_ easy. Our brains did not evolve to process third-person feedback like that, so unless you have quite a bit of practice, you will need to do a lot of adjusting after your initial guess.

    • @j0code
      @j0code หลายเดือนก่อน +6

      @@WJS774 Unless the camera feed is mirrored. Then it's pretty easy.

  • @masterandexpert288
    @masterandexpert288 หลายเดือนก่อน +82

    "What the factor" got me hahaha

  • @musicdudejoe263
    @musicdudejoe263 หลายเดือนก่อน +1

    This was your silliest video yet and I loved it, haha. Was cracking up at the thought of someone chasing you round a wood throwing numbers at you while filming it.

  • @austinwright2591
    @austinwright2591 หลายเดือนก่อน +6

    13:48 This is not technically a "regular" expression because backreferences are not regular; as what the backreference matches changes based on previous input, and regular expressions are context-free (you can concatenate two regular expressions and it has the expected effect, but this does not hold true if the expression contains a backreference). "Regular" is an important property because it guarantees performance within a fixed amount of memory regardless of the input size, and compute time proportional to the input size, but the backreference obliterates that characteristic.

  • @CoolAsFreya
    @CoolAsFreya หลายเดือนก่อน +113

    There's no place like 127.0.0.1 ... except the rest of the /8 block of 16,777,216 addresses also reserved for loopback

    • @martijnvds
      @martijnvds หลายเดือนก่อน +18

      And ::1

    • @runed0s86
      @runed0s86 หลายเดือนก่อน

      ​@@martijnvdsthat's just 127.0.0.1 in ipv6

  • @CurtisDyer
    @CurtisDyer หลายเดือนก่อน +58

    This looks like an alternate version of Abigail's prime regex that was posted on a Perl newsgroup in the '90s. Legendary Perl hacker.

    • @MishaTheElder
      @MishaTheElder หลายเดือนก่อน +10

      Right. There should be an entire episode on Abigail's tricks

    • @ville_syrjala
      @ville_syrjala หลายเดือนก่อน +1

      @@MishaTheElder Sounds like an episode of the "The Scary Door".

  • @LuxFerre4242
    @LuxFerre4242 หลายเดือนก่อน +574

    If have a problem, and you try to use regex to solve it, you now have two problems.

    • @zeikjt
      @zeikjt หลายเดือนก่อน +72

      (problems)+

    • @dimesio
      @dimesio หลายเดือนก่อน

      @@zeikjt (problems)*

    • @roxas8999
      @roxas8999 หลายเดือนก่อน +14

      I was sure somebody would have commented this, so I scrolled through a lot until I found this

    • @ken830
      @ken830 หลายเดือนก่อน

      @@roxas8999 Me too... first thing I did was search for "problems" and found this comment. Oldie... but goodie...

    • @alvaroludolf
      @alvaroludolf หลายเดือนก่อน

      Pff... coward.

  • @DirkieDurky
    @DirkieDurky หลายเดือนก่อน

    As someone who understands regex I still didn't understand how it could produce primes. Was super interesting to learn how, and what an amazing bit at the end! Thanks for all the great content you provide 🙏❤

  • @michellevanallen3286
    @michellevanallen3286 หลายเดือนก่อน +1

    My 8 year old loves your videos…I predict he will giggle uncontrollably anytime a new number is summoned but also give me a genuine look of confusion when trying to understand regex.
    And I wouldn’t be completely mad if he picks up on “what the factor” because that’s freaking hilarious but I still wouldn’t want to have to explain it to his teachers. 😂

  • @kingbeauregard
    @kingbeauregard หลายเดือนก่อน +79

    I was baffled until you pointed out that what we're really regex-ing against is a sequence of 1's. Regex is just trying to see if it can break that string of 1's into a number of equal-sized chunks (that are of length 2 or greater); for example, it could break "1111" into two chunks of length two, but it could not do anything similar with "11111".

    • @backwashjoe7864
      @backwashjoe7864 หลายเดือนก่อน +10

      Thank you! He left out that kind of nice example and left me confused.

    • @manlyadvice1789
      @manlyadvice1789 หลายเดือนก่อน +10

      Same here. I was baffled before, aghast after. It's as if someone was trying to come up with the least efficient, most memory-intensive prime-seeking method possible.

    • @skuizhopatt5318
      @skuizhopatt5318 หลายเดือนก่อน

      Yeah, it's not JS hyere, so '1' != 1 ^^
      (Killing my own joke, but JS typing system is really bloated. Not to that point, though)

    • @theairaccumulator7144
      @theairaccumulator7144 หลายเดือนก่อน

      @@manlyadvice1789 tbh it only takes pcre2 860 microseconds on my computer to test 1001 in 1468 steps. 3089 is tested to be prime in 3.52ms and 25310 steps. I've had regexes used in production take more steps.

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน

      Yeah, I was immediately proving it false by counter-example until he explained that 1) it's checking the length of the string, not the contents and 2) it's the opposite, it's checking for composite numbers.
      If you want prime lengths, use this instead:
      ^(?!.?$|(..+?)\1+$)

  • @saFubar
    @saFubar หลายเดือนก่อน +108

    For anyone else who like me was confused at 14:20, when Matt says "we can have multiples of them" he's referring just to the \1+ portion of the expression. This is in addition to the preceding capture group that it references, which in effect means that we *must* have multiples of them when taken across the whole string -- thus ensuring that prime numbers don't themselves get "factored" as p*1 and matched by the regex. This is more obvious when he says "you can have two twos, or three twos, or four twos, all the way up" even though the graphic makes it look like one two is an option.

    • @fradinetienne712
      @fradinetienne712 หลายเดือนก่อน +3

      Thanks, I was so confused by how prime were not flagged as well!

    • @theaxer3751
      @theaxer3751 หลายเดือนก่อน +1

      Thanks. I've worked with regex, but never \1 inside the regex. This actually makes sense to me now, whereas I was confused by Matt's phrasing.
      It's literally like 'insert whatever group 1 matched here'

    • @linazso
      @linazso หลายเดือนก่อน

      That was confusing to me as well! Not sure I now fully understood how it does that but at least I got that it's somehow only referring to two or more groups.

    • @HenryLoenwind
      @HenryLoenwind หลายเดือนก่อน +2

      Or, to say it in other words: The second part of regex grabs any number that's 2 or higher that is repeated 2 or more times. So if it can be expressed as "[2,3,4,...] x [2,3,4,...]", it matches.

    • @rmsgrey
      @rmsgrey หลายเดือนก่อน +3

      Yeah, despite the first column of "or"s, the regex is "find something that matches what's in the brackets, and then find that exact string again one or more times" so if the bit in the brackets found "11", then the regex would be looking for "11" followed by at least one "11", taking up the entire string.

  • @IsYitzach
    @IsYitzach หลายเดือนก่อน +67

    Its not quite the Sieve of Eratosthenes as it checks if a the target number is a factor of all numbers smaller than it. The Sieve, as written by Eratosthenes, will only use prime numbers up the to the square root of the number as it will discard all composites before they are used.

    • @arirahikkala
      @arirahikkala หลายเดือนก่อน +8

      I believe that's still trial division, not the actual sieve. See Melissa E. O’Neill's The Genuine Sieve of Eratosthenes, which goes into detail on this and derives asymptotics for both: The sieve does a lot less work to find all primes up to n than trial division, even trial division only checking prime numbers up to sqrt(n).

    • @theaxer3751
      @theaxer3751 หลายเดือนก่อน +1

      I still haven't figured out how to do that one in Human Resource Machine optimally

    • @framegrace1
      @framegrace1 หลายเดือนก่อน +3

      Original algorithm, as writen by Eratostenes, doesn't stop at sqrt n. It doesn't change the complexity, so it doesn't really matter, in fact.
      For him would have been a strange "optimization", as very few integers have rational square roots, an calculating the rest would be almost as tedious as the siege itself.

    • @kyay10
      @kyay10 หลายเดือนก่อน +1

      ​@@framegrace1I don't think he would've been that confused by the idea of taking the ceiling of the square root. You can easily describe what that means in the language of geometry, which he definitely understood

    • @huellenoperator
      @huellenoperator หลายเดือนก่อน +5

      @@framegrace1 You don't need to know roots, just stop when the square of the current number exceeds your chosen max.

  • @AthenaAGT
    @AthenaAGT หลายเดือนก่อน

    I’ve never been happier to be even slightly fluent in regex. It was torture to learn but this made it worth it.

  • @Thewinner312
    @Thewinner312 หลายเดือนก่อน +7

    A string is a "string of characters". It makes sense to say because it applies to all strings, not just those that also happen to be grammatically correct sentences. For example "^.?$|^(..+?)\1+$" is a string, but certainly not a sentence.

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน +3

      "20 to life" is a string, a sentence, and a noun phrase.

  • @Richardincancale
    @Richardincancale หลายเดือนก่อน +13

    I liked that it checked the primality of a series of ‘1’ characters. It shows that prime numbers are not dependent on the base of the numbers, just the number of ‘things’ being tested

  • @krissp8712
    @krissp8712 หลายเดือนก่อน +29

    This is approaching the level of LLM powered IsEven

    • @DanielQRT
      @DanielQRT หลายเดือนก่อน

      it it orders of magnitude more efficient, this regex can check all numbers up to ~10_000 in ~3 seconds on my machine (using the fancy_regex crate in rust)

  • @johnchessant3012
    @johnchessant3012 หลายเดือนก่อน +131

    "for completeness, I will show you that this definitely does work in practice"
    Can confirm. I said the incantation aloud and a wild 163 appeared, what about you guys?

    • @cheeseburgermonkey7104
      @cheeseburgermonkey7104 หลายเดือนก่อน +10

      The demonic power might be accessing your memories while the incantation progresses because I got 41,024,320 random numbers that I don't know what to do with now.

    • @skandragon586
      @skandragon586 หลายเดือนก่อน +10

      i must have said it wrong... i got an 8

    • @colinmcconnell827
      @colinmcconnell827 หลายเดือนก่อน +8

      An Amazon Prime delivery appeared.

    • @alexandermcclure6185
      @alexandermcclure6185 หลายเดือนก่อน

      @@colinmcconnell827 can confirm, i was the package

    • @alexandermcclure6185
      @alexandermcclure6185 หลายเดือนก่อน +5

      I tried it and I discovered a new programming language called C** what should i do!?

  • @TheMyx231
    @TheMyx231 หลายเดือนก่อน

    I applaud your commitment to making regular expressions make sense.

  • @Slowphoton
    @Slowphoton หลายเดือนก่อน +5

    This regular expression has been around for decades. Attributed to Abigail, a Dutch JAPH (just another Perl hacker). I hope YT will not eat my comment.

  • @MattSeremet
    @MattSeremet หลายเดือนก่อน +7

    I'm proud I was able to learn regular expressions well enough to update a popular stack exchange answer that had sat for a decade with a tiny error in the explanation. Subtle and important.

    • @jonasbarka
      @jonasbarka หลายเดือนก่อน +2

      I would be proud if I could post a question to Stack Overflow without it immediately being closed for one reason or another...

  • @althaz
    @althaz หลายเดือนก่อน +38

    Dang, prime numbers *AND* regex? It must be my birthday!

    • @LawrenceBeyer790
      @LawrenceBeyer790 หลายเดือนก่อน +1

      Happy birthday!c

    • @alexandermcclure6185
      @alexandermcclure6185 หลายเดือนก่อน

      hap birf

    • @flickthenick
      @flickthenick หลายเดือนก่อน

      Surely your birthday, as in anyone's birthday, is a regex?

  • @chemann2944
    @chemann2944 หลายเดือนก่อน +15

    "What the factor!" , nice one Matt

  • @DAveShillito
    @DAveShillito หลายเดือนก่อน +1

    One of my favourite quotes "If you have a problem that can be solved by regular expressions, you now have two problems" 😁

  • @BrandonMeyer1641
    @BrandonMeyer1641 หลายเดือนก่อน

    I stumbled upon your rgb Christmas tree and thought that was cool. So I looked into your Channel more. As someone who’s been learning r and wrangling data I was instantly intrigued. I was like, “is that a regular expression? Generate primes? It can’t be”.
    Subscribed.

  • @tygrataps
    @tygrataps หลายเดือนก่อน +9

    The camera man never dies! I am so sharing this with my work group tomorrow.

    • @tygrataps
      @tygrataps หลายเดือนก่อน

      They loved it btw!

  • @ryanthunder3247
    @ryanthunder3247 หลายเดือนก่อน +12

    I am always so happy to see a stand-up maths video. The ending spoky bit was quite a fun touch 😁

  • @ErzengelDesLichtes
    @ErzengelDesLichtes หลายเดือนก่อน +18

    The important part in cracking this is the text you’re running it on and what on earth ‘1’*n means in python. Once I figured out it means “repeat the character n times” and so you’re matching against a string of repeating characters, what the regex is doing became clear.

  • @HaydenSmedley-y7k
    @HaydenSmedley-y7k หลายเดือนก่อน

    Quality is never an accident; it is always the result of intelligent effort.

  • @nathanrogers7397
    @nathanrogers7397 หลายเดือนก่อน

    This video was relevant to my interests both, as an amateur prime number hunter and as a person who uses regular expressions to manipulate text files as part of my job. Thanks for the fun video!

  • @lordterk
    @lordterk หลายเดือนก่อน +145

    Screaming in regular expressions 😂

    • @Milamberinx
      @Milamberinx หลายเดือนก่อน +1

      Careful, with good luck and bad luck it'll match, not necessarily the same or right things.

    • @lordterk
      @lordterk หลายเดือนก่อน

      @Milamberinx hahahaha absolutely true! 🤣

    • @Duraludon884
      @Duraludon884 หลายเดือนก่อน +7

      "A+"

  • @RussellBeattie
    @RussellBeattie หลายเดือนก่อน +5

    It's a fundamental law of computer science that any sufficiently complex regular expression is, by definition, immutable code: It can only be written or re-written, never edited (or understood).

    • @robadams1645
      @robadams1645 หลายเดือนก่อน +1

      But if you put a regex in your code without a comment explaining what it does, I WILL fail your pull request.

  • @youdontknowme5969
    @youdontknowme5969 หลายเดือนก่อน +21

    Aaahh yes, RegEx's
    Fun to concoct
    Absolute nightmare to debug

    • @asandax6
      @asandax6 หลายเดือนก่อน +2

      As someone who's written Regex for 2 years I still have no idea what I'm doing and I have a feeling I never will.

    • @ailaG
      @ailaG หลายเดือนก่อน

      ​@@asandax6 that's okay, I've been doing them for 20 and I feel the same.
      At least we can now ask ChatGPT...
      One time ages ago someone asked in a local forum about a regex and I thought he was parodying them because on top of the whole cat-walked-on-keyboard mess it had smileys in it! (this is before emojis, but the image equivalent of a 😊 😢 etc)
      Until I realised the colons and brackets were part of the original regex, transformed by the forums software
      And while that doesn't demonstrate how annoying regex is, it sure shows it

  • @vacronda
    @vacronda หลายเดือนก่อน

    This video's production value is like finding a new prime - astronomically impressive!

  • @benjaminsprung315
    @benjaminsprung315 หลายเดือนก่อน

    Thank you Matt for this prodigious incantation unveiling the deep mystery of regular expressions!

  • @jamesmnguyen
    @jamesmnguyen หลายเดือนก่อน +146

    Haven't watched the full video yet but I immediately said, "Now plug the largest Mersenne Prime into that regex"

    • @Z_Inspector
      @Z_Inspector หลายเดือนก่อน +21

      I'm not quite sure python can handle a number of that size, but if it could, it would take an eternity to run anyway!

    • @MrKahrum
      @MrKahrum หลายเดือนก่อน +23

      calculate it for me and i will ;)
      tl;dw: after two infinities, my computer spits out its answer, confirming that which we definitonally knew to be true. we look at each other, and say "nice".

    • @FScott-m1n
      @FScott-m1n หลายเดือนก่อน +3

      Ka -- blooie!

    • @marcoottina654
      @marcoottina654 หลายเดือนก่อน +5

      ​@@MrKahrum "Top long ; didn't ..... Write?"

    • @jamesmnguyen
      @jamesmnguyen หลายเดือนก่อน +8

      @@marcoottina654 Too Long; Didn't Watch

  • @arc8dia
    @arc8dia หลายเดือนก่อน +19

    The production value of this video is amazing ... for a math video

  • @ChalkyWhiteChalkyWhite
    @ChalkyWhiteChalkyWhite หลายเดือนก่อน +9

    'i love matts videos'

  • @WispYart
    @WispYart หลายเดือนก่อน

    As a student who at the time give a lecture to the class on regex, I love this video, thank you Matt!

  • @melthomas8048
    @melthomas8048 หลายเดือนก่อน

    Not gonna lie, Business Matt was so good with the transition that I wasnt even skipping that Ad

  • @benharri
    @benharri หลายเดือนก่อน +28

    regex in TH-cam titles. coming back to this later 📌

    • @benharri
      @benharri หลายเดือนก่อน

      no place like 127.0.0.1

    • @I.____.....__...__
      @I.____.....__...__ หลายเดือนก่อน

      _Tom Scott has unentered retirement…_

    • @alexandermcclure6185
      @alexandermcclure6185 หลายเดือนก่อน +1

      @@benharri actually, there's no place like ~

  • @decare696
    @decare696 หลายเดือนก่อน +33

    '^' is called a caret. An up-arrow would be ↑...

    • @JdeBP
      @JdeBP หลายเดือนก่อน +6

      In the 7-bit world of the 1960s, they were the same thing. Have a look at the early days of ASCII. Terrible Python has yet to fully escape that world.

    • @tavern.keeper
      @tavern.keeper หลายเดือนก่อน +1

      It's called a circumflex

    • @menachemsalomon
      @menachemsalomon หลายเดือนก่อน +1

      Yes. The '^' is the caret, and the '|' is the stick.
      You know, "Do this, or else ..." That's a stick. And that's what '|' is used for, though usually doubled to accomplish that. `Do() || die("Hey, I've died twice.")` works in C, though it's a Perl idiom.

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน

      I prefer to think of it as a naked circumflex.

    • @埊
      @埊 หลายเดือนก่อน

      @@menachemsalomon 丨 and 人 and 个 is used where?

  • @maertsnosmirc
    @maertsnosmirc หลายเดือนก่อน +27

    I can think of nothing scarier than regular expressions

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน +1

      Well then you shouldn't be afraid of ^(?!.?$|(..+?)\1+$) since it's looking for the beginning of something that _isn't_ matching the regular expression.

  • @newcantinacrispychickentac7754
    @newcantinacrispychickentac7754 หลายเดือนก่อน

    As someone with no computer science knowledge, this video was very easy to follow. Great job!

  • @ericbrine9503
    @ericbrine9503 หลายเดือนก่อน

    A good way to define a regular expression is that it defines a set of strings. We normally use them to check if a string belongs to the set, but we could also use them to generate strings that belong in the set. In other words, we could use the regex pattern to generate primes instead of just checking if a number is prime.

  • @MadSpacePig
    @MadSpacePig หลายเดือนก่อน +11

    I remember reading Humble Pi and near the end thinking 'Matt is really more of a Computer Scientist at this point rather than just a mathematician'
    I think this is the nail in the coffin.

    • @TheGreatAtario
      @TheGreatAtario หลายเดือนก่อน +7

      No computer scientist would consider "sentence" to be the proper meaning of "string"

    • @swankeepers
      @swankeepers หลายเดือนก่อน +1

      @@TheGreatAtario was going to post the same, but thanks for doing it first.

  • @sevcoyote4730
    @sevcoyote4730 หลายเดือนก่อน +43

    I don't think I've ever heard somebody pronounce the 'g' in 'regex' like the 'g' in 'regular', even though it obviously makes sense when you think about it.

    • @thepython10110
      @thepython10110 หลายเดือนก่อน +27

      I've always pronounced it that way in my head, and was surprised when I heard it the other way. It really is exactly the same as the GIF thing.

    • @JavedAlam-ce4mu
      @JavedAlam-ce4mu หลายเดือนก่อน +11

      I have only ever heard it pronounced rejex. The sound from the reg in register. I think it's pronounced this way because it is more phonetically satisfying.

    • @thepython10110
      @thepython10110 หลายเดือนก่อน +7

      @@JavedAlam-ce4mu That is the normal pronunciation, and makes sense in English (usually G before I or E is a soft G, like in other languages), but a hard G also makes sense because it comes from "reGular."

    • @grelca
      @grelca หลายเดือนก่อน +2

      @@thepython10110​​⁠​⁠the folks who get mad when people say gif with a soft g would like a word with you 😂

    • @auralluring
      @auralluring หลายเดือนก่อน +3

      i’ve heard it so many times both ways, at this point i’m not even sure which way i usually say it

  • @GWConsultant
    @GWConsultant หลายเดือนก่อน +10

    IMHO @15:21 explanation of (..+)\1 in the picture is incorrect! For instance, (11)\1 shall NOT match 11 but shall match 1111 or 111111 because at least ONE REPETITION of the given group given in parentheses MUST be present! If there is no repetition of the given group, then there is no match!

    • @fatemonkey
      @fatemonkey หลายเดือนก่อน

      There doesn't need to be a repitition. + matches 1 or more times, so it's checking if 11 appears 1 or more times, thus 11 would match...

    • @AbiGail-ok7fc
      @AbiGail-ok7fc หลายเดือนก่อน +3

      @@fatemonkey 11 will not match. + matches indeed 1 or more times, but \1+ means that you have to match 1 or more times whatever was matched by the capture group *following* the match of the capture group.

  • @linguica
    @linguica หลายเดือนก่อน

    I love regular expressions. I taught myself to use them in college and then as an intern doing data entry I was able to complete a full day's "work" in like 30 minutes.

  • @ace90210ace
    @ace90210ace หลายเดือนก่อน

    as a multi decade long software dev i didnt expect to learn this much about regex from a Stand up maths video lol. recognised it instantly as regex and was very curious how it worked

  • @Niveum
    @Niveum หลายเดือนก่อน +8

    If we took the formal language theory definition of regular expressions (from the Chomsky hierarchy), evaluating primes should not be possible.
    Regex implementations (like in Python) uses backtracking and recursive matching.

  • @CaelVK
    @CaelVK หลายเดือนก่อน +7

    regex is definitely one of the scariest things you can show to any programmer

  • @ianmcdougall2898
    @ianmcdougall2898 หลายเดือนก่อน +11

    There's no place like ::1...

    • @thepython10110
      @thepython10110 หลายเดือนก่อน +12

      There's no place like ~

    • @skuizhopatt5318
      @skuizhopatt5318 หลายเดือนก่อน

      Hey ! I'm your new postman. You can trustfully bring me all your mail, including valuables.
      (yes, this is an ipv6 joke too :p )

  • @brentlidstone1982
    @brentlidstone1982 หลายเดือนก่อน

    As someone who has recently got into Python it is really funny to me that Matt has also recently got into Python and now this channel has become a de-facto Python tutorial goldmine.

  • @Utoxin
    @Utoxin หลายเดือนก่อน +2

    I've used regex for 25 years, and still didn't spot the trick until you started explaining the backreference magic. GAH.

  • @kingbeauregard
    @kingbeauregard หลายเดือนก่อน +7

    Regular expressions are otherworldly. Do learn them if you do any programming, though; they are one of the most useful things you can master.

    • @mjkmetso2935
      @mjkmetso2935 หลายเดือนก่อน +3

      They are also the easiest way to introduce massive bugs into production *Looking at crowdstrike*

    • @_Agent_86
      @_Agent_86 หลายเดือนก่อน +1

      The most useful thing to master in coding is how to make the most readable code possible. In that light, regex is the least useful things you could master. Never be clever!

    • @kingbeauregard
      @kingbeauregard หลายเดือนก่อน +1

      @@_Agent_86 Yes and no. Yes on, readability is key to good code; code that is easy to maintain tends towards fewest bugs. No on your disdain for regular expressions: even simple regular expressions are useful as hell. For example, if you want to test the format of a positive integer with no leading zeros, it's /^[1-9]\d*$/ . That involves fewer gyrations than trying to evaluate the expression numerically and stripping off leading zeroes / a decimal / etc. Or trying to extract fields of data from a string, regular expressions can make difficult tasks very simple.
      You're right that regular expressions can be abused, to which I say: don't abuse them. Problem solved.

    • @kingbeauregard
      @kingbeauregard หลายเดือนก่อน +1

      @@mjkmetso2935 Even that isn't the fault of regex. The incident report identifies this as the problem: "The content validator evaluated the new template instances under the assumption that the IPC template type would provide all 21 inputs. Due to the mismatch, the validator failed, leading to the incident." It wasn't a regex blunder or a bug in regex or an ambiguity in regex; from what they're describing, the coder operated under incorrect assumptions about the data. The same error would have occurred with a non-regex validator.

    • @_Agent_86
      @_Agent_86 หลายเดือนก่อน +2

      @@kingbeauregardI do agree, and I do use them, but they are inherently low readability. And when I review code with one, I assume it was copy pasted from somewhere. Then I have to divert and research carefully. My comment was mostly to quell the enthusiasm of the op a bit! Appreciate your POV though.

  • @two_number_nines
    @two_number_nines หลายเดือนก่อน +16

    13:57 no, the capture groups ARE zero-indexed. The \0 capture group is the whole regex pattern itself. You can't use the \0 capture group within the regex itself, but you can use it in re.sub(...) and re.findall(...).group(0).

    • @mina86
      @mina86 หลายเดือนก่อน +5

      No. The whole match is not a capture group. You can observe it by doing ‘re.search('bar', 'foobarbaz').groups()` which returns an empty tuple because there are no capture groups. \0 is the syntax to refer to the whole match.
      where `n` is non-zero digit is a syntax to refer to 1-indexed capture group.

  • @OlivierGalizzi
    @OlivierGalizzi หลายเดือนก่อน +3

    I have been using regular expressions in my work very regularly for years and yet I am amazed by this one and especially by its simplicity compared to the task accomplished ! Kudos to Illya for coming up with that piece of cryptic string !

    • @vibaj16
      @vibaj16 หลายเดือนก่อน

      I don't understand what the "\1+?" is doing. If "\1" is the first index of all the possible matches from the expression in the parentheses, then wouldn't it just be the match "11"? So wouldn't this regex only match on multiples of 2?

    • @OlivierGalizzi
      @OlivierGalizzi หลายเดือนก่อน

      @vibaj16 when this regexp matches, (...+?) captures the smallest prime divisor. For example 5 consecutives "1" if you test 385=5*7*11=77 groups of 5 consecutives "1". \1 will then also represent 5 consecutives "1" and \1+ will match the 76 remaining groups of 5 consecutives "1".

    • @snygg1993
      @snygg1993 หลายเดือนก่อน

      @@vibaj16 It is "\1+" without "?"
      "\1" is a reference to whatever the "first set of parentheses" matches. In this case, there is only one set of parenthesis, so there is no \2 and so on.
      When "(..+?)" matches "11", then "\1" is matching "11".
      The "+" repeats "\1" and therefore matches "11" as often as possible.
      ^(..+?)\1+$
      The regex takes the smallest [i.e. "?"] sequence of repeating characters [i.e. "..+"] at the beginning [i.e. "^"] of the string [together "^(..+?)"] and looks if the found beginning [in this case "11"] is repeated [i.e. "\1"] over and over again [i.e. "+"] until exactly the end of the string (i.e. "$").
      If it is not, then, "11" is not the smallest sequence that can be matched by "the parenthesis" and also fulfilling the entire regex, but maybe "111" is the smallest sequence. If not, then maybe "1111" is and so on.

    • @vibaj16
      @vibaj16 หลายเดือนก่อน

      ​@@snygg1993 thanks. I think I understand now. The exact meaning of "\1" wasn't explained very well in the video, nor by the first reply to my question.

  • @nustada
    @nustada หลายเดือนก่อน +1

    The reason we use strings rather than sentences, is because it can be any combination of characters. A string could be a book, or might just be random letters. Sentences infer grammar, punctuation and scope.

  • @electricketchup
    @electricketchup หลายเดือนก่อน

    This is by far your best video yet. By the way, the second incantation can be simplified to the string "::1"

  • @jbrecken
    @jbrecken หลายเดือนก่อน +9

    That scene in another dimension was right out of "Planet of the Prime Matts."

    • @denelson83
      @denelson83 หลายเดือนก่อน +3

      Uh, 314159, as in asteroid 314159 Mattparker, is prime.

  • @Fabio292
    @Fabio292 หลายเดือนก่อน +7

    As my high school programming teacher said: "If you have a problem and you want to solve it using regex, you now have two problems"

    • @FirstLast-gw5mg
      @FirstLast-gw5mg หลายเดือนก่อน +1

      In this case, the second problem of getting the opposite of what you wanted could also be solved using regex.
      ^(?!.?$|(..+?)\1+$)

  • @fishHater
    @fishHater หลายเดือนก่อน +16

    Also did he just refer to our beloved pipe as vertical line

    • @danielhoolihan
      @danielhoolihan หลายเดือนก่อน +2

      and the caret as "up arrow" :(

    • @swankeepers
      @swankeepers หลายเดือนก่อน

      and "string" as "sentence"!

  • @EDoyl
    @EDoyl หลายเดือนก่อน +1

    My favorite type of thing in the entire world is anything which is all three of:
    1) funny
    2) extremely clever
    3) extremely stupid

  • @DanWills
    @DanWills หลายเดือนก่อน

    Totally fascinating! I love regex and this is such a cool example! Thanks so much for documenting and explaining your expedition to the prime realm Matt!

  • @pg_pete
    @pg_pete หลายเดือนก่อน +5

    Wildcards are OutOfMemoryExceptions waiting to happen

    • @nathanjohnson9715
      @nathanjohnson9715 หลายเดือนก่อน +2

      True story! I'm a pentester, and one of the things we always look for is dangerous regex or "redos" vulnerabilities in code we're testing.

  • @kea2878
    @kea2878 หลายเดือนก่อน +4

    Spooky that the QR code at 9:19 has a ghost in the middle.

  • @arithex
    @arithex หลายเดือนก่อน +6

    wow just 1 line of code! that must be really fast, right ... right?

    • @LexLissauer
      @LexLissauer หลายเดือนก่อน

      😂
      It will be fast, as it is only applicable to relatively small numbers.

  • @thetaphi
    @thetaphi หลายเดือนก่อน

    Someone had a looot of fun literally throwing numbers at you. Excellent film making.

  • @peelaza
    @peelaza หลายเดือนก่อน

    what an amazing cinematic treat at the end! thanks for the vid!

  • @privacyvalued4134
    @privacyvalued4134 หลายเดือนก่อน +7

    17:15 Who wants to play the Matt Parker platformer video game now?

    • @Saleca
      @Saleca หลายเดือนก่อน

      Prime runner

  • @thomasdalton1508
    @thomasdalton1508 หลายเดือนก่อน +35

    It isn't anything to do with the Sieve of Eratosthenes. It's just brute force factorisation. A sieve is an algorithm to produce all primes. This is an algorithm to check if a number is a prime. Totally different things.

    • @wyattsheffield6130
      @wyattsheffield6130 หลายเดือนก่อน +5

      You can also use a regex to generate any/all valid strings in the language defined by that regex! I'd say this counts (pun intended) 😉

    • @jotch_7627
      @jotch_7627 หลายเดือนก่อน +1

      ​@@wyattsheffield6130this is true in this instance (not always with modern "regex"), but this grammar actually describes strings with non-prime lengths, so its about as useful at finding primes as the multiplication button on your calculator :)

    • @thomasdalton1508
      @thomasdalton1508 หลายเดือนก่อน +8

      @wyattsheffield6130 But it wouldn't be the Sieve of Eratosthenes, since it would be checking each number separately. The Sieve of Eratosthenes uses the previous steps to make it more efficient. You don't check for divisibility by 4 because you already removed 4 when checking divisibility by 2. And by checking every number for divisibility by each number at once, you don't have to do any division and can just do a single addition for each number. It's a specific algorithm that is intelligently designed to be efficient. It's very different to just brute force checking each number.

    • @koalanefelibato4365
      @koalanefelibato4365 หลายเดือนก่อน +3

      "it's just brute force factorisation" uhm, yeah, that's what the sieve does...

    • @eritain
      @eritain หลายเดือนก่อน +6

      Saying it's "nothing to do with" the sieve is going too far though. If you used this (non-regular) expression in the generative direction, to generate all the composite numbers, it would generate all the multiples of 2, then all the multiples of 3, then yes admittedly all the multiples of 4, then 5, then 6, etc. It's not the sieve proper but anyone can see the resemblance.

  • @JerryFlowersIII
    @JerryFlowersIII หลายเดือนก่อน +7

    00:30 I like how you can see Matt's Matte.

  • @sheepphic
    @sheepphic หลายเดือนก่อน +2

    This isn't actually the Sieve of Eratosthenes, it's just trial division. The thing that makes the SoE powerful is you don't ecer check crossed-out numbers, meaning you're only checking if *primes* are a factor of n, but this does no such thing, and so is much slower.