Python Regular Expressions - Computerphile

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ม.ค. 2024
  • Continuing the exploration of Regular Expressions and Automata with Professor Thorsten Altenkirch.
    The professor's code: bit.ly/C_PythonRegEx
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharanblog.com
    Thank you to Jane Street for their support of this channel. Learn more: www.janestreet.com

ความคิดเห็น • 103

  • @mokopa
    @mokopa 5 หลายเดือนก่อน +69

    The way in which Prof Altenkirsch talks during this video is almost indistinguishable from the conversation a lone programmer would have with himself out loud during an intense programming session. If anyone ever wondered what programmers sound like when there's no-one to hear them code, this is pretty much it.

  • @RagHelen
    @RagHelen 5 หลายเดือนก่อน +156

    It never came to me that the highest level of Pythonic is to lean back 40 degrees with your upper body while writing Python.

    • @DrGreenGiant
      @DrGreenGiant 5 หลายเดือนก่อน +29

      I do this too. Well, specifically I alternate between ultimate recline and ultimate shrimp lol

    • @innocentsmith6091
      @innocentsmith6091 5 หลายเดือนก่อน +7

      I find to be curled up in a fetal position to be most pythonic

    • @kylek29
      @kylek29 5 หลายเดือนก่อน +7

      Don't overlook the flower shirt, which is part of the proposed PEP9 spec.

    • @RagHelen
      @RagHelen 5 หลายเดือนก่อน

      @@kylek29 That trumps proper whitespaces.

    • @MethodOverRide
      @MethodOverRide 5 หลายเดือนก่อน +1

      ​@@DrGreenGiantI feel so seen right now 😂

  • @jens256
    @jens256 5 หลายเดือนก่อน +51

    So this is not an explanation of Regular Expressions as a tool, that we might use on a daily basis. This is the theoretical basis for how REs is implemented. It's stuff covered in undergrad CS, and one of the next lessons is making your own parser for your own programming language.

  • @jackerylel
    @jackerylel 5 หลายเดือนก่อน +38

    The amount of times he wrote return as retrun makes this video super relatable and gives me hope

    • @liamchisari2191
      @liamchisari2191 2 หลายเดือนก่อน

      You've got this bro 💪

  • @BigJonYT
    @BigJonYT 5 หลายเดือนก่อน +34

    This feels like they edited a 2 hour video, which explained everything, down to 20 minutes :-)

  • @frederico-kluser
    @frederico-kluser 5 หลายเดือนก่อน +7

    I love watching this guy, he looks like some kind of ancient sage like a programming Gandalf

  • @bentationfunkiloglio
    @bentationfunkiloglio 5 หลายเดือนก่อน +2

    Prof Altenkirch vids are my favorite!

  • @Erik_The_Viking
    @Erik_The_Viking 5 หลายเดือนก่อน +1

    Another video with Thorsten? Awesome! He has a great sense of humor.

  • @pouet4608
    @pouet4608 3 วันที่ผ่านมา

    thank you so much for this serie linking automatons to regexes!

  • @YuTv1408
    @YuTv1408 5 หลายเดือนก่อน +4

    I love Thorsten. ❤❤ the guy is a true genius of our time.

  • @yedidiapery
    @yedidiapery 5 หลายเดือนก่อน +2

    great series! i really liked the implementation style

  • @arthurdent8086
    @arthurdent8086 5 หลายเดือนก่อน

    I've always focused on just getting the re to serve some purpose. That's why I got an engineering degree, not a mathematics degree. But this helps me appreciate just how much math is under the hood (bonnet?)! It's fun listening to prof Altenkirsh even though the machine code episode was more my speed

  • @MangoNutella
    @MangoNutella 5 หลายเดือนก่อน +22

    Just watching because of Prof. Altenkirch

    • @jalsiddharth
      @jalsiddharth 5 หลายเดือนก่อน +1

      Thorsten is amazing, but beyond all of that, a super kind person. Can't believe I've had the chance to interact with him in real life multiple times.

    • @Ryan_Hokanson
      @Ryan_Hokanson 5 หลายเดือนก่อน

      Sure seems like a nice, verysmart guy.
      I too watch purely for the fun of (almost~certainly) knowing that if you understand the arcane _jargon_ he is speaking then the _concepts_ are surely simply simple.
      LSS: Dude... you had me at "Non-Deterministic Automata"

  • @aram5642
    @aram5642 5 หลายเดือนก่อน +8

    Is he talking about Regularexprechen?

  • @ivarkrabol
    @ivarkrabol 5 หลายเดือนก่อน +5

    Right back to it, I see! There's an error in the example with the alternating as and bs at 0:59. It should not have a "+" (meaning or) in between "(ab)*" and "(a+ε)"

    • @phizc
      @phizc 5 หลายเดือนก่อน +2

      The green text on the screen for the challenge in the end (20:41) is also incorrect. The one in the comment on the monitor is the correct one.

  • @velho6298
    @velho6298 5 หลายเดือนก่อน +9

    I think Sean has gotten free university degree from making these videos. I hope he gets his diploma at some point

  • @syjwg
    @syjwg 5 หลายเดือนก่อน +7

    Unit testing should also know the expected result. There should be some instant feedback instead of someone checking and saying, "Yep, that test case must return false, so it's okay".

    • @DavidLindes
      @DavidLindes 5 หลายเดือนก่อน +3

      Yeah... this definitely falls short of proper unit testing. Perhaps Prof. Altenkirch needs to learn to behave... or rather, to learn _behave_ -- the python implementation of the Cucumber conception of unit testing. :)

  • @phizc
    @phizc 5 หลายเดือนก่อน +8

    It would have saved me a lot of time if the mystery regex had been shown correctly on the screen and the paper (from 20:38). The correct regex (with sensible symbols) is /(0|11|10(00|1)*01)*/.
    After I had wasted much time on the incorrect one, I drew the AST from the python code in a notebook since I didn't trust the comment above it. The comment was correct as it turned out.
    Here's the first matches:
    0 : 0
    11 : 3 (11)
    110 : 6 (11 0)
    1001 : 9 (10 01)
    1100 : 12 (11 0 0)
    1111 : 15 (11 11)
    10010 : 18 (10 01 0)
    10101 : 21 (10 1 01)
    11000 : 24 (11 0 0 0)
    11011 : 27 (11 0 11)
    11110 : 30 (11 11 0)
    I think I can see the pattern.

    • @tunafllsh
      @tunafllsh 5 หลายเดือนก่อน +1

      Ah that's the classical division by 3 test.

    • @babyeatingpsychopath
      @babyeatingpsychopath 5 หลายเดือนก่อน +1

      The extra credit question on my DFA exam in college was "write a DFA that can determine If a binary number is divisible by 3. Hint, it has 3 states." I recognized the regex instantly.

    • @jursamaj
      @jursamaj 5 หลายเดือนก่อน

      @@babyeatingpsychopath Technically, shouldn't there be 4 states? The empty string is not "a binary number is divisible by 3", but it *is* your start state. Then you have 3 states for the 3 remainders modulo 3.

    • @babyeatingpsychopath
      @babyeatingpsychopath 5 หลายเดือนก่อน

      @@jursamaj technically, I believe you're correct, I suspect the actual instructions specified a nonzero length binary string. It's been a couple of decades since that exam.

  • @zamf
    @zamf 5 หลายเดือนก่อน +3

    The RE at the end seems to detect binary numbers that are multiple of 3. However, I'm not sure how it checks this.

    • @ThorstenAltenkirch
      @ThorstenAltenkirch 5 หลายเดือนก่อน +4

      Yes this is correct. 🎉Maybe I should do another video how to construct it.

    • @zamf
      @zamf 5 หลายเดือนก่อน +1

      @@ThorstenAltenkirch Definitely! And to explain what property of binary numbers you're using in this case.

  • @Sharaton
    @Sharaton 5 หลายเดือนก่อน

    That final regular expression is just a reformulation of (1(01*0)*1+0)*, obviously.
    Now, which is better and why is the real question.

  • @sargismartirosyan9946
    @sargismartirosyan9946 5 หลายเดือนก่อน +1

    Very interesting and recommended gold mine Channel 🎉😊

  • @pylang3803
    @pylang3803 5 หลายเดือนก่อน +1

    What makes this project particular to "Python" Regular Expressions? Cause re exists and you could roll your own regex in most languages.
    Aren't you just making regexes (in some language, that happens to be python)?

  • @mountp1391
    @mountp1391 5 หลายเดือนก่อน

    Thank you

  • @wpherigo1
    @wpherigo1 5 หลายเดือนก่อน

    I have no idea how he did that. I wonder how many times I have to watch it to figure out what he did?

  • @noclafcz
    @noclafcz 5 หลายเดือนก่อน +1

    I had a problem, so I used regular expression. Now I have two problems.

  • @IIARROWS
    @IIARROWS 5 หลายเดือนก่อน +20

    Halfway through and I don't understand what's happening...
    What's the point? Why Python is important for this?

    • @piyh3962
      @piyh3962 5 หลายเดือนก่อน +7

      Yeah, I feel like there's a bunch of missing context

    • @redjr242
      @redjr242 5 หลายเดือนก่อน +12

      He's implementing regex in terms of an NFA network, in code. He wrote python code for running any NFA network in a previous video. That code will also work on the regex-specific NFA networks he's constructing in this video. He chose Python because it had to be in some programming language, and Python is easier to write than in C for example.
      Edit: but yeah as others have said, the title of this video suggests it's about python's re regex library, or just regex in general, which could confuse viewers :/

    • @DrGreenGiant
      @DrGreenGiant 5 หลายเดือนก่อน +5

      This is part of a series which is in a playlist. Hopefully that helps you understand the context if you've not seen the previous videos!

    • @vwtype411
      @vwtype411 5 หลายเดือนก่อน

      TH-cam prompting this episode out the blue. I also bailed halfway through.

    • @Loki-
      @Loki- 6 วันที่ผ่านมา +1

      This is the university experience. Just throw in you doing a homework assignment for this now, then studying for the exam on it. Somehow by the end you'll look back and realize you absorbed some amount of it because you know more than before you began.

  • @Masheeable
    @Masheeable 5 หลายเดือนก่อน

    Prof Altenkirsch is definitely the Alien played by Jemaine Clement in Men in Black. It's so obvious he's a Supermax-escapee from the other side of the moon bent on taking over the world.

  • @aounhaider8335
    @aounhaider8335 5 หลายเดือนก่อน

    Also upload videos on compiler design.

  • @delmonti
    @delmonti 5 หลายเดือนก่อน +1

    ....I have no idea what I've just watched.

  • @djhoese
    @djhoese 5 หลายเดือนก่อน +14

    Please change the title of this video. Python is not the important part. People are going to search for help with using regular expressions in Python (import re) and find this video which is not going to help.

    • @BaronFirespawn
      @BaronFirespawn 5 หลายเดือนก่อน +2

      If you're watching Computerphile videos and expecting tutorials, you're already in the wrong place.

    • @djhoese
      @djhoese 5 หลายเดือนก่อน

      @@BaronFirespawn Someone who doesn't know computerphile would only see a search result for a popular channel. Regardless, the "Python" in this video had nothing to do with the concept being shown.

  • @HarishNarayanan
    @HarishNarayanan 3 หลายเดือนก่อน

    Real life Erlich Bachman.

  • @unclerojelio6320
    @unclerojelio6320 5 หลายเดือนก่อน +2

    Which code editor is he using?

    • @DrGreenGiant
      @DrGreenGiant 5 หลายเดือนก่อน +3

      One without a code formatter lol

    • @Imperial_Squid
      @Imperial_Squid 5 หลายเดือนก่อน +1

      That "In [n]"/"Out [n]" in the console reminds me of spyder, but could be wrong

    • @DavidLindes
      @DavidLindes 5 หลายเดือนก่อน +1

      @@Imperial_Squid well, the title of that window (e.g. at 3:25) says "IPython Console"... you'll also see that sort of output in Jupyter Notebooks... I think a lot of these things are interconnected in some manner, though I don't know the details.

    • @KushagraJuneja
      @KushagraJuneja 5 หลายเดือนก่อน

      probably Jupyter

  • @4984christian
    @4984christian 5 หลายเดือนก่อน

    What about the Fall of Rome where the European people invaded Rome because they where pushed from the east?

  • @mytech6779
    @mytech6779 5 หลายเดือนก่อน +2

    Nice try, but your not going to unconfuse me that easy!

  • @nathanaelsmith3553
    @nathanaelsmith3553 5 หลายเดือนก่อน +8

    Love regex - hate python (syntax) - useful libraries though.

    • @XenoTravis
      @XenoTravis 5 หลายเดือนก่อน

      If python had better syntax and kept the ease of use it would be so nice!
      If I can remember the weird quirks in python it is so nice to use

    • @nathanaelsmith3553
      @nathanaelsmith3553 5 หลายเดือนก่อน +3

      @@XenoTravis it's like a cross between BBC Basic from the 1980s and JavaScript, without any curly braces. And 'elif' - seriously? That just looks like a typo. Deep and shallow copies? Grrrr! Probably all straight forward if you start out learning to code in Python but really annoying if you are previously familiar with other languages. But the libraries are useful.
      Oh and why is there a while but not a do ... while ?

    • @RedHair651
      @RedHair651 5 หลายเดือนก่อน

      What would you improve about the syntax? ​@@XenoTravis

    • @XenoTravis
      @XenoTravis 5 หลายเดือนก่อน

      @@RedHair651 haven't thought about what I would do to improve it tbh. Sometimes it seems a little bit bare and I have to assume a lot of things.
      I would change the for loop or at least add in the classic c syntax along with their style. But that is just me being so used to the 'normal' way

    • @halfsourlizard9319
      @halfsourlizard9319 5 หลายเดือนก่อน

      The syntax is what you hate? There are way better reasons to hate Python: Mutability. Lack of a type system. Incorrectly / inconsistently-implemented scoping.

  • @xxtradamxx
    @xxtradamxx 5 หลายเดือนก่อน

    if you have 2 years of comp sci bsc, you will understand this, but then it's not useful for you anymore, well...

  • @aquaast4571
    @aquaast4571 5 หลายเดือนก่อน

    why does he look like an older version of brad pitt as benjamin button

  • @Anvilshock
    @Anvilshock 5 หลายเดือนก่อน +7

    Python? Oh, you mean Peißn! Yes, hörd of it.

    • @bogdanstamenic2836
      @bogdanstamenic2836 5 หลายเดือนก่อน +1

      "My English is not ze yellow from ze egg"

  • @DavidvanDeijk
    @DavidvanDeijk 5 หลายเดือนก่อน

    e1 = /b?(ab)*a?/

  • @THERODRIGOoriginal
    @THERODRIGOoriginal 5 หลายเดือนก่อน

    0?00

  • @madplayer5
    @madplayer5 4 หลายเดือนก่อน +1

    Like if you did not understand a sh*t

  • @Tyler_0_
    @Tyler_0_ 5 หลายเดือนก่อน

    To disjointed to understand unfortunately, gave up at 15min.

  • @buttermilk_pie
    @buttermilk_pie 5 หลายเดือนก่อน

    No clue wth is going on here. Thought I was about to listen to a real smart man talk
    about using Regular Expressions…

  • @gllizzzy
    @gllizzzy 5 หลายเดือนก่อน +7

    import re already

    • @DrGreenGiant
      @DrGreenGiant 5 หลายเดือนก่อน +6

      That really doesn't explain how RE works, which is the whole point of this video

    • @DavidLindes
      @DavidLindes 5 หลายเดือนก่อน +2

      @@DrGreenGiant very much agreed. I wish more was done to draw attention to that distinction, though -- not to mention the fact that he's using a completely different syntax for his regular expressions than anything that's common in the UNIX [and similar] landscape... It's useful stuff if you're thinking about the abstract ideas of it all, but quite disconnected from practical everyday usage of existing implementations. 😕

    • @halfsourlizard9319
      @halfsourlizard9319 5 หลายเดือนก่อน

      ​@@DavidLindes Syntax isn't interesting. It's just arbitrary convention. The fundamental ideas are what matters ... the rest is just implementation details.

    • @DavidLindes
      @DavidLindes 5 หลายเดือนก่อน

      @@halfsourlizard9319 Syntax may not be "interesting" to the theory, but it's critically important to actually doing anything useful with actual computers. Also, it can be _very_ interesting -- see the IOCCC.

    • @halfsourlizard9319
      @halfsourlizard9319 5 หลายเดือนก่อน

      @@DavidLindes Knowing the theory behind regexps is the interesting bit; there are references (or LLMs) for the derpy idiosyncrasies of the various flavours.

  • @lorenzobolis5166
    @lorenzobolis5166 5 หลายเดือนก่อน

    Please write better unit tests than he does

  • @misterhat5823
    @misterhat5823 5 หลายเดือนก่อน +1

    Sorry, I can't learn from this dude.

  • @davt8355
    @davt8355 5 หลายเดือนก่อน

    Only Rust

    • @RedHair651
      @RedHair651 5 หลายเดือนก่อน +2

      I don't understand why Rust has so much success

    • @davt8355
      @davt8355 5 หลายเดือนก่อน

      @@RedHair651 A better version of C++, what else do you want? Speed and security. Top-notch for today's needs.

    • @BruceGrembowski
      @BruceGrembowski 5 หลายเดือนก่อน

      ​@@RedHair651because rust never sleeps.

  • @ianlawson94
    @ianlawson94 5 หลายเดือนก่อน

    First