Same Story, Different Notation - Computerphile

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ก.พ. 2025
  • Finite State Automata meets Recursion. Professor Brailsford continues the story of computers without memory.
    State Machines versus Chomsky Type 3
    Chomsky's Hierarchy: • Chomsky Hierarchy - Co...
    Finite State Automata: • Computers Without Memo...
    3D Rock Art Scanner: • 3D Rock Art Scanner - ...
    AI Safety: • AI Safety - Computerphile
    The Professor's Notes: bit.ly/computer...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscom...
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

ความคิดเห็น • 130

  • @MorRobots
    @MorRobots 9 ปีที่แล้ว +149

    Professor Brailsford is by far my favorite Computerphile presenter.

    • @sourcecode6467
      @sourcecode6467 9 ปีที่แล้ว +4

      +MorRobots Yes, I totally agree. I always look forward to each one of his videos.

    • @SUCACU
      @SUCACU 9 ปีที่แล้ว +2

      +MorRobots What about Rob Miles, he's on my top list to be honest :D

    • @sourcecode6467
      @sourcecode6467 9 ปีที่แล้ว +2

      Yes, he's a bright young guy. Personally not my favourite, but great nonetheless. I like them all.

    • @Zaurthur
      @Zaurthur 9 ปีที่แล้ว

      definitely in my top ten

    • @MorRobots
      @MorRobots 9 ปีที่แล้ว +2

      Rob Miles is like the "Im not saying its Ai, but AI!" of computerphile lol love his work on AI

  • @Luketg8
    @Luketg8 8 ปีที่แล้ว +5

    Only recently had to cancel my place at the University of Nottingham for MSc Computer Science by Study as I had gotten a very attractive job offer, after discovering these videos I have realised that it will be a huge shame that I won't get to learn from great professors like Brailsford, I absolutely love these videos!

  • @braydentidwell
    @braydentidwell 9 ปีที่แล้ว +28

    Brings me back. Automaton, Grammars, and Languages was one of my favorite college courses. It really gives you a different perspective on the world of computer science.

    • @AntonAdelson
      @AntonAdelson 9 ปีที่แล้ว

      +Brayden Tidwell My uni forgot them completely :(

    • @fuelofmil0
      @fuelofmil0 9 ปีที่แล้ว +2

      +Brayden Tidwell Going to start my first automata course in some weeks. It's gonna be so much fun!

    • @Triantalex
      @Triantalex 2 หลายเดือนก่อน

      ok?

  • @mheermance
    @mheermance 9 ปีที่แล้ว +18

    I wish this professor taught my Automata class as he's engaging and very clear in his explanations.

    • @dariokartal9453
      @dariokartal9453 4 ปีที่แล้ว

      Yeah. Mine was overbearing (every time in class, he'd pick a student and dress him down in front of the hundred-strong audience - happened to me once) and was notorious for, as the rumor had it, having invented the Internet. I flunked his class twice. Don't matter, still probably the best class I ever took. The third time was the charm.

    • @Triantalex
      @Triantalex 2 หลายเดือนก่อน

      ok?

  • @WfrArcPol
    @WfrArcPol 6 ปีที่แล้ว +39

    Me in class learning about FSA: I don't know WHAT the hell I'm doing, this is so boring
    Me at 2AM watching video about said FSA: I don't need sleep, it is Time To Learn™

    • @Triantalex
      @Triantalex 2 หลายเดือนก่อน

      ok?

  • @chrissteward5435
    @chrissteward5435 9 ปีที่แล้ว +120

    I don't believe it, A TH-cam vid just gave me homework....

    • @AmeshaSpentaArmaiti
      @AmeshaSpentaArmaiti 9 ปีที่แล้ว +3

      +Chris Steward don't forget to do your chores, and TAKE OUT THE TRASH!!

    • @AntonAdelson
      @AntonAdelson 9 ปีที่แล้ว +1

      +Chris Steward Haha, yeah! I'm confused, a little bit excited, and unsure. Like going out on a first date ever.

    • @KilgoreOnDrugs
      @KilgoreOnDrugs 9 ปีที่แล้ว +1

      +Chris Steward ahh ... Compilers course :)

    • @KainusGulch
      @KainusGulch 9 ปีที่แล้ว

      +Chris Steward Fortunately, we don't have to turn it in if we don't want to. I want to make a TH-cam channel some time, but not about computer thingies specifically. Unless it's gaming, maybe.

    • @Triantalex
      @Triantalex 2 หลายเดือนก่อน

      ok?

  • @2Cerealbox
    @2Cerealbox 9 ปีที่แล้ว

    I absolutely love this series! I hope you continue with this for a long time.

  • @datboi_gee
    @datboi_gee 5 ปีที่แล้ว +8

    "In order to get to sentence, we first must take an 'L.'" Everybody has to take an L sometimes, professor. I feel you.

  • @realraven2000
    @realraven2000 9 ปีที่แล้ว +1

    and obviously we need to accept commas as end markers so we can declare a list of identifiers. In Javascript this is also a performance optimisation.

  • @umchoyka
    @umchoyka 9 ปีที่แล้ว +1

    Quite a switch-up on the film style for this one. I think I like this way better than the constant out-of-focus thing, but the lighting was a bit off. Interesting though, keep experimenting! :D

  • @oafkad
    @oafkad 9 ปีที่แล้ว +4

    I'm genuinely annoyed I didn't grow up in the UK just for the off chance that I'd get to take a class from Professor Brailsford.

  • @MrSlowestD16
    @MrSlowestD16 9 ปีที่แล้ว

    0:45
    I think it's important to note that recursion is ONE answer when the length isn't determinant. I simply bring this up is because recursion is a good way to solve some problems while it's a pain in the ass to solve others using it.

  • @DrRChandra
    @DrRChandra 9 ปีที่แล้ว +6

    In C, identifiers (EDIT: or at least their declarations) can be terminated by "[" (array), "," (list of identifiers of the same type), whitespace, or ";", maybe others I can't think of at the moment. But as Prof. points out, most commonly, it's a ";", for readability/understandability.

    • @ShanaTsunTsunLove
      @ShanaTsunTsunLove 9 ปีที่แล้ว +2

      +rchandraonline Well, putting thing like left square bracket as "[" is not complete because it wouldn't be C if you didn't have many decades of legacy cruft to ruin everything, see en.wikipedia.org/wiki/Digraphs_and_trigraphs#C

    • @kalleguld
      @kalleguld 9 ปีที่แล้ว +2

      +rchandraonline It is terminated by any character that's not a letter or number (or underscore).
      "int a=b+c;" contains 3 identifiers, terminated by '=', '+' and ';'

    • @DrRChandra
      @DrRChandra 9 ปีที่แล้ว +1

      Kasper Guldmann, oh, hmmm...I was thinking about declarations, not their references, but that's valid too.

    • @kalleguld
      @kalleguld 9 ปีที่แล้ว +2

      Oops, I misspoke. There are 4 identifiers, "int" is an identifier too.

    • @DrRChandra
      @DrRChandra 9 ปีที่แล้ว

      +ShanaTsunTsunLove, "[" can terminate the C identifier's name, but additionally that does not completely describe the _type_ of the identifier. In other words, at "[" you know its name and that it's an array, but not how big, which in order for the compiler to accept it, you need the size (which can be null to specify an unknown size) plus "]". It begs the question, what is an identifier? I was thinking of just a name, which is separate from its characteristics such as its type and therefore its C legality.

  • @paperclips1306
    @paperclips1306 ปีที่แล้ว

    I can hear this guy for eternity

  • @pascalfragnoud2846
    @pascalfragnoud2846 4 ปีที่แล้ว +1

    Wait, am I missing something ?
    Semicolons don't end identifiers in C (and any language that I know of, for that matter), things that are not valid identifiers characters do (including the semicolon): spaces, operator symbols...

  • @konstantinrebrov675
    @konstantinrebrov675 6 ปีที่แล้ว +1

    int _; is also a valid variable name in many programming languages, including C.

  • @TrimutiusToo
    @TrimutiusToo 9 ปีที่แล้ว +1

    End marker can be semicolon, comma or whitespace (space, tab or newline)... in C that is...
    Lets not include arrays, as it should be a bit different parsing...

  • @vcothur7
    @vcothur7 9 ปีที่แล้ว +49

    Chomsky is everywhere

    • @EtrielDevyt
      @EtrielDevyt 9 ปีที่แล้ว +16

      Chomsky is Love, Chomsky is Life.

    • @joshuawade1495
      @joshuawade1495 9 ปีที่แล้ว +3

      +EtrielDevyt Praise Chomsky.

    • @fuppetti
      @fuppetti 9 ปีที่แล้ว +3

      +Vikram Cothur
      Is it the same Chomsky who comments on politics and stuff?

    • @wierdalien1
      @wierdalien1 9 ปีที่แล้ว +4

      +Deltaexio yes

    • @SecretRaginMan
      @SecretRaginMan 9 ปีที่แล้ว +5

      +Vikram Cothur He's the Walpole of computer science.

  • @harrisonharris6988
    @harrisonharris6988 9 ปีที่แล้ว +1

    I thought a key characteristic with state machines is that they never know how they got into that state, but in the identifier example it must know how it got into that state as it must be able to work it out from the string it stored as it iterated through each character (i.e. to save the identifier name is must have concatenated a digit or a letter onto some memory string so it knows the name, however to do that it would need to have a memory to store it and so would know what caused it to reach that state, you could play it perfectly in reverse).

    • @Theraot
      @Theraot 9 ปีที่แล้ว +2

      +Harrison Harris
      The actual constraint is that the ammount of memory used to represent the state doesn't grow - that implies that the number of possible states is finite - hence the name "finite state machine". And that disallows things as concatenating, or keeping track of the path to get to the current state.
      Yet there is no need to concatenate. Not with this state machine.
      ---
      The state machine is used by passing all the characters of the string to it, until:
      A) you reach the end state, meaning that the string is valid.
      B) you reach the end of the string, in which case the string is invalid.
      So, no it is not concatenating the string.
      ---
      But how do you take some source code and get what is what?
      Notice that this grammar is useless to implement a programming language - since it only defines an indentifier. While you may try to implement a language with a set regular expressions[1] and some inference rules[2]... In practice, what is actually done is to create a single (context free) grammar for the whole language - this is easier to maintain.
      [1]: One for each part of the language.
      [2]: A backtracking algorithm to choose state machines that represents parts of the language. The backtracking algorithms "knows" how it got into the current state - it is adding it to a stack. This means that it is not a finite state machine, and it also means that regular expressions alone are not sufficient to parse a programming language.
      You may have been warned to not use regular expressions to parse C, XML, Java, C# or anything like that, now you have an insight of why.
      Note: all regular expressions are context free grammar, the inverse doesn't hold.
      ---
      Once we go beyond regular expressions, - I hope - we will see that you can mark some parts of the grammar to emit tokens. That is done by cutting the string by the current position and emitting an object "token" with that string and tagged with the part of the grammar that produced it.
      In that case, you only need to keep track of you current position in the string, which can arguably be considered part of the state. The tag is what state it was when the token is emitted.
      That way you can have a grammar that takes "foo + bar" and emits {"identifier" token "foo", "operator" token "+", "identifier" token "bar"} - further processing is needed to interpret the token sequence, this is often done by building a tree, in this case it will have "foo" and "bar" as childs of the addition. Then you can walk the tree, doing the operations as needed to execute the code [or to emit the code in another language, such as machine language].
      It is worth noting that if you use only postfix notation (ie, "foo bar +") you don't need to build the tree (you would use the stack instead), meaning that is easier to implement the interpreter / compiler.
      ---
      Also note that modern compilers - even if they are parsing a context free language - may add rules that are not from the grammar itself. These rules are used to emmit warnings and other code analysis - such as cheking if you are not using a variable. Describing such rules can't be done using context free grammars only - if warnings were to be emitted as tokens, the grammar would be context sensitive (see Ogden's lemma).

  • @ToeCutter454
    @ToeCutter454 9 ปีที่แล้ว +3

    was I the only one impressed by the use of old dot matrix printer paper as scrap paper?

  • @12tone
    @12tone 9 ปีที่แล้ว +3

    I'm a little confused. Is this still considered a finite state automaton? The tail piece can in theory be arbitrarily long, so it has an infinite number of states it could potentially be in. Is this just a weird artifact of etymology?

    • @DS127
      @DS127 9 ปีที่แล้ว

      +12tone The *output* can be arbitrarily long, but here "state" is refering to what's going on "inside" the machine. The state machine only has one state at a time.

    • @12tone
      @12tone 9 ปีที่แล้ว +1

      Makes sense. Thanks!

    • @SimGunther
      @SimGunther 4 ปีที่แล้ว

      Spotted a wild popular youtuber! :D

  • @AlejandroMartinez4
    @AlejandroMartinez4 9 ปีที่แล้ว

    I'm liking this series ^^

  • @vicplichota
    @vicplichota 9 ปีที่แล้ว

    Looking forward to the next vid in this topic... :-)

  • @AmeerFazal
    @AmeerFazal 9 ปีที่แล้ว +3

    You are my Bro!
    :)

  • @TheWyrdSmythe
    @TheWyrdSmythe 9 ปีที่แล้ว

    I always loved State Engines! So handy for any kind of serial input.
    Are you taking this series all the way to recursive descent parsing? :)

  • @ropersonline
    @ropersonline 3 ปีที่แล้ว

    It's a little annoying that the computerphile video titles, such as appear at 0:50 here, are generally different from the actual TH-cam video titles.

  • @ariebrons7976
    @ariebrons7976 9 ปีที่แล้ว

    if i understand this correctly:
    1. detect a variable
    2. get into a loop to definte the length of the variable
    3. if you don't detect any more signals (letters) then get out of the loop.

  • @SixB0w
    @SixB0w 2 ปีที่แล้ว

    Really interesting! Was always wondering why the semicolon was necessary. Why is it not needed in Python?

  • @squirrelbrains2197
    @squirrelbrains2197 8 ปีที่แล้ว

    i wish all my professors, were that good at explaining things.

  • @dmacmcmanus95
    @dmacmcmanus95 9 ปีที่แล้ว +3

    Are there any other TH-cam channels similar to this one?

    • @QuantumFluxable
      @QuantumFluxable 9 ปีที่แล้ว

      +Dillon McManus Sixty Symbols and Periodic Videos for physics and chemistry, respectively. Steve Mould also does some interesting science videos from time to time.
      Applied Science is kinda similar, but for engineering and general mad sciencing.

    • @josevillegas5243
      @josevillegas5243 9 ปีที่แล้ว +1

      +Dillon McManus I'll add to the list
      Vsauce
      3Blue1Brown
      Physics Girl
      Mathologer
      Mindyourdecisions
      standupmaths

    • @josevillegas5243
      @josevillegas5243 9 ปีที่แล้ว

      +Jose Villegas
      Frame of Essence
      Looking Glass Universe

    • @josevillegas5243
      @josevillegas5243 9 ปีที่แล้ว

      tggt00 are there any channels that you reccomend that are not on the lists?
      Also, just found another one: PhDComics

  • @PixelOutlaw
    @PixelOutlaw 9 ปีที่แล้ว

    People wanting to break away from the norm C++ family and try recursion should do it in a tail call safe language like Scheme. You can learn most of it in an afternoon. :) It is in the Lisp family but Common Lisp (what people currently mean when they say Lisp) prefers looping and iteration or even better, applying functions to entire collections.

  • @firstnamelastname-oy7es
    @firstnamelastname-oy7es 9 ปีที่แล้ว

    Interesting video!

  • @bernardoherrera5217
    @bernardoherrera5217 หลายเดือนก่อน

    Really great video but I cannot access the professor notes, can you get it back?

  • @73h73373r357
    @73h73373r357 9 ปีที่แล้ว

    This raises the question, how does the compiler/interpreter know that there isn't an escape character?
    For example, "int i = 0;" will generate an integer called "i" with value "0" without errors. However, "int i = 0" throws a compiler error.
    How does the compiler know to throw the error?

    • @tehlolzfactor
      @tehlolzfactor 9 ปีที่แล้ว +1

      I assume the only way the compiler knows that is because when you write int i = 0, the compiler sees that there is no termination character and assuming you wrote more code, it sees that a new statement has been made without a comma or other valid notation for having two statements. therefore the compiler resolves that you haven't placed a termination character. if there isn't more code after your initial statement. it's possible the compiler checks recursively through spaces and newlines in your code after that statement and at a certain point decides you haven't put a termination character

    • @menachemsalomon
      @menachemsalomon 9 ปีที่แล้ว +1

      The C standard comes with a grammar for the language, in a format similar to, but somewhat more compact than, the grammar given by Professor B. In your scenario, with the missing semicolon, the code does not become a "sentence" (to use Chomsky's terminology).

    • @73h73373r357
      @73h73373r357 9 ปีที่แล้ว

      +Menachem Salomon I understand that. But that's too abstract. I'm looking for an indepth-view of how a compiler works. (yes i've read a bit online about that) but it's generally difficult to understand and an explanation would be quite nice.

  • @kurtslagle797
    @kurtslagle797 9 ปีที่แล้ว

    I wish my CS professors explained things this clearly.

  • @RpattoYT
    @RpattoYT 9 ปีที่แล้ว +19

    I'm sorry to bring this up but I can't not mention this. Did anyone else notice that the Professor just said something to the effect of, "cumming back into to one's self" whilst poking at a phallic shape sketch he'd just drawn on some paper. Maybe it's just immature of me to pick up on such things but I couldn't help but nearly choke on my coffee.

    • @CatnamedMittens
      @CatnamedMittens 9 ปีที่แล้ว +7

      Perverted minds.

    • @RpattoYT
      @RpattoYT 9 ปีที่แล้ว

      CatnamedMittens :P

    • @TheWeepingCorpse
      @TheWeepingCorpse 9 ปีที่แล้ว

      +rpatto92 I'm proud to consider myself a raging pervert. If the authorities knew what I get up to I'd be behind bars years ago. In fact I watch most of these videos naked, but that never even entered my dirty sordid mind. P.s masturbating is a form of recursion and like the prof said it can't go on for ever.

    • @U014B
      @U014B 9 ปีที่แล้ว +1

      Which begs the question: how does he expect that to be done? 😕

    • @Triantalex
      @Triantalex 2 หลายเดือนก่อน

      ok?

  • @linkVIII
    @linkVIII 9 ปีที่แล้ว +2

    This was posted during my lecture about this today. Wasn't in the mood to watch right after lecture.

  • @Zaurthur
    @Zaurthur 9 ปีที่แล้ว

    I have a test on this sort of thing in my theory of computation class on Friday.

  • @ArnimSommer
    @ArnimSommer 9 ปีที่แล้ว

    So, with Chomsky notation I can only combine two elements?
    E.g. S -> LTE would not be valid?

  • @Arganoid
    @Arganoid 9 ปีที่แล้ว

    This refers to a previous 'car park video', but where is that video?

  • @moatl6945
    @moatl6945 9 ปีที่แล้ว

    Am I wrong, or doesn't »T->DT« violate the first rule that a »sentence« must _start_ with a letter (L)? On the other hand you won't get the semicolon on the last position, otherwise.
    Or is this just a »notation-problem«?

    • @domagojpolancec6552
      @domagojpolancec6552 9 ปีที่แล้ว

      +Martin Steindl The S -> LT production covers the "identifier must start with a letter" rule because L will always be resolved into a letter and it precedes T. The T part can be resolved either in a letter followed by another T part, a digit followed by another T part or an E part which in this case resolves into a semi colon which marks the end of the sentence. What's important to note is that S is the starting point of a sentence, i.e., we can not begin constructing a sentence from T, L, D or E.

    • @kalleguld
      @kalleguld 9 ปีที่แล้ว +2

      +Martin Steindl You must start the sentence at S. And since the only rule with S on the left side is S => LT, there's no problem.

    • @moatl6945
      @moatl6945 9 ปีที่แล้ว

      +Martin Steindl OK, I was wrong: I just overlooked, that T isn't the start of the »sentence«.

  • @philomath6190
    @philomath6190 5 ปีที่แล้ว

    "You can't keep recurring forever. You will crash. You will run out of memory."
    Wait a second. State machines don't need memory.

  • @mmxbass
    @mmxbass 9 ปีที่แล้ว

    "Any string of symbols A and B where there are equal numbers of A and B." It's funny how simple of a problem requires so much power. (Cannot be solved in the general case without an infinite stack machine.)

  • @D12golden
    @D12golden 9 ปีที่แล้ว

    I am graduated in Computer Science and I am still getting handouts from a professor.

  • @karlkastor
    @karlkastor 9 ปีที่แล้ว +1

    This reminds me of the Haskell programming language.

  • @midnightrizer
    @midnightrizer 7 ปีที่แล้ว

    Syntax is everything.

  • @peppybocan
    @peppybocan 9 ปีที่แล้ว +2

    1:03 '"... and those of you who are from electronics engineering department" ... and those from Mathematics dept. and from Computer Science dept. have no power here. :D

  • @TechXSoftware
    @TechXSoftware 9 ปีที่แล้ว +2

    When UNI taught me how to draw state machine diagrams, that wasn't it, lol

    • @MrCrayfishMinecraft
      @MrCrayfishMinecraft 9 ปีที่แล้ว +2

      +TechXSoftware What Uni did you go to? Mine taught it exactly like in the video.

    • @TechXSoftware
      @TechXSoftware 9 ปีที่แล้ว

      MrCrayfish UWS, There is no start and end circles, arrows.

    • @juggernaut93
      @juggernaut93 9 ปีที่แล้ว

      +TechXSoftware It's only about notation. I was taught to draw the accept state as in this video, but the start state as a circle and an entering arrow.

  • @Oldiesyoungies
    @Oldiesyoungies 9 ปีที่แล้ว

    start up the entomology channel again :(

  • @joga_bonito_aro
    @joga_bonito_aro 9 ปีที่แล้ว

    Funny. Just one week ago i wrote an exam about this stuff in my university

  • @DirtyRobot
    @DirtyRobot 9 ปีที่แล้ว +1

    Haveaniceday;

  • @sabriath
    @sabriath 9 ปีที่แล้ว

    EBNF is so much better though.

  • @samarthchaturvedi3926
    @samarthchaturvedi3926 9 ปีที่แล้ว

    Why is he saying semicolon so much?

  • @commanderlake7997
    @commanderlake7997 8 ปีที่แล้ว

    His thumb is creepy!

  • @GegoXaren
    @GegoXaren 9 ปีที่แล้ว

    HAI 1.3
    IM IN YR loop UPPIN YR var TIL BOTH SAEM var AN 10
    VISIBLE SMOOSH "HAI! " AN var AN " AN SUCH!" MKAY
    IM OUTTA YR loop
    KTHXBYE

    • @RussellTeapot
      @RussellTeapot 9 ปีที่แล้ว

      +Gego/XAREN ゲゴザレン DAFUQ

    • @GegoXaren
      @GegoXaren 9 ปีที่แล้ว

      Russell Teapot
      That is LOLCODE.
      It prints
      HAI! 0 AN SUCH!
      HAI! 1 AN SUCH!
      HAI! 2 AN SUCH!
      ....
      HAI! 9 AN SUCH!

  • @joshuajurgensmeier4534
    @joshuajurgensmeier4534 9 ปีที่แล้ว

    7th!

    • @GtaRockt
      @GtaRockt 9 ปีที่แล้ว

      +Joshua Jurgensmeier Congratulations
      you won 1, I repeat, *1* Internet