Introduction to Tokenization | Writing a Custom Language Parser in Golang
ฝัง
- เผยแพร่เมื่อ 24 ก.ย. 2024
- This is the first video in my series covering modern parsing techniques for language & compiler development. In this video we begin writing the lexer which we use for the remainder of the series.
Lexing is the process of transforming the raw source code into meaningful tokens which can later be used by the parser. This process will involve using regular expressions for pattern matching and helper handling functions to simplify the process compared to simpler forms of tokenization.
⭐⭐ GitHub Repo ⭐⭐
- github.com/tla...
🔔 JOIN THE COMMUNITY 🔔
/ @tylerlaceby
----------------------------------------
-- Social Links --
Discord - / discord
GitHub - github.com/tla...
Extremely clean and thorough explanation of a Lexer. Thanks a lot of making this video
Really hyped! I first watched your "How to make an interpreted language" about 3 weeks ago, and after rewriting my codebase in C#, I published it on GitHub! It was originally supposed to be a functional language, so I didn't add variables. I'm working on that as well as static typing! Now you can type anything, the only thing I have left to do before I do this GIGANTIC new commit is to add static typing to function return types!
...What am I yapping about? Great video, man! Can't wait for episode 3!
Loved the video! And yes, I'd love to see new videos about interpreting and compilation!
sameeeee
Same here! But mostly on an interpreter
Super excited to continue learning with this series! I just got finished with exams and wanted to learn more about how LSPs work. Thank you for this!
Nice ! I'm looking forward the parser videos 😊
I have really loved your series, really anticipating the rest of the series
Your videos are always great. They are super easy to follow and learn from. 😃
Brothaaa, I was waiting for this but I didn’t get any notificación. Let me enjoy my novel!
Really great. Liked.
Yooo, I literally discovered you a few days ago because I was researching around for my toy language project!
Loved your ts series, I'll definitely be sticking around!
Happy to have been of help. New videos coming soon for this series.
I'm writing my own sql parser, thanks for your video!
Awesome. Hope this helps 😄
You could make: type TokenKind string ... then define their values in the const declaration block directly, then you dont need a function to determine the value :D Good luck. very cool video and well explained.
I'm learning to write dsls right now and this is immensely helpful. I started building an interpreter for a yaml based dsl so that I could learn that portion. But now I'm focusing on the lexing and parsing portion which scared the bejesus out of me. Unemployed at the moment so I'm gonna take as much time as I need to learn to do this. Thanks for the videos!
Hopefully this series will treat you well. Thanks for the kind words. Yea luckily parsing a yaml lioe syntax is much simpler than the one we cover in the series.
Best of luck and if you need any assistance feel free to reach out on my discord.
@@tylerlaceby I appreciate it! I'm gonna suffer for a while with it but I'm sure this will be a great help.
why bro read my mind
i was just looking for making language in golang the other day
Hi, do you have plans to integrate it also with some compiler like LLVM? your language can really take advantage of what LLVM already have for it not only performance but also a bunch of cross language libraries.
compile to to machine code would be awesome
what extensions do u use? code highlighting, auto import, etc.?
Just the go extension, go language server, the default settings go uses for formatting is applied on save and Ayu Mirage is the theme I like to use.
@@tylerlaceby thanks!
not just compiling/interpreting, how about adding simple std libs too
hey do i wanna add a separate token for types like i32, u32, ...?
Maybe for your primitive types. But it should be able to group the rest as a symbol. I personally like making it a symbol and not having separate tokens. But some languages do it the other way where they have a token for each primitive