This is a really, really, cool idea and approach. This is going to be **phenomenally** useful and will improve how *all* code across the world is authored by everyone. I really hope this can be integrated into every editor and web-based code display. Thank you so much for this essential labour you've expended on this. I'm actually shocked at how badly most "code editors" (actually glorified text editors) understand the code they're editing. This addresses a lot of the problems of treating code as text instead of ordered trees (which is what programs really are on a conceptual *and* literal/mechanical level). I hope one day we can move into a much more productive world where most code authors can use good tree-based tools and stop wasting their time dealing with confusing parentheses, forgetting semicolons, and naming things without using spaces. "Expand selection" was my favourite feature of Sublime that I've missed since migrating to VSCode. I love that you remember which child nodes you "expanded selection" from, so you can contract them back again without mistakenly choosing (eg) the first child! Also, that idea of using "expand selection" with multiple nodes is SO powerful and useful. Just blown away by this, bravo and thank you again!
The "GLR" approach is a good generalisation of what we've been doing since the 1970's: simply treating most things as expressions, and only later hoisting out a contained L-value reference if it's required by context. So "L-value expected" becomes a semantic error rather than a syntax error.
Awesome job-both the tree-sitter work and the presentation. Thank you! Just stumbled across this via a pointer on the emacs-devel list (in case anyone wonders :-). The one small thing I have an issue with was the critique of existing code-highlighting practices in the "motivation" part of the talk. In some cases I personally liked the "old-style" highlighting better than the treesitter-generated ones; in particular, I have a preference for NAMES OF NEWLY DEFINED THINGS to be highlighted, and that was what the "old" style did in the examples. In fact I think that EVEN OLDER algorithms did it the way Max prefers, i.e. type names one color, variable names second color, etc. Anyway, that's a detail. This is REALLY COOL work with lots of potential. Also kudos for respecting the elderly (-: as shown here, old methods such as (G)LR parsing may still have untapped potential. Now excuse me while I get a case of fresh punch cards from the basement and try to write a tree-sitter grammar for LISP
You know vscode right now is so popular but I still love and use Atom as my primary editor. I hope now that Microsoft owns Github and Atom they dont get rid of it in favor of vscode. Also, I didn't know about this but the syntax highlighting always did look better to me in Atom but I just assumed the themes were made better, I didn't realize all this tech was going on under the hood
Slick! Leveraging GLR for error recovery is a great idea. Question, though: How would you handle a confused lexer? Consider the insertion of a quote mark near the beginning of a file...
This is strange that a cool stuff like tree-sitter is used only in two major and popular code esitors. I'm mean only NeoVim and Emacs use it(if we talk about popular solutions) and other editors like VSCode, SublimeText4 etc. use regex-based syntax hightlighting 🤢
Thanks for your presentation. I am learning tree-sitter but i have a problem. I am parsing a source code from file, i need to know his function name not the node name. how can i do that. I read all the doc but i can't have the solution. thanks
I'm not convinced. For auto-completion and error reporting, language servers already need to parse and analyze the code. I'd expect that to be much more efficient if information for syntax highlighting also comes from that same parser and analyzer. I'm not much into language server protocol but I got the impression that semantic highlighting is something that is already supported (even though not implemented for every language) LSP is also designed for incremental changes as far as I know.
Agree, the main problem with LSP is that it comminucates via RPC, which can be quite slow, especially for thing that needs to be highlighted every key strokes. Tree-sitter can be directly imbedded into editor as a quick, standard (not necessarily correct) syntax highlighter.
I mean as someone who is worked for both LSP is way too slow for this job. Syntax highlighting is something that should happen in the editor and quickly. Plus tree sitter is not making a value add like LSP does. People can and do whatever they want with the tree which is something that LSP (at the moment) does not provide
LSP's are too slow, it was a stupid idea to have to run an entire server that uses JSON for a task that requires minimal latency. Should instead have been a standardized ABI interface that different compilers/interpreters implement, instantly cuts out the majority of the performance concerns and also removes a ton of complexity. Treesitter is a project that aims to patch over the mistake of LSP's since unfortunately it's too late to go back and replace them entirely
4 years old and still relevant, super nice !
This is a really, really, cool idea and approach. This is going to be **phenomenally** useful and will improve how *all* code across the world is authored by everyone. I really hope this can be integrated into every editor and web-based code display. Thank you so much for this essential labour you've expended on this. I'm actually shocked at how badly most "code editors" (actually glorified text editors) understand the code they're editing. This addresses a lot of the problems of treating code as text instead of ordered trees (which is what programs really are on a conceptual *and* literal/mechanical level). I hope one day we can move into a much more productive world where most code authors can use good tree-based tools and stop wasting their time dealing with confusing parentheses, forgetting semicolons, and naming things without using spaces.
"Expand selection" was my favourite feature of Sublime that I've missed since migrating to VSCode. I love that you remember which child nodes you "expanded selection" from, so you can contract them back again without mistakenly choosing (eg) the first child! Also, that idea of using "expand selection" with multiple nodes is SO powerful and useful.
Just blown away by this, bravo and thank you again!
It is not “it will” it already does. It is an important tool in my workflow already
What a fantastic presentation. A very useful project and a great explanation of how it works!
We've been working on a remarkably similar concept. Hot damn. Actually glad I'm not the only one.
The "GLR" approach is a good generalisation of what we've been doing since the 1970's: simply treating most things as expressions, and only later hoisting out a contained L-value reference if it's required by context. So "L-value expected" becomes a semantic error rather than a syntax error.
Just want to add another comment about how great a presentation this was.
Awesome job-both the tree-sitter work and the presentation. Thank you! Just stumbled across this via a pointer on the emacs-devel list (in case anyone wonders :-). The one small thing I have an issue with was the critique of existing code-highlighting practices in the "motivation" part of the talk. In some cases I personally liked the "old-style" highlighting better than the treesitter-generated ones; in particular, I have a preference for NAMES OF NEWLY DEFINED THINGS to be highlighted, and that was what the "old" style did in the examples. In fact I think that EVEN OLDER algorithms did it the way Max prefers, i.e. type names one color, variable names second color, etc.
Anyway, that's a detail. This is REALLY COOL work with lots of potential. Also kudos for respecting the elderly (-: as shown here, old methods such as (G)LR parsing may still have untapped potential. Now excuse me while I get a case of fresh punch cards from the basement and try to write a tree-sitter grammar for LISP
Grammar for LISP? I see what you did there...
This guy is now co-founder behind the new fast code editor Zed. Go figure!
You know vscode right now is so popular but I still love and use Atom as my primary editor. I hope now that Microsoft owns Github and Atom they dont get rid of it in favor of vscode. Also, I didn't know about this but the syntax highlighting always did look better to me in Atom but I just assumed the themes were made better, I didn't realize all this tech was going on under the hood
Two years later and the day has come. What are you using now?
Now Treesitter has given Neovim superpowers by allowing it understand ASTs
This is basically getting default features of structural (aka projectional) editing into textual editing. Interesting.
Slick! Leveraging GLR for error recovery is a great idea. Question, though: How would you handle a confused lexer? Consider the insertion of a quote mark near the beginning of a file...
tree-sitter is now part of GNU Emacs:)
This is strange that a cool stuff like tree-sitter is used only in two major and popular code esitors. I'm mean only NeoVim and Emacs use it(if we talk about popular solutions) and other editors like VSCode, SublimeText4 etc. use regex-based syntax hightlighting 🤢
Absolutely amazing presentation (and software)
So informative and the idea is just mind-blowing! Thanks for your well organized presentations.
Wow, this is crazy cool. Might have to give Atom a try after seeing this
Looks interesting. Reminds me of Roslyn for C# in Visual Studio.
Treesitter ships with neovim now
Fantastic news!!
FANTASTIC
Fantastic work!
Fantastic
Awesome!!!
If you need to know what is Tree-sitter jump into th-cam.com/video/Jes3bD6P0To/w-d-xo.html
Thanks for your presentation. I am learning tree-sitter but i have a problem. I am parsing a source code from file, i need to know his function name not the node name. how can i do that. I read all the doc but i can't have the solution. thanks
I'm not convinced.
For auto-completion and error reporting, language servers already need to parse and analyze the code.
I'd expect that to be much more efficient if information for syntax highlighting also comes from that same parser and analyzer.
I'm not much into language server protocol but I got the impression that semantic highlighting is something that is already supported (even though not implemented for every language)
LSP is also designed for incremental changes as far as I know.
Agree, the main problem with LSP is that it comminucates via RPC, which can be quite slow, especially for thing that needs to be highlighted every key strokes.
Tree-sitter can be directly imbedded into editor as a quick, standard (not necessarily correct) syntax highlighter.
Also if there are errors on that line you are going to lose the syntax highlight or fallback to the basic one.
I mean as someone who is worked for both LSP is way too slow for this job. Syntax highlighting is something that should happen in the editor and quickly. Plus tree sitter is not making a value add like LSP does. People can and do whatever they want with the tree which is something that LSP (at the moment) does not provide
LSP's are too slow, it was a stupid idea to have to run an entire server that uses JSON for a task that requires minimal latency. Should instead have been a standardized ABI interface that different compilers/interpreters implement, instantly cuts out the majority of the performance concerns and also removes a ton of complexity. Treesitter is a project that aims to patch over the mistake of LSP's since unfortunately it's too late to go back and replace them entirely