"Cursorless: A spoken language for editing code" by Pokey Rule (Strange Loop 2023)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ย. 2024
  • If you could design a spoken language from scratch for editing code, how would it look? What would be your nouns? Would they be tokens? Functions? Lines? What would be your verbs, your adjectives, and your adverbs?
    Cursorless is one answer to these questions: a spoken language designed for maximally efficient code editing by voice. Cursorless leverages the tree-sitter real-time parser to enable high-level, "smart" code manipulations while retaining the flexibility to use "dumb" primitives like tokens, lines, delimiter pairs, and regexes when necessary.
    Learn how a handful of simple abstractions - actions, modifiers, marks, and scopes - empower Cursorless users to create powerful and concise command chains that would leave even the most seasoned vim user drooling on their keyboard.
    Pokey Rule
    Creator of Cursorless
    @PokeyRule
    Pokey Rule is the creator and lead maintainer of Cursorless. He releases all of his code under the MIT license and relies on donations from users to sustain the project. Prior to working on Cursorless, Pokey managed a machine learning team at Globality. He studied programming languages and human-computer interaction at Stanford University.
    ----
    Recorded Sept 22, 2023 at Strange Loop 2023 in St. Louis, MO.
    thestrangeloop...

ความคิดเห็น • 85

  • @samwight
    @samwight ปีที่แล้ว +386

    Cursorless literally saved my job. When I was hit by a car and broke my wrist, I couldn't use one of my hands for months for typing. I was able to buy a cheap mic, install and learn Talon, and then be just about as productive as I was before. The Talon community, and cursorless especially, has such a special place in my heart because of that. They made it possible for me to turn a very tumultuous time into one that I was easily able to adapt to.

    • @NithinJune
      @NithinJune ปีที่แล้ว

      wow !

    • @chromosundrift
      @chromosundrift 10 หลายเดือนก่อน

      Fantastic! So happy for you that such a scary and horrifying experience had a silver lining!
      Quick note, while cheap mics are pretty good these days, it's worth noting that a _good mic_ (hopefully a mid price) can make voice control much more reliable, especially if it's positioned near and pointed at your mouth. Mic position can be an art but it can make all the difference and many people don't seem to realise. Note too that multiple mics can be better still. Finally, if you change your mic or have multiple setups, you may need to tune things for each. Doing this can be worth the effort.

  • @redbrick808
    @redbrick808 ปีที่แล้ว +94

    The visual aids on this presentation are so well done and helped me grasp the concepts much quicker. Fantastic presentation, this really held my attention!

  • @brollin_
    @brollin_ ปีที่แล้ว +70

    Cursorless was a life changing find for me as well. After I lost the use of my hands, I found Talon which made it possible for me to use a computer again, which was big. Cursorless was a HUGE benefit on top of that, that actually made it fun to program again. 1+ years later of using it and it still feels like magic when you string a novel command together on the fly. :D

  • @JasonStillwell
    @JasonStillwell ปีที่แล้ว +33

    This is vi for your throat.

  • @danser_theplayer01
    @danser_theplayer01 ปีที่แล้ว +10

    Bro presented literal computer incantations.
    10/10

  • @TiredOcean
    @TiredOcean ปีที่แล้ว +42

    The part about point-free/tacit programming paradigms being a great fit for interactive use, and drawing the comparison between bash and jq pipelines to Cursorless was a great "aha" moment. It made me realise that maintainability of a programming language is not an unalloyed good, since it very often puts pressure on other aspects of the language that impacts its interactive use - which is a separate, but just as valid use case!
    (One of the things I like most about doing computer work in general is when you've got a problem you're trying to figure out, and are in the fun phase of interactive experimentation. However, when you have found a potential solution/process, you've got to codify it for future use/automation, and there's almost a feeling of starting from scratch as you have to consider maintainability. I wish that transition was simpler but I have no idea what that would look like.)

  • @kenneth_romero
    @kenneth_romero 8 หลายเดือนก่อน +1

    one of the coolest talks and tech out there.

  • @mattkriese7170
    @mattkriese7170 10 หลายเดือนก่อน +3

    This is incredible. I have been using talon on and off, and though I was amazed with the speed at which some users could utilize the software... I found that it was hard on my voice and in turn I would get voice strain during long sessions. this really does seem to be a more efficient way, and it excites me that there are so many languages supported.
    Thanks for your work, and for sharing your talent and knowledge. Very cool.

  • @Waitwhat469
    @Waitwhat469 ปีที่แล้ว +18

    100% agree, accessibility work helps more people than just those that absolutely need it. Another great modern example, look at subtitles! Tons of non-hearing impaired people have them on by default, because there are moments in which that little bit of extra information helps!

  • @robertsmme
    @robertsmme ปีที่แล้ว +10

    Really interesting. I liked the sign off comment about using restrictions others experience to be an lever to innovation for them and the great good.

  • @garthgoldwater5256
    @garthgoldwater5256 ปีที่แล้ว +28

    this is incredible. super inspiring! great take on pointfree and interactive programming

  • @murtaza6464
    @murtaza6464 ปีที่แล้ว +8

    This is super awesome, and happened to show up in my feed the same week I've been working on a tree sitter parser of my own. Love seeing innovation in HCI especially in niche areas. Excellent presentation style as well

  • @jimhrelb2135
    @jimhrelb2135 ปีที่แล้ว +12

    One step for a more coordinated pair programming session. Call it a tool issue if you will, but as a vimmer, I like this a lot :)

  • @marksmod
    @marksmod ปีที่แล้ว +5

    This is crazy interesting. Most interesting language I've learnt about in years, its like completely different

  • @hysenndregjoni853
    @hysenndregjoni853 ปีที่แล้ว +3

    Spoken VIM, nice

  • @theNoriLi
    @theNoriLi ปีที่แล้ว

    What a fantastic tool for accessibly programming =] disability and work-related injury make keyboard usage challenging for so many.

  • @capability-snob
    @capability-snob ปีที่แล้ว +8

    Wow, I need to add this to my system shell, and would be nice to have it in emacs too. Lovely.

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +3

      💯 stay tuned on both of those

  • @melodyogonna
    @melodyogonna 9 หลายเดือนก่อน +1

    I thought it was going to be about Vim, instead it's something completely different that addresses a fear I've silently had for a long time - that I might lose my hands someday, and with it the possibility to program. My face split into a wide smile immediately I heard what this does.

  • @isaactfa
    @isaactfa ปีที่แล้ว +82

    Great presentation! Apart from the specific design of, say, the vocabulary, is Cursorless inherently voice-oriented? I don't really have the desire to talk while coding, but this rigorous point-free mark->target->modifier(scope)->action pipeline approach is really compelling as a vim user.

    • @emma70707
      @emma70707 ปีที่แล้ว +12

      I've heard Pokey mention in the Slack a keyboard-based version they're testing out. I'm not sure what the priority level on that is though.

    • @SynchronizedRandomness
      @SynchronizedRandomness ปีที่แล้ว +10

      The Kakaoune editing model is also very similar to this, if you’re interested. It’s also been ported via packages to some other editors (eg Emacs).

  • @philltopia
    @philltopia ปีที่แล้ว +9

    Definitely not a point free talk! excellent stuff

  • @falkland_pinguin
    @falkland_pinguin 11 หลายเดือนก่อน +2

    Unnecessary remark: At 12:58, Pokey was probably referring to the NATO "phonetic" alphabet, also known as International Radiotelephony Spelling Alphabet according to Wikipedia. The International Phonetic Alphabet is /ðɪs/.

  • @andreas_arvidsson
    @andreas_arvidsson ปีที่แล้ว +6

    Truly an inspiration for the (coding) children! :D

  • @alurma
    @alurma ปีที่แล้ว +2

    Amazing

  • @jaysilence3314
    @jaysilence3314 ปีที่แล้ว +22

    Cursorless should be excellent for live coding on Twitch,TV. Yet there seems to be no one doing this.

    • @josevargas686
      @josevargas686 11 หลายเดือนก่อน

      it is too difficult, no one is stepping up to the challenge, a gold mine unmined!

  • @RockieYang
    @RockieYang ปีที่แล้ว +1

    Thanks for great talk, super inspiring

  • @schubertludwig
    @schubertludwig ปีที่แล้ว +30

    Super nice! But isn’t the backwards word order an actual usability issue? If you spoke them beginning with a mark, then modifiers, then actions then you could have on screen feedback for your selection as you’re speaking. Same problem that vim had, no?

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +31

      Great question! Flipping the order is something I’m quite curious about, and would actually be trivial to do; ~3 lines of code. I’ve got an open offer to help anyone set that up who’s interested
      That being said, due to the way the Talon engine parses phrases today, you wouldn’t get live feedback. We don’t know what you’ve said until you’ve completed an entire phrase, so we couldn’t highlight the intermediate targets
      If you paused after the mark, then in theory we could show you that target, but it’s actually ambiguous, because if you just say a letter in talon, it will type the letter
      But I really don’t want to discourage anybody from trying the reverse word order, because I’m quite curious how it would work
      Fwiw the experimental Cursorless keyboard interface does actually go in reverse order of the spoken grammar, and we do highlight the intermediate targets as you type

    • @julianferrone9620
      @julianferrone9620 ปีที่แล้ว +5

      ​@@PokeyRuleJams the whole word order vs. reverse order thing reminds me of Haskell's $ vs & operators-i.e. for functions f, g, h you can compose a pipeline as either "f $ g $ h" (apply f to the result of applying g to the result of applying h to the argument) or you can run it as "h & g & f" (apply h then apply g then apply f)
      Live view would be super cool, I'll have to check out that experimental interface

    • @schubertludwig
      @schubertludwig ปีที่แล้ว +5

      @PokeyRuleJams thanks for the in-depth answer! Once I setup Cursorless and saw it involved Talon I realized the speech recognition wasn’t real time, so probably not immediately a big priority. Also, watching you use the system as an expert in your live coding videos made me think that most targeting doesn’t seem as complicated as I had imagined. What a fun and productive lens through which to look at editing! I didn’t realize how much of my coding tasks were “bring” actions. ;)
      I will try the keyboard interface soon… also so many ideas involving LLMs for natural language editing commands… what a rabbit hole of ideas you’ve open ended up for me! \o/

    • @AnthonyBullard
      @AnthonyBullard ปีที่แล้ว +4

      You could take the approach Helix did to vim commands and invert the order and highlight the target range as you speak it, and then when the action is spoken you’ll know what you acted on.

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว

      @@AnthonyBullard yeah that would be cool; see my answer at the top of the thread for some of the practical difficulties there, though

  • @randywest984
    @randywest984 3 หลายเดือนก่อน

    You have created an incredible product. My question is can you think of a way a blind user could interact with your product without the ID marks and therefore without access to the vocabulary to distinguish what tokens are the target

  • @jacobzimmerman3492
    @jacobzimmerman3492 ปีที่แล้ว +18

    Hey Pokey, not sure if you’re reading these but there was a recent short presentation/paper called “A Caret for your Thoughts” that I think you’d find interesting.
    Cool stuff here, I love that you could teach someone kind of naturally by conversation or pair programming. I bet an interactive tutorial in the form of a podcast/recording could be fun

    • @MrRonah
      @MrRonah ปีที่แล้ว

      Link please? I searched for it and failed to find it :(

    • @cursorless
      @cursorless ปีที่แล้ว +5

      Thanks for the pointer! Are you referring to th-cam.com/video/r--d5XlUyT4/w-d-xo.html
      > I bet an interactive tutorial in the form of a podcast/recording could be fun
      Ooh interesting idea. Could you elaborate?

    • @garthgoldwater5256
      @garthgoldwater5256 ปีที่แล้ว +2

      is that talk the “A Caret for Your Thoughts?” video on the veztron youtube channel?

  • @yash1152
    @yash1152 11 หลายเดือนก่อน +2

    3:33 no matter how small the delay is, we wanna amortize that
    woww. amortized analysis of complexity. awesome.

  • @aredrih6723
    @aredrih6723 ปีที่แล้ว +9

    Not sure how much it contrast with vim motion (with proper tokenization of the AST) outside of the pointfree aspect.
    Arbitrary word selection could be one but I'm more expecting of eye tracking tracking for that one (though the tech is a bit lacking there).
    Unfortunately, the current impl receiving sentence by sentence make cursorless incompatible with time sensitive eye gazing.
    On the point free exemple, I'm curious how it would behave in a 1k line document with instances on top and on the bottom. Would it replaces everything matching ? Even outside of your view ? What if you want to only replace half ?
    Also, I guess the usability depends on the work environment (open space might be a bad mix) and how much feedback speed matters to the user (keypress vs spoken word).
    For the world order, I'm curious what language where the verb is at the end would prefer. Japanese comes to mind (at least to my limited understanding) with flexibility in the sentence structure (lots of context in no particular order (distinguished by following particle) followed by a verb (optional of course 🤡)), and they implicit subject repeat from sentence to sentence (if 2 successive sentence talk about the same thing, only the first has the topic); so `this` as the default mark might less obvious.
    Either way, good job on making to address your specific condition and building a community around it. I can only wish it goods things.
    (and (obvious but always true) don't worry about random internet user, you know you're delivering value to your user ~)
    Good luck on the project.

    • @4xelchess905
      @4xelchess905 ปีที่แล้ว

      I think one key difference with Vim is that default commands are optimized by syllables, as opposed to key strokes. Syllables are slightly longer to utter, but there are a lot more of them, and we learn them right from birth.
      My guess is that you can restrict edit to scopes and selections, and you have a lot of ways to finely build selections in one or several sentences. With the help of the LSP, you might even be able to select a variable instead of a mere token and have every only select that variable rather than all the instances of the name in unrelated scopes.
      In an other comments, pokey talks about an "experimental Cursorless keyboard interface [that] does actually go in reverse order of the spoken grammar, and we do highlight the intermediate targets as you type", ala kakoune/helix.
      Also, since it is integrated inside VS code (and more generally the goal seems to be a plug-in rather than a standalone editor), I surmise you can actually seamlessly switch between cursorless and regular commands depending on which is the simplest/fastest for your needs.

  • @strangeWaters
    @strangeWaters ปีที่แล้ว

    I use it on and off. It's difficult but workable.

  • @RyanLynch1
    @RyanLynch1 ปีที่แล้ว

    wow this is actually so cool! i really want to try this

  • @yash1152
    @yash1152 11 หลายเดือนก่อน

    26:45 tree sitter localizes those errors (i.e. rest doc is still valid)
    26:59 treesitter's parse tree output: dialect of scheme (lisp)

  • @ancbi
    @ancbi ปีที่แล้ว +14

    with how much this works like a spoken vim, wouldn't it be funny if you have to speak something unintuitive to exit cursorless?

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +12

      Haha. Reminds me of “whale quench”. In Emily’s strange loop talk on voice coding, she said “whale quench” to exit vim, as that’s the talon for pressing wq, and then “whale quench” went viral 😄

    • @jishcatg
      @jishcatg ปีที่แล้ว +4

      @@PokeyRuleJams Now I'm going to hear that every time I do it.

  • @danielrhouck
    @danielrhouck 11 หลายเดือนก่อน +3

    The thing thatʼs weird about this talk is… he doesnʼt mention a place to *start.* How do I enter a *new* line of code, instead of modifying an existing one?
    Iʼm sure the resources he mentions at the end cover that, itʼs just weird

  • @fburton8
    @fburton8 ปีที่แล้ว +1

    Glassk! Shrock!!

  • @hikesandbiking1181
    @hikesandbiking1181 ปีที่แล้ว

    pokey you rule!

  • @yash1152
    @yash1152 11 หลายเดือนก่อน +1

    0:06 "spoken language for editing code"
    i always wanted helix/kakoune/vim modal navigation & editing keybinds to be considered an actual formal language, rather than just "commands" - just like this person considers his creation to be a programming language.

  • @the-pink-hacker
    @the-pink-hacker ปีที่แล้ว +2

    So it's vim for the voice?

  • @yash1152
    @yash1152 11 หลายเดือนก่อน +1

    12:38 _"little hats on a symbol of token, special word for that symbol"_
    ahhw, i'd like to use the IPA flight names then:
    India Golf Niner Nine Charlie Tango

    • @yash1152
      @yash1152 11 หลายเดือนก่อน +1

      12:57 on yeah, ofcourse IPA got a mention here (:

  • @ed_halley
    @ed_halley ปีที่แล้ว +1

    Just ran into this, and looked at a few jams. I don't see an example of just typing: how do you create content, spell identifiers you can't yank from elsewhere, etc.? There's gotta be a literal mode somewhere.

  • @yash1152
    @yash1152 11 หลายเดือนก่อน +1

    18:37 _"what paradigm cursorless is as a prog lang"_
    lemme guess, this aspect of it is "array programming" !?

    • @yash1152
      @yash1152 11 หลายเดือนก่อน

      31:45 yep, uiua tacit array programming language

  • @DebanjanBasu
    @DebanjanBasu 7 หลายเดือนก่อน

    I see some vim like structure in the grammar

  • @SSJ3Tim
    @SSJ3Tim ปีที่แล้ว +5

    @13:07 I really want to hear how you say "jury" with one syllable!

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +6

      😂 yeah I’ve wondered the same thing. The creator of Talon is apparently able to do it. I personally just have it remapped to “Jane” 😊

  • @chaquator
    @chaquator 11 หลายเดือนก่อน

    ive been interested in programming by dictation since seeing the apple vision pro with eye tracking. i would gladly throw away my mouse and maybe even keyboard and just program with my eyes voice and sometimes hands

  • @Ben_EH-Heyeh
    @Ben_EH-Heyeh 10 หลายเดือนก่อน

    Looking at Cursorless Git...
    All of the language support files are written in Scheme, a dialect of Lisp, why is Scheme not a supported language?

  • @sathirasilva4958
    @sathirasilva4958 11 หลายเดือนก่อน

    Does anyone know what presentation tool is he using?

  • @yash1152
    @yash1152 11 หลายเดือนก่อน

    16:09 16:16 english: VAO order
    16:24 different word order, _probably_ wouldnt be hard
    thanks a lot. yeah, i would definitely not use this order.

  • @hytmal
    @hytmal ปีที่แล้ว +6

    Mandarin is not a written language. Chinese can be written in Traditional or Simplified. Cantonese and Mandarin are spoken dialects.

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +5

      Ha that’s fair. I should probably have thought of that, as someone who studied mandarin in college for 2 years 😅

  • @ShadowDrakken
    @ShadowDrakken 11 หลายเดือนก่อน

    probably better to say "simple scopes and smart scopes" rather than "dumb scopes" :P

  • @roeniss
    @roeniss 2 หลายเดือนก่อน

    Point-free programming == pipelining?

  • @wege8409
    @wege8409 ปีที่แล้ว +4

    I tell you what, coding in the age of AI is really a pleasure

  • @ehsanu1
    @ehsanu1 ปีที่แล้ว +2

    This seems like a great fit for allowing LLMs to edit code efficiently, with some training.

    • @ueaj4576
      @ueaj4576 ปีที่แล้ว

      was thinking the same thing

    • @thenwhoami
      @thenwhoami ปีที่แล้ว +3

      This is an extra step an LLM would have to take vs. just outputting code directly.

    • @clray123
      @clray123 11 หลายเดือนก่อน

      ​@@thenwhoami I think what was meant was editing existing code, not outputting new code. But I suppose for that we already have diff/patch, so there is indeed little use for some relative positioning commands if you can just train the LLM to generate well-formed patches.

  • @superscatboy
    @superscatboy ปีที่แล้ว +1

    Is "whale" one syllable?
    I would've said it's two. Way-ul.

    • @PokeyRuleJams
      @PokeyRuleJams ปีที่แล้ว +3

      Ha never thought about that. It’s “jury” people usually go after 😄

    • @teromc
      @teromc ปีที่แล้ว

      Idk, but pronunciation in the dictionary is /weɪl/ (phonetic symbols) and Google says "sounds like wayl".

    • @redpepper74
      @redpepper74 ปีที่แล้ว

      Ls are kinda weird, sometimes they add like half a syllable if you put them after a vowel. (And yes it is strange to think about fractional syllables but there’s no real reason we can’t)

  • @fennecbesixdouze1794
    @fennecbesixdouze1794 ปีที่แล้ว +1

    I think that generative AI could be really powerful for doing day-to-day code editing by voice command.
    Soon we will be able to give generative AI instructions to refactor sections or blocks of code, out loud in natural language, and let the AI assistant do the typing for us, within the full context of the file and project.
    Editing as well as navigating project directories, finding files, definitions, functions, etc all by voice in natural language.

  • @thenwhoami
    @thenwhoami ปีที่แล้ว +1

    This is really cool, but I can't help but wonder how LLMs will upheave this. Sure, this is much more precise, you can edit something in exactly the way you intend on the first shot, every time. An LLM on the other hand will spew out code that may not have been what you wanted initially, but then you can tell the LLM to make specific edits to your code in (using your natural language), and through that process you can still arrive at what you want.
    Thanks for the talk!

  • @nil0bject
    @nil0bject ปีที่แล้ว

    whale is one syllable? natural language models can replace this easily