I really like this program, I'm very much disabled, have difficulty using my hands, used to use only Google's equivalency, didn't really like it, but it was the only thing we had, now I have something on Linux.
That's great, James!! I was blown away w/ SpeechNote and I still use it b/c it's useful to me - I can only imagine how important it is for you! You might like the new Plasma 6 desktop environment as they have some good accessibility tools in it - I just dropped a video covering installing it on Arch Linux. :P Go take a look - thanks for being here!!
I agree that Speech Note is not the most polished app, but it's the first I know that lets you use LLMs without rolling your own Python scripts. The quality of speech to text depends strongly on what model you use. Vosk and Coqui, as demonstrated, are not much better than you can get on your phone. My first trial (in English) was with a Whisper model (FasterWhisper Large) on a pretty nerdy bit of medieval history text, and I was genuinely astonished at how good it was. A quick try with the small Vosk model, however, produced output that was good for a laugh, but not much better. As one quickly sees with transcription of voice to text, the models take so much computation that we're going to have to wait a long time for real time translation from speech input. For the time being, speech to text and then translate the text. I'm also not sure how good the translation capacity of the present models is: a quick test showed something a bit better than Google translate (perhaps) but still with the sort of errors that a human learner would have had drilled out of them in 101. For my needs, the way Speech Note makes the best models accessible locally without programming skills is transformational, but I'm sure we will see (soon) front ends with more facilities; but Speech Note is quick enough to get used to if you want to use it. A big plus is that it can read and transcribe a pre-recorded MP3 file, which can be a very good way of working. You have to edit its output, but that's true of stuff you compose at the keyboard, and the output of these systems has at least already been through the spell check. For people with difficulty typing, this is the beginning of something big.
I use this app in fedora and work good but now I pass to Arch its close me and no work for me, and i dont know whats happend, please i need help to make work it.
@@techheart6090 Canonical is Flatpak, Snap, Parental control... Flatpak is essentially Microsoft store for Linux, an attempt to corner the market and exclude apt. Flatpak works through portals that can't be controlled by user or sudo. They literally hijack the file system.
Enjoyed this demonstration. Installed the app. Using it a lot.
I really like this program, I'm very much disabled, have difficulty using my hands, used to use only Google's equivalency, didn't really like it, but it was the only thing we had, now I have something on Linux.
That's great, James!! I was blown away w/ SpeechNote and I still use it b/c it's useful to me - I can only imagine how important it is for you! You might like the new Plasma 6 desktop environment as they have some good accessibility tools in it - I just dropped a video covering installing it on Arch Linux. :P Go take a look - thanks for being here!!
Great video! Thanks a lot!
Glad you liked it!
I agree that Speech Note is not the most polished app, but it's the first I know that lets you use LLMs without rolling your own Python scripts.
The quality of speech to text depends strongly on what model you use. Vosk and Coqui, as demonstrated, are not much better than you can get on your phone. My first trial (in English) was with a Whisper model (FasterWhisper Large) on a pretty nerdy bit of medieval history text, and I was genuinely astonished at how good it was. A quick try with the small Vosk model, however, produced output that was good for a laugh, but not much better.
As one quickly sees with transcription of voice to text, the models take so much computation that we're going to have to wait a long time for real time translation from speech input. For the time being, speech to text and then translate the text. I'm also not sure how good the translation capacity of the present models is: a quick test showed something a bit better than Google translate (perhaps) but still with the sort of errors that a human learner would have had drilled out of them in 101.
For my needs, the way Speech Note makes the best models accessible locally without programming skills is transformational, but I'm sure we will see (soon) front ends with more facilities; but Speech Note is quick enough to get used to if you want to use it. A big plus is that it can read and transcribe a pre-recorded MP3 file, which can be a very good way of working. You have to edit its output, but that's true of stuff you compose at the keyboard, and the output of these systems has at least already been through the spell check. For people with difficulty typing, this is the beginning of something big.
Thanks for the post; it's more info than my video!! :P
What about web reading, web kindle reading? and audio file to text?
Hmmm - I'm sure theres a solution out there...
Does this support typing outside of the application? Like into documents?
doesn't seem to, the copy button works well, I've been using it so far. if there's a way, I haven't found it yet.
DaveSomething replied exactly what I would have... thanks for watching!!
The video title is Speech to Text, not Text to Speech, but the video content is almost all about text to speech.
You're right - sorry about that, I'll make changes.
I use this app in fedora and work good but now I pass to Arch its close me and no work for me, and i dont know whats happend, please i need help to make work it.
Flathub / flatpack isn't an option?? I always put flatpack on Debian to pad its package capabilities...
does it work on a CLI?
I believe this is a GUI application - but I did not doublecheck...
I think must be a deppendency
I'm not OK with flatpaks, they make Linux worse than Windows (FU Canonical, Hello Debian!). So no thumb up and subscription for you.
Canonical is snaps, right? I don't like snaps but do utilize Flatpak's at times... why do you hate them so much???
@@techheart6090 Canonical is Flatpak, Snap, Parental control... Flatpak is essentially Microsoft store for Linux, an attempt to corner the market and exclude apt. Flatpak works through portals that can't be controlled by user or sudo. They literally hijack the file system.