Creating a Speech to Text Program with Python

CS Coach

มุมมอง 87 091

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 103

@danoconnell5292 10 หลายเดือนก่อน ⁺⁷
I've spent 5 days trying to figure this out knowing nothing literally about nothing and by this I mean virtual assistants you can talk to. With the learning curve I will say I wish I watched your video first. Your the best to learn from so far my man nice job.
@CSCoach 10 หลายเดือนก่อน ⁺⁴
That means a lot man. Thanks so much for the feedback.
@Knot2goodAtIt 7 หลายเดือนก่อน ⁺⁶
I had nooooo idea this was so straightforward! Thank you! I want to create a translator and I think this is the perfect base!
@File_corupt ปีที่แล้ว ⁺³
yo this tutorial is great as I have tried to watch other tutorials as a noob but they talk like I know the stuff but I dont but you explain the stuff to me great
@goner007 10 หลายเดือนก่อน ⁺¹
this worked so well. i never knew it was that easy until you explained it. Hope to learn more from you thanks
@DevonAIPublicSecurity ปีที่แล้ว ⁺⁶
Hey Oscar this was a great tutorial, keep doing the work you are doing and this was very clear and it made perfect sense.
@CSCoach 10 หลายเดือนก่อน
Thanks for the feedback. It means a lot. Going to ramp up the channel with more content in the coming months :)
@artetridimensionale ปีที่แล้ว ⁺⁸
ok so it's a bit simple, the problem is if you stop talking and you have to make sure the text is justified and doesn't end up at the end of a line every time you interrupt yourself
@ymhtpat 11 หลายเดือนก่อน ⁺⁴
nvm figured it out. for those of you asking about the last step for windows os, go to note pad and type in " C:\Users\[your computer]> " then the first prompt then do the same for the second. save the file as a .txt. it should work from there.
@BobJoe-lt1is 4 หลายเดือนก่อน
What second prompt are you talking about? I wrote that in the notepad and yet my program sometimes outright refuses to work.
@vanillatheneko8473 4 วันที่ผ่านมา
yo! I'm having the same issue you had. I'm new to pycharm so I'm not sure what you mean with notepad. Where can I find it in the UI? Did you mean the terminal? or is there somewhere else I'm supposed to find the 'notepad' window. sorry if this is a very dumb question lol
@ColinTimmins ปีที่แล้ว ⁺⁷
Cool stuff! I have struggled so much with language as I’m extremely dyslexic. ChatGPT has opened up the door 🚪 for me! 😊
@CSCoach ปีที่แล้ว
That's awesome :D So glad ChatGPT is able to help you :)
@ymhtpat 11 หลายเดือนก่อน ⁺⁴
On the last step I'm hung up. I'm on the latest windows os, Are you running the touch output.txt and tail output on macs equivalent of command prompt? I ran it on mine and came up with " 'touch' is not recognized as an internal or external command, operable program or batch file."
where did I go wrong?
@joyaljijimon3419 8 หลายเดือนก่อน
same here bruh
@poisoned_durian8 10 หลายเดือนก่อน ⁺¹
im using windows im not so sure if the problem is the commands "echo. > output.txt and type output.txt | more"
@mastershonobi110 9 วันที่ผ่านมา
Gm, great vid! I am using python 3.12 on Mac(M1) and I’m unable to import pyaudio! Error states, “ Failed to build installable wheels for some pyproject.toml based projects (pyaudio)” I have searched high n low to solve this issue? Any thoughts or direction? Thanks in advance
@MrScgaming28 ปีที่แล้ว ⁺²
Can you tell me how to do the last step of making that output file and tailing the outputs in a windows OS?
@anderson3889 11 หลายเดือนก่อน
did you find a way
@arvindh13 10 หลายเดือนก่อน
In windows it will record your audio & print it to the notepad, but you need to restart your notepad to see it.
@sandrasajeev8640 6 หลายเดือนก่อน ⁺¹
@@arvindh13 Hey! I'm also stuck at this step. Can you pls explain like where do I've to put the commands to touch and tail the output?
@pattuchiitu8978 ปีที่แล้ว
Thank you for the tutorial. its works now 👋
@mmjuuno ปีที่แล้ว ⁺²
how do i see the appended msgs in the terminal? it worked and I could open the output txt file and see what i was saying, but I'm not sure how to see it realtime ig (using Windows 10 and the windows subsystem for linux to install kali linux, and then opened bash in cmd). also when i stop the script (using pycharm) it spits back a bunch of errors and I'm not sure why, or at least i think they're errors
@CSCoach ปีที่แล้ว
I did this by running the tail command in another terminal. Though, you could also change line 43 to be print(text) rather than print("Wrote text")
@kavito5947 ปีที่แล้ว
@@CSCoachwhats the name of the terminal you used?
@tugpsx640 ปีที่แล้ว
This is great, thanks for sharing you tips and tricks.
@CSCoach ปีที่แล้ว
You bet!
@KNOCKOUT-t7e 9 หลายเดือนก่อน
@@CSCoach what is the name of the app u used in the video ??
@alexanderkartvelian4274 ปีที่แล้ว
You are awesome! It would be great if you teach us how to send generated text from recognizer into "Text to speech" . thanks for the video!
@SirMrMystery 11 หลายเดือนก่อน
he already did something like that in which he created a jarvis like program th-cam.com/video/BEw5EFqCCEI/w-d-xo.htmlsi=wEeSWa18kFPppBUF
@loisisnel2955 3 หลายเดือนก่อน
Thanks for the course, but does anyone know how can i do it by using an audio file and not a voice recognition ?
@ernstb1234 3 หลายเดือนก่อน
It keep telling me Zach command not found when I try installing the dependency please help
@aotrakstar ปีที่แล้ว ⁺¹
Hi coach. do you think maybe using openai's whisper will have a more accurate outcome for transcribing speech?
@CSCoach ปีที่แล้ว
I found the python library to be enough personally. Though, I would imagine whisper to be better given that it's made by openAI. I'd expect it to be a better trained AI
@hjoseph777 4 หลายเดือนก่อน
You can use whisper offline. I suggest faster-whisper or whisper.cpp
@mohitpandya_2228 11 หลายเดือนก่อน
This is just working fine for the first run. After the first run the text generated takes about 10 mins to get to the output. How can i fix that thing and make it just as fast as the first run
@Quagik 3 หลายเดือนก่อน
its saying AttributeError: module 'speech_recognition' has no attribute 'Recognizer'
@abaizkhan4963 4 หลายเดือนก่อน ⁺¹
Any git repo link?
@Yajnco 8 หลายเดือนก่อน
My language Hmong is not available for voice to text and I have been searching for programs or ways to show me how it's done, but can't find any. Not sure if Python is the solution. Frustrating of not knowing where to start.
@Illogical. ปีที่แล้ว
I need a thing that does a few steps less than what this does. I only want it to record phonetics, maybe spacing between words, maybe intonation, and maybe recognize when I'm pausing to remember a word.
@fransuacordero5407 6 หลายเดือนก่อน
one question, this works only for english lenguage? or can be used with other languages?
@Hbdisaster_28 7 หลายเดือนก่อน
For me whenever i say something, the terminal shows wrote text but idk where it writes the text(im on windows btw) can anyone plz help
@arpitv2003 7 หลายเดือนก่อน ⁺¹
it would be creating a text file with name "output.txt" and writing into it
@Hbdisaster_28 7 หลายเดือนก่อน ⁺¹
Thank you
@Kenoki-yi3gf 7 หลายเดือนก่อน
i am doing this on windows os and i cant seem to find where the text is saved.... can someone please help
@workstation-s2n 5 หลายเดือนก่อน
Could you please post a link to the full file
@JamesDonkor-edu ปีที่แล้ว ⁺¹
hey great video! would this work for other languages?
@danielcasas7343 8 หลายเดือนก่อน
I also hace the same question
@Ddetektiv ปีที่แล้ว
how do you get terminal? When I clicked on terminal, new shows me and there was PS C:\Users\[my name]>, but as I can see, in ur it's not. Any way how to fix it?
@snaxsammy6472 ปีที่แล้ว
At the top bar of the terminal where it says "problems" click on the drop down arrow next to the "+" in the further right of that bar, then click on "command prompt", you should be able to run the code through there. At least thats what I did and it worked for me.
@anirudhsrisai3397 ปีที่แล้ว
Think of integrating it with chatgpt and getting answers for it just by using the text which we have received from speech to text
@CSCoach ปีที่แล้ว ⁺¹
Haha, did you see this video I made :) I believe I do what you suggested in the comment.
th-cam.com/video/BEw5EFqCCEI/w-d-xo.html
@sicfrydred ปีที่แล้ว ⁺¹
what program did you use to do this?
@CSCoach ปีที่แล้ว ⁺²
Did it in python :)
@Bartdu59Gaming ปีที่แล้ว
He used "VS Code" and the language used for the program is Python.
@kavito5947 ปีที่แล้ว
@@Bartdu59Gamingwhats the name of the terminal he used?
@snaxsammy6472 ปีที่แล้ว ⁺¹
Hi Coach! MrScgaming28 and I, from the comment section seem to have the same issue, could you re-explain how to do the last step of making that output file and tailing the outputs in a windows OS?
@arvindh13 10 หลายเดือนก่อน
Create a text document & copy the source --> paste it in the program.
@story-4-you-man 8 หลายเดือนก่อน ⁺¹
@@arvindh13 can you explain this? program?
@usus8420 9 หลายเดือนก่อน
hi it's only desktop solution ?
@adrio1569 ปีที่แล้ว ⁺¹
How can we add languages to it?
@arvindh13 10 หลายเดือนก่อน ⁺¹
You need pre designed language libraries or you need to make a new one with a lot of data.
@OmarAbdelrazek-tn9du ปีที่แล้ว
I still don’t understand how to download the libraries, I have a Mac m2 and documentation u gave didn’t help me. So can u or anyone else help me?
@felixforster5836 ปีที่แล้ว
You have to type in the terminal "pip3 install ...."
@jimjones26 ปีที่แล้ว
I am going to work on adding a "trigger" word that will let the program know I want to act.
@CSCoach ปีที่แล้ว
That's cool!! If you get that to work, I'd love to know how
@mosawarjamshady2989 2 หลายเดือนก่อน
Hey did you end up figuring this out?
@i.a_n_i_l_k_u_m_a_r 11 หลายเดือนก่อน
How to stop recording the voice?
@Lunaraa0 ปีที่แล้ว
Hi oscar does it works for french
@mohamed_Sultan1504 9 หลายเดือนก่อน
Does anyone know how to out it in a user-friendly app so anyone can use it easily?
@user-db4nm2rp2w 11 หลายเดือนก่อน
if i want to do only speech to text, then I don't need to install pyaudio, right??
@arvindh13 10 หลายเดือนก่อน
Yes, you need Pyaudio & Speech recognition fork library
@Der.Mercado 23 วันที่ผ่านมา
cant get the code
@suissegarantiegaming2100 ปีที่แล้ว
Could this also work with a other Language?
@CSCoach 11 หลายเดือนก่อน
Programming language? Or the text is outputted to a different language? Either way I'm pretty sure the answer would be yes :)
@hjoseph777 4 หลายเดือนก่อน
You did Not show much also where is the Git repository link. Smartie
@buyerinsight 10 หลายเดือนก่อน
why import pyttsx3? it is never used?
@arvindh13 10 หลายเดือนก่อน
It's not required yes.
@KNOCKOUT-t7e 9 หลายเดือนก่อน
what app he used in this video??
@stylloofdreams975 9 หลายเดือนก่อน ⁺¹
Visual Studio Code
@manchishirisha8013 6 หลายเดือนก่อน
where are u writing the code
@ruzix_yeager4938 หลายเดือนก่อน
vscode
@mohamed_Sultan1504 9 หลายเดือนก่อน
A real king
@KutuluTuk ปีที่แล้ว
i couldnt import pyttsx3 it says unused and alot of error there 😂
@arvindh13 10 หลายเดือนก่อน
you need to install it, it's not an inbuilt library.
@thunderchills640 5 หลายเดือนก่อน
thanks
@WolFX_FPS ปีที่แล้ว ⁺¹
I have an error with the speech_recognition library:
r.adjust_for_ambient_noise(source2, duration=0.2)
TypeError: Recognizer.adjust_for_ambient_noise() missing 1 required positional argument: 'source'
Any ideas?
@arvindh13 10 หลายเดือนก่อน
You need to provie the source before that, please check the program again and if it does not work, install Speech recognition fork library.
@gleful262 8 หลายเดือนก่อน
got this error, fixed it by adding() to Recognizer
@hebataha5197 8 หลายเดือนก่อน
Does this work offline ?
@brianmckeown70 5 หลายเดือนก่อน
^
@dungnguyen-si2sz 9 หลายเดือนก่อน
i am getting trouble with this error:
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyaudio
Failed to build pyaudio
ERROR: Could not build wheels for pyaudio, which is required to install pyproject.toml-based projects
I cant install the two last kind of stuffs you gave us, hope you reply soon, thank you for that nice work sir.
@mikethompson6455 8 หลายเดือนก่อน
Getting same error. Hope someone can help
@mikethompson6455 8 หลายเดือนก่อน ⁺¹
Did this. I have Mac OS so
If you're using macOS, you can install the necessary tools using Homebrew. First, make sure you have Homebrew installed, then run:
brew install portaudio
After installing portaudio, you can try installing PyAudio again using pip3 install pyaudio.
@gleful262 8 หลายเดือนก่อน
had this issue on widows fixed it by running as admin
@hjoseph777 4 หลายเดือนก่อน
Your screen not very clear
@odogbolahan8148 7 หลายเดือนก่อน
my name is oscar too haha
@soham0726 8 หลายเดือนก่อน
source code
@phpsolutioncode9309 หลายเดือนก่อน
I continue your work ! thank you! for your video was essential for my Idea. th-cam.com/video/W6JcI9Qw8aU/w-d-xo.html

ต่อไป

เล่นอัตโนมัติ

I Built a Personal Speech Recognition System for my AI Assistant