Transcribe Video to Text with Python and Watson in 15 Minutes

Nicholas Renotte

มุมมอง 52 351

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 24 ต.ค. 2024

ความคิดเห็น • 170

@M310GL 3 ปีที่แล้ว ⁺⁷
Amazing tutorial, everything work smoothly. Hopefully, IBM will provide better models to non-english languages in the future.
@juandavidruizcohen1380 3 ปีที่แล้ว ⁺⁵
Such a good tutorial!, would love to see some content of actual write up and training of these models. Keep the good work
@NicholasRenotte 3 ปีที่แล้ว ⁺²
Heya @Juan, thanks so much! More of this coming this year!
@kei4838 2 ปีที่แล้ว
Great! You can highlight and download all the transcripts with one click with Glasp.
@jloibman 2 ปีที่แล้ว ⁺¹
Hi!
When I tried the command "subprocess.call(command, shell=True)" for a mp4 video it returned 1.... Do you know how can I fix that to generate the .wav? Thanks!
@harry_code 3 ปีที่แล้ว ⁺⁴
Really Informative and concise Nicholas...! Thanks a ton for this awesome tutorial!
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Thanks so much @Hariharan, glad you enjoyed it!
@guyincognito1985 3 ปีที่แล้ว ⁺¹
Every so often I find a YT channel so awesome, that I say to it... "Where have you been all my life?" Are any of these ML speech to text services accurate enough to use? I paused the video and read the transcript and it seemed pretty "garbled".
@NicholasRenotte 3 ปีที่แล้ว
Hahahah thanks so much @Guy Incognito! They're good, a lot of the time it really boils down to the quality of the audio and using accent specific models e.g. a lot of models would suck for me unless I used an Australian specific model!
@ansh6848 3 ปีที่แล้ว ⁺¹
Wow! That's amazing but is there any way by which we can convert test to video?
@thebigbigdaddy ปีที่แล้ว
Would you have something that also detects different speakers? Great video!
@MarcVerwerft 2 ปีที่แล้ว ⁺²
Absolutely spot on - good content, good explanation, fast tutorial with all the basics. Thanks a million ;-)
@DennyBaso 3 ปีที่แล้ว ⁺¹
When i try to running "!brew install ffmpeg" this message show >'brew' is not recognized as an internal or external command,
operable program or batch file< How to fix this? I use Windows Operating System.
@danieljuca ปีที่แล้ว ⁺²
Is it possible to convert this text into a subtitle file...?
@rohithkumarbairy6034 3 ปีที่แล้ว ⁺²
I'm not able to store the audio, I'm using windows 10 and Jupiter Notebook any suggestions?
@NicholasRenotte 3 ปีที่แล้ว
Heya @Rohith, does the folder you're trying to put it in exist?
@asmitamondal705 2 ปีที่แล้ว
Hello, awesome video...I was just wondering if this would work for videos which do not have a TH-cam transcript
@wannaknowme2841 3 ปีที่แล้ว ⁺¹
What comment does we use to type in cmd? For mac we used open . How about windows?
@clairematthews2255 4 ปีที่แล้ว ⁺²
Thanks for the great videos @nicholas! At step 3 I am getting a "NameError: name 'stt' is not defined" - any tips?
I also wondered how would this source code change if you were using already prepared audio .wav files in Jupyter folder?
@NicholasRenotte 4 ปีที่แล้ว
Heya @Claire! Thanks so much 🙏 . Just checking, did the code below run successfully? It's possible that if you weren't able to authenticate then the STT variable wouldn't be available, once the code below runs fine you should be good to go.
authenticator = IAMAuthenticator(apikey)
stt = SpeechToTextV1(authenticator=authenticator)
stt.set_service_url(url)
Ah, if you've already got preprocessed audio files you can skip the audio extraction. Check this out, it's straight Speech (Audio WAV) to Text: th-cam.com/video/A9_0OgW1LZU/w-d-xo.html
@AthulyaPD 2 วันที่ผ่านมา
the youtube-dl doesnt work anymore! i get the message "Due to a ruling of the Hamburg Regional Court, access to this website is blocked." when i open the link to update it
@francycharuto 3 ปีที่แล้ว ⁺²
You're the man! Thanks for putting it together.
@NicholasRenotte 3 ปีที่แล้ว
Anytime!!! Pumped it’s proving useful!
@stanleymwangi6524 4 ปีที่แล้ว ⁺¹
Awesome tutorial. What if you want to generate srt files instead of a transcript?
@NicholasRenotte 4 ปีที่แล้ว
Thanks so much @Stanley Mwangi, I started looking into this yesterday. I've got it added to the list of upcoming vids 👨‍💻
@patrickjane276 3 ปีที่แล้ว ⁺¹
awesome man thanks so much! any idea if there's a Watson model that knows when to add exclamation points or question marks? Trying to come up with a way to show sentence importance.
@NicholasRenotte 3 ปีที่แล้ว
Ooooh, I don't think so @Max. What's the goal regarding importance?
@alishaansari9086 3 ปีที่แล้ว ⁺²
Amazing!!! This is exactly what I was looking for.
I got this error during the execution of idk if this is just the server issue or due to code.
ApiException Traceback (most recent call last)
in
1 with open('audio1.wav','rb') as f:
----> 2 res = stt.recognize(audio=f, content_type='audio1/wav', model='en-US_NarrowbandModel', timestamp=True, continuous=True).get_result()
~\anaconda3\lib\site-packages\ibm_watson\speech_to_text_v1.py in recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, **kwargs)
504 data=data)
505
--> 506 response = self.send(request)
507 return response
508
~\anaconda3\lib\site-packages\ibm_cloud_sdk_core\base_service.py in send(self, request, **kwargs)
265 status_code=response.status_code)
266
--> 267 raise ApiException(response.status_code, http_response=response)
268 except requests.exceptions.SSLError:
269 logging.exception(self.ERROR_MSG_DISABLE_SSL)
ApiException: Error:
Internal Server Error
Internal Server Error - Write
The server encountered an internal error or misconfiguration and was unable to
complete your request.
Reference #4.debd7768.1616270159.1230d52
, Code: 503
Really appreciate if you could help me with this.
@NicholasRenotte 3 ปีที่แล้ว
Hmmm, without seeing your code I've got a feeling it might be an incorrect APIKEY. Can you check this is correct and also compare your code to the sample code in the description?
@emchivi5780 3 ปีที่แล้ว ⁺¹
Hey Man did you find what was the problem? I getting the same error, thanks!
@khubir.4483 3 ปีที่แล้ว ⁺¹
'brew' is not recognized as an internal or external command,
operable program or batch file.
I don't know what is the problem
@NicholasRenotte 3 ปีที่แล้ว
Try installing homebrew: brew.sh/
@destinibuckner1773 3 ปีที่แล้ว ⁺¹
Is there a different format of what command should equal if you're using windows? I'm not getting an audio file output
@NicholasRenotte 3 ปีที่แล้ว
It should be the same @Destini, I'm using Windows here as well :)
@sahilgarg4850 3 ปีที่แล้ว ⁺¹
Hey man! Great Content. Just a quick question - What is the alternative for 'apiKey' & 'URL' if we have to use it for multiple videos and unlimited minutes without paying anything (Free). Is there any other way to do that?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Could look at open source alternatives. Haven't dug into too many myself.
@varunvora816 3 ปีที่แล้ว ⁺¹
Great Job. Really Helpful!
@tennisboi1 2 ปีที่แล้ว ⁺¹
Not sure if this is because of updates they have made. I'm on step 3 using the recognize command and keep getting the error "request() got an unexpected keyword argument 'continuous'". I wonder if you know how they have updated the library to get this step working again.
@user-dg8ys 2 ปีที่แล้ว
same problem here
@tennisboi1 2 ปีที่แล้ว
@@user-dg8ys I ended up just deleting the continuous after looking through the library. And that worked for me, though took me running it twice though
@user-dg8ys 2 ปีที่แล้ว
@@tennisboi1 it didn't work me like that actually. Did you also delete the comma at the left side of continuous?
@eugeneshilow 3 ปีที่แล้ว ⁺¹
Worked for me ONE time. But then at Step 3 I started getting this error: ApiException: Error:
"The server encountered an internal error or misconfiguration and was unable to
complete your request.
Reference #4.cdc7b5c.1609227022.c04aea
, Code: 503"
How to fix it?
@eugeneshilow 3 ปีที่แล้ว ⁺¹
More info the error "Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/ibm_cloud_sdk_core/base_service.py", line 224, in send
raise ApiException(
ibm_cloud_sdk_core.api_exception.ApiException: "
@eugeneshilow 3 ปีที่แล้ว ⁺²
Fixed this. The error was due to the typo in the model's name: en-US_NarrowbandModel
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Heya @Eugene, awesome you got it working!
@patrickjane276 3 ปีที่แล้ว ⁺¹
@@eugeneshilow fireeeeeee. now I love TH-cam comments again. thank you so much for posting this
@DevsLikeUs 4 ปีที่แล้ว ⁺²
Awesome tutorial, thank you!
@NicholasRenotte 4 ปีที่แล้ว
Thanks a billion! Glad you enjoyed it!! #happycoding
@datareactor4143 2 ปีที่แล้ว
Hi Nicholas, I'm unable to get the audio file here i tried to use different short video files, do i need to change any other parameters according to that? I'm not getting any error but not able to get the audio.wav file extracted
@praveenshahani5339 2 ปีที่แล้ว
Have a doubt, is it possible for put a command which can search the youtube video and articulate the speech?
@ju1042 3 ปีที่แล้ว ⁺¹
Question: it would work with a video in a different format like mp4?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Yup!
@NicholasPadilha 2 ปีที่แล้ว
Is it possible to take the timestamp of what was said in the video / audio?
@xiaohanzhang4052 3 ปีที่แล้ว ⁺¹
Fantastic video! Thanks~
I tried your code with your video file. But I got the following error in the final api call:
"It is required that you pass in a value for the "algorithms" argument when calling decode()"
Is anyone seeing that error? happen to know what reason it is?
@NicholasRenotte 3 ปีที่แล้ว
Heya @Xiaohan, yup one of the other subscribers who figured out this was an issue with PyJWT, try installing 1.7.1 and it should fix the issue. Example install:
pip install PyJWT==1.7.1
@nevaehthompson5818 2 ปีที่แล้ว
Great vid, helped a lot. Sadly, however, I'll have to find a different method bc IBM deactivated my account for no reason with no warning.
@MrMwenesi 3 ปีที่แล้ว ⁺²
How would you do the same for a live youtube video?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Pass through the audio feed to the API!
@kavitham4526 3 ปีที่แล้ว ⁺¹
Hi, this is very helpful.. is there any possibility to write the code for converting text to video animation...If you do so, it will be very helpful for us
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
I believe so, I haven't dug into it yet but you could look at using GANs!
@kavitham4526 3 ปีที่แล้ว
@@NicholasRenotte Thank you so much
@yogesharora-g6w 2 หลายเดือนก่อน
Hey please help. I want transcript of one youtube channel of every video of theirs. How is it possible?
@gauravmalik3911 2 ปีที่แล้ว
Worked for me, cheers
@JustKamKam ปีที่แล้ว
Why is it impossible to set up an IBM account. Trying to replicate this and I can't create an IBM Cloud account.
@PakistanInstitute ปีที่แล้ว
bro I have vs code editor I installed youtube-dl in vs code but unable to download a youtube video in which editor you are running youtube-dl and then you paste a link of video and it start downloading plz guaid step by step 🙂
@SurajSingh-lu8ei ปีที่แล้ว
how can i transcript 9 hours video within minutes, is it possible ?? plz reply i am working on project
@pepedecastro3352 2 ปีที่แล้ว
I keep getting that continuous is not an argument ? any help please
@davegamboa- 2 ปีที่แล้ว
same
@raphaelradespiel9970 3 ปีที่แล้ว ⁺¹
Hi, so, I'm trying this project out so that I can speed up my transcription tasks and I was able to fix some previous problems and learn a bit more about jupyter notebook, but now I hit a wall that I just can't seem to find a solution. In the "Open Audio Source and Convert" part, the first code cell, I've been getting this error that says: ERROR:root:Error in service call
then the info apears and there where three consecutive "ConnectionAbortedError: [WinError 10053]" errors.
could you help me out? I've read it has to do with anti-virus blocking the connection or my firewall, but I've dissabled them all to test this out. Maybe I just haven't configured python correctly or I missed a step. Anyways, do you have any idea what this could be? I can send you the error messege if you want. (thanks in advanced)
@NicholasRenotte 3 ปีที่แล้ว
Heya @Raphael, yup, shoot through the error!
@raphaelradespiel9970 3 ปีที่แล้ว
@@NicholasRenotte ok, here it goes:
ERROR:root:Error in service call
Traceback (most recent call last):
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 706, in urlopen
chunked=chunked,
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1065, in _send_output
self.send(chunk)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 986, in send
self.sock.sendall(data)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 972, in sendall
v = self.send(byte_view[count:])
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 941, in send
return self._sslobj.write(data)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 642, in write
return self._sslobj.write(data)
ConnectionAbortedError: [WinError 10053] Uma conexão estabelecida foi anulada pelo software no computador host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages
equests\adapters.py", line 449, in send
timeout=timeout
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\util
etry.py", line 531, in increment
raise six.reraise(type(error), error, _stacktrace)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\packages\six.py", line 734, in reraise
raise value.with_traceback(tb)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 706, in urlopen
chunked=chunked,
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1065, in _send_output
self.send(chunk)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 986, in send
self.sock.sendall(data)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 972, in sendall
v = self.send(byte_view[count:])
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 941, in send
return self._sslobj.write(data)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 642, in write
return self._sslobj.write(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionAbortedError(10053, 'Uma conexão estabelecida foi anulada pelo software no computador host', None, 10053, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\ibm_cloud_sdk_core\base_service.py", line 227, in send
response = requests.request(**request, cookies=self.jar, **kwargs)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages
equests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages
equests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages
equests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages
equests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionAbortedError(10053, 'Uma conexão estabelecida foi anulada pelo software no computador host', None, 10053, None))
@raphaelradespiel9970 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte Hey, never mind, I found out there was a 100MB limit to the audio files. I fixed it and its working just fine right now. Thanks for the tutorial my dude.
@NicholasRenotte 3 ปีที่แล้ว
@@raphaelradespiel9970 anytime!! Glad you got it!
@rachidaboussaid501 3 ปีที่แล้ว ⁺¹
how can i get subtitle with watson my dear fellow as srt or vtt file to lunch it with that's video ?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Oooh, haven't gone that far unfortunately rachid!
@bronsonranga5770 4 ปีที่แล้ว ⁺¹
Very informative, thank you bro ❤️
@NicholasRenotte 4 ปีที่แล้ว
Thanks so much @Bronson Ranga, glad you enjoyed it!
@nobleson685 4 ปีที่แล้ว ⁺¹
In the third step 3. Open Audio Source and Convert, the program throws me this error. I am unable to correct it. ConnectionError: ('Connection aborted.', OSError("(32, 'EPIPE')")). Do you know why it happened? Thanks
@NicholasRenotte 4 ปีที่แล้ว ⁺²
Let's dig a little further, what machine/OS are you using and can you paste your code below? :)
@nobleson685 4 ปีที่แล้ว ⁺¹
@@NicholasRenotte macOS Mojave 10.14.6
with open('audio.wav', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/wav', model='en-US_NarrowbandModel', continuous=True).get_result()
@NicholasRenotte 4 ปีที่แล้ว ⁺¹
Is your audio file greater than 100MB? It might be because the file sizes are too large as the STT service can only handle audio up to 100MB. Try this code below, it'll split your audio file first and loop through them to convert! I commented #NEW where there are new code bits! Let me know how you go.
# 0. Install and Import Dependencies
!pip install ibm_watson
!brew install ffmpeg
import subprocess
from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
# NEW Import os to loop through directory
import os
# 1. Extract Audio
command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav'
subprocess.call(command, shell=True)
# NEW Split audio files into manageable chunks
command = 'ffmpeg -i audio.wav -f segment -segment_time 240 -c copy out%03d.wav'
subprocess.call(command, shell=True)
# 2. Setup STT Service
apikey = 'YOUR API KEY'
url = 'YOUR URL'
# Setup service
authenticator = IAMAuthenticator(apikey)
stt = SpeechToTextV1(authenticator=authenticator)
stt.set_service_url(url)
# 3. Open Audio Source and Convert
# NEW loop through audio files and convert
results = []
for filename in os.listdir('.'):
if filename.endswith(".wav"):
with open(filename, 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result()
results.append(res)
# 4. Process Results and Output to Text
len(res['results'])
# Preprocess transcriptions
text = []
for file in results:
for result in file['results']:
text.append(result['alternatives'][0]['transcript'].rstrip() + '.
')
text = [para[0].title() + para[1:] for para in text]
transcript = ''.join(text)
with open('output.txt', 'w') as out:
out.writelines(transcript)
@nobleson685 4 ปีที่แล้ว
@@NicholasRenotte You are right. The audio file was above 100MB. Thank you for the updated code. I was able to split the audio file into
@NicholasRenotte 4 ปีที่แล้ว
Awesome, we're through that! Is that the full error? Also two things, have you updated the API key and URL? And are you behind a firewall at the moment, the request will need to be able to go out to the cloud service?
@alex-vq1yy 7 หลายเดือนก่อน
bro if u have another method without using ibm key then please tell me
@humzaali5980 3 ปีที่แล้ว
Hello, first of all amazing video and keep it up. I just want to ask you if i can use a path for input and output file.
@NicholasRenotte 3 ปีที่แล้ว
Sure can, just add in the following paths in the read and output sections of the code:
with open('PATH_TO_INPUT_FILE/audio.wav', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result()
...
text = [para[0].title() + para[1:] for para in text]
transcript = ''.join(text)
with open('PATH_TO_OUTPUT_FILE/output.txt', 'w') as out:
out.writelines(transcript)
@humzaali5980 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte Thank you very much for replying Nicholas. I have tried your code but it does not transcribe all videos properly. Can you tell me what i am doing wrong. Thanks
@NicholasRenotte 3 ปีที่แล้ว
@@humzaali5980 definitely, what seems to be happening? Any errors?
@humzaali5980 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte just that some times it does not transcribe it properly like it translate so its to solid it here to you. So some times it does not translate right. Specially a movie clip.
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Ahhhh @@humzaali5980, you might need to refine the model sometimes depending on the audio quality and accents! cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-customization
@jackReme 2 ปีที่แล้ว ⁺¹
Thanks alot!
@butternuts842 ปีที่แล้ว
how can I make it so it can tell the difference between 2 or more speakers?
@aryanvijay6081 3 ปีที่แล้ว ⁺¹
hey, i had a doubt what happens when we do it on a mp4 file
@NicholasRenotte 3 ปีที่แล้ว
Should still work, just need to change your input format.
@devpriyashivani1855 ปีที่แล้ว
Hey, I'm getting this error:
TypeError: Session.request() got an unexpected keyword argument 'continuous'
@devpriyashivani1855 ปีที่แล้ว
When I removed the continuous=True, I got the below error:
ApiException: Error:
Internal Server Error
Internal Server Error - Write
The server encountered an internal error or misconfiguration and was unable to
complete your request.
Reference #4.4d752c31.1672464316.3358daf9
, Code: 503
@1UniverseGames 3 ปีที่แล้ว
Is it possible to get or extract the TH-cam videos subtitles/voice into Text? Any suggestions
@NicholasRenotte 3 ปีที่แล้ว
Haven't tried it but in theory it should work.
@vaishaligunjal582 ปีที่แล้ว
getting error while importing libraries :
AttributeError: partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)
from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
@foxtrothu2831 3 ปีที่แล้ว ⁺²
Doesn't work for me! Return Code: 1 instead of 0
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Heya @Foxtrot, any additional errors?
@foxtrothu2831 3 ปีที่แล้ว ⁺¹
@@NicholasRenotte thank you for your reply. I wrote every code same as yours in the video,
#### But in the part: extracting audio from video, it return code:1 instead of 0. I have no idea what it means and no additional error explanation shown. After googling this issue, I found it's better to give up. So I used PR to convert the mp4 to wav.
#### In the part - converting audio to txt through Watson STT, my connection would be aborted after serveral minutes' running the code. Maybe it's because I'm in China? Even though using VPN, I am still not able to access it.
Thanks for your video after all. I know it's not easy to make a video. Keep up the great work!
@NicholasRenotte 3 ปีที่แล้ว
@@foxtrothu2831 thanks appreciate the feedback. Weird, I would've thought you could access the API regardless of location.
@anirudhc426 3 ปีที่แล้ว ⁺¹
Awesome video!
@NicholasRenotte 3 ปีที่แล้ว
Thanks so much 🙏!!
@MrMwenesi 3 ปีที่แล้ว ⁺¹
And...can you also add translation with live video?
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Heya man!! It's using PyTorch so the nearest equivalent would be ONNX or CoreML, check this out: github.com/ultralytics/yolov5/issues/251
@MrMwenesi 3 ปีที่แล้ว
@@NicholasRenotte That requires more explanation. I am basically a nocode/lowcode developer of conversational agents. how would I integrate in into the jupyter notebooks?
@NicholasRenotte 3 ปีที่แล้ว
@@MrMwenesi oh, I think my comments got mixed up. Ignore that one. Check this out th-cam.com/video/YCyuZM454_I/w-d-xo.html handles the live audio bit. Could extract the audio feed in real time and do something like that.
@dogs8113 2 ปีที่แล้ว
How to make pyttsx3 read subtitles of text from TH-cam and at the same time dub with translation?
@openmindjustdoit1306 3 ปีที่แล้ว ⁺¹
good job did this support Arabic language or not
@NicholasRenotte 3 ปีที่แล้ว
Sure does! cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
@lukajvv.7839 4 ปีที่แล้ว
Hey man, I have python installed on my computer but I keep getting an error stating that 'brew' is not recognized as an internal or external command, operable program or batch file. What should I do?
@lukajvv.7839 4 ปีที่แล้ว ⁺¹
Sorry I am working on a windows machine, but I am nor sure how to install ffmpeg for it
@NicholasRenotte 4 ปีที่แล้ว ⁺²
No stress, so it's a three step process for Windows:
1. Download the ffmpeg source files from here: ffmpeg.org/download.html+
2. Unzip the folder where you want the installation to be
3. Update your Windows environment variable PATH to include the path to the bin folder for ffmpeg
@youngboys7342 2 ปีที่แล้ว
Thank you sir
@vijayasekaran3144 5 หลายเดือนก่อน
How can I do it for Instagram videos ?
@heartheart5543 3 ปีที่แล้ว
Can you do it with R ?
@alcidesneves2807 2 ปีที่แล้ว
what it for?......what about search exacly word anda finda video with that word im youtube
@kvafsu225 2 ปีที่แล้ว
Fascinating
@sneh5496 6 หลายเดือนก่อน
youtube dl didn't work for me
pip install yt-dlp did
@ahmetozel5112 หลายเดือนก่อน ⁺²
I was really disappointed. you translate the audio, not the video. I thought you did it by processing audio and video data. very disappointing
@achendvankar 4 ปีที่แล้ว
I am unable to extract audio using this code:
command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav'
subprocess.call(command, shell=True)
I get no audio output.
@achendvankar 4 ปีที่แล้ว ⁺¹
Any help would be greatly appreciated. Thank you :)
@NicholasRenotte 4 ปีที่แล้ว
Definitely @A C, what's the name of your audio file?
@achendvankar 4 ปีที่แล้ว
The name of the audio file was audio.wav
@NicholasRenotte 4 ปีที่แล้ว
@@achendvankar Your input file was named audio.wav?
@achendvankar 4 ปีที่แล้ว ⁺¹
@@NicholasRenotte yes, it was. Hope i am not doing anything wrong here and unnecessarily troubling you :)
@lavanyakasu8852 3 ปีที่แล้ว ⁺¹
heyy...please tell me how to deploy this model
@NicholasRenotte 3 ปีที่แล้ว
Heya @Iavanya, the tutorial uses a SaaS service via an API, there's no need to deploy it!
@rahulkmail 3 ปีที่แล้ว ⁺¹
Excellent
@NicholasRenotte 3 ปีที่แล้ว
🙏 thanks so much @Rahul!
@khubir.4483 3 ปีที่แล้ว
command = 'ffmpeg -i Spirit.mkv -ab 160k -ar 44100 -vn audio.wav'
subprocess.call(command, shell=True) is giving output 1.
and brew is also not recognised as command.
!brew install ffmpeg
in windows 10
can anyone help me
@NicholasRenotte 3 ปีที่แล้ว
Install brew: brew.sh/
@RunyCalmera ปีที่แล้ว
Awesome❤
@銘鋒 3 ปีที่แล้ว
Hi, can I know how to install ffmpeg Window in Jupyter Notebook? Your reply is appreciating. Thank you!
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
www.wikihow.com/Install-FFmpeg-on-Windows
@rohandevaki4349 ปีที่แล้ว
does this still work?
@lynnwillis4332 3 ปีที่แล้ว
anyone getting this error:
ApiException Traceback (most recent call last)
in
1 with open('audio.wav', 'rb') as f:
----> 2 res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result()
~\anaconda3\lib\site-packages\ibm_watson\speech_to_text_v1.py in recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, low_latency, **kwargs)
564 data=data)
565
--> 566 response = self.send(request)
567 return response
568
~\anaconda3\lib\site-packages\ibm_cloud_sdk_core\base_service.py in send(self, request, **kwargs)
306 status_code=response.status_code)
307
--> 308 raise ApiException(response.status_code, http_response=response)
309 except requests.exceptions.SSLError:
310 logging.exception(self.ERROR_MSG_DISABLE_SSL)
ApiException: Error:
Internal Server Error
Internal Server Error - Write
The server encountered an internal error or misconfiguration and was unable to
complete your request.
Reference #4.470b3017.1626566664.3077eed1
, Code: 503
@lhlee1580 4 ปีที่แล้ว
can the code be done for mp4 videos?
@NicholasRenotte 4 ปีที่แล้ว ⁺¹
Sure can! Just change the command line:
# From this
command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav'
# To This
command = 'ffmpeg -i yourfilename.mp4 -ab 160k -ar 44100 -vn audio.wav'
@ggcharlie8511 3 ปีที่แล้ว ⁺¹
The real question is, can it translate Playboi Carti
@NicholasRenotte 3 ปีที่แล้ว
😂😂😂 next project man, mumble rap decoder!
@johnclarkson6120 3 ปีที่แล้ว ⁺¹
god...so great
@NicholasRenotte 3 ปีที่แล้ว
Thanks sooo much 🙏
@loverizer8414 2 ปีที่แล้ว
I think this is very complicated,
A layman like me that doesn't know anything about python can't understand this.
Try am make it more simple, I have to sign out in about 2mins because am not following.
@madhurir9646 2 ปีที่แล้ว
I am getting this error:
ApiException: Error:
Internal Server Error
Internal Server Error - Write
The server encountered an internal error or misconfiguration and was unable to
complete your request.
Reference #4.733a2f17.1655198247.4e83735e
, Code: 503
@praneethsai8589 3 ปีที่แล้ว
cant able to download videos
C:\Users\user>youtube-dl th-cam.com/video/FM6kHcXpw98/w-d-xo.html
'youtube-dl' is not recognized as an internal or external command,
operable program or batch file.
@ibrahelsheikh 2 ปีที่แล้ว
Where code
@ЛеонидБорисов-ч5х 8 หลายเดือนก่อน
Colors of Python snakes like a flag of my country
@pramodsurya.m 2 ปีที่แล้ว
I'm stuck at step 4. Can you please help me with this
After executing these lines that were provided
--------------------------------------------------------------------------
with open('audio.wav', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/wav', model='en-GB_BroadbandModel', continuous=True).get_result()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [7], in ()
1 with open('audio.wav', 'rb') as f:
----> 2 res = stt.recognize(audio=f, content_type='audio/wav', model='en-GB_BroadbandModel', continuous=True).get_result()
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ibm_watson\speech_to_text_v1.py:587, in SpeechToTextV1.recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, low_latency, **kwargs)
580 url = '/v1/recognize'
581 request = self.prepare_request(method='POST',
582 url=url,
583 headers=headers,
584 params=params,
585 data=data)
--> 587 response = self.send(request, **kwargs)
588 return response
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ibm_cloud_sdk_core\base_service.py:306, in BaseService.send(self, request, **kwargs)
304 logger.warning('"%s" has been removed from the request', key)
305 try:
--> 306 response = self.http_client.request(**request,
307 cookies=self.jar,
308 **kwargs)
310 if 200
@muhammadnoval8787 3 ปีที่แล้ว
no arabic languange?
@NicholasRenotte 3 ปีที่แล้ว
I believe there is one for Arabic :)
@ashilshah3376 ปีที่แล้ว
4:27
@ashilshah3376 ปีที่แล้ว
th-cam.com/video/FM6kHcXpw98/w-d-xo.html
@barigerajesh1 2 ปีที่แล้ว
Hi Nicholas, great job.
when the following two instructions are run..
command = 'ffmpeg -i Spirit.mkv -ab 160k -ar 44100 -vn audio.wav'
subprocess.call(command, shell=True)
displaying Output as 1 and audio file is not created.
And I'm working on Windows Workstation.
@Nicholas Renotte, I Really appreciate if you could help me with this.
@armeniansnoocersnoocer 2 หลายเดือนก่อน
shame video ! dislike

ต่อไป

เล่นอัตโนมัติ

Analyzing Twitter Accounts with Python and Personality Insights