YES! I speak English, don‘t have any hearing impairment, and still I watch with captions sometimes, at times when it‘s impractical to turn on audio. I can watch with the sound turned down on the bus when I‘m going somewhere, without having to fool around with earphones. YES PLEASE!
I am hard of hearing (completely deaf in my left ear and ~ -37dB in my right ear) … and really appreciate you championing this cause. Ironically, I rarely use your captions because your sound quality is always so good. A common problem I encounter on TH-cam videos is the background music being too loud _and_ the video having no CCs. So, I'd add encouraging creators to think about their sound design (which includes CCs!) as much as they do their visual design.
True; when I do add any kind of sound bed, it's at least -28 or -30 dB so even if you have headphones, it is way below the recorded speech (which has light compression to keep it higher even when I step away from the mic).
@twochambz Depending on the device and the part of Google doing the transcriptions, the Android phone model is actually the more powerful version. Google data centers have to process so many queries that the model running on their servers is (sometimes) weaker/smaller than what's on your phone where an AI-optimized SoC only has to process a single person's requests.
I'd think the YT auto captioning is set to a faster/less accurate version on purpose, with the 500+ hours of video being uploaded to YT every minute that's still an insane amount of processing needed. It could cost them 2, 3, 4 times as much processing to get from a 90% accuracy to a 99% accuracy (I'm pulling numbers out of my ass but you get the point), I presume they've done the cost/benefit analysis
True; that can be helpful too and is a manual process as well. I do it for all my main channel videos, and as many of the other channel videos as well, because I love them as a viewer (so I think viewers should have them on my videos too!).
@@hugevibez For me, my brain doesn't work that way, so I always do it after the entire edit is complete-also, getting markers exported into timestamp format can be weird/annoying (at least in the editors I've used), so you still have to type them out in the format TH-cam likes in the description.
@@JeffGeerlingYou want something similar to Whisper, but it looks at scene cuts in the video, or I guess it could just look at "topic cuts" in the transcription. In fact I bet that would be much easier. What about just dumping the transcription into a LLM-like service like ChatGPT and asking it to generate some chapters?
Thank you for understanding the importance of closed captions. Accessibility is so heckin important, and I especially appreciate that you brought attention to the fact that other Creators with considerably more people power and financial resources are not doing something they could be doing. Keep up the awesome bar raising work.
Yes! Oh gosh I wish more creators that I watch regularly did this. Grian and Gem from Hermitcraft have create closed captions already - as you say it is really easy to do!
I really appreciate you going above and beyond on accessibility, as I'm deaf with cochlear implants myself and rely on them a lot. As you said, it's bewildering how many creators don't even use the TH-cam captions (much less make good ones), they're making it harder for themselves to build their channel by denying viewers.
One of the absolute biggest things I appreciate the most about your videos are actually the subtitles, and how accurate they are too. It goes such a long way, I'm not deaf or hard of hearing in any way, but not only does this make videos actually watchable for them, I still make great use of captioning because 1) English isn't my native language, and English is HARD. 2) I have trouble concentrating on speech very often and sentences completely enter one ear and exit out the other :) Following along with text really helps.
That's what this 2nd channel's all about! I want to share a lot of the little things people might never see in the day to day of making videos, doing electronics work, etc.
Thank you, Jeff! It is attention to detail, like this that just adds more authenticity to your content. I appreciate the effort, and so does my hearing disability.
Thanks Jeff. Really appreciate the effort you put in to add accurate captions. Being somewhat older now my hearing isn't as good as what it used to be so sometimes I'm unable to make out a specific word and accurate captions are really helpful when that happens.
Thank you so much for sharing this!!!!! I have friends with different disabilities and always try to keep their interests at the forefront but I've also never bothered to really take a close look at the generated captions vs. the ones I can quickly generate on my own. This is something I'm doing in all my videos from now on. Thank you again!
Absolutely! I use MacWhisper as well and have been using the medium model with good results. I also use the DeepL integration to translate the captions into other languages which I choose based on testing and people selecting TH-cams auto captions translation to give people a better experience.
I use whisper to generate captions for old TV and movie content that my mother and I watch. It doesn't do well for content that contains mixed languages, nor for content that has a lot of music, but it still gives good-enough results for content that's less than ideal.
I find it interesting how much those language models are good enough now to caption your more technical videos. But at the same time its also baffling how a lot of creators do not do that additional step of adding captions considering there's easier ways to do them now like what you demonstrated. Though there's a few channels I do watch that do captions regularly like LGR, Technology Connections, 8 Bit Guy, SciShow, VWestlife, Veritasium, and a few more I can't recall right now. Then there's the very rare ones that do them in more than one language like LowSpecGamer and GMM, both of which do English and Spanish. So far the best one with frequently a lot of languages on their videos I've seen is Fascinating Horror. I appreciate captions myself because there's times when my hearing is degraded due to sinus congestion from allergies making my ears stuffy or the host's voice not being that clear or have a heavy accent (i.e. Scottish, Indian, Aussie slang, etc.) that I'm not used to and take a while to understand.
First of all, thank you. Besides the obvious accessibility aspect, they also come in very handy when hovering over videos on the TH-cam feed in a quiet environment :D However that usage percentage might be skewed by TH-cam. Being a non-native speaker, I constantly have to keep turning off CC (sometimes it's more relaxing watching a video when you don't see what has not yet been said), but no matter what, TH-cam keeps switching them back on. Based on googling for a solution, this seems to not be an isolated issue...
There should be a creator tool that listens for imperial units and provides on-screen unit translation to some SI unit. Especially for temperature. The US is weird in this...
Thank you, I absolutely appreciate the captions! :) I'm a native English speaker. While my hearing is good, I often lose a few words when listening to someone speak and need to rewind/read the respective subtitles to pick up on what was said.
Fantastic to see @Level2Jeff - closed captions (subtitles here in the UK) are a critical component to all video content ... but, you asked about things tech TH-camrs should be doing to support accessibility, so, as a sight impaired content consumer, here's a couple of things for you, specifically from this video (but common to most tech TH-camrs) ... First, use a high contrast mouse cursor; move the mouse a little more slowly (it's almost impossible to track where your mouse pointer is in this video for many sight impaired folk). Next, use some tooling that graphically indicates left and right clicks as you make them, or clearly note that you are clicking (and which button) in sync with when you click. Also, use tooling, or editing, to zoom in to the area of your screen that's of interest; very often a full screen is not helpful, particularly if you can't make the entire UI larger (such as in many editors or terminals where you can only increase the primary font size, but not the scaling of the entire UI), make sure that the area of the screen or app you are interacting with, along with local context is, the primary thing that's visible by zooming in to that area. And not specific to this video, but where there are any cases where you're doing text or graphic overlays in the edit, be certain that they are of a high contrast relative to the video content they're overlaid on top of - there's no point in an overlay if it can't be consumed by everyone. 10 / 10 for the captions, now time to work on improvements for folk with sight impairments too!
Thanx for this suggestion Jeff. I use captions all the time, as I'm busy multitasking while working, so most times cannot have the sound on. The auto generated captions are terrible to put it mildly! 🤣
@@macTijn Right now, no-since I can't verify the accuracy I don't want to incorporate any automated tool into that workflow. I'm okay limiting the audience a bit so I can ensure the accuracy of the subtitles.
Captions are useful for the hearing-able folks as well - so many thanks for making the effort! I assume you can also offer a transcript easily which could be great for some how-to videos
Some of the things (besides a good Pi) that I find near perfect on your channels are: 1. No annoying blasting background music 2. Loud and clear talk 3. CC A few annoying things on your channels: 1. When your passion overtakes you and your word rate exceed normal speech with 100% 😁 2. eh . . .
Thanks for the tip, will implement when I finally manage to start making videos. Actually I already used this (or something very similar can't remember) at work, instead of potentially writing cc for hours, it took less thsn an hour all together even while I speak a language that translators usually suck in (finnish), and I had to be really pedantic about fixing any mistakes and perfecting the timings, as it was for a client. Definitely the most useful AI tool I've used so far.
3 หลายเดือนก่อน
Whisper is amazing. Thanks for adding quality CCs.
Your RISC-V video on your main channel has mis-timed captions, noticed it before seeing this video which is kinda funny to me
3 หลายเดือนก่อน +1
You are one of few I found on YT that make it look so easy and understandable. Have been thinking of making my own content but want to find some good source for where to start, to figure out what I need of hardware and way of doing things. I guess that it is more important to have an idea of where I want to go than the hardware it self, but could it really be done with a simple thing like an mobile phone?
Yeah, content first, then hardware! I had been writing my blog for over 15 years before I started really investing in TH-cam content-for me, 90% of the video is testing, building, then writing up either a detailed outline or a full script... then 10% is recording and editing (though sometimes editing takes more time than I'd like!). But I've done some videos almost entirely on my iPhone (13 Pro), the main thing is to get some decent lighting, whether natural light bounced off a wall, or with a ring light, or a big softbox that you can find on Amazon. For audio, try to get a microphone positioned close to you, or get a little lavaliere mic, even cheaper ones close-ish sound better than expensive ones far away!
@@Level2Jeff maybe I'll do a test and report my results- I'm wondering how it will handle game sound, other streamers on Discord, blerps, etc. I have played with Whisper before and got good results but only tried with test files.
It's because Google, like so many large companies, has figured out that once a company becomes so big, they don't have to try anymore. People will pay for their crappy stuff because they don't know where else to go, or because they themselves have too little life experience to realize that they're paying for crappy stuff.
I'm glad that Whisper exists, but it does need a strong system to get good/fast results. For example I use the cpu fork of the command line version to convert noisy German train driver videos to English. On an N100 with 16GB RAM I can only use the medium model, and it takes longer than realtime. And I would say TH-cam's translation has the edge. It's still a good tool to have though. It would be interesting to see how much better the large model would do. Thanks for the video!
Jeff, you're one of the good ones. Those other big "creators" don't give a sh*t about their audience-- only the $$$ their eyeballs represent. Thanks for what you do-- including careful placement of midroll ads which almost no one bothers to do because it's 2 minutes more work than just clicking upload.
Yeah-I usually leave out midrolls for the first 2-4 days, so my normal subscribers don't have to see any. Then I add them in and check the breakpoints, and delete most of the placements. I try to limit it to 1-2 for a video, even a much longer one.
@@Level2JeffI don’t mind mid rolls… gotta pay the bills. But PLACE them at logical breaks in the content… the biggest TH-camrs just slap them every couple of minutes often mid-sentence or even mid-word. #lazy
I see "4K" where you say I should see "CC". I almost always watch via an Nvidia Shield, so that might explain the difference. Thanks for taking the time to offer the better experience.
Hey Jeff, if you are inclined, try doing Spanish, Italian, french and portuguese. a lot more people in this side of the hemisphere will be able to enjoy your content and you get more views
My native language is Portuguese, but most of the channels I watch on TH-cam are in English and CC helps a lot, since I have some difficulty understanding the spoken language .
I almost always use captions (and I have no hearing issues).. Why? Because it tends to be noisy where I live, and being a visual person as well, since i'm already staring at the screen, it helps me follow the pace better not JUST hearing it. I wish more would do this.
I just use Davinci Resolve Generated Captions if that is you choise of NLE. They are way better for my experience than MacWhisper. I do multilingual TH-cam content so one thing I hope Mac whisper could do better is that the translate feature would translate in context and not one line (half sentance) a time. Currently I use deepl API with raycast for translating. Their web app has charter limit not enoght for one Yt video.
For anyone wondering like me, NLE refers to "non linear editor". An older term referring to the order in which video content is edited. Since we've been editing digital video content non linear for decades (I mean, who uses film reel, hah), it has become the de facto standard of video editing. Thanks for the heads up, I'll be taking a look at Davinci Resolve for my next CC creation! 👍
hey jeff, thanks for the caption support! though, you should unpublish youtube's inferior auto-generated track (which is still on for this video), since, if a viewer is coming from a video that only had auto generated ones on, it will default to that version instead of the ones that the creator uploaded themselves
Whisper is fantastic. I use it on Linux with Kdenlive. However, like most caption generators, I need to tweak the results. Rarely, I have to fix an actual mistake, but mostly I need to clean up the timings, because Whisper doesn't always break the text at logical places.
It's not just about the hearing impaired people. If the captions are good, TH-cam makes a pretty good job translating it at least to pt-br. I assume it is as good for other languages. I do speak english and one of the main reasons why I follow english-speaking creators is to practice my english-listening skills (no captions for me). But I've seen so many great content in english that I cannot share with friends because they do not speak english. With this, the video is now accessible to them also with good captions and translation.
As a non native English speaker I want to thank you for the effort. Automatic translations are complete garbage and even the automatic subtitles from youtube are really bad. Usually I understand most of what you say but sometimes there is a word that I don't understand or when someone is recording in a noisy place I can't understand what they say so I'm really thankful for the subtitles.
let me look into the future ... ah yes ... a button on TH-cam ... 'Click to hear this video in your preferred language' (based on voice to text - text translation - text to voice). It'll be so simple even Red Shirt Jeff will be able to use it :-)
kdenlive has a whisper intigration and is fully open source and free and runs on linux and windows and i think on mac but i do not know if there is a mac buld.
I do wonder how much of the quality is determined by whether you're reading from a written script on a prompter or if you're speaking off the cuff. I've noticed that even good / accurate transcriptions of adhoc messages I send to my wife need a fair amount of editing because I'm not "editing" it before speaking, so I end up going down all kinds of conversational dead ends before settling on what I meant to say. If other creators take a more conversational approach, it might take a chunk of time to edit the captions into something coherent without changing the intent too much. I mean, don't get me wrong, it's absolutely worth doing it just the same. Even if it's just for accessibility although I suspect it doesn't hurt discoverability, either.
I've used it for both, and the accuracy is still 98-99% for conversational content, interviews, etc., even when I'm talking to someone over Zoom and it's potato quality audio. The accuracy may go down with non-native speakers, or in some other circumstances, but I've been quite impressed.
Well it was just one example; I just checked MKHBD, JayzTwoCents, GamersNexus, and JerryRigEverything, and only 2/4 of them have CC on their videos-it seems like it's not very standard yet, but IMO should be, especially at a scale where you have a video production pipeline.
Pretty sure they were using a paid company who did subtitles back when yt allowed community added subtitles, so I’d say it’s a fair point. Excluding a huge part of your audience due to something as simple as subs isn’t smart tbh
Undecided with Will Ferrell uses the multiple audio track option to upload tracks with his own voice in multiple languages that he auto generates. Like German, Spanish, Portuguese, Hindi, Arabic and Chinese. It sounds super creepy😂. It's good and sounds very convincing but hearing his voice in your native tongue is weird.
Is there a delay on TH-cam assigning the "CC" logo? I watched your video last night and followed the steps and there is no "CC" icon when I look at my latest video in the list.
@@Level2Jeff Well at least it is crazy easy -- I've used the free version three times so far since watching your video yesterday -- on woodworking type videos -- and it's far more accurate than the auto-generated subtitles. And only about a million times faster than trying to spellcheck the auto-generated stuff.👍
Likely the reason many content creators don't use this is because using such LLM driven apps comes with questionable ethics as to the dataset used to train the model.
Not dismissing the importance of captions, but the number of viewers who have captions on are skewed by the fact that captions are on by default on smartphones and tablets. And if you don't turn them off on every video then they'll stay on, even if you never meant to look at them. For me it's more annoying because they are distracting and I don't need them most of the time, but there's no setting to change the default. Just saying.
By any chance, do you have captions enabled system-wide? Last time I had this problem, I learned it was because I had it on in either Android Accessibility or iOS. You can choose to turn it off and captions shouldn't start automatically.
@@_jerieljan If you mean in the accessibilities subtitle settings, then no. If there are some more settings that I don't know about, then maybe, but I didn't turn anything on on purpose. It might be on for non-english devices maybe?
If you look on the channel pages (not on the Videos tab), it shows CC, not sure why Google only has it appear there, and in search. Otherwise you have to go into a video and then check the captions in the tools.
YT keeps turning them off after every video finishes and I have turn them back on every time. So annoying! I don't know how you got this behaviour to happen.
Odd, it doesn't do that for me, but maybe it's something in app settings? I have seen captions on sometimes when I go to my home feed but they don't stay on if I turn them off after tapping on a video.
I'm so tired of constantly turning the captions off. It's a disease. There is no way to permanently turning them off, because TH-cam will automatically turn them on when English is not the system language on Android devices.
I've never understood why some people that don't have a hearing issue, aren't in public and aren't in a loud environment, yet still turn in subtitles. I was even a sub only anime watcher for over a decade, but if what I'm watching is already in English, subs are extremely distracting.
Most TH-camrs don't even have titles that have anything to do with the actual videos let alone useful descriptions. Closed captions? Nah, let's capitalize a random word in the title and tell people to like and subscribe 5 times.
YES! I speak English, don‘t have any hearing impairment, and still I watch with captions sometimes, at times when it‘s impractical to turn on audio. I can watch with the sound turned down on the bus when I‘m going somewhere, without having to fool around with earphones. YES PLEASE!
Even while watching a video with audio up, I still use the subtitles... 😅 Just such a better experience
CCs are great for non native speakers. Reading usually comes easier than listening and can function as a backup. A great way to learn =]
Thank you! A lot of channels don’t add captions. It’s a shame that LTT (and others) doesn’t add captions anymore, because they used to…
I am hard of hearing (completely deaf in my left ear and ~ -37dB in my right ear) … and really appreciate you championing this cause. Ironically, I rarely use your captions because your sound quality is always so good. A common problem I encounter on TH-cam videos is the background music being too loud _and_ the video having no CCs. So, I'd add encouraging creators to think about their sound design (which includes CCs!) as much as they do their visual design.
True; when I do add any kind of sound bed, it's at least -28 or -30 dB so even if you have headphones, it is way below the recorded speech (which has light compression to keep it higher even when I step away from the mic).
Now that you mention it, I hadn't thought of it before but the lack of loud background music definitely assists in the audio clarity.
You are SO right, why oh why does the "music" have to be at +60 dB and the narration at -3 dB
Greetings from -70 dB left, -90 dB right
I always find it funny when Android Live Caption is more accurate than TH-cam auto-captions
lol "my tiny phone processor does a better job in real-time than Google's entire datacenter"!
@@Level2Jeff especially when it's a shrunk down version of what's in Google's data center...
@twochambz Depending on the device and the part of Google doing the transcriptions, the Android phone model is actually the more powerful version.
Google data centers have to process so many queries that the model running on their servers is (sometimes) weaker/smaller than what's on your phone where an AI-optimized SoC only has to process a single person's requests.
I'd think the YT auto captioning is set to a faster/less accurate version on purpose, with the 500+ hours of video being uploaded to YT every minute that's still an insane amount of processing needed. It could cost them 2, 3, 4 times as much processing to get from a 90% accuracy to a 99% accuracy (I'm pulling numbers out of my ass but you get the point), I presume they've done the cost/benefit analysis
What I often miss are timestamps. But that is of course a lot more work.
True; that can be helpful too and is a manual process as well. I do it for all my main channel videos, and as many of the other channel videos as well, because I love them as a viewer (so I think viewers should have them on my videos too!).
@@JeffGeerling Is it really that much time, can't you just add markers during editing?
@@hugevibez For me, my brain doesn't work that way, so I always do it after the entire edit is complete-also, getting markers exported into timestamp format can be weird/annoying (at least in the editors I've used), so you still have to type them out in the format TH-cam likes in the description.
@@JeffGeerlingYou want something similar to Whisper, but it looks at scene cuts in the video, or I guess it could just look at "topic cuts" in the transcription. In fact I bet that would be much easier. What about just dumping the transcription into a LLM-like service like ChatGPT and asking it to generate some chapters?
Thank you for understanding the importance of closed captions.
Accessibility is so heckin important, and I especially appreciate that you brought attention to the fact that other Creators with considerably more people power and financial resources are not doing something they could be doing.
Keep up the awesome bar raising work.
Im not a native English speaker but closed captions help understand content much better
Selv jeg der drømmer på engelsk har CC slået til, rigtig godt hvis du er usikker på om du hørte rigtigt 😁
Yes! Oh gosh I wish more creators that I watch regularly did this. Grian and Gem from Hermitcraft have create closed captions already - as you say it is really easy to do!
I really appreciate you going above and beyond on accessibility, as I'm deaf with cochlear implants myself and rely on them a lot. As you said, it's bewildering how many creators don't even use the TH-cam captions (much less make good ones), they're making it harder for themselves to build their channel by denying viewers.
One of the absolute biggest things I appreciate the most about your videos are actually the subtitles, and how accurate they are too. It goes such a long way, I'm not deaf or hard of hearing in any way, but not only does this make videos actually watchable for them, I still make great use of captioning because 1) English isn't my native language, and English is HARD. 2) I have trouble concentrating on speech very often and sentences completely enter one ear and exit out the other :) Following along with text really helps.
I like this type of content, plz do more of these to introduce us more tools and features. Great work ! YAY !!
That's what this 2nd channel's all about! I want to share a lot of the little things people might never see in the day to day of making videos, doing electronics work, etc.
And another Thank you for caring about the CC of your material.
Thank you, Jeff! It is attention to detail, like this that just adds more authenticity to your content. I appreciate the effort, and so does my hearing disability.
Thanks Jeff.
Really appreciate the effort you put in to add accurate captions. Being somewhat older now my hearing isn't as good as what it used to be so sometimes I'm unable to make out a specific word and accurate captions are really helpful when that happens.
Thank you so much for sharing this!!!!! I have friends with different disabilities and always try to keep their interests at the forefront but I've also never bothered to really take a close look at the generated captions vs. the ones I can quickly generate on my own. This is something I'm doing in all my videos from now on. Thank you again!
Absolutely! I use MacWhisper as well and have been using the medium model with good results. I also use the DeepL integration to translate the captions into other languages which I choose based on testing and people selecting TH-cams auto captions translation to give people a better experience.
I only use them because yt forces me to do so. If there would be a setting to have them deactivated on start of playback would be awesome.
I use whisper to generate captions for old TV and movie content that my mother and I watch. It doesn't do well for content that contains mixed languages, nor for content that has a lot of music, but it still gives good-enough results for content that's less than ideal.
I rely on captions to watch videos on youtube. Thanks for putting in the effort to do them!
Oh hi!
I find it interesting how much those language models are good enough now to caption your more technical videos.
But at the same time its also baffling how a lot of creators do not do that additional step of adding captions considering there's easier ways to do them now like what you demonstrated.
Though there's a few channels I do watch that do captions regularly like LGR, Technology Connections, 8 Bit Guy, SciShow, VWestlife, Veritasium, and a few more I can't recall right now.
Then there's the very rare ones that do them in more than one language like LowSpecGamer and GMM, both of which do English and Spanish. So far the best one with frequently a lot of languages on their videos I've seen is Fascinating Horror.
I appreciate captions myself because there's times when my hearing is degraded due to sinus congestion from allergies making my ears stuffy or the host's voice not being that clear or have a heavy accent (i.e. Scottish, Indian, Aussie slang, etc.) that I'm not used to and take a while to understand.
First of all, thank you. Besides the obvious accessibility aspect, they also come in very handy when hovering over videos on the TH-cam feed in a quiet environment :D
However that usage percentage might be skewed by TH-cam. Being a non-native speaker, I constantly have to keep turning off CC (sometimes it's more relaxing watching a video when you don't see what has not yet been said), but no matter what, TH-cam keeps switching them back on. Based on googling for a solution, this seems to not be an isolated issue...
There should be a creator tool that listens for imperial units and provides on-screen unit translation to some SI unit. Especially for temperature. The US is weird in this...
ha, true
Thank you, I absolutely appreciate the captions! :)
I'm a native English speaker. While my hearing is good, I often lose a few words when listening to someone speak and need to rewind/read the respective subtitles to pick up on what was said.
Fantastic to see @Level2Jeff - closed captions (subtitles here in the UK) are a critical component to all video content ... but, you asked about things tech TH-camrs should be doing to support accessibility, so, as a sight impaired content consumer, here's a couple of things for you, specifically from this video (but common to most tech TH-camrs) ... First, use a high contrast mouse cursor; move the mouse a little more slowly (it's almost impossible to track where your mouse pointer is in this video for many sight impaired folk). Next, use some tooling that graphically indicates left and right clicks as you make them, or clearly note that you are clicking (and which button) in sync with when you click. Also, use tooling, or editing, to zoom in to the area of your screen that's of interest; very often a full screen is not helpful, particularly if you can't make the entire UI larger (such as in many editors or terminals where you can only increase the primary font size, but not the scaling of the entire UI), make sure that the area of the screen or app you are interacting with, along with local context is, the primary thing that's visible by zooming in to that area. And not specific to this video, but where there are any cases where you're doing text or graphic overlays in the edit, be certain that they are of a high contrast relative to the video content they're overlaid on top of - there's no point in an overlay if it can't be consumed by everyone. 10 / 10 for the captions, now time to work on improvements for folk with sight impairments too!
Good points! I do think my screen recorder has overlays for keypresses and clicks, I'll have to see how that works.
For me not a native english speaker, this will make me watch more english contents and have better understand
Thanx for this suggestion Jeff. I use captions all the time, as I'm busy multitasking while working, so most times cannot have the sound on. The auto generated captions are terrible to put it mildly! 🤣
This is what AI was MEANT to do, not AI girlfriends.
"Actually useful AI" - image recognition, translation, etc. (of course they were called 'Machine Learning' before AI became the buzzword!).
Do you have translations planned? If so, what tools are you looking at?
@@macTijn Right now, no-since I can't verify the accuracy I don't want to incorporate any automated tool into that workflow. I'm okay limiting the audience a bit so I can ensure the accuracy of the subtitles.
What about AI girlfriends with Closed Captioning AND the added ability to read between the lines? I'd be into that.
@@JeffGeerling very responsible decision. AI translation is not accurate at all.
Thank you for this Jeff (not only subtitling your videos, but showing the process)
I posted my first video with killer subtitles tonight all because of your awesome tutorial! Thank you!!
Captions are useful for the hearing-able folks as well - so many thanks for making the effort! I assume you can also offer a transcript easily which could be great for some how-to videos
As part of the 40%...I've always appreciated the fact you have actual subtitles on all your videos! Thanks Jeff and keep up the great work.
Some of the things (besides a good Pi) that I find near perfect on your channels are:
1. No annoying blasting background music
2. Loud and clear talk
3. CC
A few annoying things on your channels:
1. When your passion overtakes you and your word rate exceed normal speech with 100% 😁
2. eh . . .
Thanks for the tip, will implement when I finally manage to start making videos. Actually I already used this (or something very similar can't remember) at work, instead of potentially writing cc for hours, it took less thsn an hour all together even while I speak a language that translators usually suck in (finnish), and I had to be really pedantic about fixing any mistakes and perfecting the timings, as it was for a client. Definitely the most useful AI tool I've used so far.
Whisper is amazing. Thanks for adding quality CCs.
Thanks Jeff this is really great. I hope more tech youtubers start doing their own captions, because as you said Google's just are not good.
Amazing work Jeff! My wife is legally Deaf, and this kind of effort changes her world! I salute you!
Your RISC-V video on your main channel has mis-timed captions, noticed it before seeing this video which is kinda funny to me
You are one of few I found on YT that make it look so easy and understandable. Have been thinking of making my own content but want to find some good source for where to start, to figure out what I need of hardware and way of doing things. I guess that it is more important to have an idea of where I want to go than the hardware it self, but could it really be done with a simple thing like an mobile phone?
Yeah, content first, then hardware! I had been writing my blog for over 15 years before I started really investing in TH-cam content-for me, 90% of the video is testing, building, then writing up either a detailed outline or a full script... then 10% is recording and editing (though sometimes editing takes more time than I'd like!).
But I've done some videos almost entirely on my iPhone (13 Pro), the main thing is to get some decent lighting, whether natural light bounced off a wall, or with a ring light, or a big softbox that you can find on Amazon. For audio, try to get a microphone positioned close to you, or get a little lavaliere mic, even cheaper ones close-ish sound better than expensive ones far away!
A lot of us REALLY appreciate this!
As someone with a toddler currently screaming in my ear, I greatly appreciate CC.
My wife is a (small potatoes) streamer and i may help her set this up for her VODs
Do it! Would love to know how the accuracy is for that, too (especially if there's other background sound in it).
@@Level2Jeff maybe I'll do a test and report my results- I'm wondering how it will handle game sound, other streamers on Discord, blerps, etc. I have played with Whisper before and got good results but only tried with test files.
"IDK why Google's _____ is so bad" feels like you can say that about all their products. Everything's declining lol
It's because Google, like so many large companies, has figured out that once a company becomes so big, they don't have to try anymore. People will pay for their crappy stuff because they don't know where else to go, or because they themselves have too little life experience to realize that they're paying for crappy stuff.
Google used to be a great product, even their translate thing was great. then they used ai for translation, and it become really bad.
I'm glad that Whisper exists, but it does need a strong system to get good/fast results. For example I use the cpu fork of the command line version to convert noisy German train driver videos to English. On an N100 with 16GB RAM I can only use the medium model, and it takes longer than realtime. And I would say TH-cam's translation has the edge. It's still a good tool to have though. It would be interesting to see how much better the large model would do. Thanks for the video!
Jeff, you're one of the good ones. Those other big "creators" don't give a sh*t about their audience-- only the $$$ their eyeballs represent. Thanks for what you do-- including careful placement of midroll ads which almost no one bothers to do because it's 2 minutes more work than just clicking upload.
Yeah-I usually leave out midrolls for the first 2-4 days, so my normal subscribers don't have to see any. Then I add them in and check the breakpoints, and delete most of the placements. I try to limit it to 1-2 for a video, even a much longer one.
@@Level2JeffI don’t mind mid rolls… gotta pay the bills. But PLACE them at logical breaks in the content… the biggest TH-camrs just slap them every couple of minutes often mid-sentence or even mid-word. #lazy
I see "4K" where you say I should see "CC". I almost always watch via an Nvidia Shield, so that might explain the difference. Thanks for taking the time to offer the better experience.
Reviewing the cc before uploading would be a good job for an LLM.
Hey Jeff, if you are inclined, try doing Spanish, Italian, french and portuguese. a lot more people in this side of the hemisphere will be able to enjoy your content and you get more views
My native language is Portuguese, but most of the channels I watch on TH-cam are in English and CC helps a lot, since I have some difficulty understanding the spoken language .
That's so cool. I'll be using it to make better captions and get quality step by step instructions using AI
When I clicked on this video I was curious what YT and Jeff have to do with Creative Commons. However closed-captions were also interesting.
I almost always use captions (and I have no hearing issues).. Why? Because it tends to be noisy where I live, and being a visual person as well, since i'm already staring at the screen, it helps me follow the pace better not JUST hearing it. I wish more would do this.
I just use Davinci Resolve Generated Captions if that is you choise of NLE. They are way better for my experience than MacWhisper.
I do multilingual TH-cam content so one thing I hope Mac whisper could do better is that the translate feature would translate in context and not one line (half sentance) a time. Currently I use deepl API with raycast for translating. Their web app has charter limit not enoght for one Yt video.
For anyone wondering like me, NLE refers to "non linear editor". An older term referring to the order in which video content is edited. Since we've been editing digital video content non linear for decades (I mean, who uses film reel, hah), it has become the de facto standard of video editing. Thanks for the heads up, I'll be taking a look at Davinci Resolve for my next CC creation! 👍
@@Biru_to Thanks! Hope it help!
hey jeff, thanks for the caption support! though, you should unpublish youtube's inferior auto-generated track (which is still on for this video), since, if a viewer is coming from a video that only had auto generated ones on, it will default to that version instead of the ones that the creator uploaded themselves
Oh it does? Yikes! I'll have to see if I can automatically get that disabled. Annoying they don't default to the creator's subtitles...
I believe people also use Whisper for movies, tv shows, etc
other usecase is for people who are not fluent with english, helps them to understand
I use captions when i dont want to turn up the audio or late at night
Whisper is fantastic. I use it on Linux with Kdenlive. However, like most caption generators, I need to tweak the results. Rarely, I have to fix an actual mistake, but mostly I need to clean up the timings, because Whisper doesn't always break the text at logical places.
It's not just about the hearing impaired people. If the captions are good, TH-cam makes a pretty good job translating it at least to pt-br. I assume it is as good for other languages. I do speak english and one of the main reasons why I follow english-speaking creators is to practice my english-listening skills (no captions for me). But I've seen so many great content in english that I cannot share with friends because they do not speak english. With this, the video is now accessible to them also with good captions and translation.
As a non native English speaker I want to thank you for the effort. Automatic translations are complete garbage and even the automatic subtitles from youtube are really bad. Usually I understand most of what you say but sometimes there is a word that I don't understand or when someone is recording in a noisy place I can't understand what they say so I'm really thankful for the subtitles.
let me look into the future ... ah yes ... a button on TH-cam ... 'Click to hear this video in your preferred language' (based on voice to text - text translation - text to voice). It'll be so simple even Red Shirt Jeff will be able to use it :-)
kdenlive has a whisper intigration and is fully open source and free and runs on linux and windows and i think on mac but i do not know if there is a mac buld.
Nvidia Canary is the state of art right now. Try that out.
Wow, 40%? I would not have guessed. I went to a college that had deaf students so a lot of the classes were interpreted, inclusion is a great thing.
You are so awesome.
I do wonder how much of the quality is determined by whether you're reading from a written script on a prompter or if you're speaking off the cuff. I've noticed that even good / accurate transcriptions of adhoc messages I send to my wife need a fair amount of editing because I'm not "editing" it before speaking, so I end up going down all kinds of conversational dead ends before settling on what I meant to say. If other creators take a more conversational approach, it might take a chunk of time to edit the captions into something coherent without changing the intent too much.
I mean, don't get me wrong, it's absolutely worth doing it just the same. Even if it's just for accessibility although I suspect it doesn't hurt discoverability, either.
I've used it for both, and the accuracy is still 98-99% for conversational content, interviews, etc., even when I'm talking to someone over Zoom and it's potato quality audio.
The accuracy may go down with non-native speakers, or in some other circumstances, but I've been quite impressed.
Google's closed captions can be quite amusing though. I get a giggle out of "Prague rock" 😁 (say it out loud if you're not getting it).
Taking LTT as a benchmark for Techtuber Creators is a pretty far stretch 😬
Well it was just one example; I just checked MKHBD, JayzTwoCents, GamersNexus, and JerryRigEverything, and only 2/4 of them have CC on their videos-it seems like it's not very standard yet, but IMO should be, especially at a scale where you have a video production pipeline.
Pretty sure they were using a paid company who did subtitles back when yt allowed community added subtitles, so I’d say it’s a fair point. Excluding a huge part of your audience due to something as simple as subs isn’t smart tbh
Jeff Buscemi has done it again
How do you do?
💯
i appreciate the effort you do this for us hearing impaired folks!
And while you are there, can you make a video on your video production architecture
I think ThioJoe is the one who use most of these features that nobody else seems to notice they exist.
He's certainly had a lot of good videos on his setup, though it's a bit more than I've considered taking on for my process.
AI subtitles are built into Davinci Resolve...
Undecided with Will Ferrell uses the multiple audio track option to upload tracks with his own voice in multiple languages that he auto generates. Like German, Spanish, Portuguese, Hindi, Arabic and Chinese. It sounds super creepy😂. It's good and sounds very convincing but hearing his voice in your native tongue is weird.
Is there a delay on TH-cam assigning the "CC" logo? I watched your video last night and followed the steps and there is no "CC" icon when I look at my latest video in the list.
I think there are some channel pages where it shows (and in Search results), and some where it doesn't for some reason.
@@Level2Jeff Well at least it is crazy easy -- I've used the free version three times so far since watching your video yesterday -- on woodworking type videos -- and it's far more accurate than the auto-generated subtitles. And only about a million times faster than trying to spellcheck the auto-generated stuff.👍
Likely the reason many content creators don't use this is because using such LLM driven apps comes with questionable ethics as to the dataset used to train the model.
so if I used wisper on one of my video's it will give me English CC but will TH-cam convert that to other langwidges?
Anyone using any automatic voice to text for subtitles, please at least do a proofread.
Not dismissing the importance of captions, but the number of viewers who have captions on are skewed by the fact that captions are on by default on smartphones and tablets. And if you don't turn them off on every video then they'll stay on, even if you never meant to look at them. For me it's more annoying because they are distracting and I don't need them most of the time, but there's no setting to change the default. Just saying.
By any chance, do you have captions enabled system-wide? Last time I had this problem, I learned it was because I had it on in either Android Accessibility or iOS.
You can choose to turn it off and captions shouldn't start automatically.
@@_jerieljan If you mean in the accessibilities subtitle settings, then no. If there are some more settings that I don't know about, then maybe, but I didn't turn anything on on purpose. It might be on for non-english devices maybe?
I cannot get thru the checkout process at all no matter what I try. Any suggestions?
But, but, but if creators add the correct captions, then I won't get a laugh out of the auto-generated captions!
What Large model are you using? v1, v2 or v3? Curious if you've tried the different versions and have had better or worse results in newer versions.
Damn it, I thought it meant Creative Commons.
Heh, that would be a bit more commitment, but something I've thought about a lot.
Linus is gonna be pissed at you for calling him out. Tsk-tsk. 😂😂😂
I just looked up in all of your channels, at none of the videos in the mainview or overview I see the CC-icon.
If you look on the channel pages (not on the Videos tab), it shows CC, not sure why Google only has it appear there, and in search. Otherwise you have to go into a video and then check the captions in the tools.
Why is this on your second channel? Seems important enough for the main one.
I have switched off subtitles for about 10.000 times, but TH-cam keeps turning them on.
YT keeps turning them off after every video finishes and I have turn them back on every time. So annoying! I don't know how you got this behaviour to happen.
Odd, it doesn't do that for me, but maybe it's something in app settings? I have seen captions on sometimes when I go to my home feed but they don't stay on if I turn them off after tapping on a video.
Having a big audience doesn't mean you have quality content...
How have never seen the CC on your videos??
🤘👏👏👏
I'm so tired of constantly turning the captions off. It's a disease. There is no way to permanently turning them off, because TH-cam will automatically turn them on when English is not the system language on Android devices.
Lol no. I never have this problem and english is not my system language
then you're a lucky one I guess
I keep forgetting to add them even though a have a lot of non English viewers
comment for the algorithm ❤️
Why is it not captioned in Klingon?
Asking the real questions!
Work on Windows devices?
I've never understood why some people that don't have a hearing issue, aren't in public and aren't in a loud environment, yet still turn in subtitles. I was even a sub only anime watcher for over a decade, but if what I'm watching is already in English, subs are extremely distracting.
I can only give you one thumbs up. Unfortunately.
there I gave you one more :)
Most TH-camrs don't even have titles that have anything to do with the actual videos let alone useful descriptions. Closed captions? Nah, let's capitalize a random word in the title and tell people to like and subscribe 5 times.
Just wondering...what flavor is that Apple Kool-Aid? Does it taste like apples or large amount of Benjamin's? LOL