I am an Audio DSP engineer, amazed at how much information, is packed coherently into the video. Amazing overview for any engineer who wants to understand audio
Not all correct though. Reducing the sample rate doesn't decrease the signal to noise ratio relative to a band-limited signal because the Nyquist theorem works, bitches!
@@joshuascholar3220 Thx for the comment… though I confess to not having any idea what you are talking about… lol I guess that when you get down into the weeds, nothing is ever simple
@@MrArdytube it means that with perfect filtering, any sample rate over double the rate of the highest frequency will do. If the highest frequency in the signal is 20 khz, then sampling at 96 khz doesn't make the result any more accurate than sampling at 48 khz.
@@joshuascholar3220 Thx for the explanation. On a slightly different topic….can people actually differentiate high frequency tones? For instance, for me…. Higher than 10k all sounds pretty much the same
Mixing engineer here. One thing you should've mentioned is the loudness wars. Over-compression has taken over the music industry. In theory, songs that are louder will stand out more, and thus be more likely to get played. But there is a point at which the song is so over-compressed that it loses it's sonic quality. When this happens, it becomes flat, irritating, and lifeless. Ever notice how some videos are excessively loud at half volume? It creates an adverse effect because the consumer is already annoyed. The hardest part of mixing is finding the balance of compression and dynamic range while also keeping in mind the overall level of it. Truly talented engineers find it. But often times it goes unnoticed because of the current state we're in. Hoping to see it change one day.
tHe LoUdNeSs WaRs my dude the skill to understand what this even means died 20 years ago. There's no loudness war, we're lucky if anything dumped to Spotify cares at all, even slightly, about dynamics. You're an old man yelling at a cloud
I enrolled in an Audio Engineering program at the beginning of this month and the timing of this series is impeccable. Thank you SO much for all your hard work, this was super helpful!
"begging" here meaning if you liked dont forget to subscribe for more. Im guessing youre one of those guys that yells get a job at people with will work for food signs
@@realitynowassigned I'm missing the correlation here. Telling people to do what they're already doing ≠ remembering them an option they might not be familiar with.
@@realitynowassigned no, I do not yell at people down on their luck, but I also do not reward anyone with a subscription who only seem to care about that. Create quality content, and the number of people who want to see more will increase. P.S. I hope you get back on your feet soon.
It should be noted that a major implication of the Nyquist-Shannon sampling theorem is that a signal, band-limited to frequencies below the Nyquist limit, is *always* reproduced *perfectly* with PCM encoding. You’ll only introduce noise if you don’t band-limit your signal to 1/2 the sampling frequency before encoding it. The upshot is that you can simply set the sampling rate to double that of the highest audio frequencies that humans can perceive, remove sounds that is impossible to hear, and encode sounds that are physically impossible to distinguish from anything with greater sampling rates. This is why 44.1kHz was chosen for CD, with a bit of a buffer for equipment quality concerns, plus a bit of math determining the exact rate for technical reasons involving pre-establish standards in other areas, differing between countries, etc. You still have to worry about the analog parts of the process, like microphone quality and the accuracy of the ADC and DACs at either end. You might also just want a higher sampling rate to reduce information loss when editing and applying effects, or for any number of reasons that hopefully don’t include a misguided attempt to ignore absolute mathematical facts.
One minor correction, it's not perfectly reproduced, as you still have the quantization noise. If the samples were perfect then it would be a perfect reproduction but that's obviously impossible with PCM.
@StringerNews1 I'm not one of those analog morons who thinks a record player can hold a candle to the SNR of modern digital audio. What I'm saying is that he's claiming that some arbitrary band limited signal can be reproduced perfectly by PCM because of the Nyquist-Shannon sampling theorem, but that's entirely glossing over that the sampling theorem is dealing with continuous values for discrete samples whereas digital PCM is discrete values for discrete samples. PCM can exactly reproduce arbitrary band limited signals that are already quantized to the bit depth used but you inherently always have some quantization noise if you're using any digital sampling. That quantization noise for anything reasonable like 16 bit LPCM is going to be so low that human ears will never be able to discern the difference, but the original comment claimed that the only noise was if the signal wasn't band limited. There is no discrete value equivalent of the Nyquist-Shannon sampling theorem.
StringerNews1 I’m with Andrew. Since pcm is a digital method it cannot be separated from bit depth. Digital bit depth is the equivalent of analogue fidelity and dynamic range, which is a discussion way beyond Nyquist. I work with small signals in X-ray detection with dsp and sampling rate is only half the story at best.
Digital sampling is rather straight forward. Nyquist tells us the frequency-range that can be reproduced, the bit-depth gives the dynamic range. In total for CD that means we can reproduce any sound perfectly that falls into the 0-22kHz frequency band and is has 96dB dynamic range. All higher frequencies present in the input-signal leads to artefacts due to aliasing and all sounds lower than -96 dB are lost during quantisation and fall into the noise-floor. As we can not hear arbitrary quiet sounds and 0 dB was fixed at the lower threshold of human hearing the signal is, at least for human consumption, equivalent to perfect - unless you want to argue that it should include volumes which cause permanent degradation to your ears.. (And no - transients are nothing special and fall under the exact same rule.) Normally for production audio is sampled at higher rates and higher bitdepth but that change compared to CD is *NOT* something any human can hear but still very necessary for production as it allows alterations to the sound without dropping below the quality that we can notice artefacts again. Sound is thus often captured with a 120dB dynamic range. That means a signal that is too faint to hear in a sound-proof studio can be boosted to whispering volume without introducing any audible noise, or boosting a quiet conversation to the volume of a passing truck. And lastly - you gotta be a special type of retard if you would claim that 96 dB 44kHz is not enough for listening as this can quit literally reproduce the sound of breathing while simultaneously being at a busy intersection.
Wondered when someone would put a decent video about this up... Half way thru I was thinking this was gonna be another yawner about PCM... But he delivered on the goodies
As someone who studied this subject at a basic level many years ago , and I am certainly no expert , I'm still amazed how complicated this topic can be . However what's more amazing is reading the comments below from people , who I guess have expertise in audio engineering , correcting each other. Someone must be wrong . So anyone trying to understand this subject should read a book on it and as always, pay little attention to TH-cam comments , even mine. Happy listening everyone !
Holy s... This channel is pure gold, this video was INCREDIBLE!! Very well done, complete, and, with a little background, very easy to follow and understand.
This is the best explanation I've ever seen! So many times I googled "what is sampling rate" and couldn't understand the explanations. Finally it's clear what is it all about, thank you so much!
@@Strobie1kagobo You've got a point here. I don't know about Andrew Rudolf, but most people these days listen to their music on poor devices. Not to mention they play their music everywhere there is noise around: on buses, in the streets etc. Or on some shitty tiny bluetooth speakers at home. Only a few dinosaur audiophiles are still in quest for the best audio quality... that's the irony of having better techniques nowadays. It's lost its value.
@@lordsharshabeel no, TH-cam compression has nothing to do with it. Listening using at least half decent speakers or headphones the distinction can be clearly heard. (I used half-decent headphones right now at work and I heard the differences). Phones have tiny speakers with absolutely horrible signal to noise ratio and poor reproduction of anything but midrange. In the case of bit depth the higher noise can't be heard because the speaker itself is already noisy. In the case of sampling frequency the difference is in the high frequencies which the tiny speaker can't reproduce accurately so the distinction is blurred. So the reason OP didn't hear the difference is because his listening device is piss poor at sound reproduction.
I could hear the difference on my iPad, maybe the fact your phone is outputting mono sound is the problem? I don’t know though, I barely understood the video
This video was really amazing. Altough I'm studying music & sound design and have learned all of this before, I gained a much better overview over the subject through this! Thank you for this amazing content!
This is a very good video! I do have just one minor nitpick - when you discuss the Nyquist limit at 2:45, you mention that the "highest possible frequency that can be captured" is equal to the Nyquist limit. while it is true that you can represent a sine wave at the Nyquist frequency, you cannot *reliably* capture it at that frequency, because the sampling accuracy is 100% dependent on the phase of the Nyquist tone. It is more accurate to say that the highest possible frequency is just below the Nyquist frequency, or in other words, is strictly less than the Nyquist frequency.
I imagine if you were to capture a frequency at the nyquist limit then, it would be the amplitude that is misrepresented, as it would be capturing at the exact same point on each cycle and if not exactly in phase would never 'see' the full amplitude, whereas lower frequencies can build up a better overall picture.
Great video, but my understanding was that CD formats use a spiral track, not concentric rings. It's a servo-based system, commonly using the signal reflection to 2 or 4 photodiodes to optimize positioning. You seek to a little in front of where you expect the track, wait for a track to come by, lock the servo on it, read the position of the stream, and then determine if you want to play, seek, or wait a little.
It's spiral yes. But your understanding on the rest is off in a couple spots. The tracking is indeed handled by photodiodes that are on the fringe of the sensor array, so that when pits start to show up on them because the spiral is making them drift, the servo can adjust the lens to center it back on the data diode. Focus is handled in a similar way; instead of causing the servo to move toward the center or edge of the disc, when the "fringe" photodiodes both see the same conditions, the beam is out of focus and it adjusts up and down (with a little bit of a wiggle to determine direction, story for another time lol). When the lens is reaching it's limit (the controller knows this by how much current is flowing thru the servo coils), the sled motor kicks the whole assembly over a small amount so the lens servo can continue it's job of keeping things in line. There's also no "waiting for anything to come around", unless the drive is being used for random access (like CDROM, or in the case of audio, doing something like seeking or changing tracks). While playing a CD, the servo follows the entire track end to end, in just about real time (ignoring processing time and buffering), continuously collecting the stream as it goes past the lens, just as record needles follow the groove. The only real difference on that level is with a record, there's a physical guide to keep the needle where it needs to be to play the music... With CD, this has to be done with a non contact method, and so the extra photodiodes to tell the lens where to be. If you watch a CD lens while a disc is playing (from the side of course lol), you will see it compensate for every little ripple and warp in the disc, just at a higher speed, than it's old school mechanical counterpart.
Also the data on a CD is not "stored in pits an lands", as he said it, but instead it's stored in the change between pits and lands. It's not "land = 0, pit = 1", it's when the laser notices a change from pit to land and vice-versa, that it will read 1. As long as it stays in the pit or land, it reads 0. This may be come off as nitpicky, but I think it's a quite interesting little fact
@@MadScientist267 there's one other significant difference from the analogy with a record player - the laser reads one track for one stereo channel, then the next track for the other stereo channel, and then moves on, alternating left and right as it goes, which necessitates one channel always being heard slightly later than the time it was actually read off the disc, by a difference of the time it takes for the laser to read one revolution of the disc. Or so I believe.
This is such an interesting video. It kind of requires you to have a decent grasp of basic but varied technical terms (“but what does 5Hz sound like??”; “what is a bit??”) but if you do it is a strangely efficient, kind of fun experience. Kind of like being in a somewhat advanced class in a subject you like.
I bought my first audio CD in high school in 1988. Thirty Three years later I finally have a clear, concise understanding how the sampling technology impacts the sound quality with examples that demonstrate the point. Thank You!
As I listen to this being streamed over a mobile connection to my bluetooth headset i get the feeling that somewhere along this pathway I'm not getting 48KHz audio.
@Pho Tato Not as a rule and not when TH-cam is set at the default of Normal quality. The streaming m4a (AAC) bitrate will be anywhere from 96 to 128 kbs or so with a minimum of a 44 kHz sampling rate. Upper frequency cutoff is well above 15 kHz and in fact at 128kbs, rolls off smoothly into the upper limit of the format's capabilities. support.google.com/youtubemusic/thread/338369?msgid=348540
None of us are getting the 48KHz audio, not that it matters anyway. The highest quality sound stream available on this video is webm opus @ 160 kbps, 48000 Hz. If your browser can't play back the opus audio, then you're getting mp4a @ 128kbps, 44100 Hz.
"Temporal masking". I knew this phenomenon had a name! I have this issue when watching movies on my car stereo system on my boat. Now I know what keywords to use when researching a solution. Thank you! Excellent video, btw.
It was the noise floor. You can hear a very distinct “fuzz” (aka white noise) in the 8-bit one. Say you have a music track that ends with silence. If the noise floor is low enough, you won’t be able to tell when the track has stopped playing. With a high noise floor, the ‘silence’ will become *more* silent the instant the track ends.
Listening through the Speakers of the ROG Phone 2 I noticed compressed dynamic range, distorted highs and a higher noisefloor. Problem is there are phone speakers that have lower dynamic range as the compression at hand.
TH-cam implements additional compression so everything tend to sound the same. Very little dynamic range. 16 bit has more clear highs, not fizzy, if you notice hard enough.
I am listening to music on very high resolution equipment: like Stax headphones and decent home stereo too. It is amazing how well music is coded even in relatively small data mp3. I have to listen hard to hear difference between 128 and 320 kb/s or even lossless. What is far more important is: How well music was recorded and how good your listening equipment is. Well done digital audio!
Fantastic and informative content as usual. I'd love to see some videos on battery tech, particularly on what separates lithium ion chemistries from each other and what they are used for. Another video on machine learning, particularly applied to self driving would be of interest. Throw in some Tesla references, and watch your channel take off like a rocket. You deserve growth. More people should see your channel. :)
I am a electronic engineering student and an amateur self learning music producer and this has become the best and useful video that I've seen in TH-cam
Don't forget, to get the most out of those 'digital' cables you're going to need 'DIGITAL' speakers and/or headphones. I recently ran into a problem receiving HDTV signals because I was using an old fashioned 'analog' TV antenna! lol
1:00 the scale you put on screen is horizontal, while bits represents the amount of possible values in the vertical spectrum. 1:34 bit nitpicky, but you display an analog mixer. There is no ADC/DAC in that thing. 2:55 this is untrue. There isn't more noise, it just can't capture anything over the niquist so it's way less high fidelity because it just can't capture anything over that frequency. This ofcourse if true if the ADC does not have a filter in place, but all ADCs have a filter. 7:29 I wouldn't put an echo in the range with dynamic processors. If that would work, reverb would be on the same list. 8:52 this is a bit of a misrepresentation. EQ does not transform the signal from the "time domain" to the "frequency domain". It's just a different visual representation, and one that is also possible in the analog domain. The "frequency domain" does not have a time axis, sure, but that doesn't mean the signal now is in a non-time domain.
I just want to touch on how much I admire the beautiful thumbnail of this video. The orange screen contrasting with the white and grey really caught my attention and made me remember this video for later, its wonderful.
Sampling and quantization are not the same thing, they are related but distinct concepts. Sampling is discretization of time while quantization is discretization of signal amplitude. The difference is important, and sampling and quantization are carried out by different blocks in a signal processing circuit. Love your videos.
As a communications engineering major, it’s interesting watch all this being used practically here. I had only studied it so far in my engineering textbooks.
This is no longer the case. TH-cam has for quite some time been using a much better audio codec that will preserve up to 20khz no problem. A quick analysis of the audio in this video confirms this to be the case.
I made a noise album about 5 years ago. It was, of course, mostly noise and purposely degraded sounds. After it was all done and released I wondered what the lowest bitrate was that I could bring it down to. I don't remember what the program was called, but I found one that brought it insanely low. I ended up fitting about 45 minutes of inaudible mess into just under 1.2 MB. It fit on a 💾. Now there's a stack somewhere in my basement of these because I never gave them out. Thank you for this video. It really hit home with me.
@@MilesPrower1992 TH-cam disabled messaging years ago on here so hit me up at Game Interest ok Twitter or Instagram. asingledigit.bandcamp.com/album/plastic-trigger-finger-and-polyrhythm is the album free if you want to download it, but the Bandcamp version is the high quality version, not the ultra low bitrate version.
1:21 when you explain 16-bit sampling and 65536 levels, those levels represent VERTICAL values of sampled audio, not horizontal. It is a fundamental flaw in your lecture. Horizontal values / precision is defined by sampling frequency.
While this is true technically, the Nyquist Shannon sampling theorem proves that 44.1 k recording, which is CD audio, can represent, literally perfectly, up to 22 khz which is 2 kilohertz above what any human can hear.
I agree this could have been shown better if the bracket had been drawn vertically. I had no problem with it, though, maybe because I already knew this. I took it to show that 'the samples we're seeing along here are each recorded at being one of 65,536 different amplitudes' - with the horizontal bracket showing a range of different samples all at different heights. I think due to the commentary and the context, probably very few people got the wrong message.
@@pilotavery With due respect, I don't think you've understood or correctly interpreted Maciej's objection. He's pointing out that the 65,536 levels mentioned are vertical/amplitude levels, yet the bracket on the graphic is drawn horizontally. He has a good point and the presentation would be even clearer if the bracket was drawn vertically.
Also, the nyquist 'limit' is not a technical limit. Off course higher frequencies can be captured. The nyquist curve represents the limits of human hearing, thereby considering higher frequencies obsolete.
@@boydrijkvan6500 the Nyquist Shannon limit, is by definition, the limit. It says nothing about the bandwidth of human hearing. All it says is that you have to sample at twice a maximum frequency you want to capture. If you want to capture frequencies above the Nyquist limit, then it wouldn't be above the Nyquist limit because you would be increasing the sampling rate.
Well done. As I've tried to teach myself electronics over the last few years, many times I've thought to myself, I wish I could find an abstract overview of various subjects. You do a great job of creating such an overview. I wish you had been around a few years ago with this upload. Another subject that I've encountered and wanted an overview of is programming languages. It would be really helpful to have a reference that encompasses the history from punch cards to python. I've wondered how different languages waxed and waned within academic, commercial, and social contexts. What were they used for, how long, why, what were they replaced by, what better options were eclipsed by the momentum of second rate options, etc.? It's just an idea I've often wondered about. Thanks for the upload. -Jake
Been waiting eagerly for this video! Thank you! Love your channel btw and I was here when you had 19k subs, told you that you'd get to 100k very quickly :D nearly there
Such a strange presentation format.. no preamble ,just straight to the info/knowledge 👍 Sadly my ears have degenerated to the point where the difference between pure analog and digital is no longer obvious ( wear hearing protection kids 😁 ) but then again "listening to the stereo" kinda detracts from the pleasure of the actual music
@@ericscaillet2232 I'm fairly certain he is referring to hearing damage. Retraining just isn't always a possibility when your hearing range is shortened and possibly has frequency loss.
With a lossless digital format (CD, FLAC, WAV) there is no difference anyway. On a good lossy format (like OPUS/256kbps) you'll also struggle to hear a difference even with the best ear and gear. (Note that MP3, even at it's highest bitrate of 320kbps is *not* a good lossy format by modern standards! Newer formats such as AAC, OGG and especially OPUS make use of an improved understanding of auditory masking and improved compression algorithms, allowing them to retain a higher fidelity while using less storage space. The only reasons why people still use MP3 is because they don't know or don't care about the numerous better alternatives)
The fact that I can't perceive *ANY* difference when trying ABX tests of 192kbps AAC and lossless audio shows how far compression have improved! (128 mp3 is easily discernable)
@@DumbledoreMcCracken 28, my 24 years old friend that loves audio gear also can't discern between 192AAC and Lossless. I've come to the conclusion that high sound quality (given you don't have REALLY crap headphones) has 90% to do with mastering and mixing, 7% to do with decent audio gear and 3% to do with what kbps you use.
Also 16 bits of audio resolution is still exactly what most compact devices output audio as today like smartphones and tablets for example so even if you have a 32 bit audio file it will still output it as a 16 bit signal that is amplified as analog audio. :)
Techno Universal That’s true. Generally I wouldn’t even use such files on my phone as you wouldn’t be able to tell the difference anyway, even though my phone DAC does support higher bit rates. Someone once sent me a FLAC 24/192 (it was around 200 MB for 4 minutes), and he claimed it was the best thing ever this and that. At first I thought he was right, as it sounded like night and day compared to Spotify. But I wasn’t born yesterday, so I converted the FLAC to MP3. I don’t want to claim there is no difference, but I couldn’t tell any. I let him hear it, and he wasn’t having any of it and he thought I was faking it. I really think some people are delusional haha.
Marvin P. Yeah plus apparently the DACs in IPhones that had headphone jacks never supported anything more than 16 bit audio however it’s still likely that they can output up to 32 bit audio in a digital format through the lightning port or maybe also through Bluetooth! So you would be able to benefit from it if you had the phone connected to a large audio system via the lightning port that supported 32 bit audio however you wouldn’t benefit from it with headphone adapters as the adapters all still have 16 bit DACs built into them for cost saving purposes! :)
Techno Universal I’m pretty sure the iPhone Lightning connector and Bluetooth (using aptX HD/LDAC codec) both top out at 24 bit. At the end of the day, it doesn’t matter too much on a mobile device. Based on my experimentation, I came to the conclusion that the way the music is mastered is more important than these numbers. I don’t think people care too much about it either, as most just use Spotify on their Bluetooth headphones anyway.
Marvin P. Yup while Bluetooth headphones would probably only use 16bit audio anyways because of the limited bandwidth of a Bluetooth connection and to save on manufacturing costs! So yes primarily higher bit rates would be more important for mastering than anything else! :)
This video needs more views. This is a very succinct explanation of digital audio, and the underlying processes. Also useful for use against vinyl fanbois who say digital audio is crap.
Yeah... I’ve been listening to music for years now! I can barely tell the difference between vinyl and compact disc. Unless you have a really good stereo, most f it will sound the same. I will say that Dolby wrecks a piece!
If that's true, you're psychoacustically editing out the clicks, pops, and hiss of vinyl. They are there, your brain is just filtering out the upper frequencies.
Was this presented in a compressed form, like Meaningful Data compression layer 3? Because there was definitely an hours worth of useful information in this video.
Great video. I’ve never dipped into MP3 encoding before so you have helped me understand this better. Just one point. You show three clips of music recorded at increasing sample rates. Please note that viewers will not hear 48 kHz audio through TH-cam - all TH-cam plays at 44.1/16 causing your 48 kHz audio to be down sampled. Artefacts may be created going between 48 and 44.1 kHz as interpolation will be required by TH-cam‘a audio algorithms due to the mismatch in sample rates. For best results these clips should use rates that are powers of 2 of the base rate, such as 11.025 kHz , 22.05 kHz and 44.1 kHz, 88.2 kHz on TH-cam.
Downsampling 48 KHz audio to 44.1 doesn't cause artifacts. 44.1 and 48 KHz are both above the Nyquist frequency, so both capture the entire range of audible frequencies. A 20 KHz signal will not be distorted when samping the audio created by a 48 KHz sampling rate to 44.1 KHz.
7:52 While Fourier transform is used in audio processing, the shown method of frequency filtering is called linear phase equalizer and is only used in special cases for melodic sounds only. It's absolutely terrible with percussion, since it introduces noticeable pre-ringing which destroys vital transients. Moreover, it adds a large processing delay required for the transformation window. It's best to avoid ever using Fourier transform explicitly and instead derive a time domain operator which can do roughly the same job. In fact, this way it's much closer to analog filters.
There's nothing in his high-level description that suggests linear-phase filtering. Theoretically any filter can of course be represented in frequency domain, though practical FFT-based filtering will generally be FIR. But while every linear phase filter is FIR, the converse is not true. (Also, a common application for linear-phase filters is in steep low-pass filters (e.g. as part of resamplers) where they do not introduce ringing on top of whatever is intrinsically present in band-limited signals)
2:55 this is the first time that I've heard someone claim this, so a citation would be nice. By the way an increase in SNR is a good thing so probably the intention was to claim that the SNR decreases
Nissim Trifonov I agree with you, I think he got it backwards. Noise increases as the signal frequency gets closer to the Nyquist frequency, not the SNR. The SNR gets lower.
Try using different headphones. To me they sound totally different and I didn't even use my analytical headphones. If you are using bluetooth or earbuds, that's likely your bottleneck.
Oh my, oh my... this is the first video I see from this channel... 45 seconds and I'm already suscribed... 15min later it's like "oooooooohhh so that's what the professor meant on my DSP class 17 years ago"
For vast majority of people accuracy of determining between 128 and a higher bitrate is pretty much in a guessing range, when it comes to blind testing. And even those in "the know" would only have better chance of telling the difference by actively looking out for signature artifacts of a lower bitrate. So don't spread that cork-sniffing audiophile bullshit. 128kpbs is a perfectly serviceable bitrate for many applications.
@@Quicksilver_Cookie If you can't tell the difference between a 128kbps and a 320kbps mp3, then I'm sorry for you. I mean, sure, I can't tell the difference between 320 mp3 and say a flac, but 128kbps just doesn't cut it. Especially nowadays when even phones have tens if not hundreds of gigabytes of storage.
128 kbps is okay for casual listening. Spotify mobile only 96 kbps. I download most of the podcasts I listen to on my phone at 64kpbs when I have the choice. I don't know think I'd be enjoying them any more at 320 kbps.
Don't know how/ why YT algo suggested this, it is not related to any of my usual content consumption but the production quality and eloquent delivery of the information made it interesting and accessible to me (an illiterate ignoramus). Thumbs up 👍
I'm not sure that's fair. They both address different aspects. TC tends to also concentrate on the machine itself and the different types of encoding/decoding involved. They were really both excellent.
@@rationalmartian yep I understand most of what Tech Connections is talking about, this video has too much jargon I don't have enough background knowledge to comprehend
This was amazing! That was so much good information, and so well-delivered. The visual were extremely helpful, too. I teach an audio production class, and would love to use this video to illustrate some concepts.
I just literally recorded a piece of this video to audacity and a quick spectrogram shows that the audio cuts off at around 20Khz, maybe 19,8khz. so no, that is not correct. Also a google search comes up with 44.1Khz which seems accurate.
I can only discern a lossless from a 320kbps quality audio in very specific genres of music...like hi hats and crash cymbals from heavy metal / Rock (especially those albums that were recorded and mastered using analog tape) jazz and those cellos and percussion from classical music....but I have to be really in a quiet environment using a high end earphones...is the additional space worth it on the lossless? frankly speaking NO
because it is part of a system that was developed for encoding video, inwhich audio is apart of that system, the audio portion is used as standalone for mp3 because music is a thing
Good job, I wonder what the extremes of sensor equipment are. Like the microphone with the highest useful bitrate, or when it comes to video, highest framerate. Oh, what about the ratio between the number of pixels and the framerate for common cameras, and why that is? Okay, just did the math and using a unit I just made up (frames per second per megapixel) VHS Pixels: 159840 Megapixels: 0.15984 Frame Rate: 30 FPSPMP: 4.7952 Generic full hd camera Pixels: 2073600 Megapixels: 2.0736 Frame Rate: 60fps FPSPMP: 124.416 Red Weapon Pixels: 33177600 Megapixels: 33.1776 Frame Rate: 75fps FPSPMP: 2488.32 I know there's probably a much easier way to display that, most likely bit rate or something, but it was still fun for me to do the math.
not sure to understand, microphones are analog equipment, so you don't really have a bitrate associated to it (if I understand what you're saying) you have some limitations, like the bandwith, maximum level, and you could also have the dynamic range, but I guess they never give the dynamic range of a microphone because the preamp of the mic is what is going to limit the most the dynamic range (I guess ?)
I've found many microphones give frequency response curves. For instance the microphone I typically use shows that it responds fairly strongly to frequencies up to about 200 hz, is quite linear up to about 8khz, then drops off sharply, hitting near zero response at or above 12khz. However, the thing about microphones and audio equipment, quite aside from any considerations of analogue equipment... Is it depends on the purpose. There's no real point to creating headphones or speakers that can output much past 20khz if it's intended for humans. But if you're experimenting on mice, cats, dogs or bats or the like, you may suddenly have need of equipment that can handle 40,50,60 khz... Or maybe even 120 khz. There's other use cases too where much higher frequency responses are useful. And for instance an ultrasonic rangefinder might operate in a few tens to hundreds of kilohertz, but fundamentally it's still basically the same tech as audio equipment. In fact, within certain limitations it's possible to use the sound card on a computer to perform tasks that have nothing to do with conventional audio processing... Such as things you'd typically associate with an oscilloscope - since, after all, the inputs are actually electrical, not audio... By the way how did you derive your estimates for VHS? VHS is an analogue standard, and directly encodes TV signals, but there's a wide variation in what precisely it encodes. There's also considerable variation in quality between a PAL and NTSC recorder. Plus it records interlaced video directly. For an NTSC video signal you can expect it to record 262.5 lines per field at roughly 60 fields per second. Luma bandwidth is equivalent to somewhere around 240 to 256 pixels per line (again this is an analogue signal, so these aren't strictly speaking pixels), but chroma bandwidth is closer to 80 pixels per line. PAL would result in 312.5 lines per field at 50 fields per second. Luma bandwidth is comparable to NTSC, but from what I remember reading the chroma bandwidth is about 140 pixels... Except that due to how PAL works this gets averaged over two lines, so the effective chroma bandwidth is even lower, even though it's higher on an individual line. There's other factors; the oldest VHS standards had mono audio; later standards had stereo. HQ VHS has higher resolutions in general due to design changes in how things are recorded and better tape quality. (there is also a HD vhs format, believe it or not. But it's obscure and the tapes are digital; Quality is 1080i with similar quality to bluray) There's also implications caused by LP, SLP and equivalent settings for tape. So... If we assume a PAL recording with Hifi stereo and make some simplifications that aren't strictly speaking accurate for an analogue signal... And assume this works out to 1 byte per channel for the image data. (stored as luma + 2 chroma channels). Audio is broadly equivalent to 22khz 16 bit audio... We get something like Luma 256x625x25 (really 256x312.5x50) + chroma 625x140x25 (really (140/2)x312.5x50 for each of two chroma) channels = ~6,187,500 bytes per second for video. plus about 90,000 bytes for audio. This leads to something close to 6 megabytes a second equivalent on VHS. Again though this is analogue video, not digital. Furthermore it's completely uncompressed. And when you work it out this means an hour of VHS tape stores the equivalent of a good 21.6 gigabytes or so. Does that make sense? Well, there ARE weird archiving devices that store data on VHS - the method they use isn't overly efficient, and they need a lot of error correction to compensate for the analogue format... But it gets to about 2 gigabytes an hour. But as I said, there is an obscure HD format on VHS tape (tape is essentially identical - only the player differs) which stores 1080i compressed digital video. But still depends on the fact that when properly utilised for digital storage, a VHS tape has a capacity on the order of 50-100 gigabytes... So... You know. Older media can be surprising. XD This realisation comes about largely due to the fact that these old methods were essentially uncompressed video; It follows, unsurprisingly, that uncompressed video takes up a LOT of storage space. Hence VHS tape can store a LOT of data, in practice.
@@KuraIthys Honestly I don't know, what I calculated was most likely completely useless for almost any situation, I just like doing conversions. And I just got the basic pixel measurements off wikipedia, it wasn't for vhs in particular, I just looked at the low definition video quality section and vhs was the first one I recognized.
It's funny, I've repeatedly run into 'hearing test' videos on youtube where anything above 15khz is basically inaudible, and anything above 17khz is definitely inaudible. Now, you'd think it might be me, or my equipment... Except I have headphones rated for up to 28 khz, and I've tested this against pure tone generators and can still clearly hear things up past 18-19 khz depending on how loud it is. So... Even if 48khz audio is in use, my practical experience with most videos would suggest other limitations are in effect on youtube.
@@KuraIthys 48000 hz is a sample rate. You're talking about audio spectrum. But you can hear only up to 15000 a 17000 hz on TH-cam? That's not o.k. at all. That means, you're missing all high tones!? Good luck... I do not know what the problem is :-/
I am an Audio DSP engineer, amazed at how much information, is packed coherently into the video.
Amazing overview for any engineer who wants to understand audio
And… for us dilettantes…. It is truly remarkable to learn about the astounding wizardry that is embedded in even the casual sounds that we hear
Not all correct though. Reducing the sample rate doesn't decrease the signal to noise ratio relative to a band-limited signal because the Nyquist theorem works, bitches!
@@joshuascholar3220
Thx for the comment… though I confess to not having any idea what you are talking about… lol
I guess that when you get down into the weeds, nothing is ever simple
@@MrArdytube it means that with perfect filtering, any sample rate over double the rate of the highest frequency will do. If the highest frequency in the signal is 20 khz, then sampling at 96 khz doesn't make the result any more accurate than sampling at 48 khz.
@@joshuascholar3220
Thx for the explanation. On a slightly different topic….can people actually differentiate high frequency tones? For instance, for me…. Higher than 10k all sounds pretty much the same
Mixing engineer here. One thing you should've mentioned is the loudness wars. Over-compression has taken over the music industry. In theory, songs that are louder will stand out more, and thus be more likely to get played.
But there is a point at which the song is so over-compressed that it loses it's sonic quality. When this happens, it becomes flat, irritating, and lifeless. Ever notice how some videos are excessively loud at half volume? It creates an adverse effect because the consumer is already annoyed.
The hardest part of mixing is finding the balance of compression and dynamic range while also keeping in mind the overall level of it.
Truly talented engineers find it. But often times it goes unnoticed because of the current state we're in. Hoping to see it change one day.
This was mentioned in the previous video, specifically to do with advertising.
tHe LoUdNeSs WaRs my dude the skill to understand what this even means died 20 years ago. There's no loudness war, we're lucky if anything dumped to Spotify cares at all, even slightly, about dynamics. You're an old man yelling at a cloud
"This is what the youtube compression sounds like"
lmfao :p
Right
Fact
Fact
Fact
I enrolled in an Audio Engineering program at the beginning of this month and the timing of this series is impeccable. Thank you SO much for all your hard work, this was super helpful!
Your channel is insane... this is the type of content I can't get enough of, your videos are top-notch.
Well said
no they arent
th-cam.com/video/DuMciNIzDtM/w-d-xo.html
@@G-G._ why?
@@RodolfoGeriatra dont @ me mexican
Not only is the content informative, there is no begging for subscriptions at the end.. What a great channel! I subscribed! Keep up the great work!
"begging" here meaning if you liked dont forget to subscribe for more. Im guessing youre one of those guys that yells get a job at people with will work for food signs
@@realitynowassigned I'm missing the correlation here. Telling people to do what they're already doing ≠ remembering them an option they might not be familiar with.
@@realitynowassigned no, I do not yell at people down on their luck, but I also do not reward anyone with a subscription who only seem to care about that. Create quality content, and the number of people who want to see more will increase. P.S. I hope you get back on your feet soon.
It should be noted that a major implication of the Nyquist-Shannon sampling theorem is that a signal, band-limited to frequencies below the Nyquist limit, is *always* reproduced *perfectly* with PCM encoding. You’ll only introduce noise if you don’t band-limit your signal to 1/2 the sampling frequency before encoding it. The upshot is that you can simply set the sampling rate to double that of the highest audio frequencies that humans can perceive, remove sounds that is impossible to hear, and encode sounds that are physically impossible to distinguish from anything with greater sampling rates.
This is why 44.1kHz was chosen for CD, with a bit of a buffer for equipment quality concerns, plus a bit of math determining the exact rate for technical reasons involving pre-establish standards in other areas, differing between countries, etc.
You still have to worry about the analog parts of the process, like microphone quality and the accuracy of the ADC and DACs at either end. You might also just want a higher sampling rate to reduce information loss when editing and applying effects, or for any number of reasons that hopefully don’t include a misguided attempt to ignore absolute mathematical facts.
One minor correction, it's not perfectly reproduced, as you still have the quantization noise. If the samples were perfect then it would be a perfect reproduction but that's obviously impossible with PCM.
@StringerNews1 I'm not one of those analog morons who thinks a record player can hold a candle to the SNR of modern digital audio. What I'm saying is that he's claiming that some arbitrary band limited signal can be reproduced perfectly by PCM because of the Nyquist-Shannon sampling theorem, but that's entirely glossing over that the sampling theorem is dealing with continuous values for discrete samples whereas digital PCM is discrete values for discrete samples. PCM can exactly reproduce arbitrary band limited signals that are already quantized to the bit depth used but you inherently always have some quantization noise if you're using any digital sampling. That quantization noise for anything reasonable like 16 bit LPCM is going to be so low that human ears will never be able to discern the difference, but the original comment claimed that the only noise was if the signal wasn't band limited. There is no discrete value equivalent of the Nyquist-Shannon sampling theorem.
StringerNews1 I’m with Andrew. Since pcm is a digital method it cannot be separated from bit depth. Digital bit depth is the equivalent of analogue fidelity and dynamic range, which is a discussion way beyond Nyquist. I work with small signals in X-ray detection with dsp and sampling rate is only half the story at best.
Digital sampling is rather straight forward.
Nyquist tells us the frequency-range that can be reproduced, the bit-depth gives the dynamic range. In total for CD that means we can reproduce any sound perfectly that falls into the 0-22kHz frequency band and is has 96dB dynamic range. All higher frequencies present in the input-signal leads to artefacts due to aliasing and all sounds lower than -96 dB are lost during quantisation and fall into the noise-floor. As we can not hear arbitrary quiet sounds and 0 dB was fixed at the lower threshold of human hearing the signal is, at least for human consumption, equivalent to perfect - unless you want to argue that it should include volumes which cause permanent degradation to your ears.. (And no - transients are nothing special and fall under the exact same rule.)
Normally for production audio is sampled at higher rates and higher bitdepth but that change compared to CD is *NOT* something any human can hear but still very necessary for production as it allows alterations to the sound without dropping below the quality that we can notice artefacts again.
Sound is thus often captured with a 120dB dynamic range. That means a signal that is too faint to hear in a sound-proof studio can be boosted to whispering volume without introducing any audible noise, or boosting a quiet conversation to the volume of a passing truck.
And lastly - you gotta be a special type of retard if you would claim that 96 dB 44kHz is not enough for listening as this can quit literally reproduce the sound of breathing while simultaneously being at a busy intersection.
@@pieteruys2032 Any proof that humans can distinguish between transients and their band limited versions?
Wow! Dude, you just explained entire DSP(Digital Signal Processing) under 20mins that I took multiple attempts to clear the course in college 😅
Wondered when someone would put a decent video about this up... Half way thru I was thinking this was gonna be another yawner about PCM... But he delivered on the goodies
There's WAY MORE to it, champ. Way more.
Another awesome video. Thank you and I'm glad to see this channel grow like it has. Keep it up!
As someone who studied this subject at a basic level many years ago , and I am certainly no expert , I'm still amazed how complicated this topic can be . However what's more amazing is reading the comments below from people , who I guess have expertise in audio engineering , correcting each other. Someone must be wrong . So anyone trying to understand this subject should read a book on it and as always, pay little attention to TH-cam comments , even mine. Happy listening everyone !
Holy s... This channel is pure gold, this video was INCREDIBLE!! Very well done, complete, and, with a little background, very easy to follow and understand.
This is the best explanation I've ever seen!
So many times I googled "what is sampling rate" and couldn't understand the explanations.
Finally it's clear what is it all about, thank you so much!
the invent of digital audio is nothing short of brilliant. Man kind at its best.
Well I find this fascinating, though whilst listening to this video via a phone. I literally can't hear a difference.
@@Strobie1kagobo You've got a point here. I don't know about Andrew Rudolf, but most people these days listen to their music on poor devices. Not to mention they play their music everywhere there is noise around: on buses, in the streets etc. Or on some shitty tiny bluetooth speakers at home. Only a few dinosaur audiophiles are still in quest for the best audio quality... that's the irony of having better techniques nowadays. It's lost its value.
More likely the bit compression of TH-cam streaming killed any distinction.
@@lordsharshabeel no, TH-cam compression has nothing to do with it. Listening using at least half decent speakers or headphones the distinction can be clearly heard. (I used half-decent headphones right now at work and I heard the differences).
Phones have tiny speakers with absolutely horrible signal to noise ratio and poor reproduction of anything but midrange. In the case of bit depth the higher noise can't be heard because the speaker itself is already noisy. In the case of sampling frequency the difference is in the high frequencies which the tiny speaker can't reproduce accurately so the distinction is blurred.
So the reason OP didn't hear the difference is because his listening device is piss poor at sound reproduction.
@@Strobie1kagobo poetic
I could hear the difference on my iPad, maybe the fact your phone is outputting mono sound is the problem? I don’t know though, I barely understood the video
This video was really amazing. Altough I'm studying music & sound design and have learned all of this before, I gained a much better overview over the subject through this! Thank you for this amazing content!
Your use of visuals has been the best way I have come across to explain this part of digital/analogue audio!
This is a very good video! I do have just one minor nitpick - when you discuss the Nyquist limit at 2:45, you mention that the "highest possible frequency that can be captured" is equal to the Nyquist limit. while it is true that you can represent a sine wave at the Nyquist frequency, you cannot *reliably* capture it at that frequency, because the sampling accuracy is 100% dependent on the phase of the Nyquist tone. It is more accurate to say that the highest possible frequency is just below the Nyquist frequency, or in other words, is strictly less than the Nyquist frequency.
I imagine if you were to capture a frequency at the nyquist limit then, it would be the amplitude that is misrepresented, as it would be capturing at the exact same point on each cycle and if not exactly in phase would never 'see' the full amplitude, whereas lower frequencies can build up a better overall picture.
Exactly
Fantastic and easy to understand explanation. I wish all videos on TH-cam were as clear and concise as this. Congratulations.
Great video, but my understanding was that CD formats use a spiral track, not concentric rings. It's a servo-based system, commonly using the signal reflection to 2 or 4 photodiodes to optimize positioning. You seek to a little in front of where you expect the track, wait for a track to come by, lock the servo on it, read the position of the stream, and then determine if you want to play, seek, or wait a little.
Yep, you can find more in Technology Connections where they have several videos about digital audio and CDs.
It's spiral yes. But your understanding on the rest is off in a couple spots.
The tracking is indeed handled by photodiodes that are on the fringe of the sensor array, so that when pits start to show up on them because the spiral is making them drift, the servo can adjust the lens to center it back on the data diode. Focus is handled in a similar way; instead of causing the servo to move toward the center or edge of the disc, when the "fringe" photodiodes both see the same conditions, the beam is out of focus and it adjusts up and down (with a little bit of a wiggle to determine direction, story for another time lol).
When the lens is reaching it's limit (the controller knows this by how much current is flowing thru the servo coils), the sled motor kicks the whole assembly over a small amount so the lens servo can continue it's job of keeping things in line.
There's also no "waiting for anything to come around", unless the drive is being used for random access (like CDROM, or in the case of audio, doing something like seeking or changing tracks).
While playing a CD, the servo follows the entire track end to end, in just about real time (ignoring processing time and buffering), continuously collecting the stream as it goes past the lens, just as record needles follow the groove. The only real difference on that level is with a record, there's a physical guide to keep the needle where it needs to be to play the music... With CD, this has to be done with a non contact method, and so the extra photodiodes to tell the lens where to be.
If you watch a CD lens while a disc is playing (from the side of course lol), you will see it compensate for every little ripple and warp in the disc, just at a higher speed, than it's old school mechanical counterpart.
Also the data on a CD is not "stored in pits an lands", as he said it, but instead it's stored in the change between pits and lands.
It's not "land = 0, pit = 1", it's when the laser notices a change from pit to land and vice-versa, that it will read 1. As long as it stays in the pit or land, it reads 0.
This may be come off as nitpicky, but I think it's a quite interesting little fact
@@MadScientist267 there's one other significant difference from the analogy with a record player - the laser reads one track for one stereo channel, then the next track for the other stereo channel, and then moves on, alternating left and right as it goes, which necessitates one channel always being heard slightly later than the time it was actually read off the disc, by a difference of the time it takes for the laser to read one revolution of the disc. Or so I believe.
@@barthvapour Buffering fixes that
Underrated channel
You deserve wayyy more views and subscribers
This is such an interesting video. It kind of requires you to have a decent grasp of basic but varied technical terms (“but what does 5Hz sound like??”; “what is a bit??”) but if you do it is a strangely efficient, kind of fun experience. Kind of like being in a somewhat advanced class in a subject you like.
This is hand down the BEST explanation of digital audio I’ve ever seen.
CD doesn't use co-centric tracks, it's a single spiral track.
MyUsername09AZ one track. Just like a side of a record has one groove. Stunning how many TH-cam videos get this simple concept wrong
@@danieldaniels7571 They probably confuse it with floppy disks
Audio CDs don't use concentric tracks, but data CDs do. Probably that's the reason for the confusion.
sitbc
Audio CD and data CD are the exact same thing. A CD-ROM therefore uses a single spiral track as well.
And so do DVD and BluRay.
@@Brinta3 DVD RAM uses concentric circles. It may be the only one. I'm surprised BluRay kept the spiral.
I bought my first audio CD in high school in 1988. Thirty Three years later I finally have a clear, concise understanding how the sampling technology impacts the sound quality with examples that demonstrate the point. Thank You!
As I listen to this being streamed over a mobile connection to my bluetooth headset i get the feeling that somewhere along this pathway I'm not getting 48KHz audio.
@Pho Tato Not as a rule and not when TH-cam is set at the default of Normal quality. The streaming m4a (AAC) bitrate will be anywhere from 96 to 128 kbs or so with a minimum of a 44 kHz sampling rate. Upper frequency cutoff is well above 15 kHz and in fact at 128kbs, rolls off smoothly into the upper limit of the format's capabilities.
support.google.com/youtubemusic/thread/338369?msgid=348540
No need to beg when you're hot...😒
And radio fm only up to 44khz,and let's not dwell on am transmissions....48 kHz in really wasted on pop music though 😂
None of us are getting the 48KHz audio, not that it matters anyway. The highest quality sound stream available on this video is webm opus @ 160 kbps, 48000 Hz. If your browser can't play back the opus audio, then you're getting mp4a @ 128kbps, 44100 Hz.
@@ericscaillet2232 fm radio is 32khz lol
This is by far the coolest digital audio presentation/explanation I've experienced to date!
"Temporal masking". I knew this phenomenon had a name! I have this issue when watching movies on my car stereo system on my boat. Now I know what keywords to use when researching a solution. Thank you! Excellent video, btw.
Damn, this is great, high quality, everything that needs to be said is said, no unnecessary info
i went back and forth between the 8 bit and 16 bit audio and couldn't notice a single difference
i am disappointed in myself and my buying decisions
It was the noise floor. You can hear a very distinct “fuzz” (aka white noise) in the 8-bit one.
Say you have a music track that ends with silence. If the noise floor is low enough, you won’t be able to tell when the track has stopped playing. With a high noise floor, the ‘silence’ will become *more* silent the instant the track ends.
You must be going deaf
Listening through the Speakers of the ROG Phone 2 I noticed compressed dynamic range, distorted highs and a higher noisefloor. Problem is there are phone speakers that have lower dynamic range as the compression at hand.
Same here I didn't notice any difference
TH-cam implements additional compression so everything tend to sound the same. Very little dynamic range. 16 bit has more clear highs, not fizzy, if you notice hard enough.
I am listening to music on very high resolution equipment: like Stax headphones and decent home stereo too. It is amazing how well music is coded even in relatively small data mp3. I have to listen hard to hear difference between 128 and 320 kb/s or even lossless. What is far more important is: How well music was recorded and how good your listening equipment is. Well done digital audio!
Fantastic and informative content as usual. I'd love to see some videos on battery tech, particularly on what separates lithium ion chemistries from each other and what they are used for. Another video on machine learning, particularly applied to self driving would be of interest. Throw in some Tesla references, and watch your channel take off like a rocket. You deserve growth. More people should see your channel. :)
Fuck Tesla. And Elon Musk...
I am a electronic engineering student and an amateur self learning music producer and this has become the best and useful video that I've seen in TH-cam
The HiFi Store sells you high-end digital audio cables for 2000 $ per meter !
Thats a must !
It's made of GOLD bro
Don't forget, to get the most out of those 'digital' cables you're going to need 'DIGITAL' speakers and/or headphones. I recently ran into a problem receiving HDTV signals because I was using an old fashioned 'analog' TV antenna! lol
Lol... That's same as $1500 HDMI cable.
I can't tell you how happy I am to have found this video ! This information is so well presented
[6:42] Correction: the single track on a CD is a spiral, there are not multiple co-centric tracks - that is how hard drives are organized.
You just answered like, 300 questions I've always had about audio in the digital space...in one video. Amazing and thank you!
1:00 the scale you put on screen is horizontal, while bits represents the amount of possible values in the vertical spectrum.
1:34 bit nitpicky, but you display an analog mixer. There is no ADC/DAC in that thing.
2:55 this is untrue. There isn't more noise, it just can't capture anything over the niquist so it's way less high fidelity because it just can't capture anything over that frequency. This ofcourse if true if the ADC does not have a filter in place, but all ADCs have a filter.
7:29 I wouldn't put an echo in the range with dynamic processors. If that would work, reverb would be on the same list.
8:52 this is a bit of a misrepresentation. EQ does not transform the signal from the "time domain" to the "frequency domain". It's just a different visual representation, and one that is also possible in the analog domain. The "frequency domain" does not have a time axis, sure, but that doesn't mean the signal now is in a non-time domain.
I just want to touch on how much I admire the beautiful thumbnail of this video. The orange screen contrasting with the white and grey really caught my attention and made me remember this video for later, its wonderful.
Bravo! Accurate, succinct and clear from A to Z without any dumbing down. Subscribed!
Sampling and quantization are not the same thing, they are related but distinct concepts. Sampling is discretization of time while quantization is discretization of signal amplitude. The difference is important, and sampling and quantization are carried out by different blocks in a signal processing circuit. Love your videos.
As a communications engineering major, it’s interesting watch all this being used practically here. I had only studied it so far in my engineering textbooks.
by far the most detailed explanation on topics that i like!!!!
your channel is the best!!!!!
great video! worth mentioning that youtube cuts everything above 15 kHz though
So does vinyl. But I can't hear anything above 12Khz anyway so...
@@liquidsonly I have vinyl records that go up to 40khz, it all just depends on what it is.
@@cooliofoolio No good to me, I can't hear anything over 12kHz and no adult human can over 20kHz.
This is no longer the case. TH-cam has for quite some time been using a much better audio codec that will preserve up to 20khz no problem. A quick analysis of the audio in this video confirms this to be the case.
@@pkaulf thanks, I wasn't aware of the changes made, and I was starting to worry I had serious hearing problems hehe
The quality is unreal! I could easily binge six hours of these videos
Glad I got recommended this, so much stuff I never knew I never knew
I made a noise album about 5 years ago. It was, of course, mostly noise and purposely degraded sounds. After it was all done and released I wondered what the lowest bitrate was that I could bring it down to. I don't remember what the program was called, but I found one that brought it insanely low. I ended up fitting about 45 minutes of inaudible mess into just under 1.2 MB. It fit on a 💾. Now there's a stack somewhere in my basement of these because I never gave them out.
Thank you for this video. It really hit home with me.
What do i gotta do to get a few of those?
@@MilesPrower1992 TH-cam disabled messaging years ago on here so hit me up at Game Interest ok Twitter or Instagram. asingledigit.bandcamp.com/album/plastic-trigger-finger-and-polyrhythm is the album free if you want to download it, but the Bandcamp version is the high quality version, not the ultra low bitrate version.
1:21 when you explain 16-bit sampling and 65536 levels, those levels represent VERTICAL values of sampled audio, not horizontal. It is a fundamental flaw in your lecture. Horizontal values / precision is defined by sampling frequency.
While this is true technically, the Nyquist Shannon sampling theorem proves that 44.1 k recording, which is CD audio, can represent, literally perfectly, up to 22 khz which is 2 kilohertz above what any human can hear.
I agree this could have been shown better if the bracket had been drawn vertically. I had no problem with it, though, maybe because I already knew this. I took it to show that 'the samples we're seeing along here are each recorded at being one of 65,536 different amplitudes' - with the horizontal bracket showing a range of different samples all at different heights. I think due to the commentary and the context, probably very few people got the wrong message.
@@pilotavery With due respect, I don't think you've understood or correctly interpreted Maciej's objection. He's pointing out that the 65,536 levels mentioned are vertical/amplitude levels, yet the bracket on the graphic is drawn horizontally. He has a good point and the presentation would be even clearer if the bracket was drawn vertically.
Also, the nyquist 'limit' is not a technical limit. Off course higher frequencies can be captured. The nyquist curve represents the limits of human hearing, thereby considering higher frequencies obsolete.
@@boydrijkvan6500 the Nyquist Shannon limit, is by definition, the limit.
It says nothing about the bandwidth of human hearing. All it says is that you have to sample at twice a maximum frequency you want to capture.
If you want to capture frequencies above the Nyquist limit, then it wouldn't be above the Nyquist limit because you would be increasing the sampling rate.
You and your music and the way you do things... and the way you are has always shown just how superior you are. Thank you.
As a WiFi engineer is exciting to see waves working in sound instead of its electromagnetic form
this is the best explanation of digital audio I have ever come across! learnt a lot.
Well done.
As I've tried to teach myself electronics over the last few years, many times I've thought to myself, I wish I could find an abstract overview of various subjects. You do a great job of creating such an overview. I wish you had been around a few years ago with this upload.
Another subject that I've encountered and wanted an overview of is programming languages. It would be really helpful to have a reference that encompasses the history from punch cards to python. I've wondered how different languages waxed and waned within academic, commercial, and social contexts. What were they used for, how long, why, what were they replaced by, what better options were eclipsed by the momentum of second rate options, etc.? It's just an idea I've often wondered about.
Thanks for the upload.
-Jake
Upcycle Electronics
I too would like a programming video
Absolutely the best description for us layman I've ever seen. Bravo 👍👏👏
keep putting out this amazing content!
Soo informative and easy to understand. Gem of a channel. Please don't stop making content
Been waiting eagerly for this video! Thank you! Love your channel btw and I was here when you had 19k subs, told you that you'd get to 100k very quickly :D nearly there
By far the best overview ever made on this topic. Bravo 👏🏼
Such a strange presentation format.. no preamble ,just straight to the info/knowledge 👍
Sadly my ears have degenerated to the point where the difference between pure analog and digital is no longer obvious ( wear hearing protection kids 😁 ) but then again "listening to the stereo" kinda detracts from the pleasure of the actual music
You can retrain yourself. 😒
@@ericscaillet2232 I'm fairly certain he is referring to hearing damage. Retraining just isn't always a possibility when your hearing range is shortened and possibly has frequency loss.
Strange is a derogatory adjective, this guy put this video out to educate people and deserves more respect than that
@@chrisbeard4236 Dunno where you get that idea. If I am gunna be derogatory I don't mince my words. I very much enjoy this channel
With a lossless digital format (CD, FLAC, WAV) there is no difference anyway.
On a good lossy format (like OPUS/256kbps) you'll also struggle to hear a difference even with the best ear and gear.
(Note that MP3, even at it's highest bitrate of 320kbps is *not* a good lossy format by modern standards!
Newer formats such as AAC, OGG and especially OPUS make use of an improved understanding of auditory masking and improved compression algorithms, allowing them to retain a higher fidelity while using less storage space. The only reasons why people still use MP3 is because they don't know or don't care about the numerous better alternatives)
Best explanation of music recording and playback processes I've come across.. thankyou.
I always knew that there was a lot of maths involved in the mp3 standard, but I never realised just how much maths it was. Fascinating. Great video!
Signal processing and compression are deeply fascinating subjects.
This is my new favorite channel ! Bravo !
The fact that I can't perceive *ANY* difference when trying ABX tests of 192kbps AAC and lossless audio shows how far compression have improved!
(128 mp3 is easily discernable)
how old are you? You may have age-related hearing loss that makes your perception of sound fidelity limited.
@@DumbledoreMcCracken 28, my 24 years old friend that loves audio gear also can't discern between 192AAC and Lossless.
I've come to the conclusion that high sound quality (given you don't have REALLY crap headphones) has 90% to do with mastering and mixing, 7% to do with decent audio gear and 3% to do with what kbps you use.
@@HAWXLEADER Unfortunately, I can't hear for crap now. Age has destroyed my high frequency perception (tested by an audiologist).
I'm an audio engineer, specifically a mixing engineer, of 15 years. I also can't tell the difference in a blind test
@@DumbledoreMcCracken fortunately, the audiologist said my hearing was fine a few months ago.
the low 8bit sound example sounded way too good ..
Also 16 bits of audio resolution is still exactly what most compact devices output audio as today like smartphones and tablets for example so even if you have a 32 bit audio file it will still output it as a 16 bit signal that is amplified as analog audio. :)
Techno Universal That’s true. Generally I wouldn’t even use such files on my phone as you wouldn’t be able to tell the difference anyway, even though my phone DAC does support higher bit rates.
Someone once sent me a FLAC 24/192 (it was around 200 MB for 4 minutes), and he claimed it was the best thing ever this and that. At first I thought he was right, as it sounded like night and day compared to Spotify. But I wasn’t born yesterday, so I converted the FLAC to MP3. I don’t want to claim there is no difference, but I couldn’t tell any. I let him hear it, and he wasn’t having any of it and he thought I was faking it. I really think some people are delusional haha.
Marvin P.
Yeah plus apparently the DACs in IPhones that had headphone jacks never supported anything more than 16 bit audio however it’s still likely that they can output up to 32 bit audio in a digital format through the lightning port or maybe also through Bluetooth! So you would be able to benefit from it if you had the phone connected to a large audio system via the lightning port that supported 32 bit audio however you wouldn’t benefit from it with headphone adapters as the adapters all still have 16 bit DACs built into them for cost saving purposes! :)
Techno Universal I’m pretty sure the iPhone Lightning connector and Bluetooth (using aptX HD/LDAC codec) both top out at 24 bit.
At the end of the day, it doesn’t matter too much on a mobile device.
Based on my experimentation, I came to the conclusion that the way the music is mastered is more important than these numbers. I don’t think people care too much about it either, as most just use Spotify on their Bluetooth headphones anyway.
Marvin P.
Yup while Bluetooth headphones would probably only use 16bit audio anyways because of the limited bandwidth of a Bluetooth connection and to save on manufacturing costs! So yes primarily higher bit rates would be more important for mastering than anything else! :)
My PC can output in 24 bit. (Don't know about my mobile devices, but they have 24/192 certification so probably the same)
This video needs more views. This is a very succinct explanation of digital audio, and the underlying processes. Also useful for use against vinyl fanbois who say digital audio is crap.
Yeah... I’ve been listening to music for years now! I can barely tell the difference between vinyl and compact disc. Unless you have a really good stereo, most f it will sound the same. I will say that Dolby wrecks a piece!
If that's true, you're psychoacustically editing out the clicks, pops, and hiss of vinyl. They are there, your brain is just filtering out the upper frequencies.
This is the best explanation of digital audio I've ever heard.
Was this presented in a compressed form, like Meaningful Data compression layer 3? Because there was definitely an hours worth of useful information in this video.
tp7886 r/woosh
Comprehensive and easily understood review of digital audio. Thanks.
Great video. I’ve never dipped into MP3 encoding before so you have helped me understand this better.
Just one point. You show three clips of music recorded at increasing sample rates. Please note that viewers will not hear 48 kHz audio through TH-cam - all TH-cam plays at 44.1/16 causing your 48 kHz audio to be down sampled. Artefacts may be created going between 48 and 44.1 kHz as interpolation will be required by TH-cam‘a audio algorithms due to the mismatch in sample rates. For best results these clips should use rates that are powers of 2 of the base rate, such as 11.025 kHz , 22.05 kHz and 44.1 kHz, 88.2 kHz on TH-cam.
Downsampling 48 KHz audio to 44.1 doesn't cause artifacts. 44.1 and 48 KHz are both above the Nyquist frequency, so both capture the entire range of audible frequencies. A 20 KHz signal will not be distorted when samping the audio created by a 48 KHz sampling rate to 44.1 KHz.
TH-cam audio is 48hz unless on lowest qualities or old devices.
Wow.
I have just learned something i have wanted to know for many years.
That was fantastic. Thank you so much.
Uncompressed means not compressed. The word you should have used is decompressed, which means the inverse of compressed.
@Neb6 "not compressed" is the English word for not compressed. Decompressed does mean previously compressed, now not compressed.
I know nothing about audio and still you got me engaged till the end of the video. You have talent my guys, good video
7:52 While Fourier transform is used in audio processing, the shown method of frequency filtering is called linear phase equalizer and is only used in special cases for melodic sounds only. It's absolutely terrible with percussion, since it introduces noticeable pre-ringing which destroys vital transients. Moreover, it adds a large processing delay required for the transformation window.
It's best to avoid ever using Fourier transform explicitly and instead derive a time domain operator which can do roughly the same job. In fact, this way it's much closer to analog filters.
There's nothing in his high-level description that suggests linear-phase filtering. Theoretically any filter can of course be represented in frequency domain, though practical FFT-based filtering will generally be FIR. But while every linear phase filter is FIR, the converse is not true.
(Also, a common application for linear-phase filters is in steep low-pass filters (e.g. as part of resamplers) where they do not introduce ringing on top of whatever is intrinsically present in band-limited signals)
Wonderful video and the amount of facts crammed in is much appreciated.
It’s funny how the sound quality of recording had increased while the quality of the actual music has plummeted
Subjective
Very illustrative graphics and animations. Liked and subscribed
Okay, 16bit is enough for me.
Yes thank you for explaining that just because the wave is quantized does not mean it is not resolved back to a infinite analog signal
2:55 this is the first time that I've heard someone claim this, so a citation would be nice. By the way an increase in SNR is a good thing so probably the intention was to claim that the SNR decreases
Assuming you mean formula, I've seen it in many places and it makes intuitive sense so I don't see why it needs a reference
Nissim Trifonov I agree with you, I think he got it backwards. Noise increases as the signal frequency gets closer to the Nyquist frequency, not the SNR. The SNR gets lower.
The decrease in SNR is due to the reconstruction filter, is that correct?
The Nyquist sampling theorem really doesn't have a lot to do with SNR.
Perception isn't an illusion. An illusion is something made up by our perception, not something external which is perceived by it.
8 & 16 bit sounded almost identical
Try using different headphones. To me they sound totally different and I didn't even use my analytical headphones. If you are using bluetooth or earbuds, that's likely your bottleneck.
Oh my, oh my... this is the first video I see from this channel... 45 seconds and I'm already suscribed... 15min later it's like "oooooooohhh so that's what the professor meant on my DSP class 17 years ago"
"The 1.41Mbps requirement of CD quality sound could now be enjoyed at
bitrates as low as 128Kbps." Can it be listened to? Sure. Enjoyed? Debatable.
For vast majority of people accuracy of determining between 128 and a higher bitrate is pretty much in a guessing range, when it comes to blind testing. And even those in "the know" would only have better chance of telling the difference by actively looking out for signature artifacts of a lower bitrate. So don't spread that cork-sniffing audiophile bullshit. 128kpbs is a perfectly serviceable bitrate for many applications.
@@Quicksilver_Cookie If you can't tell the difference between a 128kbps and a 320kbps mp3, then I'm sorry for you. I mean, sure, I can't tell the difference between 320 mp3 and say a flac, but 128kbps just doesn't cut it. Especially nowadays when even phones have tens if not hundreds of gigabytes of storage.
@@8grafi8 ok
128 kbps is okay for casual listening. Spotify mobile only 96 kbps. I download most of the podcasts I listen to on my phone at 64kpbs when I have the choice. I don't know think I'd be enjoying them any more at 320 kbps.
Don't know how/ why YT algo suggested this, it is not related to any of my usual content consumption but the production quality and eloquent delivery of the information made it interesting and accessible to me (an illiterate ignoramus).
Thumbs up 👍
Technology Connections has been out done. Never thought I'd see that. Well done.
I'm not sure that's fair. They both address different aspects. TC tends to also concentrate on the machine itself and the different types of encoding/decoding involved.
They were really both excellent.
@@rationalmartian yep I understand most of what Tech Connections is talking about, this video has too much jargon I don't have enough background knowledge to comprehend
This was amazing! That was so much good information, and so well-delivered. The visual were extremely helpful, too. I teach an audio production class, and would love to use this video to illustrate some concepts.
3:33 Are you aware that youtube cuts any frequency beyond 15khz off? That means the sampling rate on youtube is only around 30 khz anyway.
I just literally recorded a piece of this video to audacity and a quick spectrogram shows that the audio cuts off at around 20Khz, maybe 19,8khz. so no, that is not correct. Also a google search comes up with 44.1Khz which seems accurate.
Just found this channel. Just straight info, I love it.
I can only discern a lossless from a 320kbps quality audio in very specific genres of music...like hi hats and crash cymbals from heavy metal / Rock (especially those albums that were recorded and mastered using analog tape) jazz and those cellos and percussion from classical music....but I have to be really in a quiet environment using a high end earphones...is the additional space worth it on the lossless? frankly speaking NO
Have you done ABX testing? If not, it's just placebo
The difference is drastic.. I hear them as completely different sounds..
Loads of information in this video, much appreciated, I'm saving this for later haha
It always makes me laught, that MP in MP3 stands for Moving Pictures...
Motion
because it is part of a system that was developed for encoding video, inwhich audio is apart of that system, the audio portion is used as standalone for mp3 because music is a thing
Xd
Ironically, VHS HiFi VCRs are one of the best devices for recording analog music.
Wow. I thought I understood how music was encoded, but this video blew me away. Thanks.
Good job, I wonder what the extremes of sensor equipment are.
Like the microphone with the highest useful bitrate, or when it comes to video, highest framerate.
Oh, what about the ratio between the number of pixels and the framerate for common cameras, and why that is?
Okay, just did the math and using a unit I just made up (frames per second per megapixel)
VHS
Pixels: 159840
Megapixels: 0.15984
Frame Rate: 30
FPSPMP: 4.7952
Generic full hd camera
Pixels: 2073600
Megapixels: 2.0736
Frame Rate: 60fps
FPSPMP: 124.416
Red Weapon
Pixels: 33177600
Megapixels: 33.1776
Frame Rate: 75fps
FPSPMP: 2488.32
I know there's probably a much easier way to display that, most likely bit rate or something, but it was still fun for me to do the math.
not sure to understand, microphones are analog equipment, so you don't really have a bitrate associated to it (if I understand what you're saying)
you have some limitations, like the bandwith, maximum level, and you could also have the dynamic range, but I guess they never give the dynamic range of a microphone because the preamp of the mic is what is going to limit the most the dynamic range (I guess ?)
I've found many microphones give frequency response curves.
For instance the microphone I typically use shows that it responds fairly strongly to frequencies up to about 200 hz, is quite linear up to about 8khz, then drops off sharply, hitting near zero response at or above 12khz.
However, the thing about microphones and audio equipment, quite aside from any considerations of analogue equipment...
Is it depends on the purpose.
There's no real point to creating headphones or speakers that can output much past 20khz if it's intended for humans.
But if you're experimenting on mice, cats, dogs or bats or the like, you may suddenly have need of equipment that can handle 40,50,60 khz... Or maybe even 120 khz.
There's other use cases too where much higher frequency responses are useful.
And for instance an ultrasonic rangefinder might operate in a few tens to hundreds of kilohertz, but fundamentally it's still basically the same tech as audio equipment.
In fact, within certain limitations it's possible to use the sound card on a computer to perform tasks that have nothing to do with conventional audio processing...
Such as things you'd typically associate with an oscilloscope - since, after all, the inputs are actually electrical, not audio...
By the way how did you derive your estimates for VHS?
VHS is an analogue standard, and directly encodes TV signals, but there's a wide variation in what precisely it encodes.
There's also considerable variation in quality between a PAL and NTSC recorder.
Plus it records interlaced video directly.
For an NTSC video signal you can expect it to record 262.5 lines per field at roughly 60 fields per second. Luma bandwidth is equivalent to somewhere around 240 to 256 pixels per line (again this is an analogue signal, so these aren't strictly speaking pixels), but chroma bandwidth is closer to 80 pixels per line.
PAL would result in 312.5 lines per field at 50 fields per second. Luma bandwidth is comparable to NTSC, but from what I remember reading the chroma bandwidth is about 140 pixels... Except that due to how PAL works this gets averaged over two lines, so the effective chroma bandwidth is even lower, even though it's higher on an individual line.
There's other factors; the oldest VHS standards had mono audio; later standards had stereo.
HQ VHS has higher resolutions in general due to design changes in how things are recorded and better tape quality.
(there is also a HD vhs format, believe it or not. But it's obscure and the tapes are digital; Quality is 1080i with similar quality to bluray)
There's also implications caused by LP, SLP and equivalent settings for tape.
So... If we assume a PAL recording with Hifi stereo and make some simplifications that aren't strictly speaking accurate for an analogue signal...
And assume this works out to 1 byte per channel for the image data. (stored as luma + 2 chroma channels).
Audio is broadly equivalent to 22khz 16 bit audio...
We get something like Luma 256x625x25 (really 256x312.5x50) + chroma 625x140x25 (really (140/2)x312.5x50 for each of two chroma) channels = ~6,187,500 bytes per second for video.
plus about 90,000 bytes for audio.
This leads to something close to 6 megabytes a second equivalent on VHS.
Again though this is analogue video, not digital. Furthermore it's completely uncompressed.
And when you work it out this means an hour of VHS tape stores the equivalent of a good 21.6 gigabytes or so.
Does that make sense?
Well, there ARE weird archiving devices that store data on VHS - the method they use isn't overly efficient, and they need a lot of error correction to compensate for the analogue format...
But it gets to about 2 gigabytes an hour.
But as I said, there is an obscure HD format on VHS tape (tape is essentially identical - only the player differs) which stores 1080i compressed digital video.
But still depends on the fact that when properly utilised for digital storage, a VHS tape has a capacity on the order of 50-100 gigabytes...
So...
You know.
Older media can be surprising. XD
This realisation comes about largely due to the fact that these old methods were essentially uncompressed video;
It follows, unsurprisingly, that uncompressed video takes up a LOT of storage space.
Hence VHS tape can store a LOT of data, in practice.
@@KuraIthys Honestly I don't know, what I calculated was most likely completely useless for almost any situation, I just like doing conversions.
And I just got the basic pixel measurements off wikipedia, it wasn't for vhs in particular, I just looked at the low definition video quality section and vhs was the first one I recognized.
Amazing channel. Can’t get enough of your info!
one day i want to be good enough to play a violin in 8bit
As a DJ, this video was highly enlightening (no pun intended)
48.000 hz? I think that is not noticeable here on TH-cam. :-)
It is
support.google.com/youtube/answer/1722171?hl=en#
@@gheegggggg Thanks. Did not expect that.
HD audio on TH-cam.... I underestimated the channel.
It's funny, I've repeatedly run into 'hearing test' videos on youtube where anything above 15khz is basically inaudible, and anything above 17khz is definitely inaudible.
Now, you'd think it might be me, or my equipment...
Except I have headphones rated for up to 28 khz, and I've tested this against pure tone generators and can still clearly hear things up past 18-19 khz depending on how loud it is.
So... Even if 48khz audio is in use, my practical experience with most videos would suggest other limitations are in effect on youtube.
@@KuraIthys 48000 hz is a sample rate.
You're talking about audio spectrum.
But you can hear only up to 15000 a 17000 hz on TH-cam? That's not o.k. at all.
That means, you're missing all high tones!?
Good luck... I do not know what the problem is :-/
@@KuraIthys You have to use 1080p for high quality audio
This is so beautifully presented. 10/10
I noticed exactly zero difference between all the examples
You must get better audio gear, the difference is definitely there.
@@CockatooDude nope youtube cuts everything above 15 khz.
@@bt3743 The first test was 8 kHz though, there was a definite difference between that and the second one.
Wow! This video can easily be named: "Life of audio signals, from artist's studio to ur home!"