recording engineer for 30 years??? duuuude. i wonder what your home studio looks like @_@ can i get your professional opinion about something? ..i came here because i was wondering if it would be worth it to get a Zoom H2, which records up to 24bit/96kHz. my goal is to get as superbly low-noise vocal recordings as technologically possible, using a Rhode NT1 on a Yamaha MG10XU mixer, via XLR Mogami Gold mic cord, with USB output to a laptop, all plugged into a surge suppressor that has an RFI filter. is the Zoom necessary, and can the Zoom record from USB anyway? also, if so, is 24bit/96kHz a bit too high for the output of a NT1? #20questions
@@prodigalus i would totally bypass your mixer, buy a class A preamp go condenser mic into that preamp the go out of the preamp to input of any interface but make sure that interface channel has zero internal preamp because you only want your class A preamp in your audio path then for monitoring either open your daw load your stereo mixed song stem and assign your mic to a new track on your daw and you can change your buffer samples to 64 so that your not getting audible latency unless your preamp has a real-time zero latency monitoring like a focusrite isa one which is perfect because on that you can also connect a compressor to your focusrite to do a bit of compression on the way in which is a really good idea anyway after all this you have to make sure your mic setup is in the best spot in your room, if you don't have a treated room your always going to get weird artifacts from the room this is incredibly important, the idea is to get the cleanest source on your way in to your daw, and if you don't have a treated room? Set up your mic where you can surround it with blankets and stay away from room corners or you will get an ugly residule bass artifact, do this and you are golden you will have a super duper clean vocal
I love this one Justin. If you ask a dozen mastering engineers what they think the difference between 16 and 24 bit audio is, they will fill your ears with superlatives and anecdotes. Please note that there are no converters or preamps that have a dynamic range anywhere near 24 bits, and, when properly dithered, 16 bit audio has "infinite" resolution, so the "Greater room for gainstaging errors" in 24 bit is a false premise, because you're not gaining the theoretical 48dB of additional dynamic range, and simply using more bits isn't changing anything except the theoretical noise floor due to rounding errors (and the dither), which the converter, preamps and especially microphones are contributing to much more than even16 bit dither is.
Also note that floating point cannot be properly dithered, and 32 bit float is a 24 bit file with an 8 bit multiplier... www.thewelltemperedcomputer.com%2FLib%2Ffloatingdither.pdf
Thanks Justin! The idea of time vs. intensity really helped me separate the two dimensions and get it straight in my head. Really appreciate your explanation here, huge help!!
00,11:44 ---- ".... you have that much more room to basically screw up...." SO TRUE ! I came up on 1 mic., 1 mono tape deck, 1 basement, in 1961. "Screwing up" in those days cost you BIGtime ! Experience is a great teacher !
Most of the point of 32 bit floating point in general is that it lets us go over 0dbFS. There are any number of reasons we might do that at some point in a signal chain. We'll turn it down or squash it down or round it off somewhere down the line, but right at this point we'd rather just let it go over. 32 bit files are mostly for when we want to "print" a signal that might go over 0dbFS. One common example is when printing a mix. You've mixed it to where you like it, then put some pseudo-mastering chain on it, then adjusted the mix some to make it maybe respond better to a later mastering step. Now you want to print that mix, but without the mastering plugins, but when you bypass or remove them, your mix peaks above 0. You could turn down the master fader, or just render it at 32 bit FP and move on. We might argue that proper gain staging would have avoided the issue altogether, but in the 32 bit world, gain staging is arbitrary, and it really doesn't matter what we did as long as it sounds good in the end.
You might want to add, that in tracking not every 32 bit recording is necessarily floating point. 32 bit PCM still does clip if you hit 0dBFS. Also, unless the recording hardware itself produces a floating point input, that includes signals over 0dBFS properly, you can still clip in tracking.
Justin, I made the decision to up my home studio skills this year and seek out good instruction on recording. I came across your TH-cam channel. I wanted to tell you that I have thoroughly enjoyed hearing all your insights and expertise. You have a wonderful way of teaching and it has helped me so much to understand the basics and beyond. Really appreciate you and your channel :)
This is by far the best video explanation of everything related to bit depth, with a little history added to help you understand even more. Also thank you for being so unbiased to the whole thing. Its rare to find videos and people like you out there lol. Liked and subscribed (:
This is awesome, thanks! I think we can accurately discuss the "data" as having a higher or lower resolution, but the term becomes problematic when applied to the perceived audio. Higher resolution data ≠ higher resolution audio. Higher resolution data = lower noise floor. Ironically, audio data/bit-depth is a great example for understanding what "data resolution" actually means. It's text-book. But the term "resolution" should definitely not be applied to the sonic result of that data (two very different domains). Great video!
You've perfectly articulated why I use 32 bit only for recording and processing, and save everything down to 16 bit (although I'd prefer to do everything in 24 bit just to guarantee there was no way to find the noise floor; I'll explain in a second). I run an internet radio station. I prepare files for airplay by centering audio files, setting peaks to 100% (actually I set at -0.1 db to avoid clipping when converting back to 16 bit), and trimming the opening and closing. Often I create edits and early fades to match the original promo edits of the songs I play. I sincerely thought when I was editing in 32 bit I was in a kinda of "fantasy" bit depth where it's impossible to clip and there's no noise floor. I've never had a peak clip, even at absolutely insane, extreme settings and levels. It makes me feel like I can't make a level error that can't be corrected with re-adjusting the peak back down (or up) to 100%. That made it perfect for editing, because no matter what I apply, I'm not adding noise or distortion. The other benefit of 32 bit is that with my station, I have broadcast-style multi-band audio processing working overtime to recreate that big, pumpy, 80s FM radio sound. (OK, I'm not quite as aggressive as the 80s stations were with things like composite clipping... but my internet station is loud, which isn't supposed to be a thing.) One of the things I make sure the system does is pull up fades as much as possible to create that "wall of sound" effect they had where the music never stopped. When you have a song slowly fading out, and another slowly fading in, you may be pulling the volume up 40db to 60 db. Suddenly that 96 db noise floor is at 36 db. Doing all I can to avoid adding any noise to that sort of situation is a good thing. ...Now, once I've got everything cleaned up and the file is set at -0.1 db, I convert back to 16 bit. :-) As you said, NOBODY (not even with compression) is going to hear the 96 db noise floor... especially when most of these recordings are from analog studio tape that probably had a 60 db S/N ratio anyway. Your explanation was outstanding, and clarified so much about bit depth that is so often presented across the internet in very muddy terms. Thanks for your efforts!!
Great explanation. In the studio, we typically capture in 24 bit. However, in the field, we often capture at 32 bit FP, since with 32bit FP there is zero chance of anyone or any unanticipated event to drive to distortion, or into the noise floor. With some types of mic’ing situations in the studio, we will capture at 32bit as well, since dynamics as a function of loudness grow geometrically with decreasing source distance, where a 1 meter 40 dB dynamic range can easily grow to 30-35 dB higher at 1 cm. Just an FYI, ProTools, whether in the DAW, or in a Carbon or MTRX hardware ALWAYS truncate at around 22 bits- regardless of settings. So unless you want that nasty truncation related distortion that can ruin a recording, do NOT assume you have the ability to use even a full 24 bit of gain staging in ProTools, let alone 32 bit. AVID has admitted this, and numerous TH-cam videos, Reddit discussions and public discourse has made note of this.
Regarding dither, adding it smooths out the fade of the audio when it gets near the lowest decibel in loudness so that the fade sounds natural, more like it does in analogue. Without dither, those last few seconds of a fade will sound brittle and choppy, the closer one gets to complete silence.
That is true, if you were actually able to hear a noise floor that 96dB below your peak level. Since you can’t, trained listeners aren’t able to distinguish between dithered and undithered 16 bit audio in proper blind listening tests. -Justin
Kind of mind blowing, sound really is just time and intensity. Our ears have just evolved to take note of the quick pressure changes caused by the creatures and objects moving around and vibrating the air when they do, like ripples in the ocean.
100%. Very insightful! I have a whole article on some of these ideas that you might enjoy: sonicscoop.com/beyond-the-basics-harmonic-motion-and-the-root-of-all-music/ -Justin
I think you have not mentioned that some recorders use a dual A/D converter to overcome the 120db dynamic range limitation of the analog circuits when recording in 32 bit float. Each of the A/D converters covers a different range than the other. That way it is possible to achieve a true widest dynamic range. In field (location sound) recording sometimes it is invaluable to have room to be able to recover from extreme situations. A good example would be a shoot and run documentary. In some occasions being able to get a whisper back to an intelligible level without touching the noise floor is crucial. The same goes when you can recover a very loud source without clipping. Thanks!
I stopped worrying about bit depth when I listenend to a pop song encoded in both 16 and 8 bits side by side and couldn't hear a difference. I'm glad DAWs use floating point though so I don't have to worry about clipping.
it sounds like working in 32-bit is the audio equivalent of using RAW files, HDR (beyond what can be displayed), or super high resolutions/quality in visual art. Not perceptible or maybe even transmittable in the final product, but keeps you safe for any crazy nonsense you want to do until that final mixdown/render. Thanks for the explanation!
So I guess the 64 bit clock in my interface isn't all that necessary! But I just wanted to thank you for the info on dithering, there just isn't enough there and some producers like Bob Katz say they wish they know more about dithering before they started. Free education has always been my favourite, thanks!
I have a question about an analogy I read. Pooring 16 milliliters into a 32 milliliter bottle. That means that other 16 milliliters of space is just silent waste, so when truncate back down to 16 bit... Does it try to include the wasteful air and get squashed with the 16 milliliters in the same 16 bit size? Or is that empty space discarded thus making 32 bit perfectly harmless for upconversion? VirtualDub2 doesn't have a 24 bit options and I'm dealing with WMA SRS audio so I'm concerned. It was only until recently that AviDemux 2.8.0 got WMA Lossless and I'm on 32 bit Windows so I can't use it. Encoding Spirited away for my XT2041DL via VP9+Opus.
But, in my humble opinion: noise floor isn’t the only sonic effect of bit depth. Bit depth, being how finely sliced the signal is dynamically, can make a difference in audio sounding more grainy or less grainy. Analog, regardless of the dynamic range, is a smooth wave. Digital is a stair stepped wave, which is easier to hear lower depth rates. Sonically, this used to translate into dirtier or more “chattery” sounding audio in older digital machines, especially as the signal got quieter. Having said that....I record at 48k/24bit, and mix at that for vinyl, or 44.1k/16bit for everything else. That seems to be good enough for just about any situation, to my ears. :)
wow. what can i say. i am no professional, but listening to this entire video, i am completely satisfied at the answers you've patiently explained. thank you so much. i truly get it now, and i can't wait to see what else i can learn from your videos!!! but... with my Rhode NT1 and Yamaha MG10XU mixer, what is the best way to eliminate as much noise as possible? i have a Mogami Gold mic cord and laptop/mixer and all plugged into a surge suppressor that filters RFI. the mixer outputs 24bit/192kHz audio via USB, but is it comparable to/inferior to an audio interface around the $200 price point? (concerning noise floor.)
I'm only here because I've been using 48000kHz forever with a 256 buffer size. However, I'm wanting to start streaming myself producing instrumentals live using OBS to stream and I don't want to have a higher input delay. Because of that, I'm considering dropping my sample rate to 44100 and also my buffer size down to something much smaller but I don't want it to have a huge impact on the quality of my recordings/uploads. Would it be a significant difference going from 48kHz down to 44.1kHz as well as dropping the sample rate from 256 down to maybe 96, or possibly even 32? I can't believe how many arguments online there are about this and everyone seems to claim a million different things! Zero clarity on this subject! (NO PUN INTENDED)
You said that your not telling us not to dither. But you also said that you can only hear it if its turned up to unrealistic level. So there's no point in doing it really? But still do it lol. Thanks though. Great info ! I listened to this a couple years ago. But it finally really makes sense now.
Yeah, 24 bit has a theoretical 144db. The very best most expensive AD/DA converters (Prism, Lynx, Apogee, Dangerous, ECT....)only give you around 120db, if that. So the technology to utilize the theoretical 24 bits (144db) doesn't exist yet as of the year 2020.
So are you saying that as a producer I shouldn't stress this topic so much and I could work at 44kHz and set my buffer size at 16 in FL studio? When you talk about 8 bit, 16 bit, 24 bit etc. that is the same thing as setting the "BUFFER SIZE" on my Scarlett Focusrite 2i2 correct?
My takeaway: 32-bit audio files are good for digital proccessing, so you can work with your sound files with the most precision and the less potential conversion artifacts when sending your signal to an audio proccessor, or sending it forth to another proccessor. However, the 32-bit is useless for the final files because no clear benefit results from converting a 16-bit to a 32-bit audio file, but files that were proccessed as 32-bit files before final output are more precise in the end.
Waves plugins, does use 32bit floating processing, like the L1 maximizer has also dither and soundshape possibilities. You can hear it very well, most material we get for mixing are 24bit 48k, we rarely see 32bit and higher than 48k, like 192k, maybe in Atmoz u use these figures, because U have a lot more channels going on? And the master is also rendered to 24bit 48k and sent to the mastering engineer.
So glad you mentioned 32 bit float and clip gain in pro tools . I was amazed at how much a difference that makes and honestly sold me on the 32bit float for processing. Thanks Justin! All the best
So it would be like taking pictures with a camera using raw vs jpeg, where raw gives you more room for error correction but it will be finalized as a jpeg to be consumed afterwards just like recording at 24 bit and then converting it to 16 bit for consumption just as sharing a raw file with people would be overkill unless it's for them to work with in a program? Also I'd like to think of the kbps for mp3 the difference between a very noisy jpeg image vs one that has very little noise but at some point the noise will be too low notice anyways unless one wants to really push to look for it. So probably 320kbps is about as clean as I imagine it can get as far as most people's listening equipment. I've been getting Flacs from Bandcamp albums that I've purchased but it's really hard to tell the difference if there really is any. Mainly I just have the storage space for it and a portable player that can play them back and has a headphone jack so I can use it without Bluetooth if I want and supposedly has a nice dac in it. The reason I looked up this video is because there is 24 bit recordings of Linkin Park but I know that it's just going to still still have constant soundness because of the loudness war and what is the point of even paying more for such recordings when it's still sound pushed to the threshold of just below clipping the sound, at least that's what I imagine how those CDs were recorded. I don't think Linkin Park has any quiet sound, it's just full throttle xd.
32:10 now this _is_ true. The effective resolution of 16-bit recorders really wasn't that good. When the 24-bit capable recorders like the Korg D1600 came out the 16-bit mode was full depth and the issue became moot.
Justin, would you be able to tell me what kind of desk that is you have behind you? the design looks like something right up my alley and I've been looking for a suitable studio desk for years. Thanks.
briiliant throughout no nonsense rhetoric off topic jargon, and focusrite is the best. Sapphire pro 24 for 2.5 years not a single issue, bumping up to the Liquid series. idc what anyone says firewire is proper real time data transfer. Great explanation, i opt for the highest bit depth offered simply for dynamics, and as you said with dithering, smoother snap. Its nice to have a literary definition put to it. search a million write ups and its all basically useless when your looking for a concise to the point layout. 24/192 is heaven. i have unique way of doing things... Thanks for efforts put forth!! Cheers!
Those are pretty dated. Focusrite does support or may drivers for those anymore esp if you aren't on Windows 10 yet. I gotten away from Focusrite because they have a tendency to abandon driver support. Motu still supports and makes drivers for all of their interfaces even if your interface is 10 or 15+ years old.
#TLDR Bit dept is for the RESOLUTION OF AUDIO DYNAMIC RANGE/AMPLITUDE INCREMENT. #ComparisonToCamera It's not comparable to visual resolution in terms of pixel, but more comparable to visual in terms of BRIGHTNESS/LUMINOSITY DYNAMIC RANGE. Higher dynamic range in camera makes you able to capture the difference in the shades of dark and bright. It might be not so important for most people/consumer, but pretty useful for people who want to do post production. in Audio: Music producer, mixing engineer, mastering engineer. In Visual: Photo and/or video editor, Retoucher. Just like in visual, to be able to see more shades of brightness, you need a camera with that specs to capture it in the first place, and a display monitor to reproduce it. In audio, it means mic and speaker/headphone. 16 bit is enough to cover everything you need. Higher bit is useful if you want to have more control and flexibility during post processing. #ComparisonToPhotoshop - when you crank up the contrast or brightness using curve or level adjustment layer to the extreme. Lower bit depth: you get parts in a big chunk. Higher bit depth: you get parts in more smaller chunk #ComparisonToNumerical - Translate it back to audio, it's in terms of audio level. Lower bit depth: you get 0,01 accuracy. Higher bit depth: you get 0.0001 accuracy #ExampleAudioToVisualComparison 1.A. Lower bit depth in audio > RANGE: silent ---> someone screaming 1 meter from you. > Whisper captured from 1 meter from you will blend with noises 1.B. Lower dynamic range in visual > RANGE: Black screen ----> white paper. > Can't capture words written on a paper in a room with low light. Even if you increase the brightness/iso, it will just increase the noises. > You can't capture iris in the eye and cloud from a backlit photo with a bright sky. you can't get the cloud details even if you darken the whole sky. you can't get the iris details even if you brighten the eye. 2.A. Higher bit depth in audio > RANGE: silent ---> jet engine 1 meter from you. > Whisper captured from 1 meter from you will still be heard above noises 2.B. Higher dynamic range in visual > RANGE: Black Screen in a dark room ----> sun > You can capture words written on a paper in a room with low light. if you increase the brightness/iso, it will gradually reveal the words until you push it further and noises show up. > You can capture iris in the eye and cloud from a backlit photo with a bright sky. you can get the cloud details even more if you darken the whole sky. you can get the iris details if you brighten the eye. Imagine all of that visual example by focusing only on the dynamic range, not the pixel resolution. (because in the real world scenario, camera with higher dynamic range will most likely have higher pixel resolution and different set of lenses.) with that said, imagine all of those examples are for 24MP camera, with the same fixed 50mm lens and the same Fstop. So the only variable left is the sensor dynamic range capability.
I don’t make records, but am very into listening to them. I’ve come to you via DAC processing per Chord QUTEST and wanting to understand what bit depth and sample rate mean. Your presentation helped a lot. Thank you!
Thank you very much for this video. So working in 44,1 khz - 16 bits is enough and the best compromise (quality/cpu ressources) ? My computer is quite old so it suits me, and no need dithering !
Thanks for such a nice explanation! I had a question on exam where I missed some points! Could anyone please explain it to me! The question is: signalfrequency is 20-20000hz, samplingfrequency is 45000hz, bit depth is 8bit all coded with pcm. What can you tell about the signalquality from above mentioned?
The only real reason to use 32-bit intermediate audio files is exactly the same as using it internally within the DAW: it’s possible in your processing or mixing to do something that pushes the intermediate output over 0db, in which case you could end up writing an intermediate file that contains clipping, which is then very hard or impossible to fix. I learned this the hard way once…
I've just received a project to mix from a client, 60 tracks at 16bit. I was going to ask her to resend it at 24bit, but after watching this, it would seem that I should be OK. Thanks for the info.
It should be, sure. It's still a best practice to always get the full resolution files, and if they are available, you may as well request and work from them, but it's REALLY not going to make or break the project when they are not available.
@@SonicScoop Thanks for the response, it's really appreciated. I've started the mix for her already using the 16bit files. There seems to be more hiss than usual, but I'm not sure if that's how they've been tracked or if it's to do with the noise floor being higher as you've mentioned? I've had to use Waves x noise on a couple of them. I think she was using a novice producer/home studio for tracking.
Do I need to use dither on individual stems as well if I render those to 24bit wav files for further editing (my project is 24bit and I use Reaper which processes at 64bit internally)?
Im so happy i found this.. i'm new to the recording world working with interfaces and daw I have a stupid question to ask that you might help me with. what would be the best settings? 441 or 48 at a 16 bit or 24 bit? thank you
48khz at 16 bit. Anything more than that is diminishing returns and no one would ever notice it (No one has ever said: “Wow, this video sounds like 16 bit”). I think the only exception to go higher is if you work on professional music videos, theatrical movies, etc that will be played on 5.1 and 7.1 speakers/headphones
I am not sure i agree, though i skipped alot because 30 min is way to long for a simple answer. Things i picked out white noise, literally not much to do with bit depth, it's more about microphones and head phones / vs / Monitor headphones , speakers vs monitors. ... My understanding is you record at 24 - 32 bit , yes there is wasted computer code and memory space. But recording at this bit allows you to save clipped. and convert bits later, able to properly recover the audio and master it undamaged . I believe this video over writes this and just says its a waste of space, or skipped the end results, but this way to dragged out. 16 bit in the end is okay, if you want to delete the unused data space. But if you start at 16 bit you are doing it wrong. If you clip , the file is destroyed, you could think you fixed it with say normalize, but all your waves will be flat top hair cut, permanently destroyed. Always record initially at 24 minimum. But i love 32 . The audio software bragging about 64 is definitely over kill. but 32 is where you want to be . You can simply test me, by going on recording to loud at 32 , save . then save same at 16 . and then using both as VS each other test subjects . try to fix the audio. You will find 16 bit permanently destroyed. and 32 was destroyed and can be made crystal clear. i have adhd and i honestly think the title of this video is very ironic.
Here’s where I get lost with bit depth and sample rate. I grew up on the Atari and Nintendo consoles. That old 8-bit sound isn’t an issue of noise floor or dynamic range. Why is is an issue of resolution at the lower rates but dynamic range and noise at 24? For sample rate, if you took it to extremes, and went to 1 sample per second I can it limiting frequency response response but wouldn’t it also be losing a lot of information. Like 24 frames per second in film looks smooth but if you took it down to 12, 6 or 2 frames per second, you’d lose a lot of information and the movement would look odd at best. What’s the difference between frames per second and audio samples per second?
I get the confusion there. I had it too at first. But what you're really hearing with old school Atari and Nintendo is primitive 8 bit synthesis. Not 8 bit audio recordings. Totally different ballgame! 8 bit digital recordings, when properly dithered, just sound like regular recordings, but with a fairly high noise floor. Try to run this experiment for yourself! Convert a file down to 8 bit (and dither it). Assuming you keep the sample rate the same, it probably won't sound nearly as bad as you expect it to sound. It'll just sound fairly noisy. If you went down to 1 sample per second, you wouldn't really hear anything as that would only be cabable of playing a sound of 0.5 Hz well below the range of human hearing. But if you used a sample rate of say 8k, you'd only hear frequencies up to around 4k. This was actually a sample rate used in some early digital telephone systems I believe. Works OK enough for speech. (By the analog standards of the day at least.) But it's way too dark for music. That would sound even worse than AM radio, which already sounds pretty bad! Hope that helps.
The 8-bit sound you got from an old Nintendo console is because they didn't use a proper antialising filter and no dithering. So if they used modern stuff to record that 8 bit sound it would actually sound pretty good.
Well and very good for a setting wherein you get to predesign and adapt your sound design to your specific capture goals, but what of the cases when you need screaming, clanging, uncontrolled environments, or you mean to capture live Foley that's cracks that noise ceiling? It seems to me like the bias of being an expert in sound makes the issues given by limited un-clipped sound range less and less of an issue. As you say "you get to be sloppier," and, yes, yees but that's critically important headroom for those of us without all the appropriate equipment and training to capture whatever we're going to need to capture. (and, presumably know and prepare in advance not to clip when one of our actors goes into a primordial tantrum, screaming out the pain of the community theater smalltown life to an ill-suited-to-the-job dynamic mic they left at the table & across the room from the condensers on the interface.
So I have a 24 bit master. Should I dither on the bounce for digital distribution? Do digital stores already do that process when they receive it? Or would you recommend or say that dithering isn't necessary anymore and that its all for taste?
@nicksterj You have mentioned multiple times that there is no way to hear the difference in bit depth, and that you are skeptical that I could hear it, and so I have provided academic sources that specifically cover this issue. For starters, In this paper, "Theiss and M. O. J. Hawksford, “Phantom Source Perception in 24 Bit @ 96 kHz Digital Audio,” 103rd Convention of the Audio Engineering Society (1997 Sep.), convention paper 4561." they tested varying bit depths and sample frequencies, and they found a discrimination rate of 94.1% when comparing 96khz 24bit to 16 bit 48khz; while a discrimiation rate of only 64.9% was reported for 96khz 16-bit vs 48 khz also at 16-bit. This implies that both sample rate and bit depth are indeed discernable, and that bit depth plays a larger role than sample rate in regards to human perception.
I'm having trouble finding this study. Can you link to it? Was it double blind? Under anything approximating normal listening conditions? Based on the 94.1% discrimination rate, I'm going to guess the answer to that is a resounding "no" :-) Likely scenario is that either it wasn't double blind, or there was some type of trickery being done that does not remotely mimic normal listening conditions. Probably both! I recall a paper submitted to AES where they took the fade out tail of an extremely quiet moment in a piece of music, jacked up the level by like 70dB and then had people listen for differences in the noise floor. Yes, THAT is possible, obviously! X-D But the part this leaves out is that if you played back the rest of the recording under these conditions it would probably blow out your speakers and your eardums :-) Is this that "study"? If so, it fails to demonstrate what it seeks to demonstrate, and confirms what we already know: That the bit depth only makes a difference if the noise floor it adds is loud enough to be heard. That just isn't the case with normal program material mastered properly at 16 bit. Adding more bits above that demonstrably makes no difference at all. This is not really controversial. I wish it were different too, but it doesn't appear to be. I hope that helps make sense of it! -Justin
Well, can't say I agree with the entire discounting of resolution (repeated generations of 16-bit processing on 16-bit data definitely results in a "sheen" that subtly degrades presence). However any advice is good advice if it takes people away from "you gotta record hot" (or worse the utter bollocks that is "you gotta use all the bits").
I"m a computer engineer who makes sound cards. I guess MAYBE you can describe it as noise floor, but it's not accurate: the circuitry around the converter determines the noise floor. 24-bit allows you to record quieter sources and capture UP TO 144db dynamic range, compared to 96db dynamic range for 16-bit. If your mic has a lot of self-noise, then 24-bit won't help you because the mic is determining your noise floor. If you have a cheap sound card like Behringer UMC202/UMC204/UMC404, then the circuitry around the ADC/DAC is low quality and won't sound as good as a MOTU. Cheap op-amps destroy the audio quality. The benefit of 32-bit audio is it's in floating-point format, which is trivial to convert from 24-bit to 32-bit float, so it's kind of a gimmick but not useless. The biggest benefit of 24-bit is that you can keep your mic gain lower, avoid using a cloud lifter for that SM7b, and you can use digital gain. The way that audio software works is they bit shift the 16-bit, 24-bit audio samples so they're 32-bit (i.e. the MSB gets shifted to the 32-bit; bit shifting is a single-cycle instruciton), so 32-bit is technically a little bit lighter on your CPU but won't be noticeable unless your CPU is peaking.
I needed this information. Everything makes a lot more sence now.. I'm stuck with 32 bit. I'm using an edirol FA-101 on windows 10 by using win 8driver. I can change from 44.1 to the maximum on the dial. It stays 32 bit. I been using mixcraft . I would of loved to be able to lower this. It's just not possible with the setup I'm using. I actually like the way everything sounds at 44.1 Because the computer can do it. Back in the old days I could get 32 tracks 24bit 96k on win xp nowadays I get less tracks but things sound grate .
I had a Boss 16 track recorder and it was 24 bit 48 but I couldn't figure out why after about 6 tracks it started sounding muddy and cloudy, it was after I got rid of it I discovered that after so many tracks it reverted to 16 bit so no science or explanation on earth can convince me bit depth doesn't matter
Correct me if I am wrong but are you not coding an analogue wave form (ie the superposition of all harmonics/acoustic reflections/phase etc.. ) in to a digital representation of it using bit depth and sample rate, yes ?? Then bit definition (as well as sample rate) will determine how well you are able to rebuild that wave form. By not having sufficient steps (ie bits) are you not restricting the resolution of fine detail (this equates to harmonics and stereo spatial placement) of the rebuilt wave form. If you do a spectrum or Fourier analysis of the original wave form and restructured waveform. Do a comparison. Do this at different bit depths and see what you get. Look forward to your comment.
@MF Nickster Thank you Nickster. It confirms what I suspected. What you are saying is the binary coding has been used to determine amplitude only. The more bits you have the louder the amplitude you can take (like the difference between using metal tape and ferric tape in a cassette deck of old) ie you can have greater dynamic range between noise and clipping. However you have a fixed minimum dynamic change as defined by one bit. This is your quantitation limit. Like I said before, real music contains allot of information, some of this may involve voltage fluctuation smaller then defined by a bit. This is the minimum resolution. signal that fall between inside this range will be rounded up ( or down ). The bottom line is when you do the D to A, your wave form will be missing that info - re harmonic content and replaced with different harmonic content consistent with statistical musical error correction used to "Join the dots". As I said before the missing content will hold information detailing the richness of the instrument(s), in a stereo recording positional placement, ambiance and a three dimensional presence. You can increase the time resolution by increasing the sample rate but not amplitude. This would explain why so many people think DSD recording sound so much more real where your sample rate is the most important variable.
Thanks’ Nickster. I think I have finally got my head around this. The key point here is when we talk about Dbs. This isn’t an absolute measurement as say a meter, kilometre, second. It is a logarithmic ratio requiring two numbers. In the context of sound, it is essentially: Db=10 x log{Base10} abs((Amplitude{max}/Amplitude{min})) This is the standard definition of a Db. When we look at an analogue signal, we can calculate dynamic range in Dbs by using to quietest bit above the noise floor and the loudest bit. When you are looking to record sound, your medium needs to be able to accommodate this variation with out loosing detail in the noise floor or clipping. This was true when we used to record music on cassette decks where the record level had to be adjusted to maximize the headroom available on the tape. This would also be true of the digital medium where the dynamic range available will be defined by the number of bits available. At this point what you say bit numbers have is mostly true. There seems to be a rule quoted by various documents saying each bit is approx. 6.02 Db. This number various with number of bits being employed in your process =10 x Log{Base10}(2 ^n x (1-2 ^(1-n)) ^2) Where n is the number of bits. I built this table based on this formula: No. of Bits No. of steps DBs per step Dbs per Bit 1 2 0.0000000 0.0000 2 4 2.3856063 4.7712 3 8 2.1127451 5.6340 4 16 1.4701141 5.8805 5 32 0.9321011 5.9654 6 64 0.5622939 5.9978 7 128 0.3287193 6.0109 8 256 0.1880110 6.0164 9 512 0.1057977 6.0187 10 1,024 0.0587866 6.0198 11 2,048 0.0323351 6.0202 12 4,096 0.0176380 6.0204 13 8,192 0.0095540 6.0205 14 16,384 0.0051445 6.0206 15 32,768 0.0027560 6.0206 16 65,536 0.0014699 6.0206 17 131,072 0.0007809 6.0206 18 262,144 0.0004134 6.0206 19 524,288 0.0002182 6.0206 20 1,048,576 0.0001148 6.0206 21 2,097,152 0.0000603 6.0206 22 4,194,304 0.0000316 6.0206 23 8,388,608 0.0000165 6.0206 24 16,777,216 0.0000086 6.0206 25 33,554,432 0.0000045 6.0206 26 67,108,864 0.0000023 6.0206 27 134,217,728 0.0000012 6.0206 28 268,435,456 0.0000006 6.0206 29 536,870,912 0.0000003 6.0206 30 1,073,741,824 0.0000002 6.0206 31 2,147,483,648 0.0000001 6.0206 32 4,294,967,296 0.0000000 6.0206 Clearly when we talk about bit resolution, we need to be careful of what we are saying. Yes number of bits does set the dynamic range available but the fact you are talking about bits also limits the amplitude resolution you can code. When I asked about the quantisation limitation you kindly pointed out that excellent video describing dithering. After reading other documents it is clear that this is a useful statistical “sleight of hand” where you overwhelm the anharmonic distortion (caused by the “stepping” from one time sample to the next), with a harmonic distortion. We as humans are very, very, sensitive to anhormonic distortion but will tolerate fairly high levels of harmonic “natural” distortion. This is why we are happy listening to a valve amp with 2 or 3 % harmonic distortion and hate transistor amps with far less anharmonic levels. There is another type of error that is simply born out of the fact you are sampling. Sample rates can be increased but bit resolution can’t be. It is effectively fixed to anything above 6.02Dbs. Now remembering that a Db is a ratio and not a fix quantity, this means you sound engineers need to maximize you record level such that it accommodates the actual dynamic range of the music fully. So if we look at a particular time slice and look at how the signal has changed from the previous time slice, the difference needs to be greater than Dbs per step. Log{Base 10} (Abs(Voltage level {current sample} - Voltage level {previous sample}) / Voltage level {previous sample})) >= Dbs per step for your bit level. eg for 16 bit it needs to be better then 0.0014699 Dbs. This was really the point I was trying to make. Look forward to you take on this..
@MF Nickster Thanks Nickster. I have to say I am enjoying this debate. Remember I coming from a position of almost complete ignorance re sound recording practices. I come from a scientific background And later with data analysis in retail (last 20+ years !). I have been interested in hifi since the late 1970's. I agree with you re there should be a sensible point where the level of resolution becomes nonsensical. My view was I suppose centered around live recordings like orchestras. I have heard a very expensive £50,000 violin played by the No.2 violist from the LSO and was struct by all the harmonics I was hearing. All these are very fine details. If you had a whole orchestra then this becomes even more complicated. And then there is stereo content phase/group delays and the rest... I did read some where that we as humans were sensitive to phase angles of 30 degrees or more and that this suggested that whilst we couldn't hear above 20K we are still sensitive to information contained there. I have yet to find an official position on this though. I got my first CD player Philips CD303 in 1984 and was struck by the clarity and bass coherence. In 1995 I listened to a LP again on a LINN Lp12 and was struck by the depth and feeling ambiance and space and naturalness. Despite the detail and dynamics of the CD something was missing as well. This really where I am coming from. I have started listening to High Def audio using my Christmas present - Cambridge audio Azure 851N. I have listened to CD 16bit/44.1 , 24bit/96K 24bit/192K and DSD64. To my ears CD was the worst where I felt the DSD64 sounded the more natural , 24bit/192K wasn't far behind. Obviously 50% of the comparison where done without being certain that they were the same masters sources. My feeling is we still don't know enough of how we hear and all the subtleties that real analogue we use/hear in building our sound model in are minds.
@MF Nickster Actually I totally agree with you. The problem is not in the technical way we do things nor our technical /scientific way we encode data. Data is king here, information rules. The cutting edge of physics seems to be saying the very essence of reality is information (see th-cam.com/video/XxVlGAFX7vA/w-d-xo.html this is good!). The problem is how we actually hear ie biological process involved in how we assemble auditory information into a coherent sound in our brains. This is why I keep harking back the degree of resolution and having more then we currently think we need. This is I believe where the issues are arising, not just in the digital world but also in analogue. Back in the early 1980s, there were allot of medium to high end Japanese amplifiers (from different manufactures) that boosted ridiculously low distortion figures of 0.000005% and yet they sounded awful ! The reason was they used used high degrees of negative feedback in esoteric named circuits. This killed phaseual information and smeared the information in the time domain. Turns out we are very sensitive to this type of distortion. When we hear, we hear with more then I ears. We use low frequency vibration in our bodies, we re aware of very low level ambiance - so much so that they limit the time people work in anechoic because near total silence can have a psychological effect. (www.theguardian.com/lifeandstyle/2012/may/18/experience-quietest-place-on-earth). You say you don't have a scientific background (yet!) What are you studying ?
But if all these considderations is true, there shouldn't be any reason for normal projects to ever need to be recorded at a higher "resolution as 48khz/16bit. So technically our hearing should actually be perfect satisfied with a recording at 44.1khz at about 10bit when compared to the noise and quality of a vinyl record. Now lets go back memory lane and remember those now iconic samplers like the Emax and Fairlight wich produces that grungy sounding 8bit sound, wich in theory should match a vinyl quality if those samplers would have been able to sample at 44.1khz. Still, if you can turn your bit depth down to 8bit with a 44khz sample rate you still get a more metallic sound simply because the waveforms get too much distorted when your level drop or raise too much between each sample, even with 44khz sample rate. The problem is really a combination of both sample rate and bit depth, as the different standards used over the years raised the bar on both sample rate and bit depth. Now don't rip my head off if I don't have the exact standards in memory, but typically the old standards of eg the Emax was iirc 11khz/8bit, maybe 22khz was available too, moving from 8 to 12 bit, memory chips becomming slowly larger and more inexpensive, we went typical 16, 22, 24, 32 and 36khz and the griddy sounds began to go away, but its first when we hit 44.1khz/16bit with the "CD" Quality we hit a usable standard that even people with perfect hearing wich is up to about 16khz in general can't hear any aparent degradation. (Only very few young teenagers is actusally able to hear anything over 18khz and our ears are so cleverly made that they automaticly lowers the sensitivity of very high frequent sounds from 16 to 20khz so it doesn't damage our ears, unless your standing very near a sound system where the amp is overdriven producing escessive high pitched noise. What I think this points out tho is, especially for musicians who samples sounds from old digital synths and samplers there really is no point in using your samplers max resolution of 48/96khz/24bit for these kind of sounds. Even the memory usage isn't a big concern anymore, a standard 44khz/16bit sample will be more than sufficient capturing the details of these devices. This video surely remembered me not to go all crazy on all these new highend audio interfaces, and unless you need to do some really special things, setting your interface to "just" 48khz/24bit wich everything is able to run at today is propably more than enough, and you wont either be having to think about "needing" a tb interface to be able to carry 16+ channels in 192khz and most usb 2 devices will be "good enough" for most home studios
@@RaveyDavey Most DAWs are able to do internal handing of up to 32 bit, but you still need plugins that support it. It's highly discusable tho if it makes any sense to use the much higher processing power needed to do so. But there is absolutely no point in home studio's to track (input) in higher than 44.1 or 48k/16bit I bet you noone can hear the difference if you go 24 bit or 96/192k. You are just moving 2'nd, 3'rd etc. distortion around. you might get slightly better results dependent on the material your tracking by going higher sample rate, and you might make it worse. Our ears don't have the ability to detect the dynamic range provided by higher than 16 bit resolution. Theoretically 44khz/12bit is enough to cover the human ears dynamic range.
The new MixPre3/6/10 II and Zoom F6 offers 32 bit float bitrates that is "impossible" to destroy and even pull down to 0 dB if you overload the 0-mark with 10's of dB. Have you tested this option?
We talk about 32 but float a bit in this video. It is still possible to destroy if you clip the converters on the way in, or if you export a lower bit depth file that is clipping. It offers no increased dynamic range over 24 bit in capture or playback, but you can do more demented processing to the audio without *internal* hard clipping. Hope that helps!
@@SonicScoop OK - just heard u stated there were no recorders in sale (at that time) that possibly could make use of the 32 bit float files. Might be they are overkill in terms of bitrate, but guess more will be available. And people are thrilled so far..
32 float is good WHILE Rec. just learning that. A few areas nip that 0 or above,,,,,, 4 effects. First song doing on DAW.... First actual non rushed 1 song ever. 5 nights 8 hours per night in. And At start I read on settings (PCs vary solid for music, all it does.) So yea I can run 32 float. And did. 32 at 94? But A few parts sneak into the plus EQ area. No cracks nothing. So while RECmix yea,,,, But as far as mix down. Depends on where your going I think. If I'm going to a master, I'm sending out high. With dumb jam track machines in the past I has one at 16 and one at 24. There was something just a tad more tingly in the 24. No BS. I didnt think there would be. Cause I heard the arguments. But I noticed it. Like more room, tiny frequencies the person that dont record won't. It was like a fluffier sandwich that wasn't pressed down. Only way to explain it.
Mabybe someone here can asnwer this question. I get that bit depth directly translates into a certain db range. I think its 6db per bit. I don't understand how that range is divided. What is the smallest amount of difference possible between to amplitude values? The video says that resolution is not applicable but It seems like there would need to be discrete levels or steps to changes in volume.
That’s a great question. Intuitively, you would think it would work that way. I did too! But that’s not exactly it. Just like with sample rate, you have infinite variability, but within a limited predetermined range. Think of it this way: with a 44.1 K sampling rate you can reproduce a sine wave at 1000 Hz or 1001 Hz or anywhere in between those two value. There aren’t steps in between those frequency options that are missing. The limitation is simply that you cannot reproduce frequencies above 22k. It is similar with bit depth. You can reproduce -1dbfs or -2dbfs or anywhere in between those two values. The limitation is that you cannot reproduce amplitude levels below -96dB. This seems counter intuitive at first. But recognize that in order to produce a sine wave at 1khz at -1dbfs, you need that sine wave to pass the zero crossing, infinitely below -, tens of thousands of times per second. And that is where the resolution loss occurs: At he bottom of the dynamic range, registering as noise. (And fairly ugly sounding noise too, before it is dithered and made random, like white noise.) That is my understanding of the theory of it. But even if that were wrong, the reality of how it functions in practice can be confirmed by your own tests. Go ahead and load up an 8 bit audio file and dither it. You’d expect it to sound weird and distorted and “crushed” in some way. But it isn’t. It doesn’t sound like 8 bit video game music or something, which is what people are usually expecting. It’s literally just noisy. This shocked me the first time I properly tried it too. Hope that helps make sense of it! -Justin
I've been a recording engineer for 30 years and this is the first time I can understand what this is all about. Thanks you so much!!!!
shame
recording engineer for 30 years??? duuuude. i wonder what your home studio looks like @_@
can i get your professional opinion about something? ..i came here because i was wondering if it would be worth it to get a Zoom H2, which records up to 24bit/96kHz. my goal is to get as superbly low-noise vocal recordings as technologically possible, using a Rhode NT1 on a Yamaha MG10XU mixer, via XLR Mogami Gold mic cord, with USB output to a laptop, all plugged into a surge suppressor that has an RFI filter. is the Zoom necessary, and can the Zoom record from USB anyway? also, if so, is 24bit/96kHz a bit too high for the output of a NT1? #20questions
@@prodigalus i would totally bypass your mixer, buy a class A preamp go condenser mic into that preamp the go out of the preamp to input of any interface but make sure that interface channel has zero internal preamp because you only want your class A preamp in your audio path then for monitoring either open your daw load your stereo mixed song stem and assign your mic to a new track on your daw and you can change your buffer samples to 64 so that your not getting audible latency unless your preamp has a real-time zero latency monitoring like a focusrite isa one which is perfect because on that you can also connect a compressor to your focusrite to do a bit of compression on the way in which is a really good idea anyway after all this you have to make sure your mic setup is in the best spot in your room, if you don't have a treated room your always going to get weird artifacts from the room this is incredibly important, the idea is to get the cleanest source on your way in to your daw, and if you don't have a treated room? Set up your mic where you can surround it with blankets and stay away from room corners or you will get an ugly residule bass artifact, do this and you are golden you will have a super duper clean vocal
This is probably the clearest and most complete discussion of bit depth I've seen. Also read your article Justin - thank you!!!
I love this one Justin. If you ask a dozen mastering engineers what they think the difference between 16 and 24 bit audio is, they will fill your ears with superlatives and anecdotes.
Please note that there are no converters or preamps that have a dynamic range anywhere near 24 bits, and, when properly dithered, 16 bit audio has "infinite" resolution, so the "Greater room for gainstaging errors" in 24 bit is a false premise, because you're not gaining the theoretical 48dB of additional dynamic range, and simply using more bits isn't changing anything except the theoretical noise floor due to rounding errors (and the dither), which the converter, preamps and especially microphones are contributing to much more than even16 bit dither is.
Also note that floating point cannot be properly dithered, and 32 bit float is a 24 bit file with an 8 bit multiplier... www.thewelltemperedcomputer.com%2FLib%2Ffloatingdither.pdf
Thanks Justin! The idea of time vs. intensity really helped me separate the two dimensions and get it straight in my head. Really appreciate your explanation here, huge help!!
Thank you for everything you've done and please don't hesitate to discuss more complex topics like this in the future ! 😍😍
This video by far is the best on bit depth. The dither and 32FP topics were the best I have heard explained clearly.
00,11:44 ---- ".... you have that much more room to basically screw up...."
SO TRUE !
I came up on 1 mic., 1 mono tape deck, 1 basement, in 1961.
"Screwing up" in those days cost you BIGtime !
Experience is a great teacher !
Most of the point of 32 bit floating point in general is that it lets us go over 0dbFS. There are any number of reasons we might do that at some point in a signal chain. We'll turn it down or squash it down or round it off somewhere down the line, but right at this point we'd rather just let it go over. 32 bit files are mostly for when we want to "print" a signal that might go over 0dbFS. One common example is when printing a mix. You've mixed it to where you like it, then put some pseudo-mastering chain on it, then adjusted the mix some to make it maybe respond better to a later mastering step. Now you want to print that mix, but without the mastering plugins, but when you bypass or remove them, your mix peaks above 0. You could turn down the master fader, or just render it at 32 bit FP and move on. We might argue that proper gain staging would have avoided the issue altogether, but in the 32 bit world, gain staging is arbitrary, and it really doesn't matter what we did as long as it sounds good in the end.
You might want to add, that in tracking not every 32 bit recording is necessarily floating point. 32 bit PCM still does clip if you hit 0dBFS. Also, unless the recording hardware itself produces a floating point input, that includes signals over 0dBFS properly, you can still clip in tracking.
Justin, I made the decision to up my home studio skills this year and seek out good instruction on recording. I came across your TH-cam channel. I wanted to tell you that I have thoroughly enjoyed hearing all your insights and expertise. You have a wonderful way of teaching and it has helped me so much to understand the basics and beyond. Really appreciate you and your channel :)
Awesome, so great to hear! Thanks for being here.
This is by far the best video explanation of everything related to bit depth, with a little history added to help you understand even more. Also thank you for being so unbiased to the whole thing. Its rare to find videos and people like you out there lol.
Liked and subscribed (:
I loled pretty hard at your explanation of 24 bit depth loudness and how it could kill you... nice podcast.
please do Dithering as a topic next!
One of the most underrated podcasts when it comes to audio!
This is awesome, thanks!
I think we can accurately discuss the "data" as having a higher or lower resolution, but the term becomes problematic when applied to the perceived audio. Higher resolution data ≠ higher resolution audio.
Higher resolution data = lower noise floor.
Ironically, audio data/bit-depth is a great example for understanding what "data resolution" actually means. It's text-book. But the term "resolution" should definitely not be applied to the sonic result of that data (two very different domains).
Great video!
So glad I found your channel. Please keep bring us the best information on TH-cam! Thanks for giving us your time and knowledge
So glad you found it too Jeffrey! I hope you’ll join us for some more videos :-)
-Justin
Greatest teacher for the nerds! Such a great job as always
I think that i'm lucky to understand what you just explained right here because its high level content ! Keep it up and thank you !
You've perfectly articulated why I use 32 bit only for recording and processing, and save everything down to 16 bit (although I'd prefer to do everything in 24 bit just to guarantee there was no way to find the noise floor; I'll explain in a second).
I run an internet radio station. I prepare files for airplay by centering audio files, setting peaks to 100% (actually I set at -0.1 db to avoid clipping when converting back to 16 bit), and trimming the opening and closing. Often I create edits and early fades to match the original promo edits of the songs I play.
I sincerely thought when I was editing in 32 bit I was in a kinda of "fantasy" bit depth where it's impossible to clip and there's no noise floor. I've never had a peak clip, even at absolutely insane, extreme settings and levels. It makes me feel like I can't make a level error that can't be corrected with re-adjusting the peak back down (or up) to 100%. That made it perfect for editing, because no matter what I apply, I'm not adding noise or distortion.
The other benefit of 32 bit is that with my station, I have broadcast-style multi-band audio processing working overtime to recreate that big, pumpy, 80s FM radio sound. (OK, I'm not quite as aggressive as the 80s stations were with things like composite clipping... but my internet station is loud, which isn't supposed to be a thing.) One of the things I make sure the system does is pull up fades as much as possible to create that "wall of sound" effect they had where the music never stopped.
When you have a song slowly fading out, and another slowly fading in, you may be pulling the volume up 40db to 60 db. Suddenly that 96 db noise floor is at 36 db. Doing all I can to avoid adding any noise to that sort of situation is a good thing.
...Now, once I've got everything cleaned up and the file is set at -0.1 db, I convert back to 16 bit. :-) As you said, NOBODY (not even with compression) is going to hear the 96 db noise floor... especially when most of these recordings are from analog studio tape that probably had a 60 db S/N ratio anyway.
Your explanation was outstanding, and clarified so much about bit depth that is so often presented across the internet in very muddy terms. Thanks for your efforts!!
Clear and informative - thanks for the video!
Good visual description of bit depth at 19:30
Great explanation. In the studio, we typically capture in 24 bit. However, in the field, we often capture at 32 bit FP, since with 32bit FP there is zero chance of anyone or any unanticipated event to drive to distortion, or into the noise floor. With some types of mic’ing situations in the studio, we will capture at 32bit as well, since dynamics as a function of loudness grow geometrically with decreasing source distance, where a 1 meter 40 dB dynamic range can easily grow to 30-35 dB higher at 1 cm. Just an FYI, ProTools, whether in the DAW, or in a Carbon or MTRX hardware ALWAYS truncate at around 22 bits- regardless of settings. So unless you want that nasty truncation related distortion that can ruin a recording, do NOT assume you have the ability to use even a full 24 bit of gain staging in ProTools, let alone 32 bit. AVID has admitted this, and numerous TH-cam videos, Reddit discussions and public discourse has made note of this.
Regarding dither, adding it smooths out the fade of the audio when it gets near the lowest decibel in loudness so that the fade sounds natural, more like it does in analogue. Without dither, those last few seconds of a fade will sound brittle and choppy, the closer one gets to complete silence.
That is true, if you were actually able to hear a noise floor that 96dB below your peak level.
Since you can’t, trained listeners aren’t able to distinguish between dithered and undithered 16 bit audio in proper blind listening tests.
-Justin
I always learn something new watching your channel- thanks!
The best explanation of bit depth I've ever come across. Thank you.
Nice. Love the in depth explanation. Thank you!
Thanks. Great to hear. Glad to be useful.
It is a pleasure to listen to your explanations. Thank you, Justin.
Thanks man I really appreciate this video keep them coming.
Kind of mind blowing, sound really is just time and intensity. Our ears have just evolved to take note of the quick pressure changes caused by the creatures and objects moving around and vibrating the air when they do, like ripples in the ocean.
And you can also just think of harmonies as polyrhythms
100%. Very insightful! I have a whole article on some of these ideas that you might enjoy:
sonicscoop.com/beyond-the-basics-harmonic-motion-and-the-root-of-all-music/
-Justin
@@SonicScoop Thanks! I'll check it out!
@@SonicScoop I loved the article, however there appears to be some missing links to some of the images
Alternatively, humans were designed...
Many Thanks Justin !! I've learnt a lot Today !! Think I've finally understood what goes on !! Your A Great Teacher !!
I think you have not mentioned that some recorders use a dual A/D converter to overcome the 120db dynamic range limitation of the analog circuits when recording in 32 bit float. Each of the A/D converters covers a different range than the other. That way it is possible to achieve a true widest dynamic range.
In field (location sound) recording sometimes it is invaluable to have room to be able to recover from extreme situations. A good example would be a shoot and run documentary. In some occasions being able to get a whisper back to an intelligible level without touching the noise floor is crucial. The same goes when you can recover a very loud source without clipping.
Thanks!
This is the way
Thank you sir. Now Im confident in recording at 16 bit since thats only thing I had.
I stopped worrying about bit depth when I listenend to a pop song encoded in both 16 and 8 bits side by side and couldn't hear a difference. I'm glad DAWs use floating point though so I don't have to worry about clipping.
it sounds like working in 32-bit is the audio equivalent of using RAW files, HDR (beyond what can be displayed), or super high resolutions/quality in visual art. Not perceptible or maybe even transmittable in the final product, but keeps you safe for any crazy nonsense you want to do until that final mixdown/render. Thanks for the explanation!
Great explanation, especially explaining dither!
This and mixbustv are quickly becoming my favorite music production channels. Thanks for the info man!
Justin, you're awesome!
No, you!
Wow! you explained it like a knife slicing through butter! You're awesome, Thank you!
Thank you for this. This dispelled all the audio myths I got fed. 24bit at 44.1 or 48khz is enough for everyday
24bit? or did you mean 16?
@@prodigalus 24bit. I still prefer that to 16bit. On playback it makes a difference. Also the cost of using 24bit to 16bit is next to nothing
Currently rewatching.
I love your clearity.
Dither *is* noise that is added on purpose to reduce the distortion caused by down sampling to lower bit depth.
So I guess the 64 bit clock in my interface isn't all that necessary! But I just wanted to thank you for the info on dithering, there just isn't enough there and some producers like Bob Katz say they wish they know more about dithering before they started. Free education has always been my favourite, thanks!
I have a question about an analogy I read. Pooring 16 milliliters into a 32 milliliter bottle. That means that other 16 milliliters of space is just silent waste, so when truncate back down to 16 bit... Does it try to include the wasteful air and get squashed with the 16 milliliters in the same 16 bit size? Or is that empty space discarded thus making 32 bit perfectly harmless for upconversion? VirtualDub2 doesn't have a 24 bit options and I'm dealing with WMA SRS audio so I'm concerned. It was only until recently that AviDemux 2.8.0 got WMA Lossless and I'm on 32 bit Windows so I can't use it. Encoding Spirited away for my XT2041DL via VP9+Opus.
But, in my humble opinion: noise floor isn’t the only sonic effect of bit depth. Bit depth, being how finely sliced the signal is dynamically, can make a difference in audio sounding more grainy or less grainy. Analog, regardless of the dynamic range, is a smooth wave. Digital is a stair stepped wave, which is easier to hear lower depth rates.
Sonically, this used to translate into dirtier or more “chattery” sounding audio in older digital machines, especially as the signal got quieter.
Having said that....I record at 48k/24bit, and mix at that for vinyl, or 44.1k/16bit for everything else.
That seems to be good enough for just about any situation, to my ears. :)
This is very good stuff. 👊
wow. what can i say. i am no professional, but listening to this entire video, i am completely satisfied at the answers you've patiently explained. thank you so much. i truly get it now, and i can't wait to see what else i can learn from your videos!!!
but... with my Rhode NT1 and Yamaha MG10XU mixer, what is the best way to eliminate as much noise as possible? i have a Mogami Gold mic cord and laptop/mixer and all plugged into a surge suppressor that filters RFI. the mixer outputs 24bit/192kHz audio via USB, but is it comparable to/inferior to an audio interface around the $200 price point? (concerning noise floor.)
This is just THE video I was looking for. Thanks!!!!!
Great stuff Justin 🙂
Awesome video, just as valuable as the sample rate one!
I'm only here because I've been using 48000kHz forever with a 256 buffer size. However, I'm wanting to start streaming myself producing instrumentals live using OBS to stream and I don't want to have a higher input delay. Because of that, I'm considering dropping my sample rate to 44100 and also my buffer size down to something much smaller but I don't want it to have a huge impact on the quality of my recordings/uploads. Would it be a significant difference going from 48kHz down to 44.1kHz as well as dropping the sample rate from 256 down to maybe 96, or possibly even 32? I can't believe how many arguments online there are about this and everyone seems to claim a million different things! Zero clarity on this subject!
(NO PUN INTENDED)
You said that your not telling us not to dither. But you also said that you can only hear it if its turned up to unrealistic level. So there's no point in doing it really? But still do it lol. Thanks though. Great info ! I listened to this a couple years ago. But it finally really makes sense now.
Yes dither your 32 bit floating point audio when converting to 16 or 24 bit because of rounding errors. It's all about the math precision.
Yeah, 24 bit has a theoretical 144db. The very best most expensive AD/DA converters (Prism, Lynx, Apogee, Dangerous, ECT....)only give you around 120db, if that. So the technology to utilize the theoretical 24 bits (144db) doesn't exist yet as of the year 2020.
Truth! The analog part of the converters is the limitation, not the digital part.
So are you saying that as a producer I shouldn't stress this topic so much and I could work at 44kHz and set my buffer size at 16 in FL studio?
When you talk about 8 bit, 16 bit, 24 bit etc. that is the same thing as setting the "BUFFER SIZE" on my Scarlett Focusrite 2i2 correct?
My takeaway: 32-bit audio files are good for digital proccessing, so you can work with your sound files with the most precision and the less potential conversion artifacts when sending your signal to an audio proccessor, or sending it forth to another proccessor. However, the 32-bit is useless for the final files because no clear benefit results from converting a 16-bit to a 32-bit audio file, but files that were proccessed as 32-bit files before final output are more precise in the end.
Focusrite is the bomb-diggity. LS56 owner here.
Waves plugins, does use 32bit floating processing, like the L1 maximizer has also dither and soundshape possibilities. You can hear it very well, most material we get for mixing are 24bit 48k, we rarely see 32bit and higher than 48k, like 192k, maybe in Atmoz u use these figures, because U have a lot more channels going on? And the master is also rendered to 24bit 48k and sent to the mastering engineer.
So glad you mentioned 32 bit float and clip gain in pro tools . I was amazed at how much a difference that makes and honestly sold me on the 32bit float for processing. Thanks Justin! All the best
So it would be like taking pictures with a camera using raw vs jpeg, where raw gives you more room for error correction but it will be finalized as a jpeg to be consumed afterwards just like recording at 24 bit and then converting it to 16 bit for consumption just as sharing a raw file with people would be overkill unless it's for them to work with in a program? Also I'd like to think of the kbps for mp3 the difference between a very noisy jpeg image vs one that has very little noise but at some point the noise will be too low notice anyways unless one wants to really push to look for it. So probably 320kbps is about as clean as I imagine it can get as far as most people's listening equipment. I've been getting Flacs from Bandcamp albums that I've purchased but it's really hard to tell the difference if there really is any. Mainly I just have the storage space for it and a portable player that can play them back and has a headphone jack so I can use it without Bluetooth if I want and supposedly has a nice dac in it. The reason I looked up this video is because there is 24 bit recordings of Linkin Park but I know that it's just going to still still have constant soundness because of the loudness war and what is the point of even paying more for such recordings when it's still sound pushed to the threshold of just below clipping the sound, at least that's what I imagine how those CDs were recorded. I don't think Linkin Park has any quiet sound, it's just full throttle xd.
Really helpful and useful, thank you!
32:10 now this _is_ true. The effective resolution of 16-bit recorders really wasn't that good. When the 24-bit capable recorders like the Korg D1600 came out the 16-bit mode was full depth and the issue became moot.
Great info! Found it hard to understand at times though as you whispered some of your words??
Justin, would you be able to tell me what kind of desk that is you have behind you? the design looks like something right up my alley and I've been looking for a suitable studio desk for years. Thanks.
briiliant throughout no nonsense rhetoric off topic jargon, and focusrite is the best. Sapphire pro 24 for 2.5 years not a single issue, bumping up to the Liquid series. idc what anyone says firewire is proper real time data transfer. Great explanation, i opt for the highest bit depth offered simply for dynamics, and as you said with dithering, smoother snap. Its nice to have a literary definition put to it. search a million write ups and its all basically useless when your looking for a concise to the point layout. 24/192 is heaven. i have unique way of doing things... Thanks for efforts put forth!! Cheers!
Those are pretty dated. Focusrite does support or may drivers for those anymore esp if you aren't on Windows 10 yet. I gotten away from Focusrite because they have a tendency to abandon driver support. Motu still supports and makes drivers for all of their interfaces even if your interface is 10 or 15+ years old.
GREAT SHOW !!! WOW ! THANK YOU !
#TLDR
Bit dept is for the RESOLUTION OF AUDIO DYNAMIC RANGE/AMPLITUDE INCREMENT.
#ComparisonToCamera
It's not comparable to visual resolution in terms of pixel,
but more comparable to visual in terms of BRIGHTNESS/LUMINOSITY DYNAMIC RANGE.
Higher dynamic range in camera makes you able to capture the difference in the shades of dark and bright.
It might be not so important for most people/consumer, but pretty useful for people who want to do post production.
in Audio: Music producer, mixing engineer, mastering engineer.
In Visual: Photo and/or video editor, Retoucher.
Just like in visual, to be able to see more shades of brightness, you need a camera with that specs to capture it in the first place, and a display monitor to reproduce it.
In audio, it means mic and speaker/headphone.
16 bit is enough to cover everything you need.
Higher bit is useful if you want to have more control and flexibility during post processing.
#ComparisonToPhotoshop
- when you crank up the contrast or brightness using curve or level adjustment layer to the extreme.
Lower bit depth: you get parts in a big chunk.
Higher bit depth: you get parts in more smaller chunk
#ComparisonToNumerical
- Translate it back to audio, it's in terms of audio level.
Lower bit depth: you get 0,01 accuracy.
Higher bit depth: you get 0.0001 accuracy
#ExampleAudioToVisualComparison
1.A. Lower bit depth in audio
> RANGE: silent ---> someone screaming 1 meter from you.
> Whisper captured from 1 meter from you will blend with noises
1.B. Lower dynamic range in visual
> RANGE: Black screen ----> white paper.
> Can't capture words written on a paper in a room with low light. Even if you increase the brightness/iso, it will just increase the noises.
> You can't capture iris in the eye and cloud from a backlit photo with a bright sky. you can't get the cloud details even if you darken the whole sky. you can't get the iris details even if you brighten the eye.
2.A. Higher bit depth in audio
> RANGE: silent ---> jet engine 1 meter from you.
> Whisper captured from 1 meter from you will still be heard above noises
2.B. Higher dynamic range in visual
> RANGE: Black Screen in a dark room ----> sun
> You can capture words written on a paper in a room with low light. if you increase the brightness/iso, it will gradually reveal the words until you push it further and noises show up.
> You can capture iris in the eye and cloud from a backlit photo with a bright sky. you can get the cloud details even more if you darken the whole sky. you can get the iris details if you brighten the eye.
Imagine all of that visual example by focusing only on the dynamic range, not the pixel resolution.
(because in the real world scenario, camera with higher dynamic range will most likely have higher pixel resolution and different set of lenses.)
with that said, imagine all of those examples are for 24MP camera, with the same fixed 50mm lens and the same Fstop. So the only variable left is the sensor dynamic range capability.
I don’t make records, but am very into listening to them. I’ve come to you via DAC processing per Chord QUTEST and wanting to understand what bit depth and sample rate mean. Your presentation helped a lot. Thank you!
Thank you very much for this video. So working in 44,1 khz - 16 bits is enough and the best compromise (quality/cpu ressources) ?
My computer is quite old so it suits me, and no need dithering !
Good and interesting overview.
Great job! Thank you, very informative!
Thanks for such a nice explanation! I had a question on exam where I missed some points! Could anyone please explain it to me!
The question is: signalfrequency is 20-20000hz, samplingfrequency is 45000hz, bit depth is 8bit all coded with pcm. What can you tell about the signalquality from above mentioned?
The only real reason to use 32-bit intermediate audio files is exactly the same as using it internally within the DAW: it’s possible in your processing or mixing to do something that pushes the intermediate output over 0db, in which case you could end up writing an intermediate file that contains clipping, which is then very hard or impossible to fix. I learned this the hard way once…
Could you explain how a bit crusher works and how it relates to this? Thanks!
I've just received a project to mix from a client, 60 tracks at 16bit. I was going to ask her to resend it at 24bit, but after watching this, it would seem that I should be OK. Thanks for the info.
It should be, sure. It's still a best practice to always get the full resolution files, and if they are available, you may as well request and work from them, but it's REALLY not going to make or break the project when they are not available.
@@SonicScoop Thanks for the response, it's really appreciated. I've started the mix for her already using the 16bit files. There seems to be more hiss than usual, but I'm not sure if that's how they've been tracked or if it's to do with the noise floor being higher as you've mentioned? I've had to use Waves x noise on a couple of them. I think she was using a novice producer/home studio for tracking.
If the original files are 24 bit, then get the 24 bit files because bit reduction should be the final step.
@@380stroker This.
Do I need to use dither on individual stems as well if I render those to 24bit wav files for further editing (my project is 24bit and I use Reaper which processes at 64bit internally)?
Great video
Im so happy i found this.. i'm new to the recording world working with interfaces and daw I have a stupid question to ask that you might help me with. what would be the best settings? 441 or 48 at a 16 bit or 24 bit? thank you
48khz at 16 bit. Anything more than that is diminishing returns and no one would ever notice it (No one has ever said: “Wow, this video sounds like 16 bit”).
I think the only exception to go higher is if you work on professional music videos, theatrical movies, etc that will be played on 5.1 and 7.1 speakers/headphones
Very insightful, thank you!
Good stuff.
Remember that compression brings up the noise floor.
I am not sure i agree, though i skipped alot because 30 min is way to long for a simple answer. Things i picked out white noise, literally not much to do with bit depth, it's more about microphones and head phones / vs / Monitor headphones , speakers vs monitors. ... My understanding is you record at 24 - 32 bit , yes there is wasted computer code and memory space. But recording at this bit allows you to save clipped. and convert bits later, able to properly recover the audio and master it undamaged . I believe this video over writes this and just says its a waste of space, or skipped the end results, but this way to dragged out. 16 bit in the end is okay, if you want to delete the unused data space. But if you start at 16 bit you are doing it wrong. If you clip , the file is destroyed, you could think you fixed it with say normalize, but all your waves will be flat top hair cut, permanently destroyed. Always record initially at 24 minimum. But i love 32 . The audio software bragging about 64 is definitely over kill. but 32 is where you want to be . You can simply test me, by going on recording to loud at 32 , save . then save same at 16 . and then using both as VS each other test subjects . try to fix the audio. You will find 16 bit permanently destroyed. and 32 was destroyed and can be made crystal clear. i have adhd and i honestly think the title of this video is very ironic.
If you encode the same exact audio recording in 24 vs 16 bit, is there more quantization error in 24 bit vs 16 bit?
Great video!
Here’s where I get lost with bit depth and sample rate. I grew up on the Atari and Nintendo consoles. That old 8-bit sound isn’t an issue of noise floor or dynamic range. Why is is an issue of resolution at the lower rates but dynamic range and noise at 24?
For sample rate, if you took it to extremes, and went to 1 sample per second I can it limiting frequency response response but wouldn’t it also be losing a lot of information. Like 24 frames per second in film looks smooth but if you took it down to 12, 6 or 2 frames per second, you’d lose a lot of information and the movement would look odd at best. What’s the difference between frames per second and audio samples per second?
I get the confusion there. I had it too at first. But what you're really hearing with old school Atari and Nintendo is primitive 8 bit synthesis. Not 8 bit audio recordings. Totally different ballgame! 8 bit digital recordings, when properly dithered, just sound like regular recordings, but with a fairly high noise floor.
Try to run this experiment for yourself! Convert a file down to 8 bit (and dither it). Assuming you keep the sample rate the same, it probably won't sound nearly as bad as you expect it to sound. It'll just sound fairly noisy.
If you went down to 1 sample per second, you wouldn't really hear anything as that would only be cabable of playing a sound of 0.5 Hz well below the range of human hearing.
But if you used a sample rate of say 8k, you'd only hear frequencies up to around 4k. This was actually a sample rate used in some early digital telephone systems I believe. Works OK enough for speech. (By the analog standards of the day at least.) But it's way too dark for music. That would sound even worse than AM radio, which already sounds pretty bad!
Hope that helps.
The 8-bit sound you got from an old Nintendo console is because they didn't use a proper antialising filter and no dithering. So if they used modern stuff to record that 8 bit sound it would actually sound pretty good.
Great that SS is back online. Do you still do comparison vids with gear??
Well and very good for a setting wherein you get to predesign and adapt your sound design to your specific capture goals, but what of the cases when you need screaming, clanging, uncontrolled environments, or you mean to capture live Foley that's cracks that noise ceiling? It seems to me like the bias of being an expert in sound makes the issues given by limited un-clipped sound range less and less of an issue. As you say "you get to be sloppier," and, yes, yees but that's critically important headroom for those of us without all the appropriate equipment and training to capture whatever we're going to need to capture. (and, presumably know and prepare in advance not to clip when one of our actors goes into a primordial tantrum, screaming out the pain of the community theater smalltown life to an ill-suited-to-the-job dynamic mic they left at the table & across the room from the condensers on the interface.
So I have a 24 bit master. Should I dither on the bounce for digital distribution? Do digital stores already do that process when they receive it? Or would you recommend or say that dithering isn't necessary anymore and that its all for taste?
Is it optimised for digital distro with -1.0db peek level?
You are appreciated
@nicksterj
You have mentioned multiple times that there is no way to hear the difference in bit depth, and that you are skeptical that I could hear it, and so I have provided academic sources that specifically cover this issue. For starters, In this paper,
"Theiss and M. O. J. Hawksford, “Phantom
Source Perception in 24 Bit @ 96 kHz Digital Audio,” 103rd Convention of the Audio Engineering Society (1997 Sep.), convention paper 4561."
they tested varying bit depths and sample frequencies, and they found a discrimination rate of 94.1% when comparing 96khz 24bit to 16 bit 48khz; while a discrimiation rate of only 64.9% was reported for 96khz 16-bit vs 48 khz also at 16-bit. This implies that both sample rate and bit depth are indeed discernable, and that bit depth plays a larger role than sample rate in regards to human perception.
I'm having trouble finding this study. Can you link to it?
Was it double blind? Under anything approximating normal listening conditions?
Based on the 94.1% discrimination rate, I'm going to guess the answer to that is a resounding "no" :-)
Likely scenario is that either it wasn't double blind, or there was some type of trickery being done that does not remotely mimic normal listening conditions. Probably both!
I recall a paper submitted to AES where they took the fade out tail of an extremely quiet moment in a piece of music, jacked up the level by like 70dB and then had people listen for differences in the noise floor.
Yes, THAT is possible, obviously! X-D But the part this leaves out is that if you played back the rest of the recording under these conditions it would probably blow out your speakers and your eardums :-)
Is this that "study"?
If so, it fails to demonstrate what it seeks to demonstrate, and confirms what we already know: That the bit depth only makes a difference if the noise floor it adds is loud enough to be heard.
That just isn't the case with normal program material mastered properly at 16 bit. Adding more bits above that demonstrably makes no difference at all. This is not really controversial.
I wish it were different too, but it doesn't appear to be.
I hope that helps make sense of it!
-Justin
Well, can't say I agree with the entire discounting of resolution (repeated generations of 16-bit processing on 16-bit data definitely results in a "sheen" that subtly degrades presence). However any advice is good advice if it takes people away from "you gotta record hot" (or worse the utter bollocks that is "you gotta use all the bits").
Have you heard of the new Zoom F6? It records 32bit float files and has no gain setting for the preamps whatsoever. Would love to hear your take on it
I"m a computer engineer who makes sound cards. I guess MAYBE you can describe it as noise floor, but it's not accurate: the circuitry around the converter determines the noise floor. 24-bit allows you to record quieter sources and capture UP TO 144db dynamic range, compared to 96db dynamic range for 16-bit. If your mic has a lot of self-noise, then 24-bit won't help you because the mic is determining your noise floor. If you have a cheap sound card like Behringer UMC202/UMC204/UMC404, then the circuitry around the ADC/DAC is low quality and won't sound as good as a MOTU. Cheap op-amps destroy the audio quality. The benefit of 32-bit audio is it's in floating-point format, which is trivial to convert from 24-bit to 32-bit float, so it's kind of a gimmick but not useless. The biggest benefit of 24-bit is that you can keep your mic gain lower, avoid using a cloud lifter for that SM7b, and you can use digital gain. The way that audio software works is they bit shift the 16-bit, 24-bit audio samples so they're 32-bit (i.e. the MSB gets shifted to the 32-bit; bit shifting is a single-cycle instruciton), so 32-bit is technically a little bit lighter on your CPU but won't be noticeable unless your CPU is peaking.
What about 16 bit vs 24 bit, would 24 bit be easier on the CPU than 16 bit?
@@swelarra No. You computer works with CPU words, which are 64-bit. 16-bit audio only saves RAM space, not CPU.
I needed this information. Everything makes a lot more sence now.. I'm stuck with 32 bit. I'm using an edirol FA-101 on windows 10 by using win 8driver. I can change from 44.1 to the maximum on the dial. It stays 32 bit. I been using mixcraft . I would of loved to be able to lower this. It's just not possible with the setup I'm using. I actually like the way everything sounds at 44.1 Because the computer can do it. Back in the old days I could get 32 tracks 24bit 96k on win xp nowadays I get less tracks but things sound grate .
I'm using an old daw the Manuel suggested using dithering when mixing down to a lower format. Is that a thing worth considering or no problem?
You need to dither when down sampling yes
I had a Boss 16 track recorder and it was 24 bit 48 but I couldn't figure out why after about 6 tracks it started sounding muddy and cloudy, it was after I got rid of it I discovered that after so many tracks it reverted to 16 bit so no science or explanation on earth can convince me bit depth doesn't matter
Maybe sample rate went from 48 to 44.1, all I know it was frustrating
Correct me if I am wrong but are you not coding an analogue wave form (ie the superposition of all harmonics/acoustic reflections/phase etc.. ) in to a digital representation of it using bit depth and sample rate, yes ?? Then bit definition (as well as sample rate) will determine how well you are able to rebuild that wave form. By not having sufficient steps (ie bits) are you not restricting the resolution of fine detail (this equates to harmonics and stereo spatial placement) of the rebuilt wave form. If you do a spectrum or Fourier analysis of the original wave form and restructured waveform. Do a comparison. Do this at different bit depths and see what you get. Look forward to your comment.
@MF Nickster Thank you Nickster. It confirms what I suspected. What you are saying is the binary coding has been used to determine amplitude only. The more bits you have the louder the amplitude you can take (like the difference between using metal tape and ferric tape in a cassette deck of old) ie you can have greater dynamic range between noise and clipping. However you have a fixed minimum dynamic change as defined by one bit. This is your quantitation limit. Like I said before, real music contains allot of information, some of this may involve voltage fluctuation smaller then defined by a bit. This is the minimum resolution. signal that fall between inside this range will be rounded up ( or down ). The bottom line is when you do the D to A, your wave form will be missing that info - re harmonic content and replaced with different harmonic content consistent with statistical musical error correction used to "Join the dots". As I said before the missing content will hold information detailing the richness of the instrument(s), in a stereo recording positional placement, ambiance and a three dimensional presence. You can increase the time resolution by increasing the sample rate but not amplitude. This would explain why so many people think DSD recording sound so much more real where your sample rate is the most important variable.
@MF Nickster That video is brilliant. Thanks.
Thanks’ Nickster.
I think I have finally got my head around this.
The key point here is when we talk about Dbs. This isn’t an absolute measurement as say a meter, kilometre, second. It is a logarithmic ratio requiring two numbers.
In the context of sound, it is essentially:
Db=10 x log{Base10} abs((Amplitude{max}/Amplitude{min}))
This is the standard definition of a Db.
When we look at an analogue signal, we can calculate dynamic range in Dbs by using to quietest bit above the noise floor and the loudest bit.
When you are looking to record sound, your medium needs to be able to accommodate this variation with out loosing detail in the noise floor or clipping. This was true when we used to record music on cassette decks where the record level had to be adjusted to maximize the headroom available on the tape.
This would also be true of the digital medium where the dynamic range available will be defined by the number of bits available.
At this point what you say bit numbers have is mostly true.
There seems to be a rule quoted by various documents saying each bit is approx. 6.02 Db. This number various with number of bits being employed in your process
=10 x Log{Base10}(2 ^n x (1-2 ^(1-n)) ^2)
Where n is the number of bits.
I built this table based on this formula:
No. of Bits No. of steps DBs per step Dbs per Bit
1 2 0.0000000 0.0000
2 4 2.3856063 4.7712
3 8 2.1127451 5.6340
4 16 1.4701141 5.8805
5 32 0.9321011 5.9654
6 64 0.5622939 5.9978
7 128 0.3287193 6.0109
8 256 0.1880110 6.0164
9 512 0.1057977 6.0187
10 1,024 0.0587866 6.0198
11 2,048 0.0323351 6.0202
12 4,096 0.0176380 6.0204
13 8,192 0.0095540 6.0205
14 16,384 0.0051445 6.0206
15 32,768 0.0027560 6.0206
16 65,536 0.0014699 6.0206
17 131,072 0.0007809 6.0206
18 262,144 0.0004134 6.0206
19 524,288 0.0002182 6.0206
20 1,048,576 0.0001148 6.0206
21 2,097,152 0.0000603 6.0206
22 4,194,304 0.0000316 6.0206
23 8,388,608 0.0000165 6.0206
24 16,777,216 0.0000086 6.0206
25 33,554,432 0.0000045 6.0206
26 67,108,864 0.0000023 6.0206
27 134,217,728 0.0000012 6.0206
28 268,435,456 0.0000006 6.0206
29 536,870,912 0.0000003 6.0206
30 1,073,741,824 0.0000002 6.0206
31 2,147,483,648 0.0000001 6.0206
32 4,294,967,296 0.0000000 6.0206
Clearly when we talk about bit resolution, we need to be careful of what we are saying.
Yes number of bits does set the dynamic range available but the fact you are talking about bits also limits the amplitude resolution you can code.
When I asked about the quantisation limitation you kindly pointed out that excellent video describing dithering. After reading other documents it is clear that this is a useful statistical “sleight of hand” where you overwhelm the anharmonic distortion (caused by the “stepping” from one time sample to the next), with a harmonic distortion.
We as humans are very, very, sensitive to anhormonic distortion but will tolerate fairly high levels of harmonic “natural” distortion. This is why we are happy listening to a valve amp with 2 or 3 % harmonic distortion and hate transistor amps with far less anharmonic levels.
There is another type of error that is simply born out of the fact you are sampling. Sample rates can be increased but bit resolution can’t be. It is effectively fixed to anything above 6.02Dbs.
Now remembering that a Db is a ratio and not a fix quantity, this means you sound engineers need to maximize you record level such that it accommodates the actual dynamic range of the music fully.
So if we look at a particular time slice and look at how the signal has changed from the previous time slice, the difference needs to be greater than Dbs per step.
Log{Base 10} (Abs(Voltage level {current sample} - Voltage level {previous sample}) / Voltage level {previous sample})) >= Dbs per step for your bit level. eg for 16 bit it needs to be better then 0.0014699 Dbs.
This was really the point I was trying to make.
Look forward to you take on this..
@MF Nickster Thanks Nickster. I have to say I am enjoying this debate. Remember I coming from a position of almost complete ignorance re sound recording practices. I come from a scientific background And later with data analysis in retail (last 20+ years !). I have been interested in hifi since the late 1970's. I agree with you re there should be a sensible point where the level of resolution becomes nonsensical. My view was I suppose centered around live recordings like orchestras. I have heard a very expensive £50,000 violin played by the No.2 violist from the LSO and was struct by all the harmonics I was hearing. All these are very fine details. If you had a whole orchestra then this becomes even more complicated. And then there is stereo content phase/group delays and the rest... I did read some where that we as humans were sensitive to phase angles of 30 degrees or more and that this suggested that whilst we couldn't hear above 20K we are still sensitive to information contained there. I have yet to find an official position on this though. I got my first CD player Philips CD303 in 1984 and was struck by the clarity and bass coherence. In 1995 I listened to a LP again on a LINN Lp12 and was struck by the depth and feeling ambiance and space and naturalness. Despite the detail and dynamics of the CD something was missing as well. This really where I am coming from. I have started listening to High Def audio using my Christmas present - Cambridge audio Azure 851N. I have listened to CD 16bit/44.1 , 24bit/96K 24bit/192K and DSD64. To my ears CD was the worst where I felt the DSD64 sounded the more natural , 24bit/192K wasn't far behind. Obviously 50% of the comparison where done without being certain that they were the same masters sources. My feeling is we still don't know enough of how we hear and all the subtleties that real analogue we use/hear in building our sound model in are minds.
@MF Nickster Actually I totally agree with you. The problem is not in the technical way we do things nor our technical /scientific way we encode data. Data is king here, information rules. The cutting edge of physics seems to be saying the very essence of reality is information (see th-cam.com/video/XxVlGAFX7vA/w-d-xo.html this is good!). The problem is how we actually hear ie biological process involved in how we assemble auditory information into a coherent sound in our brains. This is why I keep harking back the degree of resolution and having more then we currently think we need. This is I believe where the issues are arising, not just in the digital world but also in analogue. Back in the early 1980s, there were allot of medium to high end Japanese amplifiers (from different manufactures) that boosted ridiculously low distortion figures of 0.000005% and yet they sounded awful ! The reason was they used used high degrees of negative feedback in esoteric named circuits. This killed phaseual information and smeared the information in the time domain. Turns out we are very sensitive to this type of distortion. When we hear, we hear with more then I ears. We use low frequency vibration in our bodies, we re aware of very low level ambiance - so much so that they limit the time people work in anechoic because near total silence can have a psychological effect. (www.theguardian.com/lifeandstyle/2012/may/18/experience-quietest-place-on-earth). You say you don't have a scientific background (yet!) What are you studying ?
But if all these considderations is true, there shouldn't be any reason for normal projects to ever need to be recorded at a higher "resolution as 48khz/16bit.
So technically our hearing should actually be perfect satisfied with a recording at 44.1khz at about 10bit when compared to the noise and quality of a vinyl record. Now lets go back memory lane and remember those now iconic samplers like the Emax and Fairlight wich produces that grungy sounding 8bit sound, wich in theory should match a vinyl quality if those samplers would have been able to sample at 44.1khz. Still, if you can turn your bit depth down to 8bit with a 44khz sample rate you still get a more metallic sound simply because the waveforms get too much distorted when your level drop or raise too much between each sample, even with 44khz sample rate. The problem is really a combination of both sample rate and bit depth, as the different standards used over the years raised the bar on both sample rate and bit depth. Now don't rip my head off if I don't have the exact standards in memory, but typically the old standards of eg the Emax was iirc 11khz/8bit, maybe 22khz was available too, moving from 8 to 12 bit, memory chips becomming slowly larger and more inexpensive, we went typical 16, 22, 24, 32 and 36khz and the griddy sounds began to go away, but its first when we hit 44.1khz/16bit with the "CD" Quality we hit a usable standard that even people with perfect hearing wich is up to about 16khz in general can't hear any aparent degradation. (Only very few young teenagers is actusally able to hear anything over 18khz and our ears are so cleverly made that they automaticly lowers the sensitivity of very high frequent sounds from 16 to 20khz so it doesn't damage our ears, unless your standing very near a sound system where the amp is overdriven producing escessive high pitched noise.
What I think this points out tho is, especially for musicians who samples sounds from old digital synths and samplers there really is no point in using your samplers max resolution of 48/96khz/24bit for these kind of sounds. Even the memory usage isn't a big concern anymore, a standard 44khz/16bit sample will be more than sufficient capturing the details of these devices.
This video surely remembered me not to go all crazy on all these new highend audio interfaces, and unless you need to do some really special things, setting your interface to "just" 48khz/24bit wich everything is able to run at today is propably more than enough, and you wont either be having to think about "needing" a tb interface to be able to carry 16+ channels in 192khz and most usb 2 devices will be "good enough" for most home studios
@@RaveyDavey Most DAWs are able to do internal handing of up to 32 bit, but you still need plugins that support it. It's highly discusable tho if it makes any sense to use the much higher processing power needed to do so. But there is absolutely no point in home studio's to track (input) in higher than 44.1 or 48k/16bit I bet you noone can hear the difference if you go 24 bit or 96/192k. You are just moving 2'nd, 3'rd etc. distortion around. you might get slightly better results dependent on the material your tracking by going higher sample rate, and you might make it worse. Our ears don't have the ability to detect the dynamic range provided by higher than 16 bit resolution. Theoretically 44khz/12bit is enough to cover the human ears dynamic range.
Great video, Tnx
The new MixPre3/6/10 II and Zoom F6 offers 32 bit float bitrates that is "impossible" to destroy and even pull down to 0 dB if you overload the 0-mark with 10's of dB. Have you tested this option?
We talk about 32 but float a bit in this video. It is still possible to destroy if you clip the converters on the way in, or if you export a lower bit depth file that is clipping.
It offers no increased dynamic range over 24 bit in capture or playback, but you can do more demented processing to the audio without *internal* hard clipping.
Hope that helps!
@@SonicScoop OK - just heard u stated there were no recorders in sale (at that time) that possibly could make use of the 32 bit float files. Might be they are overkill in terms of bitrate, but guess more will be available. And people are thrilled so far..
www.sounddevices.com/32-bit-float-files-explained/
32 float is good WHILE Rec. just learning that. A few areas nip that 0 or above,,,,,, 4 effects. First song doing on DAW.... First actual non rushed 1 song ever. 5 nights 8 hours per night in. And At start I read on settings (PCs vary solid for music, all it does.) So yea I can run 32 float. And did. 32 at 94? But A few parts sneak into the plus EQ area. No cracks nothing. So while RECmix yea,,,, But as far as mix down. Depends on where your going I think. If I'm going to a master, I'm sending out high. With dumb jam track machines in the past I has one at 16 and one at 24. There was something just a tad more tingly in the 24. No BS. I didnt think there would be. Cause I heard the arguments. But I noticed it. Like more room, tiny frequencies the person that dont record won't. It was like a fluffier sandwich that wasn't pressed down. Only way to explain it.
Mabybe someone here can asnwer this question. I get that bit depth directly translates into a certain db range. I think its 6db per bit. I don't understand how that range is divided. What is the smallest amount of difference possible between to amplitude values? The video says that resolution is not applicable but It seems like there would need to be discrete levels or steps to changes in volume.
That’s a great question. Intuitively, you would think it would work that way. I did too! But that’s not exactly it.
Just like with sample rate, you have infinite variability, but within a limited predetermined range.
Think of it this way: with a 44.1 K sampling rate you can reproduce a sine wave at 1000 Hz or 1001 Hz or anywhere in between those two value. There aren’t steps in between those frequency options that are missing.
The limitation is simply that you cannot reproduce frequencies above 22k.
It is similar with bit depth. You can reproduce -1dbfs or -2dbfs or anywhere in between those two values.
The limitation is that you cannot reproduce amplitude levels below -96dB.
This seems counter intuitive at first. But recognize that in order to produce a sine wave at 1khz at -1dbfs, you need that sine wave to pass the zero crossing, infinitely below -, tens of thousands of times per second.
And that is where the resolution loss occurs: At he bottom of the dynamic range, registering as noise. (And fairly ugly sounding noise too, before it is dithered and made random, like white noise.)
That is my understanding of the theory of it. But even if that were wrong, the reality of how it functions in practice can be confirmed by your own tests.
Go ahead and load up an 8 bit audio file and dither it. You’d expect it to sound weird and distorted and “crushed” in some way. But it isn’t. It doesn’t sound like 8 bit video game music or something, which is what people are usually expecting. It’s literally just noisy. This shocked me the first time I properly tried it too.
Hope that helps make sense of it!
-Justin
I have seen that recorders like Sound Device records in 32bit, and are really usefull for not having so much noise in dialogues. ¿What about this?.