Video Compression Is Magical
ฝัง
- เผยแพร่เมื่อ 4 ก.พ. 2025
- I'm a big nerd about all things video, not just MAKING videos lol. Really hyped to talk about h.264, video encoding, mp4's and the magic behind how it all works
MAIN POST
sidbala.com/h-...
ADDITIONAL SOURCES
en.wikipedia.o...
mediaguide.son...
en.wikipedia.o...
• Why Snow and Confetti ...
Check out my Twitch, Twitter, Discord more at t3.gg
S/O Ph4se0n3 for the awesome edit 🙏
video compression vs that shirt
now it makes sense, he knows how to make youtube servers really pay their rent
😂😂😂
It looks fine on 2k?
90% of the size of this video goes just into that shirt
😂😂😂😂😂
You chose a great shirt to have in a video about compression...
You forget Google developed it's own video codec for TH-cam. I'm being served this video in the VP9 codec.
They also use AV1 sometimes (A/B testing? )
How about WebM?
You can choose if you want to view TH-cam videos in AV1 if there is a version available. TH-cam chooses popular videos or videos that are 8K to encode in AV1 since it takes a lot of processing power. Also, WebM is the container format for VP9 and AV1.
@@jit-r5b webm is a container like mkv or mp4 that doesn't imply a single specific video codec. there can be VP8, VP9 or AV1 inside
@Ady-rt1yu videos under 1440p are processed with the worse codec, vp9, while videos over 1440p are processed with av1
Just to clarify, the reason why RGB were chosen is because they are the primary colors of light for humans because they are the colors that the cones in our eyes are sensitive to (in other words, nature choose RGB, we merely discovered it). The CYMK pallet was chosen for pigments is because pigments are subtractive and therefore should be the complements of RGB (cyan is the complement of red, yellow for green and magenta for blue, and K is for black because combining colored pigments doesn't result in the best black). The primary colors we learn in elementary school, namely Red Blue and Yellow, are for pigments NOT light and are suboptimal from before we, as a species, understood RGB.
Yellow complements blue, and magenta complements green
Yellow is the brightest perceptual hue in the color wheel, while blue is the darkest. They're direct opposites
On the other hand, magenta doesn't exist in the physical world. It's something our brains make up when our red and blue eye cones fire, but the green cone (which is supposed to sit between them) doesn't. Magenta is literally our brains rationalizing the absence of green
They use rgb because monitors are black by default if not in use.
To be clear, objects that absorb green light but reflect both blue and red do exist in the physical world. What doesn’t exist is a wavelength that is simultaneously blue and red. Meanwhile, if you stucy photography you will learn that our red cones are also slightly sensitive to some of the blue spectrum, effecting one form of purple. The best cameras account for this.@@DubiousNachos
and to clarify additive vs subtractive: additive just means the colours are produced from emitting light and subtractive means the colours are produced from taking away light (white light hits pigment, pigment takes away every colour except for green, pigment appears green). Combining more additive colours = more light -> white, combining more subtractive colours = less light -> black.
Except that it's even more complicated than that, because the M and L cones overlap a lot and aren't **quite** centered around red green and blue. For example, L cone cells actually have their peak around yellow, not red. However, our perception of colour actually relies more on the differences between different cone responses rather than their absolute values. So I'd say that the primary reason why monitors use RGB is that "Blue" is the wavelength that activates Short cone cells, while Red and Green are the wavelengths that maximize the differential between Medium and Long cone responses without activating the Short cones.
This is some of the best content I've ever seen on your channel. The topic is amazing, and the explanation is superb.
🫡
How unfortunate that he didn't explain it. He also made several mistakes:
11:40: "You ever see this ringing pattern in a video"? You're misusing the term "ringing" here. Ringing is the wobbly artifacts you pointed out in the laptop image. This is banding.
22:30: "You ever heard of hardware encoding and decoding? Usually, this is what it's referring to." No. Hardware encoding is just faster because it's in hardware, not because it's parallelized. Video codecs are a nightmare to parallelize, at least in the short term, because a lot of the information depends on each other, since the aim is to remove redundancy.
Important distinction that most people might misinterpret, you don’t choose between additional and subtractive color models.
Pigments absorb wavelengths, using all colors absorbs all wavelengths and therefore is black.
Additive is the result of broadcasting or emitting. We picked RBG because those are the frequencies our eyes detect*. Every other color is the result of our brains doing math over groups of photoreceptors in your eye. So having a yellow pixel would be a waste as your eye doesn’t see yellow, it sees a tickle of green and red and maths it to yellow.
Wavelengths, brains, and language is a trippy triangle.
*assuming no chroma deficiencies.
Best video explanation of this subject that I've run into. Great balance between technical terms and simplification. Thanks to Sid for the article and to you for elaborating and saving us the reading part. Cheers!
i know this is primarily a web dev channel but this is by far my favorite topic you’ve covered so far
Terrific content once again! Simply thank you for creating the best dev content on youtube covering the industry with excellent and thought-provoking insights. Hard to skip even one upload ❤
Fascinating. Thank you!
Things like h264 video encoding are why I'll always believe software optimization is as important as hardware. If we didn't have this level of compression, we would only just barely now in modern day be able to MAYBE download a high quality video off the internet in a "reasonable" time period.
Software optimization is significantly better than hardware scaling because it doesn't cause huge ecological damage
@@JasminUwUboth are important though so who cares
@@JasminUwU That goes for optimization and scaling in general, it's not inherent to the differences between hardware and software.
@@rawman44If you can make more with less then we do care. Even as a company if you can use the same hardware without having to pay for the brand new component, they'll prefer that option because it's more cost effective
You 100% got me with that 8 year old sub call out. I had to pause the video and go look at the original article because I didn't think you'd even been on youtube that long, only to come back, unpause and have you explain the joke :P
I did the same but instead scrolled down to this comment lmao
You nailed it, thanks for that well break down on that article (great article).
Also i think av1 is interesting too, i love to see it on future video's
Once the video gets to a certain number of views, TH-cam will re-encode it into AV1. (Currently for me it is encoded in VP9). AV1 is a codec that is better than H264 and H265 and also does not require licensing costs for implementers. It is cool that an Open standard is being adopted an a wide scale.
h264 is a good "starting point" for beginners, as while AV1/VP9 being royalty-free is very important, h264 has wider adoption. You're more likely to have hardware that can decode h264 out of the box.
If you really want to advocate for better video codecs, maybe we should discuss why DRM has prevented wider AV1 adoption, i.e. Netflix will use AV1 on some closed firmware devices (TVs), but you can't use it on a browser even if you have a supported hardware decoder...
@@TaylorBertie Yes, h264 is a good starting point and has wide adoption. Since the article is from 2016, it is ok that it didn't mention a codec that didn't exist yet. Issues like DRM are complicated.
@@TaylorBertie "why DRM has prevented wider AV1 adoption" Netflix is encrypting AV1 from 2021.
"even if you have supported hardware decoder" You can't audit user hardware from browser (imagine fingerprinting if this would be possible, VPN/different browser/different OS would still link to same hardware).
@@AXYZE Audit? No. But you can present different codecs when you start the stream and then the client picks which one it wants, presumably because it has the needed decoding hardware. This is, or rather was a problem because Netflix didn't even present AV1 as a codec option for browsers, due to DRM concerns. It's well documented that even to this day you can have issues with getting 4k Streams on browsers due to these concerns, though at least now it's not Netflix, but OS vendors that do it.
@@TaylorBertie AV1 is DRM protected from 2021. Why you think there is or was any DRM issue? Encryption has nothing to do with video codec, because it is not codec that is encrypting, packager is encrypting and then you get CMAF file.
Netflix uses encrypted AV1 in CMAF and it does it from 3 years.
I love deep dives into nerdy topics, please do more vids like this
I'm not sure if people mentioned it, but yay it's so helpful to see conversion to kilograms at the end. I was clueless the whole video :D
Watching a H264 video in a video with H264
great video, wouldn't have found that article without you. good shit. i guess that's also why yt-dlp or ffmpeg have so much trouble cutting up video or downloading sections and you have to reencode shit all the time to get clips
Ffmpeg supports cutting without reencoding
The first time I discovered "motion compensation" is in GIF, where every next frame you have only pixels that are changed from the previous frame.
The title of that article always stuck in my head for some reason. Now you should explain AV1.
I think I might know a bit too much about this. Was like, what is so mind fuckery about Nyquist sampling therom. Even the Shannon proof of it isn't that tricky,, although I will admit I couldn't produce it from memory or first principles anymore.
Stuff I think would be useful in a followup.
1. Realtime (ie. streaming) encoding vs non-realtiome (ie. VOD) encoding.
2. CBR vs VBR, why it matters and who cares (obviously somewhat related to 1).
3. Single vs Multipass encoding (also related to 1).
4. Unreliable network transport and video (ie. RTP over UDP), and how data loss is handled by decoders.
Same: It's difficult to prove that Nyquist Shannon is precisely correct, but the takeaway (you need a couple of samples per oscillation) is (even if only in hindsight) obviously roughly correct, just on intuition.
@@carljosephyoungerTechnically, Shannon and Nyquist just give you an upper bound. You can start going way lower if you take advantage of sparsity-I believe JPEG and MPEG do this by dropping the lowest coefficients from their 8x8 DCT blocks.
This is just so interesting. I litterally just finished a personal project to create scrolling spectrograms from audio - very high frequency change. FMMPEG is such a gift.
Compressing that stache would take a life time for me.
Great example of high frequency content !
The Fourier Transform is the backbone of modern communication technology (signals, images, videos) and it is amazing how people figured this out.
What's more amazing is that the people who invented it did it centuries before even mechanical calculators existed. Imagine how much time they spent multiplying sines and cosines in the process. And they did it way before it had an actual useful application too
What a great article and a great video on top of it. I’d love to learn more about the technical side of video encoding and streaming.
This was so interesting. Always surprised at how Theo can translate difficult subjects into easy to understand videos. Even if the blog post was really good, Theo adds so much easy to understand context. Thanks!
I remember reading that article years ago. It was quite enlightening.
This was a pretty solid intro. I actually learned a fair bit of this 10-15 years ago as I dug into ripping videos apart and finding ways to re-represent video in other forms. My thumbnail pic is a still from the crane sequence in Casino Royale, but remapping the Cr chromiance values to Predator vision lol. It's one of my oldest random videos. Bascially watch any movie in Predator vision lol.
The one thing I think the article skipped though was (not sure of the name) motion differential, which works in conjunction with the motion estimation and vectors.
Think of that tennis scene where only the ball moves and everything else is stationary. Even without motion vectors, you could compress the scene by making your big I-frame, and then every P-frame is just a difference map from the previous frame. The P-frame would have no detail (easy to compress) because 99% of the frame didn't change from one to the next, except for the parts of the frame where the ball is moving.
You can combine this with motion vectors. Start with I-frame. At compression, you analyze the motion and basically find that macro block at position X moves to position Y. The encoder then builds an approximation of the next P-frame by warping the previous frame's contents as per those vectors. Bascially, warp the previous frame to mimic the current frame as closely as possible. Then get the difference map of the rebuilt frame and the actual original frame, and the difference map is what you save. It's basically a how-to approximate the frame based on the previous frame (vectors), and then an appendix to correct the little inconsistencies (difference map), because you'll never get build the frame perfectly.
The decoder will restore the I-frame on playback, apply the motion vectors to reconstruct the next frame, then decode and apply the difference map to match the original frame as closely as possible. This works well on frames like a closeup of a face, where the face will always moving in subtly different directions, but not changing by huge amounts once you account for the motion. You can retain a lot of fidelity with relatively little amounts of data.
0:25 Not true, youtube has been phasing it out for years. It is mostly VP9 now, with AV1 rolling out more and more over time.
In fact this video is getting served to me in AV1 right now.
I love how the YT timeline shows massive scrubbing activity at 21:50... where he talks about scrubbing in a YT video
I can't wait for widespread AV1 support
Those of us who write code to manipulate images or video at a low level will often treat the data as having one more dimension - along which each colour sample is aligned. The part of image or video data representing an individual colour component of each pixel (in formats where that makes sense) is often termed a *colour plane*, and these planes are conceptually stacked in that last dimension.
I'm not actually using h.264, but I'm guessing it was h.264 at some point so same same, love this type of content btw it's easy to forget that this stuff isn't common knowledge it's really useful to understand even if you only consume content it can help to understand what you are getting (why 4k doesn't always look good for example) and troubleshoot issues etc. and why some things are more expensive because of proprietary codecs etc. and why streaming video is damn hard as you would know well I'm sure haha
My youtube is av1 even though i can't use av1... my gpu doesnt support and my cpu uses like 50%.... i just use h264ify extension to force it back to h264... so i'm actually watching on h264 lol
@@reubendeleon-ji6up perfect example of that troubleshooting I can't imagine how many people are using their cpu to heat the room because they didn't notice/understand the issue haha
TH-cam has been using Google's Open Source Codec called VP9 and now they're slowly moving to another Open Source and Loyalty Free Codec called AV1.
great video and great article! You Sid made it seem pretty simple, thanks!
I never thought I’d sit through this. Very insightful.
The "primary colors" are not red, blue and yellow.
The primary colors for additive mixing, which is a relatively new discovery, are red, green and blue. And the primary colors for subtractive mixing, which we learn about in elementary school, are the complementary colors of those, so Cyan, Magenta and Yellow.
so thankful for this, i’d always wanted to understand more about this specific kind of stuff
thanks for the weight units translation!
loved it, can you do h265 and av1 ?
For anyone whos interested b frames and p frames are very similar to how we conteol lighting in theatres. We call it tracking but essentially we only record what changes from the lights in the cue before.
Little nugget of info for anyone who cares :)
Yes and no. The motion compensation is exactly the same as tracking. But P and B frames also encode local new content separately. Something this video completely failed to convey.
so interesting 👀i learned the theory for most of these methods individually in uni, but seeing the results of combining them is still amazing
The primary colors ARE RGB and CMY, given anatomical reason. Our eyes have 3 sets of color receptors one for day/night vision (BLUE), and two for perception (Green), and the second given that we are gatherers mammals (Red, that helps the distention between fruit and leaf). Anatomically RGB are the primary colors, and that is why CMY which are their counterpart are also primary colors. One Additive (light) and one Subtractive (print).
This error was inherited by historical errors. Back then producing dye was really hard. So the closer to get a full set was Red (given that Magenta is a strange almost impossible color, that is why purple was a royal color), and Blue (Given that that was the name given to any blue-adjacent color). This pallet is so bad as a substractive primary pallet, that if he uses it as reference, you need two types of reds to have a more complete spectrum (Vermilion and Crimson).
If you ever notice that you rewind and your CPU starts to go crazy. Its a tell that your graphics card isn't using Video Encoding/Decoding. This a common issue on Linux with Nvidia.
13:13 it's not ordered by "Changed the most" it's the Fourier representation. the data in approximated using Waves and by cutting off the high order parts of the approximation (data from the edge) we lose detail without sacrificing the Most significant shapes/colors -> it looks like a blurry version of the previous image.
you made this video at the right time lol... i am actually developing my custom video codec to make it work with CosmosOS and this video helps
Codec choice for video editors is really important. There are different classes of codecs which are better for different purposes. Video editors will transcode footage several times in their pipeline.
Transport/Acquisition codecs - As Theo said, the data you get off the camera isn't great for editing. Codecs like AVCHD are optimised to fit lots of data on portable media. It's computationally expensive to decode but uses minimal storage.
Intermediate codecs - These are large, but usually contain all the data, similar to how Theo described an 'all i-frame' video. They also usually have more colour data since the small differences are important when colour correcting or grading.
Delivery codecs - Lossy codecs which optimise for storage, transmission etc.
Hi, I started the H.266 article on English Wikipedia 😇 just wanna say your thumbnail is WAY underrating the concept - unintentionally I expect? At least in the video you give an example of 10,000:1 nearly. Ratios have been 1000:1 for a long time, since DivX used a hacked version of a cracked version of a partial version of H.264 … well, like any “codec”, we mean one implementation of that codec, in other words one encoder that’s compatible with every decoder including the reference one
Anyhow
Uncompressed video is INSANE. 270 megabits per second is boring old 8-bit SDTV, even after you include interlacing which is a 100-year-old hack that uses analog electronics to do 2:1 motion compression of video.
And it’s totally reasonable to try to cram that sort of thing (640x480) into 270 KILObits per second now, once you include all the caveats like average bitrate with some nice buffering, maximum length GOPs and of course a ton of processing and pre-processing - but doing that live on a modern laptop would even work. Back in 1996 the first digital MPEG-2 encoders occupied a shipping container that went to sports stadiums. I bet they had a power draw to match.
That explains why light mode is literally brighter. Good explanation.
My first time encountering Scalable Video Codecs was when shopping around for video conferencing systems back around 2013-ish and coming across the Vidyo conferencing products. Vidyo leveraged SVC video in a multipoint streaming format to adjust bitrate, spacial, and temporal aspects on the fly for the network it ran over, and the size it needed to be shown at on the end displays. I assume Zoom adopted much of this same techniques now, but back then, looking at this stuff in comparison to dedicated hardware systems with Multpoint Control Units (MCU) adding crazy latency was a mind bender. And SVC was a big component of making it work for Vidyo.
en.wikipedia.org/wiki/Vidyo
TH-cam does not use H264 anymore in most cases. Thats why you see a very high CPU usage on older devices while using YT, as the processors dont have hardware decoding for the newer codecs VP8 and VP9.
6:21 Interesting story. I used to work for a company with a 4D ride. The video projected used only still images for the entire 12 minute film. they were combined onto a set of 4 projectors using software that calculated the overlap and curve of the projected images on the screen. This way it could be completely lossless when replaying the video. You could interface with it and play the video frame by frame as well. Or literally swap a frame with anything you wanted so long as the naming convention matched the other frames. You could also see exactly how many frames were in the video and replace any if they were somehow corrupted or you edited certain parts of the film. You just replace the stills for those frames. It was interesting.
I’ve never even seen the “RBY” colour model and I think Brittanica should be ashamed to have it. Well, no, I did hear of it 40 years ago when my art teacher told me to use those 3 “primary” paint colours, except they weren’t primary because he also told us (correctly) that mixing all 3 produced brown, so we needed black. Even at 8 years old I knew something smelled wrong. A few years later I told my dad who argued back at me about RGB and didn’t know why I was saying RBY. He was a trained TV technician and when he was in college, colour video was the new hotness, like the internet was for my generation. But neither of us knew yet the names “additive” and “subtractive” primary colours.
Fourier Transform is the greatest Algorithm.
I definitely want more video-nerdy videos!
This is SUPER interesting. Great video :)
Hey Theo, This was amazing. Thank you.
HEVC sucks because you need a licence.
"which is almost certainly the technology you are using to watch this video right now"
actually TH-cam uses VP9 in most cases and sometimes AV1
( in fact for 1440p and above TH-cam never uses h264, from what I know it only uses h264 as a fallback for compatibility and for small channels )
yes I know I'm being the "ummm aksualy🤓" guy right now but I just want things to be correct :D
here's a little thing: it's tought that it's red + green + yellow or whatever because that's how it works in paintings, that's how you mix things and it makes sense there
but screens base their colours on just human vision, and the pixels work differently do paints.
it doesn't work that way
It's so cool explanation that we don't need to go to doom9
"almost certainly the technology that you're using to watch this video right now" You're not my mom, you can't tell me what to do! *watches in AV1+Opus out of spite *
Theo, I really like your videos. You choose interesting topics, you always reserve some space to praise people for their good work, you're competent and you have a very strange Mustache. Being a life-long hater of shaving aperati, I know what price you have to pay for that thing. So it's actually not the mustache that's strange, but the fact that you shave and still cannot eat soup in polite company. Why would you do that, not refraining from eating soup but preserving the nose curtain? You're not asking us to like and subscribe even though that's a psychological imperative, so it can't be show biz... So many questions... :)
red blue and yellow are not the primary colors of pigments/subtractive coloring. the actual 3 are yellow cyan and magenta. you get taught ryb in elementary school because it's close enough.
Green wasn’t chosen, RGB are just the primary colors of light while CMY are the primary colors of pigment.
As someone that does videography as a hobby, I'm so glad video compression exists...
Would have been really whack with my 6K camera that has 12-bits of colour depth with 50fps... So about 1.6GB/s.
So that same Blue-Ray disk would be filled up in about 30 seconds. D:
I am imagining the “60fps is always better” people saying stuff like “I’d rather download a couple terabytes to watch the movie… why don’t I have that choice”
An average 120 minutes flick, 1920 × 1080, 60 IPS progressive, 4:4:4 RGB 8-bit color uncompressed discounting the audio track and other metadata would be 2.444TiB.
At 4K, 12-bit colors it would almost fill the largest consumer HDD at 16.31TiB.
For 8K cinema, 16-bit colors: 41.71TiB.
But the real problem is not the storage space, its the bandwidth.
You'd need a media capable of respectively 373.2MB/s, 2.265GB/s (!) and 12.74GB/s (!!!) sustained read speed.
That's why RAID 0 is still used in high end editing stations and professional editors goes to great length to use proxy medias and only use the actual footage for rendering.
I think the comparison between PNG and MPEG is confusing reducing entropy with boring simple downscaling. Most (but not all) of the differences you’re seeing are just lower resolution.
10:40 does NOT represent that image. It’s just an example of one section of one image that probably isn’t that one. You can tell the explanation is skipping something here because it doesn’t mention “macroblocks” at the same time as quantisation. (Think of them as chunky big pixels, that can change size as areas of the screen change how detailed they are). The quantisation process happens multiple times across a single frame. It’s really easy to learn about this because it’s all the same principles as JPEG images, just with bigger numbers because computers got better between 1995 and 2005.
Seeing the “ringing” isn’t about being aggressively compressed - this is a known problem with early implementations of decoders - and shamefully, should have been solved by the time of the first H.264 decoders.
Fantastic video that was very informative and entertatining
Me, watching this to figure out why I get so many video decoding artifacts on my iPod Video:
i'm sure that I'm not understanding something but isn't it those regions of high detail that you want to preserve, isn't it the block colour areas where you can shave the most baggage with minimal loss to quality
I watched it in 144p and could still visualise all the issues.
@@kaiotellure The african american vernacular made me laugh, how is Brazil these days? :)
This was the reason for the push 10+ years ago to end Analog TV broadcasts over the air. Because analog TV signals a so inefficient, as they couldn't be compressed. Thats why there were so few channels. The digital ATSC standard uses H.264 codec, and now offers HD, and freed up bandwidth for more channels for Over the Air.
Except they were already compressed. YCbCr (or YUV) 4:2:0 *is* analog video compression which was restored to RGB for display. They just couldn't be compressed further and digital compression offered significant bandwidth benefits at the cost of dependability. A poor all analog reception was still useable despite the quality loss whereas digital decoders have a hard time getting anything but a garbled audio-visual experience.
Do you have any insight into automatically generating transcribe text for video/audio for free? TH-cam does it automatically but only under specific conditions. I feel like the technology should exist opensource, not locked behind Office365 Word dictation, or some AI website that you have to pay for.
I thought it was appropriate to ask because this video just took a deep dive to how video compression works, so, I imagine research on sound have been mastered and figured out too, to the point where we can get that sound to be converted into English text.
P.S. Thanks for bleeping the curse words.
@14:20 that's not what quantization is. At least, normally it's not done by discarding all high frequency details willy-nilly while retaining all exact precision of the lower frequency ones. Usually the levels are scaled down by some numbers (the quantization matrix) and then all numbers are stored. The quantization matrix makes *most* of the numbers end up as zero, except for those that are big enough to have a meaningful impact, and those will be small numbers now. Scaled back up it'll be something like 964 instead of the 987 it should be, but you won't notice that difference. Then, it stores it in an order where zeroes are stored very efficiently.
I'm pretty sure h264 is finally on its way out. TH-cam has mostly dropped it (it remains for back compat). Instead using VP9 for high res and AV1 for low res.
Netflix seems to mostly use a combination of av1 and h265.
I think TikTok is still mostly (or entirely?) on h264.
Apple's insistence on h265 (really expensive patents) kept h264 alive, but now that they support VP9 (at least in software) and more recently AV1 in hardware, more roadblocks are falling.
But isn't all what you said applies to all mainstream video encoders? Isn't this how mpeg2 (h262) does it? How is h264 different?
Subscribed to Theo mentioned, let's go! 😂
It still amazes me that a handheld device can decode all this in less than 1 Watt of power. Way less even, my phone uses 400mA (1.1W) including the screen while watching this video (according to AccuBattery app).
something i seem to miss in all modern video codecs, that used bot be availeble in almost every one (i think , i might be misremembering here , or suffering from mandela effect ) is that now there is a slider , wich makes sense ofc but i think with Xvid you had with quality being equal, or with filesize staying equal , with effort i mean mostly time to encode . so if you had a video that required allot of detail you could still fit it on a cdrom , encoding it would just take much longer if you wanted to preserve quality (then there also was decoding speed that could be changed so a pentium 2 could still decode a fullscreen video 800x600 wat 30 fps where if not taken care of it would be a partial slide presentation , and i mean partial as in only parts of the slides would get updated every second or so :)
That slider still exists -- in OBS Studio, for example, it's called "Encoder Preset" and has options like "Slow (Good Quality)" and "Fast (Low Quality)"; and it does the exact same thing that the slider you mention from Xvid did -- it controls how much effort is spent to try to figure out the best way to encode motion vectors between frames, and when I-frames should be used. When mastering a video file, you can really crank it up because there's no deadline for producing output other than how patient you are to wait for your video file to render; but if you're trying to stream in realtime, you need to be encoding frames at least as fast as they're coming from the source; and so where you can put the preset slider depends on how beefy your hardware is. This is where hardware accelerated encoding is a huge deal.
Nice vid.
Now take into account LLM databases and combine that with on device NPU and voila one can go a lot further..
0:30 correction: youtube uses vp9 compression, which is slightly different
Is this why.. when I film heavy rain or snow fall.. and then send it to a friend on WhatsApp, they’re like what snow? What rain?
Yep. Small details are high frequency data which is discarded.
Dude am I going crazy or was your video of you talking edited like a million times? 😂 I was thinking about I-frames and was watching you speak and started to think I was seeing them. haha How long does that take you to edit? I hope you made a program to edit out all your non-speaking moments. All this time I thought you just talked that quick. Good video! Learned a lot.
Happy to use vp9 instead of this proprietary patent polluted h.264 nightmare.
The world will be switching to AV1 even for live video. Some may switch to h265 or VVC.
Thanks Theo, this was so interesting.
TIL 24bit color = 24 colors
TIL that when I encode FHD 60fps to the highest quality possible with AVC it is already cut more than ninety nine percent from what it would be uncompressed.
What happened to the new codecs you were going to talk about near then end...
TH-cam doesn't serve H.264 at all wdym?
Funny thing. I'm watching this video on TH-cam using the AV1 codec, not H.264. :)
Is there a part of this video which isn't literally you reading a webpage to us? I can read the webpage, I clicked the video for a multimedia presentation, not a lecture. I'll just click the link and skip subscribing to the channel if it's all like this.
Nope, skipped ahead, it's all being read a webpage. Glad the content seems to be working for you, but I'd prefer if you made the content in the future.
26:55 thanks, bro. Some of us don't speak Imperial
yay... just a small mistake. Colors of subpixels are not about "brighter" its about what colors you can display (e.g. maximum color palette or color standard)... absorbant and radiant colors "add" completely different and this is the only reasonn why we use rgb pixels. It also has alot to do with how our eyes perceive colors (not brightness).
why RGB? RGB is specific to our eyes because our retinas contain red, green and blue cone cells that detect light at that wavelengths.
plus, we also have rod cells that detect light intensity. there are waaay more (20x) of them which is why we are more sensitive light intensity than color
7:30 You overestimated TH-cam compression, the compression effect was clear as a day's blue sky xd Impossible not to see it
2:10 - Isn't the reason we use green and not yellow due to the color cones in our eyes? yeah, that's additive colors, but its more to do with how our eyes work and trying to trick our eyes.
Could be wrong here, but I think the cones are an important detail.
The company I work for specialise in encoding live video, in particular for stability over changable/cellular connections. It's black magic stuff. Give me a message if you're interested in hearing about it!
OMG please show us pages with Dark Reader or something. My eyes are trashed from radiation burn
Why not divide frames into pixel blocks and decide on important parts based on some energy function, and spends your bits more on the important blocks?