Ep 021: UTF-8 Encoding Examples

Unicode Encoding! UTF-32, UCS-2, UTF-16, & UTF-8!

Characters, Symbols and the Unicode Miracle - Computerphile

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ส่องฟอร์ม อาหมัด ดิยัลโล่ เล่นโคตรดี | แมนซิตี้ 1-2 แมนยู

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

Ep 020: Unicode Code Points and UTF-8 Encoding

Intermation

มุมมอง 41 244

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 24 ธ.ค. 2024

ความคิดเห็น •

@_vehicle 3 ปีที่แล้ว ⁺¹⁹
Aw man! You are awesome! I study computer science and I had to write a program that decodes signs to their unicode representation... Script from my lecturer was sa complicated and unclear... I spent a lot of hours trying to write this program and suddenly I found your channel, and I understood the most of the material in a little more than 20 minutes (including pauses). You are awesome teacher!
@guchierrez 3 ปีที่แล้ว ⁺²¹
thanks Heisenberg!
@kennyhuang7393 3 ปีที่แล้ว ⁺²
I am leaving this comment since this helped me during my first year of college majoring in CS. Thank you.
@dubshelb 3 ปีที่แล้ว ⁺⁸
This video finally helped me grasp this topic. You explained this far better than my computer science professor did :) thank you!
@vinodcs80 3 ปีที่แล้ว ⁺²
Read many articles but concept became clear for Unicode and UTf-8 after watching this video. Thanks you very much, Appreciate it.
@mrx-qi8th 3 ปีที่แล้ว ⁺¹
I'm so lucky i found this channel, thank u
@lorensims4846 3 ปีที่แล้ว ⁺⁵
7 bit for ASCII let them use the eighth bit for parity checking. They could a 1 or 0 to that eighth bit to make the bit count always come out even (or odd depending on the local standard). If a byte came through that did NOT have an even bit count it was clearly in error.
It was a very useful feature back in the day.
@trayfor 3 ปีที่แล้ว ⁺¹
Thank You! This was extremely helpful. I struggled to find resources that explained it as proficiently and eloquently as you.
@ayys_habu 4 ปีที่แล้ว ⁺⁴
I am confused regarding 11:55
Shoudn't it be 128 to 2175(2047+128)?
@mikeyamaro9035 3 ปีที่แล้ว
no. the 2 byte utf-8 only has 11 bits to store the unicode points. 2^11 = 2048.
Thats the maximum amount of unicode points we can represent.
Of the 2048 unicode points we can represent, 128 of those are used to represent the ascii characters.
so ( 0 - 128 ) for ascii characters and (128 - 2047) for other characters.
on top of that 2047+128 also doesn't make sense be cause we would need more than 11 bits for that.
@brod515 3 ปีที่แล้ว ⁺¹
@@mikeyamaro9035 I don't think you understood what he was asking. if we have a byte that starts with 0 then we are using 1 byte encoding that means we simply look at the next 7 bits which gives as values from 0-127. that's already taken care of right?. so if we meet a byte where it's first two bits are 1 0 that means we are using 2-byte encoding and so we have 11 bits and can represent 2048 values. since we already know the code point can't be from 0-127 the first value must be 128 and we have 2048 other values to represent @Habu Ayush was asking whether the representable values should be from 128 to (127 + 2048)? We start at 128 and we have 2048 values we can represent. I'm also wondering the same thing
@MccZerk 3 ปีที่แล้ว ⁺¹
I feel like he should have written 0 -> 2047, but, normally if the number can be represented inbetween 0->127, then they use 1 byte encoding to save sending two bytes.
@TayakornRakwetpakorn 3 ปีที่แล้ว ⁺¹
I thought about this too when I watched the video. In theory, 2 byte utf-8 should be able to store another 2^11. But in practice, it doesn't, so it is a bit of waste of 128 combinations there. I guess this is because when UTF-8 decoder decodes bytes, it just converts bytes to code points. It doesn't remember that how many bytes the code point was read from. This is clearer when converting code points to UTF-8. Say U+0041 ('A') for example has 7 bits to the most significant 1. So 1 byte UTF-8 (which has 7 bits) is enough. We can also use 2 byte, 3 byte, etc. for the same code point U+0041 too. But if we do that, a code point will have many UTF-8 representations and somehow they decided that this is not good(why? still figuring this out), so we only have 1-to-1 code point to UTF-8. For example 2 byte 'A' in hex is C1 81 which is an invalid UTF-8. onlineutf8tools.com/convert-hexadecimal-to-utf8 will give an error.
@rubinanazir2662 3 ปีที่แล้ว ⁺²
i am confused.
A represent 41???
9:18
@humza4848 3 ปีที่แล้ว ⁺¹
dope vid bro
@aWorldview 3 ปีที่แล้ว
Excellent presentation!
@syedtafhimshams5547 3 ปีที่แล้ว ⁺¹
Thank You! This was extremely helpful.
@tymothylim6550 3 ปีที่แล้ว ⁺²
Thank you very much for this video! It was very interesting and educational for me! :)
@fatimazohrabennai6949 3 ปีที่แล้ว ⁺²
thank you so much
@yanchaoli551 4 ปีที่แล้ว ⁺²
for 4 bytes utf-8 encoding, total available length is 21-digits, which goes to 0x1F FF FF, why the unicode codepoints ends at 0x10 FF FF, what happens to the entire nibble?
@mikeyamaro9035 3 ปีที่แล้ว
no one has made unicode characters past 0x10FFFF
the link shows the list of the unicode code point ranges(in hexadecimal) for different languages
www.unicode.org/Public/UNIDATA/Blocks.txt
at the bottom you'll see that the last range ends at 0x10 FF FF
@ssangkal1069 3 ปีที่แล้ว ⁺⁴
Wait, he is writing stuff in flipped horizontally?!!?
Btw, great video .thanks
@PHrushikesh 3 ปีที่แล้ว ⁺²
Breaking Bad vibes 😉😉😉
@tiburciolapanak 3 ปีที่แล้ว ⁺¹
Awesome
@salvadorestrella8 3 ปีที่แล้ว ⁺¹
This is a great presentation! thank you. Let me know if you need help to sell that blue meth

ต่อไป

เล่นอัตโนมัติ

Ep 021: UTF-8 Encoding Examples

Ep 021: UTF-8 Encoding Examples

Unicode Encoding! UTF-32, UCS-2, UTF-16, & UTF-8!

Unicode Encoding! UTF-32, UCS-2, UTF-16, & UTF-8!

Characters, Symbols and the Unicode Miracle - Computerphile

Characters, Symbols and the Unicode Miracle - Computerphile

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ส่องฟอร์ม อาหมัด ดิยัลโล่ เล่นโคตรดี | แมนซิตี้ 1-2 แมนยู

ส่องฟอร์ม อาหมัด ดิยัลโล่ เล่นโคตรดี | แมนซิตี้ 1-2 แมนยู

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

Bloxfruits player after Dragon update🐲| Doge Gaming

Bloxfruits player after Dragon update🐲| Doge Gaming

Code Pages, Character Encoding, Unicode, UTF-8 and the BOM - Computer Stuff They Didn't Teach You #2

Code Pages, Character Encoding, Unicode, UTF-8 and the BOM - Computer Stuff They Didn't Teach You #2

Unicode, in friendly terms: ASCII, UTF-8, code points, character encodings, and more

Unicode, in friendly terms: ASCII, UTF-8, code points, character encodings, and more

ASCII, Unicode, UTF-32, UTF-8 explained | Examples in Rust, Go, Python

ASCII, Unicode, UTF-32, UTF-8 explained | Examples in Rust, Go, Python

Ep 018: Introduction to Floating-Point Binary and IEEE-754 Notation

Ep 018: Introduction to Floating-Point Binary and IEEE-754 Notation

Ep 027: Deriving a Truth Table from Combinational Logic

Ep 027: Deriving a Truth Table from Combinational Logic

Coding Challenge 166: ASCII Text Images

Coding Challenge 166: ASCII Text Images

From Binary to Text: ASCII, Unicode, UTF & base64

From Binary to Text: ASCII, Unicode, UTF & base64

Unicode vs UTF-8

Unicode vs UTF-8

⍼ - Why Nobody Knows What This One Unicode Character Means

⍼ - Why Nobody Knows What This One Unicode Character Means

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

ไฮไลท์ ฟุตบอล ASEAN MITSUBISHI ELECTRIC CUP 2024 : สิงคโปร์ พบ ไทย

ไฮไลท์ ฟุตบอล ASEAN MITSUBISHI ELECTRIC CUP 2024 : สิงคโปร์ พบ ไทย

ศึกมวยไทยพันธมิตร 16/12/2024

ศึกมวยไทยพันธมิตร 16/12/2024

ใครขยับไม่ได้เป็น!!

ใครขยับไม่ได้เป็น!!

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

🔴Live : สิงคโปร์ พบ ไทย #MATCHDAY รวมพลัง #เชียร์ไทยให้กึกก้อง

🔴Live : สิงคโปร์ พบ ไทย #MATCHDAY รวมพลัง #เชียร์ไทยให้กึกก้อง

พ้นเส้นตาย "ทหารไทย" 18 ธ.ค.หมดเวลา "ว้าแดง" | DAILYNEWSTODAY 18/12/67

พ้นเส้นตาย "ทหารไทย" 18 ธ.ค.หมดเวลา "ว้าแดง" | DAILYNEWSTODAY 18/12/67

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short