Why I can type ±©♥🔥🂱Ʊ in this title

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 พ.ย. 2024

ความคิดเห็น • 691

  • @EternalGoldenBraid
    @EternalGoldenBraid ปีที่แล้ว +838

    Your take on the typical TH-cam video is depressingly accurate

  • @OmegaGlops
    @OmegaGlops ปีที่แล้ว +505

    Has Unicode added a character for the "Cool S" yet? I feel like that's a crucial piece of historical iconography that definitely needs to be included!

    • @yesterdaydream
      @yesterdaydream ปีที่แล้ว +104

      Someday kids are gonna see that and be like, "wow I can't believe letters used to look like THAT."

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +242

      THIS NEEDS TO BE HEARD UNICODE!

    • @willowmaine2254
      @willowmaine2254 ปีที่แล้ว +47

      You can write a unicode proposal for them to add it as an emoji. The process takes a while though, and it may be hard to argue against its transience ⌚

    • @ZolaniZweni
      @ZolaniZweni ปีที่แล้ว +38

      The fact that we know what you mean is enough reason to add it

    • @danielqualls9321
      @danielqualls9321 ปีที่แล้ว +7

      @philedwards I would love to see a video on the cool S. It would make a great follow-up to calculator games.

  • @tehbertl7926
    @tehbertl7926 ปีที่แล้ว +472

    One of my favorite parts of Unicode and emoji is that there's a sort of "glue" character that can be used to alter or modify an already existing emoji. It allows people to specify a skin color for emojis of people, and the pride flag is basically "basic flag emoji + glue + rainbow". It's just such a neat trick that the Unicode people used to greatly increase the range of emoji that can be supported.

    • @qwertyTRiG
      @qwertyTRiG ปีที่แล้ว +91

      ZWJ: zero-width joiner

    • @1leon000
      @1leon000 ปีที่แล้ว +45

      Yea, it's the ZWJ. So the pride flag is "flag + zwj + rainbow"

    • @JorWat25
      @JorWat25 ปีที่แล้ว +54

      Similarly, there are no specific country flags saved in Unicode, just a set of letters. So if you want the US flag, you essentially enter flag+U+S. That way, they don't have to deal with the every changing list of countries.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +113

      That is pretty cool. I also didn't get into control characters which are pretty neat too.

    • @qwertyTRiG
      @qwertyTRiG ปีที่แล้ว +37

      @@JorWat25 It's actually "Country indicator U" + "County indicator S", two characters.
      It does mean that the apparent length of the text depends on whether the font supports the flag.

  • @jsonlee01
    @jsonlee01 ปีที่แล้ว +119

    As a developer I’m familiar with ASCII and Unicode but this was still very informative and entertaining video. Keep up the good work and Thanks!

  • @vKevlar
    @vKevlar ปีที่แล้ว +191

    That Ron/Don/ducks bit was absolute gold. Great job with this and all your videos! Its been exciting to see your trajectory from Fake Science into Vox and now your solo adventures!

  • @robinmichel9048
    @robinmichel9048 ปีที่แล้ว +150

    When I worked at a national wildlife refuge, we made maps in ArcGIS to hand out to people who wanted to bird watch. We'd mark spots where you could see cranes, eagles, ducks, etc. ArcGIS doesn't easily allow for images to be imported into a map, so we used a bird font. I forget exactly what letter corresponded to what bird but it made labelling bird watching locations super easy. 😂

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +42

      that's a useful font!

    • @Bruno-cb5gk
      @Bruno-cb5gk ปีที่แล้ว +16

      this reminds me of the time I made a custom font to make a computer vision task much easier and more reliable, custom fonts are always a funny workaround

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว +19

      Yeah I kinda miss symbol fonts where everything was “really” a letter, they’re kinda unnecessary now that Unicode has so many unique symbols.
      See also the mild controversy about whether to include stuff like the Prince symbol, since that was distributed as a vector font for publications to use, but isn’t public domain. But Unicode’s mission of being able to encode and support every possible symbol kind of suggests it should have it? But it can’t due to rights issues.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +16

      @@kaitlyn__L Ha great minds! I read an article about the Prince symbol while I was researching this and saw all the fans desperate to hack together the best possible comp.

    • @corgi_dad
      @corgi_dad ปีที่แล้ว +6

      Very nice! I've used ArcGIS for years and haven't seen a bird font, but then I haven't made maps with birds. Now, the base for our published maps start in GIS, but are cleaned and finalized in Illustrator.

  • @kavich
    @kavich ปีที่แล้ว +53

    Honestly, the production quality of your videos makes me feel like i'm watching Netflix, it's pristine!
    And i find it incredible that even with the fancy edits and topics you can still make it entertaining and funny!

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +8

      that's nice and appreciated - thanks! still room to improve!

  • @niranjanr9064
    @niranjanr9064 ปีที่แล้ว +64

    Learning Unicode's history was never this fun, and made me realize it's importance.
    Also, I genuinely love your style and take on video essays like this. Must've taken some time to reach that balance.

  • @MrNicePotato
    @MrNicePotato ปีที่แล้ว +108

    I am a Chinese speaker and I am learning Japanese. I know the hiragana and katagana pretty well but sometimes I don't know the pronunciation of a kanji. I would just switch to Chinese and type the Chinese equivalent and I am always surprised how it knows it is supposed to be a Japanese kanji not a Chinese character. Turns out someone (probably a group of people) painstakingly found out every single Japanese kanji that is exactly the same as their Chinese counterpart and made them the same unicode!

    • @Radien
      @Radien ปีที่แล้ว +6

      That's reassuring!!
      Though, cataloging all widely used kanji is a huge project to begin with, in any language.

    • @dapuslearning3828
      @dapuslearning3828 ปีที่แล้ว +5

      Hello, can you share your experience, like, is it easier for a Chinese speaker to learn Japanese compared to rest of the people who don't know about kanji (Chinese characters)? Can it be easier for a Japanese speaker to learn Chinese?

    • @ZiRR0
      @ZiRR0 ปีที่แล้ว +1

      @@dapuslearning3828 yeah, the difficulty of learning a language depends on your native one

    • @doak_
      @doak_ ปีที่แล้ว +2

      @@dapuslearning3828 as far as im aware of yes, a large part of japanese vocabulary is from chinese and since most kanji(used in japanese) have hanzi(used in chinese) counterparts which have same or similar meaning and pronunciation*, its better just to confirm how the kanji is actually pronounced though
      in terms of native japanese words or loanwords from other languages like english, and also japanese grammar and pitch accent in general that cant be learned with the help of chinese
      im not sure how it'd be like for a japanese speaker learning chinese, but again the grammar is different and compared to japanese theres fewer english loanwords in chinese which would be expressed entirely using the chinese meaning of hanzi instead of transliteration (e.g. japanese for computer is 'konpyuuta' while in chinese its 'dian nao', literally meaning electric brain)
      *im talking about the onyomi pronunciation which is derived from chinese, kunyomi which is the japanese pronunciation can't be learned from chinese

  • @vectorhacker-r2
    @vectorhacker-r2 ปีที่แล้ว +32

    As a computer geek this video tickles me nicely. We deal with unicode and it's weirdness as software developers all the time. It's nice for the general public to understand why sometimes we have this weirdness.

  • @CrocodileWhispers
    @CrocodileWhispers ปีที่แล้ว +12

    great editing as always Mr. Edwards. Thank you for your work, quite entertaining.

  • @farphos
    @farphos ปีที่แล้ว +56

    Repeatedly hearing the letter "Ä" being called "A" gave my swedish ass an aneurysm, even though i know he's not wrong and that the point being made is that you can add diacritics like an umlaut/diaeresis to letters using unicode.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +38

      If I'd pronounced the accented As aloud, your reaction would have been even worse!

    • @FindecanorNotGmail
      @FindecanorNotGmail ปีที่แล้ว +7

      Umlait ≠ diaeresis, but Unicode does not distinguish between them

    • @vmusatov
      @vmusatov ปีที่แล้ว +7

      I guess it's the same as in Russian Е and Ё, for non-russians out there they might look like they sound similar but for me they are completely different, I mean it's "ye" and "yo" accordingly

    • @interbeamproductions
      @interbeamproductions 11 หลายเดือนก่อน

      @@PhilEdwardsIncshould've said "A with acute", "A with grave", etc.

  • @EvenFilms
    @EvenFilms ปีที่แล้ว +11

    Okay I legitimately blurted out laughing when the B&W guy asked if he could leave after his segment was over. 😂 You keep coming up with great ways of playing with tropes of educational videos. I immediately imagined all talking head interviewees as being trapped in the video, waiting to be cut to again.
    You’re killing it, Phil! Best video on this topic I have ever seen.

  • @TristouMTL
    @TristouMTL ปีที่แล้ว +2

    Thank you for making your videos so enjoyable beyond just the information you give us... it looks like you have fun making them, and that's fun to watch!

  • @affechristoph
    @affechristoph ปีที่แล้ว +7

    I'm a very technical guy who is not afraid of using the command line and it's text-based applications. Having Unicode there is very useful if you want things to be more useful or just to be more beautiful. Also, I learn Japanese using the internet, which is now possible due to Unicode.
    Well made video, ありがとうございます!

  • @liriohardy7240
    @liriohardy7240 ปีที่แล้ว +3

    i feel so ꧁ special ꧂ as someone who has captions on hehe, i enjoyed that lil message ☆✪ൠ

  • @NadiimNafei
    @NadiimNafei ปีที่แล้ว +14

    I have a vague recollection of reading somewhere that that the addition of Egyptian Hieroglyphs to unicode was a pretty significant step in Egyptological research as it helped simplify communication in research (referring to hieroglyphs in writing in a shared, standardized way)
    I have a hard time finding the article I read, so perhaps my memory is off, but that’s what I remember 😅

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว +1

      That sounds entirely plausible. Also the addition of other ancient scripts, like the many varieties of cuneiform. Also runes!

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +2

      I read the same though I also ran into some stuff saying scholars sometimes preferred pictures to give historical context.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว +2

      @@PhilEdwardsInc Details of stylistic variations, say, over the centuries, are obviously best conveyed by pictures. But conveying the actual text, independent of such issues, would be most easily done by using an actual text encoding.

  • @palmercolson7037
    @palmercolson7037 ปีที่แล้ว +19

    I remember Ron and Don's issue well. In the 1980s, I was using a Unix computer that used Ascii for its characters and I had to produce a tape of W-2 data for the IRS which had to be in EBCDIC which is the code that was used by IBM. I had to use the dd ommand to convert the text and write it to tape and produce a file that the IRS would accept. For some reason, it took a lot of work to get it to work right.
    The other similar issue was that Unix used linefeed to terminate text lines and both VAX/VMS and the new PC's used carriage return/linefeed combinations. Transferring files had to take that into account.

    • @RaymondHng
      @RaymondHng ปีที่แล้ว +4

      So did I. I had to write a program to convert the variable record length ASCII data of our employees on our Pick operating system-based computer onto a tape of EBCDIC data for the Employee's Retirement System's IBM computer. It was dictionary driven. It ran perfectly after the second compile.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว +2

      VMS originally used record formats that did not have explicit line-ending characters. Then I think in version 3 they added 3 “stream” formats: STREAM_LF, STREAM_CR and STREAM_CRLF. One of these is equivalent to the CP/M/MS-DOS/Windows style, another to the Unix style, and the third to the old Macintosh style. So you see, it could handle them all natively.

  • @stevenjlovelace
    @stevenjlovelace ปีที่แล้ว +45

    I know you didn't want to get too technical, but UTF-8 had a big effect on Unicode's adoption. ASCII used 7 or 8 bits of data, while the earliest Unicode standards used 16. But for ASCII text (which includes, say, the HTML that makes up every web page, even in non-Latin languages), half the data would just be a bunch of zeroes. If they wanted to include more that ~65,000 characters, they'd have to extend it to 32-bits, and a full three quarters of plain Latin text would be zeroes. This would suck even now, but in an age of dialup modems, it was a non-starter. UTF-8 allows us to vary the number of bits for each character: 8, 16. 24, and 32, or even more if we include combining characters that allow you to change the color of hair on an emoji character.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว +8

      UTF-16 is a whole sorry saga in itself. Unicode was originally going to be a fixed-length 16-bit code, a.k.a. “UCS-2”. With a bit of a struggle over the CJKV unification, this was deemed sufficient for all writing systems currently in use.
      Certain vendors who were adopting Unicode at this time took the spec creators at their word, and began to hard-code the assumption of fixed-length 16-bit characters into their platforms: Microsoft with Windows NT, Sun with Java etc.
      Then with Unicode version 2 the spec creators changed their mind, and decided they would need more characters to support historical writing systems, special symbols, emojis etc. So characters--or actually, “code points”--would need more than 16 bits.
      So what happened to those products that had already been designed around the assumption of a fixed-length UCS-2 character set? What was originally “UCS-2” was now redefined as “UTF-16”, so a single Unicode code point could be represented by one or two 16-bit codes, using the newly-defined “upper surrogate” and “lower surrogate” ranges.
      But the sad thing is, what was supposed to be fixed-length code is now no longer fixed-length. What happened to all the software written with the assumption that the code was fixed-length? Don’t ask ...

    • @static-san
      @static-san ปีที่แล้ว +2

      UTF-8 also very cleverly solved a subtle problem in string processing when dealing with multi-byte characters: going backwards. It's a bit complicated, but Shift-JIS, GB-2312 and nearly all of the other systems encoded second and later characters of multi-byte sequences in ways that clashed with first bytes. But UTF-8 encodes subsequent bytes very slightly differently than the initial byte so you can always tell which is which, whatever way you're moving through the text bytes.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว +3

      With UTF-8, you know how many bytes make up the current code point just from looking at the first byte. ASCII characters represent themselves in a single byte.

    • @AaronOfMpls
      @AaronOfMpls ปีที่แล้ว

      @@static-san Yup.
      - If a byte starts with 0, you _know_ it's a 1-byte ASCII character (Unicode range 0-127). No additional bytes are needed; any bytes to either side are part of other characters.
      - If a byte starts with 10, you _know_ it's the second (or third, or fourth, etc) byte of a multi-byte character. Move backward one or more bytes to find the start of the character. Or move forward one or more bytes to find the start of the _next_ character.
      - If a byte starts with 11, you _know_ it's the _first_ byte of a multi-byte character. The number of 1s before the first 0 (originally 110xxxxx - 1111110x) tells you how many bytes long the entire character is. Move forward to read those other bytes.

  • @mikea.1586
    @mikea.1586 ปีที่แล้ว +8

    LOL who remembers receiving text messages with a bunch of squares thinking what is wrong with my phone.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      definitely

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      Those $5 emoji-enabling apps

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว

      I remember coming across the lovely phrase “the empty rectangle of Unicode disappointment”.
      I think web developers used the term “tofu” for those squares. And so Google created its “Noto” fonts (“no tofu”) to try to cover all the code points in a meaningful way, to put an end to them.

  • @lasinhouseinthetrees1928
    @lasinhouseinthetrees1928 ปีที่แล้ว +1

    thanks for covering this :) I asked you in a tweet a while back and thanks so much for the video

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +2

      oh dang i'm sorry i forgot to say thanks in the video! i put it in my giant brainstorm doc and lost track. thanks for the suggestion!

    • @lasinhouseinthetrees1928
      @lasinhouseinthetrees1928 ปีที่แล้ว

      @Phil Edwards no worries at all :) thanks for making it answered alot of my questions!

  • @fgsaldanha
    @fgsaldanha ปีที่แล้ว +11

    In the late 90s and early 2000s, when browsers did not fully support Unicode yet, non-English speakers had to adapt to writing without special chacterters. This was the beginning of a kind of "dialects" initially intended to circumvent that technical limitations, but quickly became youth slang (and the terror of many teachers and parents).

    • @user-jk2zm7uq5s
      @user-jk2zm7uq5s ปีที่แล้ว +2

      Chat-Arabic comes to mind, which incorporated numbers as letters. Pretty useful actually.

  • @justinrau1999
    @justinrau1999 ปีที่แล้ว +27

    Crazy that this is a single character ﷽

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +5

      I think the code is U+FDFA

    • @RusNad
      @RusNad ปีที่แล้ว +6

      ​@@PhilEdwardsInc It's U+FDFD

    • @aryan_kumar
      @aryan_kumar ปีที่แล้ว

      I remember on my Samsung phone it used to appear as a short sentence but then they released an update which compressed it into a character slightly wider than ―

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      I guess that’s a ligature? Technically?

    • @RusNad
      @RusNad ปีที่แล้ว +2

      @@kaitlyn__L It's a bit more than a ligature cause it's the whole phrase "In the name of Allah, the Compassionate, the Merciful" which opens every chapter of the Quran.

  • @n.kutalia
    @n.kutalia ปีที่แล้ว +4

    Wow, the visuals and the storytelling are amazing

  • @iamindin001
    @iamindin001 ปีที่แล้ว +2

    Convo between Ron and Don made me sub!

  • @kaitlyn__L
    @kaitlyn__L ปีที่แล้ว +4

    The spoofs of 90s VHS presentations and modern overhyped TH-camrs were great. I can tell how much effort went into framing, scripting, and then the chroma key background creation (more involved for the former admittedly).

  • @DS-oopa
    @DS-oopa ปีที่แล้ว +2

    This answered questions I never knew I had. I'm a computer tinkerer/novice programmer, so I was already somewhat familiar with what ASCII & Unicode are, but never knew the history behind them. Thank you for this entertaining educational video!

  • @matts7327
    @matts7327 ปีที่แล้ว +51

    I find it interesting that there is actually a common language (of sorts) that Unicode doesn't fully support: Math expressions. A format called Latex is much more common in math equation setting, and while Unicode does have some stuff for this, its not nearly as good.

    • @NigelMelanisticSmith
      @NigelMelanisticSmith ปีที่แล้ว +17

      Musical language is another example where pure Unicode doesn't really work, and you need formatting like TeX provides

    • @qwertyTRiG
      @qwertyTRiG ปีที่แล้ว +7

      See also Sutton SignWriting. Unicode supports all the symbols, but not their complex positioning.

    • @peterk7931
      @peterk7931 ปีที่แล้ว +3

      I would argue that LaTeX is a programing language, not a character set.

    • @mfaizsyahmi
      @mfaizsyahmi ปีที่แล้ว +3

      Unicode is for natural speech, not math nerdspeak 😂

    • @1224chrisng
      @1224chrisng ปีที่แล้ว +4

      ​@@mfaizsyahmi I wouldn't exactly call Hiroglyphs natural language, not since Roman times anyways

  • @petercraft1928
    @petercraft1928 ปีที่แล้ว +3

    always interested in your content, feels like curiosity at its peak

  • @SimplyDudeFace
    @SimplyDudeFace ปีที่แล้ว +9

    What I learned from this video- Unicode reaches back into the 80’s, and that they tried to cap it 16 bits by limiting the CJK glyphs.

  • @OllieWille
    @OllieWille ปีที่แล้ว +3

    I'm glad I'm not the only one who find it fun to look at unicode characters

  • @ljphoenix4341
    @ljphoenix4341 ปีที่แล้ว

    The humor in this video is on point! A super interesting video, yet entertaining as well. Great job with this one, Phil!

  • @neskey
    @neskey ปีที่แล้ว +1

    I would kill for that 90s style presentation that takes three hours of monotone narration over the meaning of every single symbol. you have no idea how badly I want it

  • @zyansheep
    @zyansheep ปีที่แล้ว +6

    You forgot about m̶̢̓̒o̸̼͊͜͠d̶̪͌i̵͙͚͠f̴͍͎̄ȉ̸̡ė̸͕r̴̨͛̎ ̴̣͘ͅc̴̲͘h̴̙̿a̷̹͛̑r̸̫̈́ͅa̸̳̚c̵̣̑ͅt̸̳̀͝e̵͉̍͜r̸͕̤̒s̸̱̣̓!

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +5

      ũ̈́g̃̈͂h̆̉͛ ̞̮̹h̤̥̦o̧̨̩w̪̫̭ ̬̯̰ḏ̲̳i̴̵̶d͇͈͉ ͊͋͌i͍͎͏ ͐͑͒m͓͔͕i͖͗͘s͙͚͛s͛͜͝ ͟͞͠tͣ͢͡hͤͥͦiͧͨͩsͪͫͬ

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      Which also make it take 10x as long to read on a screen reader lol

  • @nyuh
    @nyuh ปีที่แล้ว +2

    hell yeah !! i love unicode !!
    finally a look into its history !!

  • @kip_c
    @kip_c ปีที่แล้ว +2

    excellent quality as always Phil

  • @dylancorp4897
    @dylancorp4897 ปีที่แล้ว +1

    Great vid. This made me look into what a Japanese typewriter looked like and Wow! The technical expertise it must have taken to use those things. Makes the QWERTY evolution seem tame.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      yes indeed! i believe johnny harris made one about the chinese keyboard
      th-cam.com/video/hBDwXipHykQ/w-d-xo.html
      and i have made a qwerty vid!
      th-cam.com/video/c8f6us-Sjlo/w-d-xo.html

    • @dylancorp4897
      @dylancorp4897 ปีที่แล้ว

      ​@Phil Edwards Saw your QWERTY vid but hadn't heard of Johnny Harris before. Checking it out now, thanks for sending!

  • @alecgolas8396
    @alecgolas8396 ปีที่แล้ว +7

    I'm always down to listen to a boring 30 minute video by "The Computer Guru" while I make my spaghetti

  • @herzogsbuick
    @herzogsbuick ปีที่แล้ว

    i didn't expect to wake up to find a history of unicode video in my feed, let alone from someone i enjoy as much as you, Phil. let the breakfast, and history, begin

  • @TheNucaKola
    @TheNucaKola ปีที่แล้ว +1

    Thank you for calling out how obnoxious some TH-cam videos can be at the beginning

  • @ian4846
    @ian4846 ปีที่แล้ว

    Love your videos. Not too in depth, long and dry videos but also not too superficial and surface level.

  • @TSZatoichi
    @TSZatoichi ปีที่แล้ว +2

    The level of ASCII usage will fall off a cliff now that Dwarf Fortress has been ported to Steam.

    • @leap123_
      @leap123_ ปีที่แล้ว

      *Microsoft's ASCII extension. Dwarf Fortress uses CP-437, Microsoft's ASCII extension used in MS-DOS. So it's not the level of ASCII usage that fall of the cliff, it's the level of Bill's ASCII usage that did.

  • @DeclanMBrennan
    @DeclanMBrennan ปีที่แล้ว +5

    Insisting that every Han character gets it's own code point by itself: *Han Solo* (Queue Wookiee sound.)
    Thank you. I'll show myself out.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +3

      Don't show yourself out. I had a Han unification joke that I cut out from the video. We're only human.

    • @DeclanMBrennan
      @DeclanMBrennan ปีที่แล้ว

      @@PhilEdwardsInc 🙂

  • @LethalBubbles
    @LethalBubbles ปีที่แล้ว +2

    why do you need so much justification for including hieroglyphics? I'm so glad it tries to preserves every historical language. it is a good thing in itself.

  • @DiannaCarney
    @DiannaCarney ปีที่แล้ว +3

    Just starting the video now- but I’m so surprised, my TV even recognizes the symbols!

  • @camposfb
    @camposfb ปีที่แล้ว +3

    That HAI mention in the beginning! 🤩

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +4

      wasn't surprised to find he'd done something unicodey

  • @JC20XX
    @JC20XX ปีที่แล้ว

    This video was good at showing social impacts which I hadn't seen featured before. Thank you.

  • @paulcooper3611
    @paulcooper3611 ปีที่แล้ว +3

    And here is another bit of unnecessary history: Baudot encoding was invented in the 1870s and was used in teleprinters. Letters were encoded in five bits, which only left room for capital letters. ASCII had seven bits, which allowed for lower case letters and more punctuation, but the first iteration only had upper case letters because that is what Baudot had. The Baudot ITA2 code, standardized in 1924, listed the letters in the order they appeared on the QWERTY keyboard. When the ASCII table was laid out they realized that the keyboard order no longer mattered, so the letters are in alphabetical order.
    In 1974 I worked at a company in Freeport, Texas. The blueprint printer I used was right next to the telex machine used to communicate with the company office in Houston. They would send us the daily schedules and we would send our production data back to them. When the telex received a message, in addition to printing it out, it would punch it out on paper tape, which was stored and could be printed out again. It was sort of the early version of the disk drive.
    The telex is pretty well gone now, but the Baudot code is still used in nautical radio messaging to conserve bandwidth.
    The big competitor to ASCII was EBCDIC, but I don't want to go into that right now, since it doesn't really bear on the development of Unicode.

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      My favourite part of Baudot code was how you could still do lowercase letters, you just had to hope the receiving end wasn’t out of sync on the status of the carriage shift… (so of course most didn’t do it, and the early teleprinters didn’t even bother having compatibility)

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      Thanks for adding that Paul. And Kaitlyn you are gradually revealing you are a character code/Unicode genius.

    • @FindecanorNotGmail
      @FindecanorNotGmail ปีที่แล้ว +1

      There used to be standardised "ASCII keyboard" modules that produced ASCII codes in parallel when you pressed a key, or key combination. I think both the Apple I and Apple II were made for ASCII keyboards: for the Apple I you had to buy one separately.

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      @@PhilEdwardsInc just regurgitating stuff I read 15 years ago lol, a certain amount gets reinforced watching people repair mechanical 1930s teletypewriters too! But thank you very much nevertheless, I suppose it does all add extra background info to the subject of the video :)

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว

      IBM’s excuse for creating EBCDIC is that they were bringing out their brand-spanking new System/360 machines in 1964, and the ASCII spec wasn’t quite finished yet. (I think it was actually finalized that same year.)
      So, for the sake of a few months, the computer industry was saddled with decades of compatibility headaches between the two character encodings.
      Actually, nobody used EBCDIC apart from IBM and some plug-compatible mainframe vendors. The entire rest of the industry adopted ASCII, plus national variants thereof.

  • @dustinbelle3251
    @dustinbelle3251 ปีที่แล้ว +2

    1:35 shots fired at Scishow! 💀

  • @IMPERIALYT
    @IMPERIALYT ปีที่แล้ว +4

    Cool stuff!

  • @TwoWrights
    @TwoWrights ปีที่แล้ว +8

    This was super interesting. All your videos are. They just should be a little longer. They always leave me wanting more.

    • @SimplyDudeFace
      @SimplyDudeFace ปีที่แล้ว +1

      I get the same feeling. Maybe it’s that these videos cover topics I already know, or maybe they could use an extra minute to go into a little more detail. An important point that was skipped from this video is the fact that the rendering of the glyph is not part of Unicode. Unicode is the text description of the glyph.

    • @TwoWrights
      @TwoWrights ปีที่แล้ว +1

      @@SimplyDudeFace I think it’s a Vox thing. Or it was a Cracked thing that evolved into the Vox thing of giving you the light information but not coming to a conclusion or opinion at the end. I just want him to dig one layer deeper. These videos spark my interest in a topic and then they’re over.

  • @CubeAtlantic
    @CubeAtlantic ปีที่แล้ว +2

    Some of these unicodes are so high-quality & interesting ngl.

  • @aperson1
    @aperson1 ปีที่แล้ว +6

    This video is 𓎼𓂋𓄿𓏏.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +5

      𓃀

    • @evanmcgurrin
      @evanmcgurrin ปีที่แล้ว +1

      I'm sad how far I had to scroll to find interesting unicode in the comments, but I'm glad I found it!

  • @paulafunaro1890
    @paulafunaro1890 ปีที่แล้ว

    I never loved history but the way you tell the stories make me like it ever so little more. Thank you!

  • @jonduke4472
    @jonduke4472 ปีที่แล้ว +2

    There's also precedent because some of the earliest word processors felt it was important to be able to write chess books

  • @TonyBullard
    @TonyBullard ปีที่แล้ว +4

    The ever-growing inclusivity of Unicode makes me feel warm and fuzzy inside and hopeful for the future.

  • @BlairCarlyle
    @BlairCarlyle ปีที่แล้ว

    The humor in this video is on another level
    Great video as always Phil, thanks!

  • @LargeCrocodile
    @LargeCrocodile ปีที่แล้ว +1

    Great editing, fitting music for the atmosphere, and your watch facing the camera for some reason kept me fixated. 9/10

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      lol the watch has triggered so many people haha. i am going to start it as my retention strategy.

  • @dan339dan
    @dan339dan ปีที่แล้ว +7

    7:06 As a Chinese user, I think those are still very much the same Chinese characters. So I don't think anyone would oppose these characters on a practical standpoint. Han Chars Unification also deals with more visible differences where the good examples are listed on Wikipedia.
    As for the hieroglyphics, I haven't checked, but I assume these are added for research purposes. How else would you do a text search on ancient Egyptian writings if there isn't a standard?

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว

      I remember reading most opposition came from the types of people who also opposed Simplified Chinese and the postwar Japanese hanji simplification.

  • @matthew2532
    @matthew2532 ปีที่แล้ว

    "Sometimes a solution looks like a foot" - likely the most confusing quote without context.

  • @isaacweymouth5795
    @isaacweymouth5795 ปีที่แล้ว

    I appreciate the details you put into your youtuber spoof at 1:30. Abrasive colors, lighting, faces and cuts; just needed some face swirling and zooming.

  • @siimseiin
    @siimseiin ปีที่แล้ว +1

    Thank you for posting!

  • @alecparker6009
    @alecparker6009 ปีที่แล้ว +17

    𓃀

  • @JMartinsATV
    @JMartinsATV ปีที่แล้ว +1

    I can tell you’re finding your groove with every new video. In my opinion this topic isn’t all that interesting but your delivery was really funny and entertaining. And I still learned something so congrats!

  • @calyco2381
    @calyco2381 ปีที่แล้ว +3

    That remind me of one shorts about Ao3 fun fact that there is someone posting fic written in hieroglyph. When i checked Ao3, i found something way crazier.
    There is fic written in fkn cuneiform.
    Like, HOW DO YOU EVEN TYPE THAT?! 💀

  • @10HW
    @10HW ปีที่แล้ว

    The Calvin peeing video + the "Ron, Don, Ron, Don" joke got me subscribed :)

  • @さゆぬ-x7i
    @さゆぬ-x7i ปีที่แล้ว +12

    Some people opposed the Han unification back then but as a Japanese person I would say the majority of us, including I, think it was a sensible thing to do. But it was not a simple task, was not executed perfectly, and left some irreversible inconsistencies in the code chart - like the fact that the character 浅 is defined so it can be drawn with two or three horizontal bars in the right‐hand part while 桟 and 栈 are encoded separately. Overall it more or less worked.
    It is hard to come up with a Latin alphabet equivalent but it is like an alternate universe where one half of the world would end up having a tradition of writing the letter A consistently without a horizontal bar, while the other half would exclusively use the A with two horizontal bars to write their own languages for centuries. Both evolved from the same letter A, but for each sides “the other version” would just look weird and alien. And it’s not just the letter A but many of the alphabet throughout would have such small differences, some because of natural evolution, some due to countries decided to purposefully modify their writing system. They speak different languages so texts written in one region using their version of the alphabet rarely meet the eyes of others, never normalizing wide variations within each regions. (So for developers supporting East Asian languages it is important to not just translate the text but also make sure to apply appropriate fonts designed for each regions to look acceptable.)

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว +6

      I think a comparison to the identical or nigh-identical letters in Greek or Cyrillic are an appropriate counterpart, since they have similar evolutionary divergences as the different versions of hanji. There’s less overlap, but still enough to make the point imo. Epsilon isn’t the same as E, but to an outsider they’d look the same.

    • @さゆぬ-x7i
      @さゆぬ-x7i ปีที่แล้ว +4

      ​@@kaitlyn__L That can be regarded as similar in a broad sense, but then we can not explain why they are treated differently in Unicode (and in other older standards) using the analogy. Among CJK languages some of the Han characters look a bit different (which are unified encoding‐wise), some look completely dissimilar (encoded separately), but many others look identical. When they are similar enough, native speakers regard them as the same character. The character 百 that appears in Mandarin text and 百 in Japanese are seen as the same, so it is desirable to be able to type or search the “Chinese 百” and the “Japanese 百” the same way. That is unlike the E/Epsilon pair, and more similar to the relationship between “English E” and “French E”.
      If we want a real‐world western analogy, I suppose it is how traditional cursive handwriting of the same alphabet differs from region to region in Latin‐ or Cyrillic‐using countries. That kind of divergence extends to block‐style glyphs in CJK, so to speak.

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว +2

      @@さゆぬ-x7i certainly, there’s no perfect analogy. But for explaining the specific backlash, I think it’s at least better than A and À! As the audience more quickly grasps both the core difference and why an outside observer might want to combine them. But certainly there’s nothing exactly the same in Europe.

    • @さゆぬ-x7i
      @さゆぬ-x7i ปีที่แล้ว +2

      ​@@kaitlyn__L Yeah, it’s better than A and Á, that is for sure. With that analogy alone the audience would not understand that the unification is not as far‐fetched as the idea of consolidating Latin, Greek and Cyrillic into one alphabet, so in the end we need to explain the situation specific to CJK, either way.

    • @GurtBFroe1
      @GurtBFroe1 ปีที่แล้ว

      ​@@kaitlyn__L What about writing Z or 7 with or without a middle bar?

  • @robdavlin
    @robdavlin ปีที่แล้ว

    “The quick brown fox jumps over the lazy brown dog.”
    I wonder what sentence captures the most Unicode characters.

    • @the_linguist_ll
      @the_linguist_ll ปีที่แล้ว

      I mean since sentences can be of infinite length in any language, I guess it comes down to whichever language uses the most Unicode characters.
      Assuming we're disallowing the sentence "I just typed "🎉" and "❤" and "ɞ" and "ɮ" and "Ꮝ" and "Ꮿ" and "L"..." in which case it's that one

  • @pdpUU
    @pdpUU ปีที่แล้ว +2

    9:34 ^me sipping my mds sprite in bed wearing my bathrobe^ Ahhh yes, *we* did good.

  • @The_Sofa_King
    @The_Sofa_King ปีที่แล้ว +1

    Another solid video Phil!

  • @tomasc7728
    @tomasc7728 ปีที่แล้ว +1

    This is amazing! I just attended the funeral of one Jon Jenkins, who was one of the key Unicode engineers who worked for Apple over many years, and was a key worker who helped with Chinese characters and other alphabets, idk this stuff just really intrigues me.

  • @elideaver
    @elideaver ปีที่แล้ว +1

    Newspapers have headlines: youtube videos have titles

  • @sonny5068
    @sonny5068 ปีที่แล้ว

    This is a phenomenal video! Loved every second of it!

  • @josephrion3514
    @josephrion3514 ปีที่แล้ว +5

    If you can please explore more of the big bumps in that chart of upticks in characters what were they adding at that points. What did the code look like before han characters were added. What was the romantic languages how many characters did they fill up. What languages got supported in which order. Discuss about the emoji take over in more detail. Fascinating topic reminds me w bit of the helvetica documentary my art class showed.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +1

      Yeah definitely more there - I'd basically be doing this version history: unicode.org/history/publicationdates.html

    • @josephrion3514
      @josephrion3514 ปีที่แล้ว

      @@PhilEdwardsInc I'll take my time to browse it. maybe I'll find some interesting things! Thank you.

  • @Nick-Lab
    @Nick-Lab ปีที่แล้ว

    His typical 90s character revealed a lot about what he was like 30 years ago lol

  • @mgetommy
    @mgetommy ปีที่แล้ว +1

    Enjoyed this video Phil

  • @AyushBakshi
    @AyushBakshi ปีที่แล้ว +5

    01:56
    In graphic design, they are more precisely called typefaces. Fonts are flavours: bold, italics, narrow and so on.

  • @willychilton
    @willychilton ปีที่แล้ว +1

    Phil’s alter egos are becoming the best part of this channel. Laughed my ass off at 1960s Phil 😹

  • @glowing_kitty
    @glowing_kitty ปีที่แล้ว +1

    Asking GPT4 about this gives some fun results:
    Here are 50 unusual Unicode characters from various scripts and symbol sets:
    1. ƃ - Latin Small Letter B with Topbar
    2. Ƕ - Latin Capital Letter Hwair
    3. Ɇ - Latin Capital Letter E with Stroke
    4. ʬ - Modifier Letter Small Turned W
    5. ˁ - Modifier Letter Low Left Arrowhead
    6. ˤ - Modifier Letter Reversed Glottal Stop
    7. Ֆ - Armenian Capital Letter Fe
    8. ֍ - Armenian Eternity Sign
    9. ٭ - Arabic Five Pointed Star
    10. ں - Arabic Letter Noon Ghunna
    11. ۝ - Arabic End of Ayah
    12. ॐ - Om Symbol
    13. ॡ - Devanagari Letter Vocalic Ll
    14. ৺ - Bengali Isshar
    15. ੴ - Gurmukhi Ek Onkar
    16. ઽ - Gujarati Sign Avagraha
    17. ଃ - Oriya Sign Visarga
    18. ஃ - Tamil Sign Visarga
    19. ఀ - Telugu Sign Combining Candrabindu Above
    20. ಄ - Kannada Sign Siddham
    21. ഽ - Malayalam Sign Avagraha
    22. ඞ - Sinhala Letter Alpapraana Nayanna
    23. ฦ - Thai Character Lue
    24. ༄ - Tibetan Mark Initial Yig Mgo Mdun Ma
    25. ᚠ - Runic Letter Fehu
    26. ᛝ - Runic Letter Ingwaz
    27. ᜀ - Tagalog Letter A
    28. ᜵ - Philippine Single Punctuation
    29. ᠆ - Mongolian Todo Soft Hyphen
    30. ᥆ - Tai Le Letter Tone-2
    31. ᨗ - Buginese Letter Ara
    32. ᬆ - Balinese Letter Oriya
    33. ᱚ - Chakma Letter Dda
    34. ₯ - Drachma Sign
    35. ₸ - Tenge Sign
    36. ⃖ - Combining Leftwards Arrow Above
    37. ⃬ - Combining Parentheses Above
    38. ⌘ - Place of Interest Sign
    39. ⌬ - Increment
    40. ⍼ - Apl Functional Symbol Circle Star
    41. ⏏ - Eject Symbol
    42. ⏣ - Benzene Ring with Circle
    43. ␛ - Symbol for Escape
    44. ⑀ - Circled Digit Zero
    45. ⑆ - Circled Number Ten on Black Square
    46. ⓪ - Circled Digit Zero
    47. ⚆ - White Circle with Dot Right
    48. ⚙ - Gear
    49. ⛢ - White Diamond in Square
    50. ⛯ - Map Symbol for Lighthouse
    Please note that the appearance of these characters may vary depending on the device and font used.

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      i may have done this to find fun ones for the video!

  • @Juho.S.
    @Juho.S. ปีที่แล้ว +4

    Alt + 3 (in numpad) = ♥

  • @DeactivatedCharcoal
    @DeactivatedCharcoal ปีที่แล้ว

    One day, while waiting in a parking lot. I checked to see if I could get any WiFi signal on my phone. Lots boring SSID names (and a few clever ones) as you might find downtown. One of the WiFi names was 💩 Yes, you can use emojis & most unicode characters. I haven't had this much fun since the DOS days when I discovered you can use character 255 "invisible character" in file names. People would try to enter it as tye "space bar" Another fun was character #7 "bell character" if you Printed it, the printer would buzz or sound a beeper.

  • @christophedevos3760
    @christophedevos3760 ปีที่แล้ว

    Well... With the use of emoji we're back to the origin of writing, so the (re)introduction of hieroglyphs is a logical step.

  • @TheTerranInformed
    @TheTerranInformed ปีที่แล้ว +3

    For a while, I had been frustrated about the fact that I had to look up the character for Pi every time I wanted to type it!- but now, I have downloaded the Greek alphabet on my phone, so that I can say:
    π π π π π π π π π π π π π π π π π!!!!!!!!

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      π

    • @FindecanorNotGmail
      @FindecanorNotGmail ปีที่แล้ว

      ... and then find that Unicode differentiates between "GREEK SMALL LETTER PI" and "MATHEMATICAL ITALIC SMALL PI" and your software may only recognise one or the other ...

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 ปีที่แล้ว

      On Linux systems, you can define a “compose” key which lets you type mnemonic sequences for a large range of characters on something as ordinary as a US keyboard. E.g. compose-lessthan-doublequote for the opening “ symbol, compose-greaterthan-doublequote for the close ”, compose-plus-minus for ± and so on.
      But there was no sequence for π. Then I discovered you can define your own custom compose sequences. So I defined compose-p-i to let me type π, and compose-m-u for µ.

  • @zarinloosli5338
    @zarinloosli5338 ปีที่แล้ว

    As someone who is pretty familiar with Unicode and has kind of gotten over its wonder, I enjoyed you introducing unicode to stranger things music

  • @dogone1t
    @dogone1t ปีที่แล้ว +1

    Awesome videos, super informative

  • @yorktown99
    @yorktown99 ปีที่แล้ว

    In one of his live talks, Tom Scott says that the watershed moment was when a version of ASCII included a smiley face.

  • @rakninja
    @rakninja ปีที่แล้ว +2

    point of contention: hangul (the korean language) uses a phonetic alphabet with 24 characters. they also use some chinese characters, in much the same way and i suspect for many of the same reasons, as japan.

    • @static-san
      @static-san ปีที่แล้ว +1

      This is correct. Korean and Japanese languages are both "agglutinative" which means words are constructed from root elements and extended with prefixes and suffixes and other words. Chinese languages are not structured that way - they are "analytical" languages. So both the Japanese and Korean people had the same problem using Chinese characters to write their own language with. They eventually solved it by creating their own characters (kana and hangul respectively) to do for their language what the Chinese characters could not. However, Korea largely abandoned mixed text for pure Hangul quite some time ago. Japanese was on the edge of doing that for Kana in the 1940s (I think?) but there was protest from some quarters and it didn't happen.
      I found a carved monument in Seoul that had mixed Hanja+Hangul text. Quite interesting to see!

  • @user-ht5ce2it3z
    @user-ht5ce2it3z ปีที่แล้ว +2

    Brilliant and well made as always Phil. Love your stuff at Vox and love your solo style and how it's you're own. By the way, has anyone ever mentioned you look so much like a younger Gary Oldman? Gary Not-So-Oldman? Also, are these TH-cam emoji also unicode based? Time to do my own resarch.

  • @joshuaevans4301
    @joshuaevans4301 ปีที่แล้ว +2

    Pro tip: Press " + :" (so, press the windows key and colon at the same time) to bring up the Windows emoji menu. Now you can Unicode anywhere!

    • @FindecanorNotGmail
      @FindecanorNotGmail ปีที่แล้ว

      I think that Windows should have adopted the *Compose* key from Unix, and let people _type_ emojis instead of having to browse for them.
      For instance *Compose* *

    • @joshuaevans4301
      @joshuaevans4301 ปีที่แล้ว

      @@FindecanorNotGmail The windows emoji menu actually works like this :D
      You just press " + :" and start typing the name of the emoji. The emoji list will be filtered, and when you select one the text you typed for the search will be replaced with the selected emoji

  • @pħi
    @pħi ปีที่แล้ว +1

    i can type ඩාьЮœßλΨўωħǶƕ🌄ʑ̴̪̤̰̺̻̼̝̞̘̙̠̯̟̜̩̥̬͈͉̻͎͔͍͕͇̺̪̃̑̍̽͗̈̌̊̌̂᷄᷅᷈̏̄̄́̋͊͆̈͋͆͌̚͜͢͡📲🛞қ№😇фʠ for no reason

  • @r0kus
    @r0kus ปีที่แล้ว +2

    Thank you for this informative, fun, and well-acted video. 😑
    I realize you did not want to get too nerdy with this, but I feel two critical steps from ASCII to Unicode bore mention. The first is ANSI, Microsoft's ASCII extension for Windows that included various European characters not part of ASCII. The second is the formal international standard, ISO-8859, for the 256 character encodings a byte could hold. 8859 in particular remains significant because the first 256 characters of Unicode *are* ISO-8859.⬅
    Not mentioning these is like walking the ASCII path to the front door of the Unicode house, but missing the two steps up the porch, allowing easy access to said door.👟

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      thank you! appreciate the steps!
      (I tried to kinda do this when I mention Bill's standards, but no arguing that this wasn't a full transition.)

    • @FindecanorNotGmail
      @FindecanorNotGmail ปีที่แล้ว +1

      Actually, Microsoft's "ANSI" was based on an ANSI draft of what eventually became the standard ISO-8859-1 ("ISO Latin 1"). And there are more than one ISO-8859 with different characters: (up to ISO-8859-15)
      Microsoft's "ANSI", (also called Windows-1252) is actually a super-set of ISO-8859-1 with printable characters in the range 128..159 that the ISO standard left undefined. Unicode is a superset of ISO-Latin-1 but with control characters in that range instead.
      ISO-Latin-1 used to be the default on the web before the change to Unicode/UTF-8, and there are many web sites that still haven't converted.
      Some notable character in Windows-1252's range not in ISO-Latin-1 are fancier quotation marks. You sometimes still see just those characters rendered wrong when Windows-1252 text has been mistaken for being ISO-Latin-1.

    • @r0kus
      @r0kus ปีที่แล้ว

      @@FindecanorNotGmail Thank you for the more complete information. I have noticed annoying blocks & such showing up when I view a Windows-1252 page. It seems to be a lot less common now than it used to be.

  • @Sbrasher13
    @Sbrasher13 ปีที่แล้ว

    I had no idea so much work went into the new emoji releases

  • @bradbennett1420
    @bradbennett1420 ปีที่แล้ว +2

    New graphics are really shaping up dude. Every video is better and better

  • @tj_mora
    @tj_mora ปีที่แล้ว

    One reason why the scripts of extinct languages like the hieroglyphics were added is science. Archeologists, anthropologists, linguists, ethnologists, historians, etc. all need a way to digitally write whatever was written in a piece of clay, or rock or papyrus that they have found. This is still actually an ongoing problem for them. Like what Unicode did with Han Unification, Unicode also combined the different time periods of these ancient scripts. Ideally there should have been different sets of code points for sets of hieroglyphics that came from different periods of time but Unicode also unified those code points. So like with Han Unification where there should be distinct Japanese, or Korean or Chinese fonts just to distinguish what each code point truly represents, that too became the reality for hieroglyphics and other writing systems. They needed different fonts for different periods of time. What Unicode did is self-defeating. Our computers today are powerful enough to render up to millions of code points, not just the over 140,000 they currently have today. Why not do away with these unification and assign every character from every dialect and time period a code point, making none of us rely on installing distinct fonts just to display what the code point represents accurately.

  • @ChristianJiang
    @ChristianJiang ปีที่แล้ว +10

    6:53 I wouldn’t say “A” and “Á” is a good example… Maybe “A” and “𝐀”? But in a world where the second 𝐀 is the only acceptable form of A in certain countries. (I know, weird, but there are no actual examples for this.)
    Imagine being used to only seeing 𝐀 in your day-to-day life, and then suddenly it gets rendered as “A”. You’d still recognise it, but you’d think it looks distinctly foreign.
    The same goes for Han characters such as 八. The font is the only thing that can determine a character’s “regional” look. Now imagine a Chinese text displayed with a Japanese font. All the characters would stay legible to a Chinese audience, but many of them will look Japanese.

    • @kaitlyn__L
      @kaitlyn__L ปีที่แล้ว +3

      Yeah I thought a comparison to similar looking letters in Cyrillic or Greek would have been more appropriate. Like “imagine if a Japanese computer giant decided A and alpha were the same”

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว +1

      yeah totally fair. I thought about that, but I wanted to get at the nationalistic/cultural element, and I felt the only way for a Western audience to vibe with that was to see a character that was clearly from another country (another being from my perspective, as an American).

    • @ChristianJiang
      @ChristianJiang ปีที่แล้ว +1

      @@PhilEdwardsInc A better example would be to have Latin A and Cyrillic А encoded as one single character! (I wonder why I didn’t think of that earlier haha!) Or Latin M and Greek Μ. And only a font would display lowercase M as either m or μ. So you’d need a Greek-specific font to display μ, or else it’d incorrectly be rendered as “m”.
      This is quite crazy if you think about it, yet Chinese and Japanese users have to deal with it on a daily basis! Not only that, but Hong Kong and Taiwan also display characters differently (and Korea as well, although Chinese characters aren’t that commonly used in Korean anymore). 真, for instance, would be displayed with the central box “detached” from the lower horizontal stroke in Japanese, whereas in Chinese it connects directly to it. The same can be said for 直. Who knows what your device displays it as! But yeah, the Japanese way of writing the character would be considered wrong in Chinese, and I believe vice versa as well… Imagine seeing that in a printed text!

  • @itryen7632
    @itryen7632 ปีที่แล้ว

    I've always been interested in this kinda stuff

  • @Doggieman1111
    @Doggieman1111 ปีที่แล้ว

    You're my new favorite channel

  • @Thebreakdownshow1
    @Thebreakdownshow1 ปีที่แล้ว +1

    Why is fire so fire indeed.

  • @forivall
    @forivall ปีที่แล้ว

    Mojibake! EBCDIC! SHIFT-JIS! Combining diacritics!

    • @PhilEdwardsInc
      @PhilEdwardsInc  ปีที่แล้ว

      t̂h͟e̊ p̃ös͒s̈i͟b͟ỉl̈ît͟i͟ẽs͟ är̆ẽ e͟ñd͟l̈e͟s͟s̃"

  • @darkfent
    @darkfent ปีที่แล้ว

    Oof that unicode felt so mid 00s and the bro guy felt so early to mid 2010s