Plain Text • Dylan Beattie • GOTO 2023

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 พ.ย. 2024

ความคิดเห็น • 67

  • @liorbunzl
    @liorbunzl 8 หลายเดือนก่อน +14

    Dylan, every lecture you give is a masterpiece. We're not worthy.

  • @SpiritmanProductions
    @SpiritmanProductions 8 หลายเดือนก่อน +5

    It's worth mentioning that .NET does indeed use two bytes per character, but that is only for characters in the Basic Multilingual Plane. It supports characters outside of the BMP by using surrogates, in a manner similar to UTF8. A character like 😊 requires 4 bytes to store instead of two.

  • @bilguunbo
    @bilguunbo ปีที่แล้ว +37

    Was geniunanly impressed by the simplicity and complexity

  • @andreydeev4342
    @andreydeev4342 8 หลายเดือนก่อน +2

    Great talk! Шикарный доклад! Շատ լա՜վն ելույթն ա

  • @SumanthVepa
    @SumanthVepa ปีที่แล้ว +15

    Absolutely a must watch for any programmer that needs to deal with strings! Fantastic!

    • @Lemmy4555
      @Lemmy4555 ปีที่แล้ว +2

      what about dates and timezones

    • @SaHaRaSquad
      @SaHaRaSquad ปีที่แล้ว

      @@Lemmy4555 For the sake of your sanity I'd recommend builtin functions & existing libraries for dates and timezones. And for unicode grapheme clusters.

    • @AlisherU262
      @AlisherU262 6 หลายเดือนก่อน +1

      ​@@Lemmy4555
      *Nails scratching on chalkboard*
      We don’t talk about those two.

  • @stefanlagrange188
    @stefanlagrange188 หลายเดือนก่อน +1

    Great talk once again! Working at a company where the core processing still runs on an IBM mainframe (EBCDIC encoding), websites that use UTF8 and support an additional language that is not English, we've had some of these issues before..! 😁

  • @GG-uz8us
    @GG-uz8us ปีที่แล้ว +39

    The Harry Potter email story is very impressive, even the people working at the post office understand encoding.

    • @Joetorres3
      @Joetorres3 ปีที่แล้ว +4

      My guess is that they had trouble with commercial software in the post office all the time

  • @kalmarnagyandras
    @kalmarnagyandras ปีที่แล้ว +16

    Laughed all the way through, very informative and entertaining. Greetings from the land of Ő and Ű ;)

  • @RomainBertrand23
    @RomainBertrand23 3 หลายเดือนก่อน +1

    can't believe 43 minutes passed, thanks Dylan for those awesome 43 minutes with you :)

  • @nefrace
    @nefrace ปีที่แล้ว +3

    And only after watching this i have some understanding of how utf8 works. Thank you!

  • @jupiter909
    @jupiter909 ปีที่แล้ว +4

    Very nice talk! Have been down the rabbit hole many times with various encodings. Will give all new starters this video to watch as a primer for how crazy the landscape is :D

  • @manueldippold5124
    @manueldippold5124 6 หลายเดือนก่อน +2

    I stared at the options for ordering the city names for several minutes. 'cause we don't have a definite rule either if e.g. Ö comes right after O oder if all Ä,Ö and Ü are just scrammed after Z.
    My German gut said Ö is a type of O. So Österreich comes after Ostern but before Zürich. :D

  • @wrjacqmein
    @wrjacqmein ปีที่แล้ว +43

    "Politics creates the problems technology tries to solve" - Dylan Beattie

  • @jukkanikki3395
    @jukkanikki3395 ปีที่แล้ว +4

    Hyvää Syntymäpäivää! ;) And thanks for great talk..

    • @agnishom
      @agnishom ปีที่แล้ว +1

      Torilla Tavataan!

  • @kousheralam8657
    @kousheralam8657 9 หลายเดือนก่อน +1

    WoW Impressive !!! অসাধারণ ।

  • @facundoramallo64
    @facundoramallo64 ปีที่แล้ว +7

    a lot of stuff i didn’t know! Great talk!

  • @Fanatic17
    @Fanatic17 5 หลายเดือนก่อน +1

    This was extremely interesting and entertaining

  • @mikdore2522
    @mikdore2522 ปีที่แล้ว +1

    Excellent talk!

  • @gdargdar91
    @gdargdar91 ปีที่แล้ว +10

    41:00 you don’t drive cars out of soviet union, only tanks.

  • @slalomsk8er397
    @slalomsk8er397 ปีที่แล้ว +5

    Oh I know the Chinese problem as I got it a lot while using copy past from Linux to Windows over synergy. Pasting without formatting helped ;)

  • @debbh274
    @debbh274 ปีที่แล้ว +4

    Very nice talk.

  • @rfvtgbzhn
    @rfvtgbzhn 11 หลายเดือนก่อน +1

    20:24 even in German speaking countries this is not always handled the same, because some of the dictionaries, encyclopediae, phone books, etc. these countries treat Ä, Ö and Ü just like A, O and U, others like AE, OE and UE (which is where these letters come from historically) and some put them at the end of the alphabet.

    • @FindecanorNotGmail
      @FindecanorNotGmail 8 หลายเดือนก่อน

      I suppose some dictionaries also (correctly) would distinguish diaeresis from umlaut, and sort them differently.
      (Which you can't do with just Unicode and a Locale string: you'd need a proper dictionary with the words in them)

    • @rfvtgbzhn
      @rfvtgbzhn 8 หลายเดือนก่อน

      @@FindecanorNotGmail diaresis is hardly used in German. I know it only from surnames like Groër (a former Austrian cardinal).

  • @curious968
    @curious968 ปีที่แล้ว +3

    In some languages, such as Swedish, "ae" is a letter, not a ligature.

    • @linco95
      @linco95 ปีที่แล้ว +2

      Æ would be Norwegian or Danish. In Sweden it's Ä.

  • @LyleSeaman-x8i
    @LyleSeaman-x8i หลายเดือนก่อน

    you can't indent with vertical tabs. when you type vertical tab on a TTY, the page advances by "a bunch"

  • @renemarot544
    @renemarot544 ปีที่แล้ว +1

    Hi and thanks for this nice dive in the matter. A bit disappointed you didn't mention EBCDIC :-) must be because it was a microcomputer oriented talk but thanks again anyway.

    • @AbAb-th5qe
      @AbAb-th5qe ปีที่แล้ว

      yeah, the speaker didn't mention why 8 bits were available in the first page. Or which company invented the concept of a codepage either.

  • @mattsadventureswithart5764
    @mattsadventureswithart5764 ปีที่แล้ว +13

    I love that gay pirates are winning, and I happen to be straight.

    • @FindecanorNotGmail
      @FindecanorNotGmail 8 หลายเดือนก่อน

      Does Windows support the Ninja emoji? 🥷🆚🏴‍☠ → 🏁 ?

  • @bronkolie
    @bronkolie 3 หลายเดือนก่อน

    16:02 If you live in IJmuiden, Steam will remember your address as Ijmuiden, which looks looks weird and I need to correct it every time. I guess even in Unicode you don't escape the Anglocentrism

  • @blacklion79
    @blacklion79 8 หลายเดือนก่อน +1

    CP/M worked on microcomputers, not minicomputers!

  • @Nik930714
    @Nik930714 ปีที่แล้ว

    14:40 As a Bulgarian, yes, that used to be a thing, yes we hated it, yes I hate being reminded that that was a thing. Thank god we dont have to deal with that BS anymore.

  • @alexandershendi7428
    @alexandershendi7428 10 หลายเดือนก่อน

    Ironically the Net uses NETASCII with CR LF line terminators ;)

  • @rfvtgbzhn
    @rfvtgbzhn 11 หลายเดือนก่อน

    8:49 strange that they included both kinds of phi, but no psi. Psi is used a lot in physics, even in high school physics.

  • @mrmimeisfunny
    @mrmimeisfunny ปีที่แล้ว +1

    5:16 Minor nitpick, You said there were a lot of 4-bit microprocessors when ASCII was designed. ASCII was designed in 1969 and the first 4-bit microprocessor was invented in 1971.

    • @SaHaRaSquad
      @SaHaRaSquad ปีที่แล้ว +2

      No, he said it's "fast even on a 4-bit microprocessor". And bit masking was likely a thing long before that.

    • @mrmimeisfunny
      @mrmimeisfunny 4 หลายเดือนก่อน

      ​@@SaHaRaSquad He said immediately that there were 4 bit microprocessors when ASCII was designed.

  • @goldnutter412
    @goldnutter412 ปีที่แล้ว +5

    38:06 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣

    • @smayds
      @smayds ปีที่แล้ว +3

      My take on 🏳️‍🌈🏴‍☠️🏁 is "Gay Pirate Racing" because I sure as heck want to watch that on weekends!

  • @dudedavid522
    @dudedavid522 3 หลายเดือนก่อน

    Please leave politics out of plain text. Sincerely, guy who'd watch more hours of this enigma wrapped in an anecdote wrapped in a vest ❤

  • @ben.alldridge
    @ben.alldridge 8 หลายเดือนก่อน

    Dylan is essentially hbomberguy a bit older with more hair.

  • @SRG-Learn-Code
    @SRG-Learn-Code ปีที่แล้ว

    So..., what should I use? Is there a real fix for this madness?

    • @koljatm8987
      @koljatm8987 ปีที่แล้ว +7

      UTF-8 helps a lot

  • @vitezslavackermannferko7163
    @vitezslavackermannferko7163 11 หลายเดือนก่อน

    Damn, I think I know this guy 🤔, can somebody please remind me what he is known for?

  • @wyleong4326
    @wyleong4326 11 หลายเดือนก่อน

    0:25 noone laughed? That was pretty funny...

  • @rfvtgbzhn
    @rfvtgbzhn 11 หลายเดือนก่อน +2

    Windows now supports 8 flags, but still no real national flag: 🏁🚩🎌🏴🏳🏳‍🌈🏳‍⚧🏴‍☠

  • @klumpeet
    @klumpeet ปีที่แล้ว +4

    Please keep politics in software.

  • @cbecht
    @cbecht ปีที่แล้ว +30

    Leave politics out of software. Thank you.

    • @RoamingAdhocrat
      @RoamingAdhocrat ปีที่แล้ว +19

      oh my sweet summer child

    • @RoamingAdhocrat
      @RoamingAdhocrat ปีที่แล้ว +14

      Poe's Law dictates no-one can tell if you're joking

    • @masheroz
      @masheroz ปีที่แล้ว +5

      @@RoamingAdhocrat did you watch the presentation? Specifically, the last minute?

    • @AbdallahTeach
      @AbdallahTeach ปีที่แล้ว +1

      @@masheroz must be a John!

    • @funkmedaddy
      @funkmedaddy ปีที่แล้ว

      John u're an idiot

  • @unpronouncable2442
    @unpronouncable2442 3 หลายเดือนก่อน

    please leave politics out of software

  • @ZapOKill
    @ZapOKill ปีที่แล้ว +1

    ♲ a recycled talk of in-cohesive random facts ♲

    • @ThisIsAGoodUserNameToo
      @ThisIsAGoodUserNameToo ปีที่แล้ว +16

      A polished talk of historical artifacts

    • @FindecanorNotGmail
      @FindecanorNotGmail 8 หลายเดือนก่อน +1

      He has done the same talk multiple times, yes. You don't put down this amount of work to just do it once.
      IMHO, this talk is only scratching the surface.