Writing a compiler with LLVM - Cailin Smith - NDC Oslo 2022

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ธ.ค. 2022
  • LLVM is an "bytecode" like language and ecosystem, which allows for compilation of high level languages into a common representation, which can then be further compiled to native executables of the target architecture. In this talk, we will go over the basics of what LLVM is, how it works at a deeper level, and how MethodScript is planning on using it going forward, along with plenty of code examples.
    Check out more of our featured speakers and talks at
    ndcconferences.com/
    ndcoslo.com/
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 28

  • @HarshKapadia
    @HarshKapadia ปีที่แล้ว +6

    Really enjoyed the talk! Thank you! It's crazy how compilers optimize code!

  • @benjaminscherrey2479
    @benjaminscherrey2479 ปีที่แล้ว +7

    Good llvm intro - especially in showing how optimizations work and static single assignment if one has not encountered it before.

  • @MattKnowsTech
    @MattKnowsTech ปีที่แล้ว +4

    Also Apple's Swift programming language builds on top LLVM :)

  • @tordjarv3802
    @tordjarv3802 ปีที่แล้ว +6

    Very interesting talk, unfortunately the audio level is very low (and I have maxed it out on my machine) so I had a hard time hearing what is said

  • @karloes0
    @karloes0 ปีที่แล้ว

    Good talk. Nice that you have mentioned that llvm has Windows linker (I think almost full) compatible with Visual Studio linker. So you can live without VS installation (can be helpfull if you can't install VS for some reason). Thanks for sharing lld-link command for Windows.

  • @artist6000ish
    @artist6000ish ปีที่แล้ว +48

    I don't really understand why she's making the claim that prior to LLVM, compilers didn't have front-end and back-end subsystems, where the front-end would target generic intermediate code. This was very common in compiler technology. For instance, that's how HP's compilers worked.
    I realize that LLVM made has made this accessible to the masses, but just because you didn't see it, doesn't mean that's not how most proprietary language compilers were architected.
    I don't want to take away from the talk as this point wasn't really the point of what is being discussed, but I thought a point of clarification is in order.

    • @peterfireflylund
      @peterfireflylund ปีที่แล้ว +6

      I don’t think she knew any better, unfortunately. Gcc also did it.

    • @NeinStein
      @NeinStein ปีที่แล้ว +1

      Where was this statement made exactly? You can put timestamps in youtube in the mm:ss format.
      PS: Overall you are very much taking away from the talk. Have you been a HP developer before? You really miss to talk about the gist of the bespoke topic.

    • @TeslaPixel
      @TeslaPixel ปีที่แล้ว +1

      The dragon book even directly uses the words front and back

    • @banned_from_eating_cookies
      @banned_from_eating_cookies 16 วันที่ผ่านมา

      She never once said that. She made no claims about LLVM being the first to have frontends and backends. I wonder what on earth motivated you to make up something like that? Anyone who has watched the video will instantly know you have fabricated this. Shame on you.

    • @banned_from_eating_cookies
      @banned_from_eating_cookies 16 วันที่ผ่านมา

      @@peterfireflylund she never made any such claim

  • @deNudge
    @deNudge ปีที่แล้ว +3

    I always thought compiler writers first write their compiler in C and finally rewrite it in their own language, once it's mature enough. But here it seems this is more or less an abstract assembly language, but not any easier. Crazy!

    • @thisisreallyme3130
      @thisisreallyme3130 8 หลายเดือนก่อน +1

      Not any easier? All progress comes at the "cost" of abstraction. By targeting LLVM, language writes don't need to concern with juggling and also being experts in x86 assembly knowledge, all the ARM chips... oh and even WebAssembly has joined the party. Some folks even got Rust and modern C working on the venerable 6502 (LLVM-MOS). Optimizations are free (courtesy of the people more interested in CPU codes than high level languages).
      But this is all way above what I do, so I could be overlooking other larger benefits.

    • @about2mount
      @about2mount 3 หลายเดือนก่อน

      The Full LLVM Libs Package can be over 24 to 48 gigs in package size. Its not only crazy, its absurd and backwards. CPython has the superior and far less costly way of doing things honestly. And CPython is an already compiled Language even before you write any program with it. And it works One to One with C++.
      CPython reads in your applications syntax Lexes it then evaluates it as a Compiled Byte Code instantly and doesn't require the Lexer after the first execution. It then stores an already byte code representation of your application by name as a dot pyw file if no new changes are made.
      Then for production that dot pyw byte code then becomes your compiled byte code making it an instantly executed application. And CPython only requires 101 MB of space.

  • @androth1502
    @androth1502 5 หลายเดือนก่อน

    hm. it might be a cool idea to write an 'IR assembly' language compiler. kind of a more human readable IR assembly transpiler to LLVM IR. then maybe incorporate concepts from the HLA language. i'll have to juggle this around in my head for a bit.

  • @JohnWasinger
    @JohnWasinger ปีที่แล้ว +1

    Does the LLVM system use lex and yacc?

  • @GTGTRIK
    @GTGTRIK 11 หลายเดือนก่อน +1

    Why is the audio level so low?!

  • @about2mount
    @about2mount 3 หลายเดือนก่อน

    The Full LLVM Libs Package can be over 24 to 48 gigs in package size. Its not only crazy, its absurd and backwards. CPython has the superior and far less costly way of doing things honestly. And CPython is an already compiled Language even before you write any program with it. And it works One to One with C++.
    CPython reads in your applications syntax Lexes it then evaluates it as a Compiled Byte Code instantly and doesn't require the Lexer after the first execution. It then stores an already byte code representation of your application by name as a dot pyw file if no new changes are made.
    Then for production that dot pyw byte code then becomes your compiled byte code making it an instantly executed application. And CPython only requires 101 MB of package space.

  • @Antiorganizer
    @Antiorganizer ปีที่แล้ว +3

    The JVM doesn't interpret the AST code (or whatever other code). The JVM compiles to an intermediate bytecode, interprets that, and *while* running, decides how to compile to machine code. The end result is near or sometimes even faster than optimized C++ code even.
    People need to stop associating the JVM with interpreting bytecodes.

    • @Knirin
      @Knirin หลายเดือนก่อน

      The JVM per spec is only an interpreter. All JIT work is on the side. Dalivik is also the only JIT that regularly caches the JIT results.
      Is it faster than C++. Only when you amortize the load cost from the interpreter out to infinity.

    • @Antiorganizer
      @Antiorganizer หลายเดือนก่อน

      @@Knirin The JVMs that have in on PCs and Androids for decades now, have compiled to hardcode machine code in the background. The result is top notch performance.
      Again, it must not be seen as an interpreter, because it simply isn't.
      And because the compilation happens at runtime, it can instrument what and how it can best optimize. This is why some Java code can run faster than C++ code.
      However, Java uses high overhead objects a lot, so C++ is more often a tad faster, but not by that much, and also, C++ does not always win.

    • @Knirin
      @Knirin หลายเดือนก่อน

      @@Antiorganizer I don’t care about the performance of a hot loop or some special benchmark. While they aren’t perfect examples Minecraft Java Edition and Minecraft Bedrock Edition make decent comparisons. I can find places where both games have identical frame rates. However, generally speaking Bedrock will load maps and jump into menus and the inventory faster than Java Edition will.
      The JIT isn’t a silver bullet. It costs memory and time to do the compilation. None of the openJDK based runtimes store the generated native code anywhere but RAM. If you close the program all of the work done by the JIT compiler is thrown away. Dalvik, Android’s JVM implementation, does keep the native code around in a cache.
      Another point, none of the openJDK based runtimes JIT the entire application. Given some design choices in the JVM a full application JIT or ahead of time compilation largely nerfs the introspection and runtime code generation capabilities used by many frameworks.
      The tldr is Java was designed to run on an interpreted virtual machine. Generics, annotations, introspection, runtime code generation, and other language features depend on it staying interpreted. While JIT compilation can speed up individual methods its application to the bigger memory layout and runtime method dispatching slowdowns is limited and will likely remain so.
      This needs an edit but my battery is almost dead.

    • @Antiorganizer
      @Antiorganizer หลายเดือนก่อน

      @@Knirin When re-implementing a game, of course, once one has the know-how of the specifics with the algorithms, one is going to do a much better and smarter job.
      It makes no sense at all to compare performance of an old code base (that might have been worked on the entire time even), and a newly implemented one.
      If you do that, then you'd be cherry picking to achieve a desired perception, and that would be disingenuous.
      The JIT is a beauty. Much of the code ends up being more efficient than precompiled C++ code.
      Memory is cheap.
      Runtime compilation is quick and non-intrusive.
      It's also able to focus JIT action where it's needed the most.
      Many benchmarks play it unfairly, and don't give the JIT mechanisms the chance to re-compile. For example, writing a quick "main" and running a bunch of algorithms inside a method and then timing that, does not show how fast it can do it.
      When you run the method with the algorithm twice, you can already see that the second run is so much faster than the first run.
      Some people have a personal hate and vendetta against Java and intentionally try to make it look bad by cherry picking comparison. There is a very big benchmarking effort in the past that intentionally do not allow JVM warmup (is what it's called). The guy running that benchmark site had a personal hate and got a kick out of making Java look less good. I personally fought that f-er many times but kept insisting he's right because he could not handle losing the argument.
      Anyway, the JVM should not be seen as an interpreter. It's mostly a dynamic compilation process.
      Heck, even the very initial run before jitting is already jited technically. I don't think it actually "interprets" JVM bytecodes at all. It's already converted to an intermediate form upon loading.

  • @kaos092
    @kaos092 11 หลายเดือนก่อน

    Why is she talking about the IR like she's making edits to it directly? You're supposed to make the edits in the source files. Show us that. Are you going to fix the bugs in the IR every time you compile it?

  • @AndrewTSq
    @AndrewTSq ปีที่แล้ว +4

    ngl, that metascript looks like assembler compared to javascript. interesting talk anyway. Only worry is that Microsoft is interested in it, that means its probably be really bad as other microsofts initiatives in the coding world. edit: sorry, I cant believe kids think that method script is easy.. it looks very very very complicated for something that should be simple.

    • @kanoalgiz
      @kanoalgiz ปีที่แล้ว +1

      Can answer as a person for whom MethodScript was invaluable experience years ago and motivated to learn many programming concepts from scratch (I'm middle Java dev, 4 years on enterprise projects now, so I cannot be more grateful to Cailin 😁)
      Example given in the video really doesn't show actual simplicity - but newbies usually do not start with developing functionality that utilizes game event bindings or sending sms 😄
      Really, feels a lot like js, but without its... quirks =) If I recall correctly, specifying variable types explicitly is a new (and optional) thing; declaring variable before using it was not required too; control flow structures are pretty standard; no specific data structures (arrays are for everything); zero boilerplate for simple programs... but what I think made it perfect for a total newbie - back then MethodScript had no OOP stuff whatsoever, as well as no member functions for types - everything is a procedure. You cannot call 'length' on array on string - you write "length(@someString)"; if you want to use array as a stack - you just write "array_push(array, value)", so everything works exactly the same. Maybe not too pretty, but extremely straightforward. There is much more in the language, but beginner is not forced in any way to use or understand features like closures or exception handling. You can just enrich you code then you're ready.
      But that would not work without marvelous documentation for the language - with examples for almost every procedure and very beginner-friendly articles on its website - definitely made with love =)
      And yeah, the thing that it is the most powerful tool to write plugins for your Minecraft server if you don't know Java should greatly motivate to learn.

  • @robertbarta2793
    @robertbarta2793 ปีที่แล้ว

    If I have 50 min, then I do not waste 6 mins at the start with trivia/irrelevant remarks. Otherwise, important material.