Intrinsics: Low-level engine development with Burst - Unite Copenhagen

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ก.ค. 2024
  • This session addresses how we are expanding the scope of the Burst Compiler to enable even the most demanding, hand-coded engine and gameplay problems to be expressed in HPC# via direct CPU intrinsics. Andreas shares the reasoning and use cases; as well as discussing implementation challenges, debugging, and performance along with comparisons to C++ code.
    Speaker: Andreas Fredriksson - Unity
    Slides available here: www.slideshare.net/unity3d/in...
  • เกม

ความคิดเห็น • 17

  • @Valentyn90A
    @Valentyn90A 4 ปีที่แล้ว +3

    My man! Your talk is amazing and EVERYTHING you said was SO EASY to follow.

  • @CBaggers
    @CBaggers 4 ปีที่แล้ว

    So happy to see the way this is going. Good luck.

  • @Valentyn90A
    @Valentyn90A 4 ปีที่แล้ว +1

    The best. On point, super informative. I'm a fan!

  • @armpap1
    @armpap1 4 ปีที่แล้ว +6

    There is a huge difference of not having the option and never needing it and having the option - even if you dont need it. That will be the case for 95% of programmers that work with unity, but this WILL eventually affect all 100% in one way or the other. In the future, this are the things that will appear because of all this low level stuff - 1) Better engine code, 2) Blazing fast editor, 3) More quality assets that you can just plug into your code, and the most important of all IMO - 4) Lowering the barrier to entry for AAA (at least on programming side).
    And I mean, performance is like money - you can never have enough of it. If your game runs 60FPS and you dont need the performance, you can still have a use for it to add something juicy or help artists with some procedural stuff, etc etc. You can never have enough performance.

    • @asiseverything3404
      @asiseverything3404 4 ปีที่แล้ว

      well put

    • @tatoforever
      @tatoforever 4 ปีที่แล้ว

      Indeed, if you are in the game industry, the only thing that matters is performance. Programmers that care about speed, will sacrifice almost everything (including code clarity) to squeeze as much as they can every bit of the hardware! I'm one of those! :D

  • @MarZandvliet
    @MarZandvliet 4 ปีที่แล้ว +5

    Whoa, intrinsics!! I've been coding up a Burst-backed fixed point library, and while I've been able to massage the code paths to increasingly fit the Burst/LLVM autovectorization patterns, at times I've longed to just write the code I want it to be. About a month ago I was like: well shoot, maybe I should write this in Rust (with its intrinsics library) after all, and use it as a plugin. But with this addition to the Burst reportoire I might actually be able to express all the computations properly. Very nice, will wait patiently (impatiently) for the first release.

    • @ronakmachchhar3054
      @ronakmachchhar3054 4 หลายเดือนก่อน

      Did you managed to complete it ? Im currently using a 48.16 precision fixed-point math library but I take a hit since execution is serial.

  • @tatoforever
    @tatoforever 4 ปีที่แล้ว

    I've been programming at the semi low level (well not really low level but in between) and writing shaders for a long time. As soon as I read low level I got excited!

  • @michalkracik1473
    @michalkracik1473 4 ปีที่แล้ว +1

    It's great to write wide SIMD code like this, but then don't we have to also maintain scalar version of all code for cases when the number of items is less than 4 or not multiple of 4?

  • @march4369
    @march4369 4 ปีที่แล้ว

    Hi I understand the overall concept of simd operation and the optimization it brings. I migth even consider that the exemples given are at my reach. But what I fail to grasp is when I design a level I create my doors individualy. My tipical job system will take care pf looping through all of them. Am I supposed to prepare my doors into groups before my job some how?

    • @danielkandersen6599
      @danielkandersen6599 2 ปีที่แล้ว

      It's more of a, you store the doors in a way that you can retrieve them, ie. use the property that NativeArrays store stuff lineally so you can ask for four elements at a time :)

  • @Mr_Yeah
    @Mr_Yeah 4 ปีที่แล้ว

    @30:54
    Is there a difference between `shouldOpen |= (inRadius & teamMatches) ? true : false;` and `shouldOpen |= inRadius & teamMatches;`?

    • @Valentyn90A
      @Valentyn90A 4 ปีที่แล้ว +1

      Of course! A huge difference! In the first case ternary operator is used. Ternary is acually SLOWER than an if else and creates a branch which is slow. Just as he said. In 2nd case we simply set bytes with a logical AND comparison, it's the fastest possible solution

  • @people_fly13
    @people_fly13 4 ปีที่แล้ว

    Any news here?

  • @Valentyn90A
    @Valentyn90A 4 ปีที่แล้ว

    33:54 replace j * 4 with j

    • @petrusion2827
      @petrusion2827 3 ปีที่แล้ว +5

      I know this is a year old comment, but you shouldn't do that. Multiplying integers by powers of two will always get compiled as bit shifting, and I'm not even just speaking about Burst Compiler. Any non crappy compiler, including any JIT or JVM, will do that for you. By writing it explicitly you are only confusing yourself. You should use bit operands only when you actually need to do bit operations, not to act as a compiler.