CppCon 2019: Matt Godbolt “Path Tracing Three Ways: A Study of C++ Style”

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ม.ค. 2025

ความคิดเห็น • 56

  • @darkwinter6028
    @darkwinter6028 5 ปีที่แล้ว +88

    So YOU’RE the one responsible! I use your website ALL the time; especially when doing embedded stuff. Thanks! 👍😀

  • @youdonotknowmyname9663
    @youdonotknowmyname9663 3 ปีที่แล้ว +26

    Wow, this guy ...
    Just casually writing a path-tracing application in about 100 lines of code ....
    Awsome!

    • @CppCon
      @CppCon  3 ปีที่แล้ว +5

      Glad you like it!

  • @ChiliTomatoNoodle
    @ChiliTomatoNoodle 5 ปีที่แล้ว +40

    Great discussion! Rare to see talks like this that don't just devolve into a polemic.

  • @alex_d_lee
    @alex_d_lee 5 หลายเดือนก่อน +1

    Out of all the cool points made in this talk, I think the most important one is how important actually testing your performance is. Just like Godbolt, I've been proven wrong so many times by benchmarks when I thought something I was doing was obviously going to be faster.

  • @LogicEu
    @LogicEu 3 ปีที่แล้ว +9

    Path tracing on its own is already pretty interesting, but this talk is on anther level.

  • @DrGeoxion
    @DrGeoxion 5 ปีที่แล้ว +30

    I really like Matt! The enthusiasm is contagious 😅

  • @narnbrez
    @narnbrez 4 ปีที่แล้ว +64

    You mean Godbolt isn't a catchy name the dev threw on the tool? lol I had no idea. Great name

    • @MattGodbolt
      @MattGodbolt 3 ปีที่แล้ว +23

      I know, right?!

    • @Evan490BC
      @Evan490BC 3 ปีที่แล้ว +3

      @@MattGodbolt It's like in the old Rowan Atkinson sketch, "The School Master", but in reverse 😉

  • @nazavode
    @nazavode 5 ปีที่แล้ว +24

    One of the most fun and enjoyable talks I've seen recently. Actually using the language to do real stuff is something I was missing in these kind of conference talks. This should have taken twice as much time tho, looking forward to part II!

  • @parnmatt
    @parnmatt 5 ปีที่แล้ว +7

    watched the streams as Matt was coding this; fantastic to finally see the talk!

  • @femloh
    @femloh 4 ปีที่แล้ว +2

    Goodness... Just that change alone! Huge increase! This was a really great talk. Learnt a lot.

  • @NoNameAtAll2
    @NoNameAtAll2 5 ปีที่แล้ว +25

    Godbolt is always so exciting to listen to
    I wonder if "likely/unlikely" advice to compiler could be used to make it (and not programmer) think about branch prediction

    • @notnullnotvoid
      @notnullnotvoid 5 ปีที่แล้ว +4

      At the point where you're thinking about likely vs unlikely, you're really already thinking about branch prediction. If you want the compiler to know without having to think about it, that's where profile-guided optimization comes in. Unfortunately the tools for that in practice are pretty scant.

    • @ChristianBrugger
      @ChristianBrugger 2 ปีที่แล้ว

      In the original version this wouldn't have helped, as it was a 50/50 chance for each of the branches. Only if as you say it would have made the programmer rethink it.

  • @michaelthompson7217
    @michaelthompson7217 5 ปีที่แล้ว +2

    His talks are always solid

  • @LostieTrekieTechie
    @LostieTrekieTechie 5 ปีที่แล้ว +2

    Always a great talk when Goldbolt takes the stage!

  • @slipcurve1410
    @slipcurve1410 5 ปีที่แล้ว +3

    I love compiler explorer

  • @TOMiX1024
    @TOMiX1024 3 ปีที่แล้ว +2

    54:00 that is actually what Mitsuba 2 does, i.e. `struct { float x[N], float y[N], float z[N] };`.

  • @Meow_YT
    @Meow_YT 5 ปีที่แล้ว +1

    Really interesting and nice dive into the CPU.

  • @dmitriykarbovskiy1790
    @dmitriykarbovskiy1790 ปีที่แล้ว +3

    Data oriented design is not only more optimal, it's *optimiseable*. It's trivial to add SIMD and run it in multiple threads and gain multiple times the performance. You can't really scale OOP code. Also, this example is really tiny. It's computation-heavy, not memory-heavy, not memory-bound. It does not really show the difference on that tiny scale when everything fits in the cache anyway.

  • @axelwimmer4601
    @axelwimmer4601 5 ปีที่แล้ว +1

    Absolutely awesome!

  • @aaardvaaark
    @aaardvaaark 5 ปีที่แล้ว +1

    Really enjoyed this talk, thanks.

  • @zvxcvxcz
    @zvxcvxcz 4 ปีที่แล้ว +13

    38:45 It's simple, test functions instead of objects. This data based approach is definitely the one that comes most naturally to me. Adding other objects, well the difficulty would mainly be in intersection... and actually you should probably break most other shapes into triangles and then do that calculation. I don't think it actually works out to be harder to change if you give a bit of thought to how you would change it before you write it. I don't think difficult adaptability is intrinsic to the approach.
    42:50 Hmm, some issue with the triangle intersection code? The spheres were fine.
    A few minutes later: Lol, branch missing, heh.

  • @hamesjetfield0
    @hamesjetfield0 5 ปีที่แล้ว +4

    I thought the OOP version was fairly functional (maybe not aesthetically but from a conceptual point of view). Most functions I saw were pure in terms of this and params. I'd say that's pretty functional). Most mutations seemed to be local.
    And the virtual inheritance part is just a character of C++ rather than OOP. In a more functional language (with typeclasses or even Rust traits), you'd be able to do the virtual stuff without inheritance so I don't see virtual dispatch as a purely OOP concept.

    • @darkengine5931
      @darkengine5931 4 ปีที่แล้ว +5

      It's a path tracer though. A path tracer barely has to deal with mutable state beyond the image it outputs at the end. It spends almost the entire time as well as 99% or so of the code purely reading data in an immutable fashion. So I actually think it's a poor example of highlighting differences in these styles, especially from a code and maintenance perspective. Particularly with a small and simple path tracer like this, you could probably even write it very procedural and just use a dozen global variables and the end result would still be quite easy to maintain (maybe even easiest to maintain). There just isn't much mutable program state at all.
      Where the code would vary so much more between these while still keeping the project reasonably small would be something like a text editor or CAD software where the user can edit all sorts of stuff, undo/redo operations, save and load, etc. For example, the object-oriented approach to the undo system might use the Command pattern with a whole bunch of command objects that provide virtual undo/redo methods, the functional version immutable/persistent data structures, and the DOD style maybe some very efficient diff/delta system over the data (maybe with hashes and memcmps over memory blocks) to store changesets.

  • @OskarSigvardsson
    @OskarSigvardsson 5 ปีที่แล้ว +4

    Excellent talk! Is there a link to the github page?
    On the final thing: I've hit that same || vs. | issue in a couple of hot loops, though never with this spectacular difference. "Avoid branches in hot loops" is very good advice, even short-circuited ones (I might have even replaced the "if (dist < minDist) minDist = dist" with a call to min function). I don't think the "extra work" is a concern here at all. The few extra comparisons will be pipelined and auto-vectorized up the wazoo.

    • @SuperLugol
      @SuperLugol 5 ปีที่แล้ว +2

      github.com/mattgodbolt/pt-three-ways

    • @MattGodbolt
      @MattGodbolt 5 ปีที่แล้ว +6

      The code is on GitHub at github.com/mattgodbolt/pt-three-ways

  • @defense200x
    @defense200x 5 ปีที่แล้ว

    Very interesting talk!

  • @younghsiang2509
    @younghsiang2509 2 ปีที่แล้ว +1

    Does anyone know the tool that was used in this talk to generate the profiling data such as: branch-misses, branches, and instructions?

    • @tkava7906
      @tkava7906 ปีที่แล้ว +3

      It comes with Linux kernel. Look up Linux perf tutorial to get started.

  • @statebased
    @statebased 5 ปีที่แล้ว +1

    Fun talk, thanks. Note that technically the data oriented approach can be seen also as a functional approach. For example you could see it as an indexed comonadic container-like structure which contains the intersection “functions” of the scenery, which then can be efficiently “joined” with the ray container (plus recursion for reflections). The difficulty is to capture these complex types. Even in Haskell these indexed types (if we want to allow the scene objects to individual types, eg sphere, cube, ...) are not an easy task.

    • @w-mwijnja8919
      @w-mwijnja8919 5 ปีที่แล้ว +1

      Correct, although the same could be said of the 'object-oriented' (class-oriented) approach. That one can also be described by a Hindley-Milner-esque type system. Of course, at the end of the day 'functional programming' first and foremost means 'that functions are used as first-class values', regardless of whether immutability, sum types, monoids, functors etc end up being used.

  • @briannoel7398
    @briannoel7398 8 หลายเดือนก่อน

    I wonder if the reason why the DOD method had so many branch-misses was because he didn't generate data tables with valid information. If I'm thinking about this correctly, you want to check validity when inserting data into a table. That way, when you do your operations on that data, you won't need to check for validity.

  • @Jkauppa
    @Jkauppa 3 ปีที่แล้ว

    explicit (fixed) sampling is better than random sampling distribution, its more stable, better as single frame, no inter-frame averaging required

    • @Jkauppa
      @Jkauppa 3 ปีที่แล้ว

      if you are poor, select only the main peak of the distribution, fixed, always, 1 ray per pixel

    • @Jkauppa
      @Jkauppa 3 ปีที่แล้ว

      wrap the triangle with a sphere collision ray tracing volume, center of mass centered, for ray intersection check boost

  • @TheBlenderblob
    @TheBlenderblob 5 ปีที่แล้ว

    great talk

  • @mikaurtela8047
    @mikaurtela8047 5 ปีที่แล้ว +5

    The problems with DoD approach being more difficult to change afterwards would be alleviated if we used an actual ecs library instead of a hand written SoA layout.

    • @darkengine5931
      @darkengine5931 4 ปีที่แล้ว +9

      Yeah, I felt like the DoD section was the weakest and obviously where Matt had the least experience. But he also missed the single greatest optimization that such a codebase could have with DoD by still keeping his triangle points in interleaved XYZ AoS format (he did mention that, but that's like the biggest and number one optimization afforded with DoD that he neglected) when he's sequentially looping through them. It's often easy enough to get order of magnitude improvements over OO in those cases with a serious DoD mindset, and sometimes 100x or faster without algorithmic improvements, in simple loopy code over data like his path tracer. I also wanted to facepalm a bit when he didn't even bother to profile the code until he had written it all and not only then, but only after he noticed something weird in the benchmarks.
      Also maybe this is a bit contentious, but from my standpoint, DoD is more top-down design than bottom-up as he described. It's OO that's bottom-up, not the other way around. DoD gets to the heart of the design requirements of the software by looking at it from a data-centric (information-centric, i.e.) standpoint. It organizes the data for efficient access and then builds the minimum layer of functionality and abstractions on top. The code it tends to produce tends to be minimalist and highly-tailored for the problem at hand. It's OO where I tend to find programmers start going really bottom-up and sometimes moving far, far away from the business requirements by building all sorts of monstrous libraries and frameworks before they start actually building the software and writing and using anywhere from tens to thousands of times more code than actually required for the problem at hand.

  • @pavelcelba6976
    @pavelcelba6976 5 ปีที่แล้ว +1

    Don't believe in splitting X, Y, Z - maybe having one big array of X,Y,Z, X2, Y2, Z2 maybe

  • @sideparting6845
    @sideparting6845 5 ปีที่แล้ว +7

    Thanks for saying zed
    I recognized your BBC owl and Exile on your emulator
    Yes I am mid 30s and a Brit

  • @satellite964
    @satellite964 3 ปีที่แล้ว

    DoD is the most clear. FP was the most opaque.

  • @danielkrajnik3817
    @danielkrajnik3817 3 ปีที่แล้ว

    funny, Compiler Explorer is called the same name

  • @pushqrdx
    @pushqrdx 3 ปีที่แล้ว

    15:20

  • @spiceEx
    @spiceEx 5 ปีที่แล้ว +1

    Can someone make a subtitle so the people who English Listening Comprehension not well can follow the talk?

  • @sebastiantill8019
    @sebastiantill8019 2 ปีที่แล้ว

    Nice bloke, but his naive implementation of path tracing is a worst case scenario for data coherence and thus an awful example for the comparison he is trying to make.

    • @MattGodbolt
      @MattGodbolt 2 ปีที่แล้ว +1

      Of course: it's a talk, and it wasn't designed to be anything but a demonstration of techniques and styles. If it was even slightly serious, then some kind of BV-hierarchy would be the first thing to do... but anyway! Hope it was interesting. To get the stats I used Linux `perf`

  • @skepticmoderate5790
    @skepticmoderate5790 5 ปีที่แล้ว +1

    Cartesian product? Bro, you C++ programmers are weird. It's called zip.

    • @xarcaz
      @xarcaz 5 ปีที่แล้ว +12

      Lol, no.

    • @IsmeGenius
      @IsmeGenius 5 ปีที่แล้ว +15

      Zip is inner product. Cartesian product is outer product where you visit all the possible combinations.

    • @AntiProtonBoy
      @AntiProtonBoy 5 ปีที่แล้ว +1

      @@IsmeGenius My understanding of zip is you parallel combine pair of input ranges, producing an output range. Whereas inner product parallel combines pair of input ranges, then reduces into a single element for output.

    • @zanityplays
      @zanityplays ปีที่แล้ว +2

      What else should Cartesian product be called even in python it's called product in the itertools library