Lightning Talk: Christie Mystique - Tony Van Eerd - CppNow 2023

I made the same game in Assembly, C and C++

SIMD Libraries in C++ - Jeff Garland - CppNow 2023

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

How Strong is Tinfoil?

Lightning Talk: How to Leverage SIMD Intrinsics for Massive Slowdowns - Matthew Kolbe - CppNow 2023

CppNow

มุมมอง 3 525

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 29 พ.ย. 2024

ความคิดเห็น • 5

@dat_21 ปีที่แล้ว ⁺⁵
Microarchitecture knowledge is essential.
@CodeStructureTalk 5 หลายเดือนก่อน ⁺⁴
I would say this statement is highly misleading. You are likely to see a noticeable improvements with SIMD. Maybe not 30x, but 4x-8x improvement for non-trivial code seems to be easily reachable from my experience. And it's not going to take that long to write a decent SIMD function.
The sample given is just about the worst example you can possibly show. There is no useful computation, so it's mostly showing loads and stores. As soon as you have chained, meaningful operations like shifts, multiplications, square roots, masks, etc., you start noticing a big difference.
Also, it's really cherry-picking a compiler and code example, because as soon as you have a non-trivial code fragment, you will start seeing that often the compiler will just emit a scalar code. Sometimes it might even be using xmm registers, but it will still do just a scalar operation on the first float in the register for example. Even this misleading example will not be vectorized on all compilers.
My experience so far has been that for very trivial code that the compiler manages to make sense of, intrinsics vs good compiler the speed might be about the same, but not worse. But if you compile on multiple compilers, intrinsics will be noticeably faster for some of them. And for non-trivial cases, you are easily looking at 4x-7x improvement. Which means instead of waiting some computation for 7 seconds, it finishes in 1 second. Or instead of 7 hours of computation, you finish the task in 1 hour.
I find this common disregard of actually useful technologies to be extremely harmful. As a good developer you should know where code can be optimized, you should know when SIMD makes sense. And you should use it because it can give you at least 5x savings in hardware, energy, and wait times.
@kolbstar 15 วันที่ผ่านมา
Speaker here. I agree with and disagree with various points you made.
Regarding cherry picking a compiler: there's basically two compilers. And, yes, Clang is better at autovectorization. GCC is worse, but it will autovectorize this example. But you are correct that I chose the compiler that does a better job intentionally. I'm sorry if you felt that was dishonest. It wasn't meant to be.
I chose this example not to cherry pick, but because in a lightning talk, we can't sit there for a minute waiting for the audience to figure out what the code is doing. It has to be simple.
I would say two comments both in regards to autovectorization and intrinsics. This video is from over a year ago, and the compilers are better and better, and I am increasingly preferring to use std::assume and various branch-free techniques to help guide the compiler to generate good SIMD code. When that fails, turning to google highway or intrinsics has been another, sometimes very good, route. You say autovectorization only works on simple examples, but give it a shot. Also, make use of optimization remarks output by the compilers. They can point you to where the compiler failed to vectorize something, and often it's something subtle that many programmers don't understand. For example, dot product on floats doesn't vectorize because rearranging the order of adds can result in different answers, so you need the fast-math flag.
As for really big (30x) speedups, I've seen it mostly in operations on char vectors. But yes, uncommon.
@axe863 ปีที่แล้ว
This is so true 😅
@PedroOliveira-sl6nw ปีที่แล้ว
For a moment I thought he was going to show how to use SIMD for destructing objects. I was expecting some reinterpret_cast into an object of same size but only primitive types and with some magic, parallelize destruction.. But I guess not

ต่อไป

เล่นอัตโนมัติ

Lightning Talk: Christie Mystique - Tony Van Eerd - CppNow 2023

Lightning Talk: Christie Mystique - Tony Van Eerd - CppNow 2023

I made the same game in Assembly, C and C++

I made the same game in Assembly, C and C++

SIMD Libraries in C++ - Jeff Garland - CppNow 2023

SIMD Libraries in C++ - Jeff Garland - CppNow 2023

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

How Strong is Tinfoil?

How Strong is Tinfoil?

🔴𝐋𝐈𝐕𝐄 การแข่งขัน Asian Esports Games 2024 เกม Arena of Valor (RoV) วันที่ 1

🔴𝐋𝐈𝐕𝐄 การแข่งขัน Asian Esports Games 2024 เกม Arena of Valor (RoV) วันที่ 1

Adventures in SIMD-Thinking - Bob Steagall - [CppNow 2021]

Adventures in SIMD-Thinking - Bob Steagall - [CppNow 2021]

Why is Python 150X slower than C?

Why is Python 150X slower than C?

Branchless Programming: Why "If" is Sloowww... and what we can do about it!

Branchless Programming: Why "If" is Sloowww... and what we can do about it!

Master Pointers in C: 10X Your C Coding!

Master Pointers in C: 10X Your C Coding!

why do void* pointers even exist?

why do void* pointers even exist?

Writing Code That Runs FAST on a GPU

Writing Code That Runs FAST on a GPU

The symptoms of bad code - Robert C. Martin (Uncle Bob)

The symptoms of bad code - Robert C. Martin (Uncle Bob)

How Senior Programmers ACTUALLY Write Code

How Senior Programmers ACTUALLY Write Code

จะไปแล้วหรอ🥺 #fisch #แป๋มวัดดวง #shorts

จะไปแล้วหรอ🥺 #fisch #แป๋มวัดดวง #shorts

EAT อีส มารูอ้วย | EP.129 ลูกชิ้นนายปิงกับคุณนินิว ตัวแม่นักร้องตัวจึ้ง ทานไปไม่มีอ่อมพูดคุยเม้ามอย

EAT อีส มารูอ้วย | EP.129 ลูกชิ้นนายปิงกับคุณนินิว ตัวแม่นักร้องตัวจึ้ง ทานไปไม่มีอ่อมพูดคุยเม้ามอย

Epic Ghost Car X ดรีมบาร์โค้ด EP.80 พิสูจน์ผี!! สำนักสงฆ์อาถรรพ์

Epic Ghost Car X ดรีมบาร์โค้ด EP.80 พิสูจน์ผี!! สำนักสงฆ์อาถรรพ์

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

จ้างผิดคนแล้ว ครอบครัวซิมป์สัน #tooneytunes #rickandmorty #ริคแอนด์มอร์ตี้ #เรื่องนี้ต้องดู

จ้างผิดคนแล้ว ครอบครัวซิมป์สัน #tooneytunes #rickandmorty #ริคแอนด์มอร์ตี้ #เรื่องนี้ต้องดู

ผิดเหรอ? มีลูกกับผัววัย 16 | อีจัน EJAN

ผิดเหรอ? มีลูกกับผัววัย 16 | อีจัน EJAN

面对姐姐的血脉压制，爸爸也得慌忙跑路！#搞笑#萌娃 #血脉压制

面对姐姐的血脉压制，爸爸也得慌忙跑路！#搞笑#萌娃 #血脉压制

Lamborghini vs Smoke 😱

Lamborghini vs Smoke 😱