Timur I want to thank you for this talk. In my synthesizer project I simply moved all my graph updating logic to the GUI thread and used atomic shared ptr to swap references. Simply moving all these locks, memory alloc, file io, out of the callback has made my program crash 100% less than it did before, which was totally random. I know I'm probably not doing everything totally correctly but I can feel an enormous difference when playing my synthesizer. It is so satisfying when apps 'just work'. These talks are invaluable for audio programmers.
This is good stuff, but a thing to note at 46:00 is that atomic free functions to manipulate shared_ptr are not lock-free. Moreover, on 48:00, the solution to delete in callback can be solved in a lock-free manner using either hazard pointers or epoch garbage collector.
A couple of points you didn't cover: 1. Never rely on calloc or C++ new zero initialized memory. The zeroing may be done via VM tricks that will cause a page fault upon first touching the memory. 2. Code can page fault. You should pre-warm any code you run before you run it on the audio thread.
This is a treasure! Thanks a lot! Being a beginner in the topic, it's one of the most useful conference talks I've ever watched. Hopefully, it won't be too difficult to apply these patterns in Rust.
re 44:47 - Having looked on ccpreference and having tried atomic_is_lock_free( const std::shared_ptr* p) and with T = juce::String, I think the atomic access on shared pointers is not lock free. At least when T is not a basic type. From cpp reference : "These functions are typically implemented using mutexes, stored in a global hash table where the pointer value is used as the key."
im always perplexed as to how people can come with this. This is an insane amount of genius, a feel so dumb I wish i could come with everything he said. :(
39:24 If you use load/save without extra arguments, sequentially-consistent ordering will be used, which is way too slow. You should always explicitly specify memory ordering for std::atomic, if you care about performance.
A note: it's perfectly okay to wait on a mutex in audio thread if you can predict for how long it will be blocked. Think of audio thread waiting for all RT synthesis threads to finish. Locking and waiting is NOT BAD at all, it's bad when you wait for unpredictable threads.
Great content, thx! Would love to see a follow up talk on this (maybe at JUCE Summit in november?). Would definitely make a nice tutorial and add significant value to the JUCE docs imo. Also Projucer seems like an interesting interactive IDE, don't think that talk has been posted yet though..
Interesting that 10ms is the number shown as the high threshold for acceptable latency. I can't speak for keyboards, but for guitar anything above 5ms starts to noticeably affect the feel of playing. 5ms or below is definitely preferable. Historically coming up with an affordable audio interface and computer pairing that could achieve this was quite the challenge, which of course was one of the strongest arguments for dedicated hardware (a la AxeFX, Kemper, etc.) vs. software based amps.
+Gerald Hinson 10ms that's a joke. If I go above 7ms, I can't play the fast parts on guitar anymore (shredding), the delay is literally thowing me off track. I hate it when people say, you can't notice things under 16ms or some bullshit like that.
Remember the speed of sound (in Air) is ~2.994 ms per meter. So, if your distance to a your Guitarbox is more than 2.4 Meters. Your Latency is to big for playing fast?
Monitors are rarely further than 3 meters away from me. I sometimes go off the stage and if I go too far away I start noticing the delay. If you say 7ms round-trip, it's usually a bit more because the software is calculating the best case (not including all variables).
Well I can explain: When your playing live with an amp and cab you are playing loud and you don't hear you own e-guitar itself at all. You will only hear the signal from the amp/cab. You learn to get easily adjusted to the small latency by the distance you are in front of your amp. But if you are playing sitting in front of your DAW or a VST on your PC with your guitar in your hands, you will never turn up the volume as high like on a live gig or rehearsal. So you hear the sound of your e-guitar alone (which can be remarkably loud by the way) but also the processed sound with latency -> and that is what will really throw you off the track if it is higher than let's say 10 ms in sum at least (better to have it around 5 ms).
please help : do i need to create a program with ASIO api to call the right data from the sound device to the DAW ? or selecting the asio driver inside protools is enough to record a 24bit bit perfect .
@Timur Doumler is there source code for this project anywhere? I'd love to see how the oscilloscope works. I checked thecppcon githib for this presentation but it was not there. I've been researching how to implement one but am confused. Thanks.
14:50 I have heard so many times "I know XXXX isn't the correct way of doing random number" what should I google for the correct way ? C/C++ no lib. ??
38:20 Volatile IS the correct solution here - and it fixes several of the "problems" listed here. Also it only prevent the compiler from doing 2 things - optimising away any checks on that variable (We want the compiler to keep them) and reordering of volatile access (with read/write in the same function this is also what we want). So.... no, for the situation used as an example it is the best possible solution even - the least overhead, short and concise.
Volatile also prevents the compiler from deferring the write to main memory. Also it seems like your assumptions about volatile are based on Microsoft's implementation of volatile, which apparently is redefined as a full memory barrier, unlike on other compilers.
It's nice to have a talk on that topic. I am really interested into making music programs and plugins, does any one knows if there are opportunities in that field ?
@@tissuepaper9962 If you get no answer, generally it means that the people already working in the field doesn't want more people to enter. I.e. it's probably paradise for those few who have made the venture.
Still not so important. Let's say you have recorded 1 min of game as video in 60 fps. Let's say in one of these 60 seconds there were actually 59 unique frames, not 60. Can you watch this video and tell where exactly there is second with 59 fps? No, you can't. Now let's say one of these 60 seconds has 16 ms of silence. Can you listen to this audio and tell where the silence occurred? Sure you can, from the first attempt. It means drop of audio is more important than drop of video.
14:36 - This is a terrible example of aliasing which everybody in the audio industry should be very well aware of. You might consider putting a comment in the video to point out that this is not the proper way to generate a square wave in the digital domain. This paper covers the basics of alias-free digital synthesis - ccrma.stanford.edu/~stilti/papers/blit.pdf
Sending 1 frame at a time? That would be a lot of overhead. Regardless, it's more how many frames ahead you are that matter, not how many you send at a time, unless your resolution is less than like 200/sec or something. You're not going to be able to get audio to even work if you plan to be only 1 frame ahead.
Pure gold for the beginners. Thanks you Timur, you are a good teacher.
I Concur
Timur I want to thank you for this talk. In my synthesizer project I simply moved all my graph updating logic to the GUI thread and used atomic shared ptr to swap references. Simply moving all these locks, memory alloc, file io, out of the callback has made my program crash 100% less than it did before, which was totally random. I know I'm probably not doing everything totally correctly but I can feel an enormous difference when playing my synthesizer. It is so satisfying when apps 'just work'. These talks are invaluable for audio programmers.
This is good stuff, but a thing to note at 46:00 is that atomic free functions to manipulate shared_ptr are not lock-free. Moreover, on 48:00, the solution to delete in callback can be solved in a lock-free manner using either hazard pointers or epoch garbage collector.
this is such a good introductory lesson
A couple of points you didn't cover: 1. Never rely on calloc or C++ new zero initialized memory. The zeroing may be done via VM tricks that will cause a page fault upon first touching the memory. 2. Code can page fault. You should pre-warm any code you run before you run it on the audio thread.
This is a treasure! Thanks a lot! Being a beginner in the topic, it's one of the most useful conference talks I've ever watched. Hopefully, it won't be too difficult to apply these patterns in Rust.
re 44:47 - Having looked on ccpreference and having tried atomic_is_lock_free( const std::shared_ptr* p) and with T = juce::String, I think the atomic access on shared pointers is not lock free. At least when T is not a basic type. From cpp reference : "These functions are typically implemented using mutexes, stored in a global hash table where the pointer value is used as the key."
REALLY helpful video! Everything was explained in a nice way. I want more stuff like this!
im always perplexed as to how people can come with this. This is an insane amount of genius, a feel so dumb I wish i could come with everything he said. :(
Excellent talk. About the std::atomic, now it is implemented as std::shared_ptr in C++20.
but 'atomic' sounds so cool T_T
Thanks Timur, a good intro for me.
39:24 If you use load/save without extra arguments, sequentially-consistent ordering will be used, which is way too slow. You should always explicitly specify memory ordering for std::atomic, if you care about performance.
A note: it's perfectly okay to wait on a mutex in audio thread if you can predict for how long it will be blocked. Think of audio thread waiting for all RT synthesis threads to finish. Locking and waiting is NOT BAD at all, it's bad when you wait for unpredictable threads.
Michał Gawron why would you have RT synthesis done in separate threads?
Devin Samarin You know, CPUs nowadays have multiple cores. ;-)
It was my understanding that mt19937 is O(1) *amorttized*. Most of the time it is just a few ops, but periodically it recomputes a large buffer.
Great content, thx! Would love to see a follow up talk on this (maybe at JUCE Summit in november?). Would definitely make a nice tutorial and add significant value to the JUCE docs imo.
Also Projucer seems like an interesting interactive IDE, don't think that talk has been posted yet though..
awesome tech without audio AFEs and audio DACs. very great job of coding.
Great pack of useful info
Glad it was helpful!
15:05 woah am I seeing a slider in the IDE setting a value?
I got impressed, but JUCE does that indeed. They should add this on Visual Studio.
He is a knowledgeable person.
Interesting that 10ms is the number shown as the high threshold for acceptable latency. I can't speak for keyboards, but for guitar anything above 5ms starts to noticeably affect the feel of playing. 5ms or below is definitely preferable. Historically coming up with an affordable audio interface and computer pairing that could achieve this was quite the challenge, which of course was one of the strongest arguments for dedicated hardware (a la AxeFX, Kemper, etc.) vs. software based amps.
+Gerald Hinson Oh, forgot to say the obvious: Great presentation!
+Gerald Hinson 10ms that's a joke. If I go above 7ms, I can't play the fast parts on guitar anymore (shredding), the delay is literally thowing me off track. I hate it when people say, you can't notice things under 16ms or some bullshit like that.
Remember the speed of sound (in Air) is ~2.994 ms per meter.
So, if your distance to a your Guitarbox is more than 2.4 Meters. Your Latency is to big for playing fast?
Monitors are rarely further than 3 meters away from me. I sometimes go off the stage and if I go too far away I start noticing the delay. If you say 7ms round-trip, it's usually a bit more because the software is calculating the best case (not including all variables).
Well I can explain:
When your playing live with an amp and cab you are playing loud and you don't hear you own e-guitar itself at all. You will only hear the signal from the amp/cab. You learn to get easily adjusted to the small latency by the distance you are in front of your amp. But if you are playing sitting in front of your DAW or a VST on your PC with your guitar in your hands, you will never turn up the volume as high like on a live gig or rehearsal. So you hear the sound of your e-guitar alone (which can be remarkably loud by the way) but also the processed sound with latency
->
and that is what will really throw you off the track if it is higher than let's say 10 ms in sum at least (better to have it around 5 ms).
please help : do i need to create a program with ASIO api to call the right data from the sound device to the DAW ? or selecting the asio driver inside protools is enough to record a 24bit bit perfect .
hello!
how can i create a random between two values to control a parameter in Faust?
Great talk, full of useful info!
Great talk!
Great talk, thank you for sharing your experience!
@Taylor Holliday Probably to preserve the deleter.
Really excellent talk!
(49:00) Why doesn't ReleasePool::add just take a shared_ptr to void? The template seems unnecessary.
when checking if(object.empty()) you need to derefence to T type to check if the actual T object is nullptr or not. I think :)
Bravo! Great talk!
@Timur Doumler is there source code for this project anywhere? I'd love to see how the oscilloscope works. I checked thecppcon githib for this presentation but it was not there. I've been researching how to implement one but am confused. Thanks.
14:50 I have heard so many times "I know XXXX isn't the correct way of doing random number" what should I google for the correct way ? C/C++ no lib. ??
checkout cpp std library
Perfect explanation! Thanks!
great stuff, good job, thanx
38:20
Volatile IS the correct solution here - and it fixes several of the "problems" listed here.
Also it only prevent the compiler from doing 2 things - optimising away any checks on that variable (We want the compiler to keep them) and reordering of volatile access (with read/write in the same function this is also what we want).
So.... no, for the situation used as an example it is the best possible solution even - the least overhead, short and concise.
Volatile also prevents the compiler from deferring the write to main memory.
Also it seems like your assumptions about volatile are based on Microsoft's implementation of volatile, which apparently is redefined as a full memory barrier, unlike on other compilers.
GOAT
Instead of working on my own soft synth I'm watching a video which explains stuff I already know. :P
It's nice to have a talk on that topic. I am really interested into making music programs and plugins, does any one knows if there are opportunities in that field ?
Hey friend, did you ever find a job? I'm interested in the same line of work, and I'm curious if you have any tips.
@@tissuepaper9962 If you get no answer, generally it means that the people already working in the field doesn't want more people to enter. I.e. it's probably paradise for those few who have made the venture.
@@oonmm Nah, I think the set of musically inclined people who are also advanced C++ programmers who really want to write audio software is small
nice talk
please someone answer ,asio4all can record audio and you can choose the buffer size what is the differance between asio4all and coding with c++??
Repeat questions... you are presenting at CppCon
57:50 "Nobody will notice if you drop like one of the 60 frames visually"
And then you enter the world of video games.
Still not so important. Let's say you have recorded 1 min of game as video in 60 fps. Let's say in one of these 60 seconds there were actually 59 unique frames, not 60. Can you watch this video and tell where exactly there is second with 59 fps? No, you can't. Now let's say one of these 60 seconds has 16 ms of silence. Can you listen to this audio and tell where the silence occurred? Sure you can, from the first attempt. It means drop of audio is more important than drop of video.
14:36 - This is a terrible example of aliasing which everybody in the audio industry should be very well aware of. You might consider putting a comment in the video to point out that this is not the proper way to generate a square wave in the digital domain.
This paper covers the basics of alias-free digital synthesis - ccrma.stanford.edu/~stilti/papers/blit.pdf
Have fun diving in to band limiting at CPPCon,.. maybe not the appropriate spot for that. :D
A comment about aliasing would be nice, that's true, but it's just an overview, no need to go into details.
One way to reduce CPU usage significantly is to pre-render some tracks and run effects in real-time only on the tracks we are working at the moment.
The buffer size should be 1 for the best audio performance. You need zero latency for the best performance.
Sending 1 frame at a time? That would be a lot of overhead. Regardless, it's more how many frames ahead you are that matter, not how many you send at a time, unless your resolution is less than like 200/sec or something. You're not going to be able to get audio to even work if you plan to be only 1 frame ahead.
I can't go lower than 64 and I have a really powerful machine.
... and unnecessary emplace_back, when plain push_back would do
What's your problem with emplace_back? It's much more elegant and can be faster.