You content is head and shoulder above the rest on the topic. Kudos! Reward + 1000! The music unnerves me as I try to concentrate on the ideas, though... reward -0.1
Running the max() function over the complete priority buffer, hinders the performance by a substantial amount. I would storing the max probability in a variable, and compare it with newly added errors, and update the variable. This can then be used for adding new experience priorities.
What about normalizing priorities (0 to 1)? This way we could just set max_priority to 1, and I think it would positively affect performance, keeping it stable! What do you think?
@@julioresende1521 But wouldn't the time complexity for using segment trees to compute max from index 0 to index N - 1 (length of the array) be the same as running the max function over the array?
Hi :) You can save the weights of the neural networks at the end of the training proccess. In my case, I use the tensorflow.keras library, where the models have a method called save_weights.
Hmm, I'm getting an error AttributeError: 'DoubleDQNAgent' object has no attribute 'sess'. Not sure why as a lot of the code looks like the previous???
Loved your animation and how you explained this concept in a systematic yet easy to understand manner.
You content is head and shoulder above the rest on the topic. Kudos! Reward + 1000! The music unnerves me as I try to concentrate on the ideas, though... reward -0.1
Amazing explanation! Loved your content. Keep making such awesome videos!
This is explained so well! Hope you continue making content : )
Thank you so much! I like your explaination:D
Your explanations are very clear and understandable. Thank you :)
Great video! Thank you for it!
Thank you for all the effort! Great video!
This channel is awesome!
Great explanation. Thanks!
Amazing content.
Do we need neural network to generate data for experience replay?
Running the max() function over the complete priority buffer, hinders the performance by a substantial amount. I would storing the max probability in a variable, and compare it with newly added errors, and update the variable. This can then be used for adding new experience priorities.
What about normalizing priorities (0 to 1)? This way we could just set max_priority to 1, and I think it would positively affect performance, keeping it stable!
What do you think?
A better way is to use Segment Trees...
@@julioresende1521 But wouldn't the time complexity for using segment trees to compute max from index 0 to index N - 1 (length of the array) be the same as running the max function over the array?
@@Небудьбараном-к1м I think in order to normalize priorities you first need to compute the max priority.
@@youcantellimreallybored3034 you can use one variable to store the max value. The segment tree (sum) is useful to compute the roulette method.
Awesome work !
pie torch :)
Very nice work! :D
Good jokes on 1:43
Love them too. PieTorch is the best xD
Subscribed!
awesome !!1 great work
Nice content except i have to watch it at 0.5 speed.
I watch it at 1.2 speed lol
Please explain it with real time example
hii how do i save a trained agent?
Hi :) You can save the weights of the neural networks at the end of the training proccess. In my case, I use the tensorflow.keras library, where the models have a method called save_weights.
Is there anyone like me who got lost when he started writing code
Why not normalize priorities? I think that will boost the performance much
Hard paper to understand
Anyone who can write all these steps one by one?
Hmm, I'm getting an error AttributeError: 'DoubleDQNAgent' object has no attribute 'sess'. Not sure why as a lot of the code looks like the previous???
Had that too. It disappeared after I switched from Jupyter to colab