You content is head and shoulder above the rest on the topic. Kudos! Reward + 1000! The music unnerves me as I try to concentrate on the ideas, though... reward -0.1
Running the max() function over the complete priority buffer, hinders the performance by a substantial amount. I would storing the max probability in a variable, and compare it with newly added errors, and update the variable. This can then be used for adding new experience priorities.
What about normalizing priorities (0 to 1)? This way we could just set max_priority to 1, and I think it would positively affect performance, keeping it stable! What do you think?
@@julioresende1521 But wouldn't the time complexity for using segment trees to compute max from index 0 to index N - 1 (length of the array) be the same as running the max function over the array?
Hi :) You can save the weights of the neural networks at the end of the training proccess. In my case, I use the tensorflow.keras library, where the models have a method called save_weights.
Hmm, I'm getting an error AttributeError: 'DoubleDQNAgent' object has no attribute 'sess'. Not sure why as a lot of the code looks like the previous???
Loved your animation and how you explained this concept in a systematic yet easy to understand manner.
This is explained so well! Hope you continue making content : )
Amazing explanation! Loved your content. Keep making such awesome videos!
You content is head and shoulder above the rest on the topic. Kudos! Reward + 1000! The music unnerves me as I try to concentrate on the ideas, though... reward -0.1
Your explanations are very clear and understandable. Thank you :)
This channel is awesome!
Thank you for all the effort! Great video!
Amazing content.
Great video! Thank you for it!
Thank you so much! I like your explaination:D
Great explanation. Thanks!
Awesome work !
Very nice work! :D
awesome !!1 great work
pie torch :)
Subscribed!
Running the max() function over the complete priority buffer, hinders the performance by a substantial amount. I would storing the max probability in a variable, and compare it with newly added errors, and update the variable. This can then be used for adding new experience priorities.
What about normalizing priorities (0 to 1)? This way we could just set max_priority to 1, and I think it would positively affect performance, keeping it stable!
What do you think?
A better way is to use Segment Trees...
@@julioresende1521 But wouldn't the time complexity for using segment trees to compute max from index 0 to index N - 1 (length of the array) be the same as running the max function over the array?
@@Небудьбараном-к1м I think in order to normalize priorities you first need to compute the max priority.
@@youcantellimreallybored3034 you can use one variable to store the max value. The segment tree (sum) is useful to compute the roulette method.
Good jokes on 1:43
Love them too. PieTorch is the best xD
Do we need neural network to generate data for experience replay?
Please explain it with real time example
hii how do i save a trained agent?
Hi :) You can save the weights of the neural networks at the end of the training proccess. In my case, I use the tensorflow.keras library, where the models have a method called save_weights.
Nice content except i have to watch it at 0.5 speed.
I watch it at 1.2 speed lol
Hmm, I'm getting an error AttributeError: 'DoubleDQNAgent' object has no attribute 'sess'. Not sure why as a lot of the code looks like the previous???
Had that too. It disappeared after I switched from Jupyter to colab
Why not normalize priorities? I think that will boost the performance much
Hard paper to understand
Is there anyone like me who got lost when he started writing code
Anyone who can write all these steps one by one?