Man, I love your tutorials so far! You are a life-saver. Though I had to pause the video every 30 seconds to makeout what's happening (as the explanation is very fast for me); this is the only video that I have actually learned about Q Learning the most. I thank you from the bottom of my heart.
Hi Shawn. Thanks for this tutorial. I have a question on line 30 and 38 after you've coded the DQN . Shouldn't you be calling self.q_action instead of self.q_state and then take max of the values? Thanks!
Hi, great tutorial! Cannot feed value of shape () for Tensor 'Placeholder_2:0', which has shape '(1,)' From self.sess.run(self.optimizer, feed_dict=feed) So tf.reduce_sum(tf.square(agent.target_in - agent.q_action)) returns a tensor with shape=() is this the issue? but if i change self.target_in = tf.compat.v1.placeholder(tf.float32, shape=[1]) to shape=[] it runs? However, rewards over 200 episodes never go over 1 or 2? Thanks!
hey why does your ai always train alot better than mine? on first 100 episodes we get similar results but on the next 100 i either got 3/4 (or 0) and you get like 50!
What version of tensorflow are you using here/how would you implement this in 2.0? I'm using 2.0, and when I copy and paste the code you provide (which should work), I get an error: "module 'tensorflow' has no attribute 'placeholder'". any ideas?
Great video. However I don't think your statement about NN updating multiple weights per iteration is true. For each iteration only a single state and action is active hence the gradients of only the weights attached to them would be non-zero and hence only they would be updated - that would be only one weight.
Man, I love your tutorials so far! You are a life-saver. Though I had to pause the video every 30 seconds to makeout what's happening (as the explanation is very fast for me); this is the only video that I have actually learned about Q Learning the most.
I thank you from the bottom of my heart.
I'm just starting with machine learning and this channel has helped me a lot with understanding the basics. Thanks a lot!
Hi Shawn. Thanks for this tutorial. I have a question on line 30 and 38 after you've coded the DQN . Shouldn't you be calling self.q_action instead of self.q_state and then take max of the values?
Thanks!
Hi, great tutorial!
Cannot feed value of shape () for Tensor 'Placeholder_2:0', which has shape '(1,)'
From self.sess.run(self.optimizer, feed_dict=feed)
So tf.reduce_sum(tf.square(agent.target_in - agent.q_action)) returns a tensor with shape=() is this the issue?
but if i change self.target_in = tf.compat.v1.placeholder(tf.float32, shape=[1]) to shape=[] it runs?
However, rewards over 200 episodes never go over 1 or 2?
Thanks!
Good Tutorial but the pace is on higher side..
great video bro.
just want to know how to work with an image as my observation space.
hey why does your ai always train alot better than mine? on first 100 episodes we get similar results but on the next 100 i either got 3/4 (or 0) and you get like 50!
Hi Shawn, thank you for so detailed tutorial. Could you please give some tips on how to rewrite your code for TensorFlow 2.0?
What version of tensorflow are you using here/how would you implement this in 2.0? I'm using 2.0, and when I copy and paste the code you provide (which should work), I get an error: "module 'tensorflow' has no attribute 'placeholder'". any ideas?
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
Great video. However I don't think your statement about NN updating multiple weights per iteration is true. For each iteration only a single state and action is active hence the gradients of only the weights attached to them would be non-zero and hence only they would be updated - that would be only one weight.
Anyone else having trouble with the reward never going over 1?
yep, mine is not updating randomly anymore