Hello brthor, what version are of Python did you use for this project? I used 3.9, but I had problems with ROMs. I read somewhere that latest support for atari_py is versions 3.7 and lower. Thanks in advance.
I'm not sure what you mean by rotation here, but the Network model in dqn.py and observe.py is identical and you could simply import the network into observe.py from dqn.py as long as you protect the training code from executing during import by checking if it's the main script before executing training.
Hello brthor. High LR is faster at the begining but has a limit at the late phases of training, low LR are slower, but limit is much higher. Solution: Set High LR and lower it during training f.e LR = n/attempt_no. Did you try that? After all great job, thanks!
I'm a beginner. It was your videos that started my journey. Thank you so much!
Nice! Good Luck!
Is there a way to quickly see what Machine Finished has learned on the game screen?
I'm not too clear what you are asking here. The agent's final score will usually reflect how much it has learned.
Hello brthor, what version are of Python did you use for this project? I used 3.9, but I had problems with ROMs. I read somewhere that latest support for atari_py is versions 3.7 and lower. Thanks in advance.
I believe used python 3.6 for this one.
i have a question. If you run the model from the "DQN" file, can't you run it using the "DQN" file from the "observe" file as the result of rotation?
I'm not sure what you mean by rotation here, but the Network model in dqn.py and observe.py is identical and you could simply import the network into observe.py from dqn.py as long as you protect the training code from executing during import by checking if it's the main script before executing training.
Hello brthor. High LR is faster at the begining but has a limit at the late phases of training, low LR are slower, but limit is much higher. Solution: Set High LR and lower it during training f.e LR = n/attempt_no. Did you try that?
After all great job, thanks!
This is a common approach for longer training runs, but in this case I was replicating the methods of the paper.