33:00' Tucker said "we do not have enough information about reward of the alternative actions to update the Q-table", however, in the case of returns, the reward for being long when we should have being short, is just the opposite sign, right? In my experience the total update of Q-table yields much better results
I made my own Deep reinforced learning AI on excel to SELL stocks only, buying still on me, i think i have all the missing parameters for this one. Its now online, results wise, its good 17% return from May 2020 todate, those data have noice due to covid 19, needs to have long time frame data to to verify normalisation, since it is in excel its self documenting.
Interesting point. The problem is, when the agent is actually trading on markets, there will be an impact on the environment, but this impact might be close to impossible to model in a simulation with historical data
Hope you'd still check comments of this video: About the 3 months' worth of data, what if the market keeps going in one direction? this will create serious bias will may cause loss of money if the market trends the other way when you plug in your bot.
Hi Oliver -- that's a fair point. However there are ways to combat that as well. A forward window retraining policy is one way, alternatively, you can look at an ensemble voting of experts in which bearish models and bullish models move in and out of contention based on tallying votes which take into account changes in market regime. I welcome you to peruse my blog as I cover these concepts periodically. The important thing is to allow new data to adjust the models output accordingly. Obviously, the expectation are that there will be an "adjustment period" in which the model doesn't perform as well. Just like humans when we adjust to new market regime. One last points: when the model starts to perform against its historical guidelines, some may decide to move into cash until the model is consistent between training, validation and perpetually, others understand that no performance moves up in a straight line and accept periods of underperformance as par for the course.
Its quite interesting. I have few queries regarding implementation. 1. Instead of Q-Learning, shouldn't time series learning be used? Specially the one that can keep the history... ? e.g. Recurrent NN~ 2: There was a question in QA part regarding why 4 different models are being used. I personally think, one model should be enough. One reason is to solve the problem of choosing which action to be taken. And secondly, actions do has relation. If first action is to "Left Thrust" than next action most probably should be "Right Thrust" unless there is strong wind from right to left. Choosing the action with highest value is questionable. 3: Can you please share the code for further experiments? Thanks and best regards
1. the agent should be provided with a markov state, where all infomation provided by the state cointains all information it needs to predict the future reward
Huge respect and admiration for your work, Dr Balch. Do you consider hyper-parametization a source of bias? It seems to me that any hyper-parametization is essentially a form of overfitting. Tuning parameters for optimal performance on a given data set is really just fitting your model to that set. Thoughts?
Using traditional TA indicators and charts is about as useful as using astrology to make your trade decisions. Try using DOM ladder price data to measure order flow.
HI Devon, Thank you for the comment. Indeed -- the type of data you use and the features you derive from it is critical. I would suggest that DOM ladder price alone will not be sufficient either. See my blog for more concrete examples. Good luck!
Those who fail to find, assume it simply must not exist. Optimal as well as creative feature engineering / selection, understanding the dynamics of how various indicators interact to both drive as well as measure price action, and robust trade policy structuring heavily influence predictive power. I seldom respond to comments, but I'll make an exception - I have a 3.5-yr. live track of realized Sharpe 2.34 on a ~150MM GMV book using only traditional TA indicators. Running into comments about the uselessness of TA always makes me smile.
@@fernandolener1106 I have been working on RL application in trading stocks and Forex for more than three years. Simply put, you are missing an important piece of information in your states. Without that the model wouldn't be generalized.
Rasoul Mojtahedzadeh i know which information you are talking about, I just wanted to be sure. I’ve been studying markets for two years already too :) The only thing I don’t yet know is if technical indicators are of any help
Hello Rasoul - Technical information alone is not sufficient. Depending on the asset you're targeting and the investment type you will need an array of features from uncorrelated data sources. Fundamentals, Technicals, Global Macro and most important - alternative data!
@@RasoulMojtahedzadeh i worked since 1 year on forex with a DDQN AI dual deep q-learning network, the AI can BUY SELL OR WAIT everytime 50 pip TP / 20 pip SL / and first version are on 23% of all trades are win trades, and the version that i test atm are by 35% . with 34% the AI made win. it works :P my inputs are 34200 TICKS / and i train my AI with 15 000 000 ticks AUDUSD and 15 000 000 ticks USDCAD and 15 000 000 ticks EURUSD and 15 000 000 ticks AUDUSD yes my AI can made ALL forex... not only one :P hihi
hahahaha, I created an AI better than this applied it to stocks and its easy money, there are a lot of item that needs to be change before this approach become profitable.
33:00' Tucker said "we do not have enough information about reward of the alternative actions to update the Q-table", however, in the case of returns, the reward for being long when we should have being short, is just the opposite sign, right? In my experience the total update of Q-table yields much better results
I made my own Deep reinforced learning AI on excel to SELL stocks only, buying still on me, i think i have all the missing parameters for this one. Its now online, results wise, its good 17% return from May 2020 todate, those data have noice due to covid 19, needs to have long time frame data to to verify normalisation, since it is in excel its self documenting.
If the agents actions don’t affect the environment why not use contextual bandits to model this instead?
Interesting point.
The problem is, when the agent is actually trading on markets, there will be an impact on the environment, but this impact might be close to impossible to model in a simulation with historical data
Hope you'd still check comments of this video: About the 3 months' worth of data, what if the market keeps going in one direction? this will create serious bias will may cause loss of money if the market trends the other way when you plug in your bot.
Hi Oliver -- that's a fair point. However there are ways to combat that as well. A forward window retraining policy is one way, alternatively, you can look at an ensemble voting of experts in which bearish models and bullish models move in and out of contention based on tallying votes which take into account changes in market regime. I welcome you to peruse my blog as I cover these concepts periodically.
The important thing is to allow new data to adjust the models output accordingly. Obviously, the expectation are that there will be an "adjustment period" in which the model doesn't perform as well. Just like humans when we adjust to new market regime.
One last points: when the model starts to perform against its historical guidelines, some may decide to move into cash until the model is consistent between training, validation and perpetually, others understand that no performance moves up in a straight line and accept periods of underperformance as par for the course.
@@Lucenaresearch you make sense. where is your blog please?
@@oliverli9630 - lucenaresearch.com/resources/#resources
@@Lucenaresearch looking cooool
Its quite interesting. I have few queries regarding implementation.
1. Instead of Q-Learning, shouldn't time series learning be used? Specially the one that can keep the history... ? e.g. Recurrent NN~
2: There was a question in QA part regarding why 4 different models are being used. I personally think, one model should be enough. One reason is to solve the problem of choosing which action to be taken. And secondly, actions do has relation. If first action is to "Left Thrust" than next action most probably should be "Right Thrust" unless there is strong wind from right to left. Choosing the action with highest value is questionable.
3: Can you please share the code for further experiments?
Thanks and best regards
1. the agent should be provided with a markov state, where all infomation provided by the state cointains all information it needs to predict the future reward
Huge respect and admiration for your work, Dr Balch.
Do you consider hyper-parametization a source of bias? It seems to me that any hyper-parametization is essentially a form of overfitting. Tuning parameters for optimal performance on a given data set is really just fitting your model to that set. Thoughts?
Sad but probably true
Thanks
This is gold
nice one.
why do you teach this if it doesnt work?
12:00
Using traditional TA indicators and charts is about as useful as using astrology to make your trade decisions. Try using DOM ladder price data to measure order flow.
HI Devon,
Thank you for the comment.
Indeed -- the type of data you use and the features you derive from it is critical. I would suggest that DOM ladder price alone will not be sufficient either. See my blog for more concrete examples.
Good luck!
Those who fail to find, assume it simply must not exist. Optimal as well as creative feature engineering / selection, understanding the dynamics of how various indicators interact to both drive as well as measure price action, and robust trade policy structuring heavily influence predictive power. I seldom respond to comments, but I'll make an exception - I have a 3.5-yr. live track of realized Sharpe 2.34 on a ~150MM GMV book using only traditional TA indicators. Running into comments about the uselessness of TA always makes me smile.
It is not gonna work like this.
Why not sir?
@@fernandolener1106 I have been working on RL application in trading stocks and Forex for more than three years. Simply put, you are missing an important piece of information in your states. Without that the model wouldn't be generalized.
Rasoul Mojtahedzadeh i know which information you are talking about, I just wanted to be sure. I’ve been studying markets for two years already too :)
The only thing I don’t yet know is if technical indicators are of any help
Hello Rasoul - Technical information alone is not sufficient. Depending on the asset you're targeting and the investment type you will need an array of features from uncorrelated data sources. Fundamentals, Technicals, Global Macro and most important - alternative data!
@@RasoulMojtahedzadeh i worked since 1 year on forex with a DDQN AI dual deep q-learning network, the AI can BUY SELL OR WAIT everytime 50 pip TP / 20 pip SL / and first version are on 23% of all trades are win trades, and the version that i test atm are by 35% . with 34% the AI made win. it works :P
my inputs are 34200 TICKS / and i train my AI with 15 000 000 ticks AUDUSD and 15 000 000 ticks USDCAD and 15 000 000 ticks EURUSD and 15 000 000 ticks AUDUSD
yes my AI can made ALL forex... not only one :P hihi
hahahaha, I created an AI better than this applied it to stocks and its easy money, there are a lot of item that needs to be change before this approach become profitable.