At the beginning of each trading day, only Open price is known. The features High, Low and Volume are not yet known, and hence, using them as features is not possible to predict the Close price of the day.
You entroduced lookahead biais in your model training using high, low and volume as it is unknown at the open time of the candle. What you could do is shift your Close column for your y variable, to try predicting the next canddle close price
The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
He did, but without using the split method. He did it by manually assigning X as all the rows excluding the Close column and the last row. He then assigned Y as all the rows close value except the last row, as this is the test set. He then trained the model with the above, and then he ran the test on the last row(test data) X values(columns excluding close value) and predicted Y(the close value for the last row). It's not the best algorithm as his set is split at a very unbalanced value. One needs more data to make it more accurate.
Why would the model predict 263 only, if the last couple of days are already > 270, values which are included into the prediction of only 263 and not 270-280?
Exactly. The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
You are using the High and Low of the hour, but you will only know this information once the hour is finished. These two features dont make sense. Thanks anyways for the video.
You would need to include the file path to your stock_data.csv file. pd.read_csv('/path/to/file'). That error means that your notebook can't find the CSV file.
At the beginning of each trading day, only Open price is known. The features High, Low and Volume are not yet known, and hence, using them as features is not possible to predict the Close price of the day.
Exactly, so what is the solution?
@@kibs_neville using open price to predict
You entroduced lookahead biais in your model training using high, low and volume as it is unknown at the open time of the candle. What you could do is shift your Close column for your y variable, to try predicting the next canddle close price
No train test split. This is the equivalent of giving the model the answer sheet to the test so you don’t get an accurate picture of model performance
The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
You are use indicators value, emas cross, macd, rsi etc..values as feature instead of OHLV values
Nice work, thanks for sharing.
Thanks for watching!
do you not need to split the dataset?
He did, but without using the split method. He did it by manually assigning X as all the rows excluding the Close column and the last row. He then assigned Y as all the rows close value except the last row, as this is the test set.
He then trained the model with the above, and then he ran the test on the last row(test data) X values(columns excluding close value) and predicted Y(the close value for the last row). It's not the best algorithm as his set is split at a very unbalanced value. One needs more data to make it more accurate.
pls enclose a link for the data.....thanks a lot
Why would the model predict 263 only, if the last couple of days are already > 270, values which are included into the prediction of only 263 and not 270-280?
How would you make a graph based on this? Thank you
Bogus Exercise. Feature already are part of future data thus making prediction using them makes no sense.
Exactly. The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
Thank you!!
Excellent. Could you make a video on Portfolio Optimization using Black Litterman Model?
You are using the High and Low of the hour, but you will only know this information once the hour is finished. These two features dont make sense. Thanks anyways for the video.
I am working on a similar project on colab but I cannot import sklearn ensemble RandomForestEnsemble..please help me
Man I'm working on a trading bot. How much for your help?
Merci (:
Argh. I get anFileNotFoundError at the line -- df = pd.read_csv('stock_data.csv')
You would need to include the file path to your stock_data.csv file. pd.read_csv('/path/to/file'). That error means that your notebook can't find the CSV file.
@@ElectricSH33P Ah, thank you (I'm very new to this.)