A Theoretical Approach to the Study of Stock Price Prediction-Comparative Analysis using ARIMA and Random Forest Model
Author : Dr. Srinivas R, Archana S and Geethanjali
Abstract :
Stock price forecasting means estimating or forecasting the stock price of the future. Recently in the modern era, stock price forecasting is used for proper financial planning, efficient decision making, risk management and better investment planning. This study compares the performance of a machine learning algorithm (Random Forest) and a time series algorithm (ARIMA) in predicting stock prices using historical data from Apple Inc. From various financial sources, the data and information’s are collected that majorly includes daily stock prices and features such as Open, High, Low, Close, Volume, and Adjusted Close. The study focuses on the Close prices as the target variable. Data pre-processing involves handling missing values and creating lag features (lag1, lag2, lag3) to enhance the predictive capability of the models. The entire data range are segregated into 80% training data and 20% testing data for evaluation. The ARIMA model, with parameters (p=5, d=1, q=0), is fitted to the training data and used to forecast stock prices for the test data. The Random Forest Regressor is trained on the lag features and used to predict stock prices. Both models are evaluated using Mean Squared Error (MSE) and Mean Absolute Error (MAE). The results indicate that the Random Forest model outperforms ARIMA in terms of both MSE and MAE, suggesting its better ability to capture complex relationships in the data. However, ARIMA remains valuable for scenarios requiring explainable and interpretable models. The study concludes that the choice between ARIMA and Random Forest should depend on specific requirements such as interpretability, accuracy, and computational cost. Future work includes combining both models and exploring advanced techniques like LSTM, GRU, and transformer networks for improved stock price forecasting.
Keywords :
Stock price forecasting, ARIMA, MSE (Mean squared Error), MAE (Mean Absolute Error) random forest, machine learning.