Research Area:  Machine Learning
With the ubiquity of the Internet, platforms such as: Google, Wikipedia and the like can provide insights pertaining to firms financial performance as well as capture the collective interest of traders through search trends, number of web page visitors and/or financial news sentiment. Information emanating from these platforms can significantly affect, or be affected by, changes in the stock market. The overarching goal of this paper is to develop a financial expert system that incorporates these features to predict short term stock prices. Our expert system is comprised of two main modules: a knowledge base and an artificial intelligence (AI) platform. The “knowledge base” for our expert system captures: (a) historical stock prices; (b) several well-known technical indicators; (c) counts and sentiment scores of published news articles for a given stock; (d) trends in Google searches for the given stock ticker; and (e) number of unique visitors for pertinent Wikipedia pages. Once the data is collected, we use a structured approach for data preparation. Then, the AI platform trains four machine learning ensemble methods: (a) a neural network regression ensemble; (b) a support vector regression ensemble; (c) a boosted regression tree; and (d) a random forest regression. In the cross-validation phase, the AI platform picks the “best” ensemble for a given stock. To evaluate the efficacy of our expert system, we first present a case study based on the Citi Group stock ($C) with data collected from 01/01/2013 - 12/31/2016. We show the expert system can predict the 1-day ahead $C stock price with a mean absolute percent error (MAPE) ≤ 1.50% and the 1–10 day ahead with a MAPE ≤ 1.89%, which is better than the reported results in the literature. We show that the use of features extracted from online sources does not substitute the traditional financial metrics, but rather supplements them to improve upon the prediction performance of machine learning based methods. To highlight the utility and generalizability of our expert system, we predict the 1-day ahead price of 19 additional stocks from different industries, volatilities and growth patterns. We report an overall mean for the MAPE statistic of 1.07% across our five different machine learning models, including a MAPE of under 0.75% for 18 of the 19 stocks for the best ensemble (boosted regression tree).
Online Data Sources
Author(s) Name:  Bin Weng,LinLu,Xing Wanga Fadel M.Megahed and Waldyn Martinez
Journal name:  Expert Systems with Applications
Publisher name:  ELSEVIER
Volume Information:  Volume 112, 1 December 2018, Pages 258-273
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S0957417418303622