Research Area:  Machine Learning
In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label (high or low) represents the underlying companys stock price variation on the next day, being either higher or lower than a certain threshold. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, shows that our approach clearly outperforms the competitors. Moreover, the devised Artificial Intelligence (AI) approach is explainable, in the sense that we analyze the white-box behind the classifier and provide a set of explanations on the obtained results.
Keywords:  
Forecasting
Social networking (online)
Companies
Stock markets
Feature extraction
Task analysis
Prediction algorithms
Author(s) Name:  Salvatore M. Carta; Sergio Consoli; Luca Piras
Journal name:  IEEE Access
Conferrence name:  
Publisher name:  IEEE
DOI:  10.1109/ACCESS.2021.3059960
Volume Information:  Volume: 9
Paper Link:   https://ieeexplore.ieee.org/abstract/document/9355141/