Research Area:  Machine Learning
Depression is a serious challenge to public health. Many of those who suffer from this disease use social media for information or relief. The text data produced by these users can be used to support research in this field. However, this raw information is not always suitable for use directly in machine learning. Hence, a comparative analysis was performed between different preprocessing techniques to verify the impact on the effectiveness of early depression detection on social media. The results show that the preprocessing contributes to an increase in the prediction effectiveness. Moreover, the mapping of emoticons to real emotion words was decisive to improve not only model-s effectiveness, but also to keep the balance between different evaluation measures.
Keywords:  
Depression
public health
text data
machine learning
depression detection
social media
evaluation measure
Author(s) Name:  Jose Solenir Lima Figueredo, Rodrigo Tripodi Calumby
Journal name:  Deep Learning
Conferrence name:  
Publisher name:  Researchgate
DOI:  10.5753/sbcas.2020.11504
Volume Information: