Research Area:  Machine Learning
Electronic mail is the primary source of different cyber scams. Identifying the author of electronic mail is essential. It forms significant documentary evidence in the field of digital forensics. This paper presents a model for email author identification (or) attribution by utilizing deep neural networks and model-based clustering techniques. It is perceived that stylometry features in the authorship identification have gained a lot of importance as it enhances the author attribution task-s accuracy. The experiments were performed on a publicly available benchmark Enron dataset, considering many authors. The proposed model achieves an accuracy of 94% on five authors, 90% on ten authors, 86% on 25 authors and 75% on the entire dataset for the Deep Neural Network technique, which is a good measure of accuracy on a highly imbalanced data. The second cluster-based technique yielded an excellent 86% accuracy on the entire dataset, considering the authors number based on their contribution to the aggregate data.
Author(s) Name:  K. A. Apoorva & S. Sangeetha
Journal name:  SN Applied Sciences
Publisher name:  Springer
Volume Information:  volume 3, Article number: 348 (2021)
Paper Link:   https://link.springer.com/article/10.1007/s42452-020-04127-6