Research Area:  Machine Learning
Encrypted communication on the Internet using the HTTPs protocol represents a challenging task for network intrusion detection systems. While it significantly helps to preserve users privacy, it also limits a detection systems ability to understand the traffic and effectively identify malicious activities. In this work, we propose a method for modeling and representation of encrypted communication from logs of web communication. The idea is based on introducing communication snapshots of individual users activity that model contextual information of the encrypted requests. This helps to compensate the information hidden by the encryption. We then propose statistical descriptors of the communication snapshots that can be consumed by various machine learning algorithms for either supervised or unsupervised analysis of the data. In the experimental evaluation, we show that the presented approach can be used even on a large corpus of network traffic logs as the process of creation of the descriptors can be effectively implemented on a Hadoop cluster.
Keywords:  
Communication Patterns
Malware Discovery
Machine Learning
Deep Learning
Author(s) Name:  JanKohout,Tomáš Komárek,Přemysl Čech,Jan Bodnár and Jakub Lokoč
Journal name:  Expert Systems with Applications
Conferrence name:  
Publisher name:  ELSEVIER
DOI:  10.1016/j.eswa.2018.02.010
Volume Information:  Volume 101, 1 July 2018, Pages 129-142
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S0957417418300794