#5, First Floor, 4th Street , Dr. Subbarayan Nagar, Kodambakkam, Chennai-600 024 pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Machine learning based heterogeneous web advertisements detection using a diverse feature set - 2018

Author(s) Name:  AbShaqoor Nengroo and K.S.Kuppusamy
Journal name:  Future Generation Computer Systems
Conferrence name:  
Publisher name:  ELSEVIER
DOI:  10.1016/j.future.2018.06.028
Research Area:  Machine Learning
Abstract:

Advertisement identification and filtering in web pages gain significance due to various factors such as accessibility, security, privacy, and obtrusiveness. Current practices in this direction involve maintaining URL-based regular expressions called filter lists. Each URL obtained on a web page is matched against this filter list. While effectual, this procedure lacks scalability as it demands regular continuance of the filter list. To counter these limitations, we devise a machine learning based advertisement detection system using a diverse feature set which can distinguish advertisement blocks from non-advertisement blocks. The method can act as a base to provide various accessibility-related features like smooth browsing and text summarization for persons with visual impairments, cognitive impairments, and photosensitive epilepsy. The results from a classifier trained on the proposed feature set achieve 98.6% accuracy in identifying advertisements.

Volume Information:  Volume 89, December 2018, Pages 68-77
Journal Link:

https://www.sciencedirect.com/science/article/abs/pii/S0167739X17328777