Research Area:  Machine Learning
Data streaming is an evolutionary concept in big data where the size of data increases a lot from social media, trending websites, and mobile applications. Nowadays, streaming data tends to collect data from live streaming to run analysis and generate reports for data prediction. This process requires skilled professional for acquiring data from live stream using complex coding and queries. The above drawback is overcome in this research work by implementing streaming algorithm to fetch data from twitter using a keyword search. The Twitter data visualization application is designed for data visualization, report generation and its analysis. The live twitter data is fetched by configuring the system with Hadoop, Hive warehouse and Apache Flume. By using flume agent, the keyword file is placed on Hadoop cluster to acquire relevant data via flume channel and then sink the collected data in Hadoop Distributed File System. The twitter data application creates a database in Hive and imports the collected data to Hive table for visualization. The output results are categorized for report generation and a graphical representation is used for sentimental analysis. The obtained positive, negative, neutral opinion from the tweets can be used for decision making. This application can be deployed for analyzing the real time public opinion about the Election, Government activities and any topic of societal interest.
Author(s) Name:   G. Kavitha; B. Saveen; Nomaan Imtiaz
Conferrence name:   2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET)
Publisher name:  IEEE
Paper Link:   https://ieeexplore.ieee.org/abstract/document/8821105