List of Topics:
Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

Effective Prediction of Missing Data on Apache Spark over Multivariable Time Series - 2018

Effective Prediction of Missing Data on Apache Spark over Multivariable Time Series

Research Area:  Big Data

Abstract:

More massive volume of data are generated in many areas than ever before. However, the missing of some values in collected data always occurs in practice and challenges extracting maximal value from these large scale data sets. Nevertheless, in multivariable time series, most of the existing methods either might be infeasible or could be inefficient to predict the missing data. In this paper, we have taken up the challenge of missing data prediction in multivariable time series by employing improved matrix factorization techniques. Our approaches are optimally designed to largely utilize both the internal patterns of each time series and the information of time series across multiple sources. Based on the idea, we have imposed three different regularization terms to constrain the objective functions of matrix factorization and built five corresponding models. Extensive experiments on real-world data sets and synthetic data set demonstrate that the proposed approaches can effectively improve the performance of missing data prediction in multivariable time series. Furthermore, we have also demonstrated how to take advantage of the high processing power of Apache Spark to perform missing data prediction in large scale multivariable time series.

Keywords:  

Author(s) Name:  Weiwei Shi,Yongxin Zhu,Philip S. Yu,Jiawei Zhang,Tian Huang,Chang Wang and Yufeng Chen

Journal name:  IEEE Transactions on Big Data

Conferrence name:  

Publisher name:  IEEE

DOI:  10.1109/TBDATA.2017.2719703

Volume Information:  Dec. 2018, pp. 473-486, vol. 4