Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Efficient Recommendation of De-Identification Policies Using MapReduce - 2017

Efficient Recommendation of De-Identification Policies Using MapReduce

Research Area:  Big Data

Abstract:

Many data owners are required to release the data in a variety of real world application, since it is of vital importance to discovery valuable information stay behind the data. However, existing re-identification attacks on the AOL and ADULTS datasets have shown that publish such data directly may cause tremendous threads to the individual privacy. Thus, it is urgent to resolve all kinds of re-identification risks by recommending effective de-identification policies to guarantee both privacy and utility of the data. De-identification policies is one of the models that can be used to achieve such requirements, however, the number of de-identification policies is exponentially large due to the broad domain of quasi-identifier attributes. To better control the trade off between data utility and data privacy, skyline computation can be used to select such policies, but it is yet challenging for efficient skyline processing over large number of policies. In this paper, we propose one parallel algorithm called SKY-FILTER-MR, which is based on MapReduce to overcome this challenge by computing skylines over large scale de-identification policies that is represented by bit-strings. To further improve the performance, a novel approximate skyline computation scheme was proposed to prune unqualified policies using the approximately domination relationship. With approximate skyline, the power of filtering in the policy space generation stage was greatly strengthened to effectively decrease the cost of skyline computation over alternative policies. Extensive experiments over both real life and synthetic datasets demonstrate that our proposed SKY-FILTER-MR algorithm substantially outperforms the baseline approach by up to four times faster in the optimal case, which indicates good scalability over large policy sets.

Keywords:  

Author(s) Name:  Xiaofeng Ding; Li Wang; Zhiyuan Shao; Hai Jin Services Computing Technology and System Lab, Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan and China

Journal name:  IEEE Transactions on Big Data

Conferrence name:  

Publisher name:  IEEE

DOI:  10.1109/TBDATA.2017.2690660

Volume Information:  Volume: 5, Issue: 3, Sept. 1 2019,Page(s): 343 - 354