Main Reference PaperLarge-Scale Data Pollution with Apache Spark, IEEE Transactions on Big Data, January 2017 [Java/Spark].
  • DaPo an Apache Spark based distributed data pollution framework is proposed for generating large and realistic test data sets for duplicate detection and it is efficient, scalable and domain-independent in data management.

+ Description
  • DaPo an Apache Spark based distributed data pollution framework is proposed for generating large and realistic test data sets for duplicate detection and it is efficient, scalable and domain-independent in data management.

  • To ensure efficiency, scalability and schema independent data generation.

  • To evaluate the quality of new duplicate detection algorithms using the generated data

+ Aim & Objectives
  • To ensure efficiency, scalability and schema independent data generation.

  • To evaluate the quality of new duplicate detection algorithms using the generated data

  • An efficient approach is contributed for more sophisticated approaches for reasoning data, representation and error schemas.

+ Contribution
  • An efficient approach is contributed for more sophisticated approaches for reasoning data, representation and error schemas.

  • Java JDK 1.8, MySQL 5.5.40, Apache Spark 1.6.2.

  • Netbeans 8.0.1 & J2EE.

+ Software Tools & Technologies
  • Java JDK 1.8, MySQL 5.5.40, Apache Spark 1.6.2.

  • Netbeans 8.0.1 & J2EE.

  • M.E / M.Tech / MS / Ph.D.- Customized according to the client requirements.

+ Project Recommended For
  • M.E / M.Tech / MS / Ph.D.- Customized according to the client requirements.

  • No Readymade Projects-Depending on the complexity of the project and requirements.

+ Order To Delivery
  • No Readymade Projects-Depending on the complexity of the project and requirements.

Professional Ethic

Main Reference PaperLarge-Scale Data Pollution with Apache Spark, IEEE Transactions on Big Data, January 2017 [Java/Spark].
  • DaPo an Apache Spark based distributed data pollution framework is proposed for generating large and realistic test data sets for duplicate detection and it is efficient, scalable and domain-independent in data management.

+ Description
  • DaPo an Apache Spark based distributed data pollution framework is proposed for generating large and realistic test data sets for duplicate detection and it is efficient, scalable and domain-independent in data management.

  • To ensure efficiency, scalability and schema independent data generation.

  • To evaluate the quality of new duplicate detection algorithms using the generated data

+ Aim & Objectives
  • To ensure efficiency, scalability and schema independent data generation.

  • To evaluate the quality of new duplicate detection algorithms using the generated data

  • An efficient approach is contributed for more sophisticated approaches for reasoning data, representation and error schemas.

+ Contribution
  • An efficient approach is contributed for more sophisticated approaches for reasoning data, representation and error schemas.

  • Java JDK 1.8, MySQL 5.5.40, Apache Spark 1.6.2.

  • Netbeans 8.0.1 & J2EE.

+ Software Tools & Technologies
  • Java JDK 1.8, MySQL 5.5.40, Apache Spark 1.6.2.

  • Netbeans 8.0.1 & J2EE.

  • M.E / M.Tech / MS / Ph.D.- Customized according to the client requirements.

+ Project Recommended For
  • M.E / M.Tech / MS / Ph.D.- Customized according to the client requirements.

  • No Readymade Projects-Depending on the complexity of the project and requirements.

+ Order To Delivery
  • No Readymade Projects-Depending on the complexity of the project and requirements.

Professional Ethics: We S-Logix would appreciate the students those who willingly contribute with atleast a line of thinking of their own while preparing the project with us. It is advised that the project given by us be considered only as a model project and be applied with confidence to contribute your own ideas through our expert guidance and enrich your knowledge.

s: We S-Logix would appreciate the students those who willingly contribute with atleast a line of thinking of their own while preparing the project with us. It is advised that the project given by us be considered only as a model project and be applied with confidence to contribute your own ideas through our expert guidance and enrich your knowledge.

[/vc_column_text][/vc_column][/vc_row]