Data Mining (DM) is the pattern extraction and discovery process from large data sets. The unveiling of hidden knowledge and insights from a large volume of data involves data mining as its core and the most challenging and interesting step. Many applications, such as business, medicine, science, and engineering, have used data mining. It has led to numerous beneficial services to many walks of real businesses by both the providers and consumers of services.
DM involves six common classes of tasks: Anomaly detection, association rule learning, clustering, classification, regression, and summarization. Applying existing data mining algorithms and techniques to real-world problems has recently faced many challenges due to inadequate scalability and other limitations. Current data mining techniques and algorithms are not ready to meet the new challenges of big data.
Mining big data demands highly scalable strategies and algorithms and more effective preprocessing steps such as data filtering and integration, advanced parallel computing environments, and intelligent and effective user interactions. DM uses both new and legacy systems, which help businesses make informed decisions quickly and helps to detect credit risks and fraud, and also help data scientists easily to analyze enormous amounts of data quickly.