A and tools are applied in order to extract


Mrs.N.Nithya Research scholar, Dr.N.Balakumar Assistant Professor, Pioneer
College of Arts and Science, Jothipuram, Coimbatore-47.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now


 Data mining is the process of analyzing data
from different views and summarizing it into useful data. “Data mining, also
popularly referred to as knowledge discovery from data (KDD), is the automated
or convenient mining of patterns representing data implicitly stored or capture
in large databases, data warehouses, the Web, other massive information repositories
or data streams.”This paper provides a survey on various data mining techniques
such as classification, clustering, regression, and summarization and so on.
This paper also discusses some of the data mining applications.

information discovery in data, data mining application, descriptive model,
predictive model.


Data mining, discovering of secreted predictive
information from large data sets and it is a powerful new technology with great
potential to help companies focus on the most important information in their
data warehouses. Data mining (sometimes called data or knowledge discovery) is
the process of analyzing data from different perspectives and shortening it
into useful information – information that can be used to increase revenue,
cuts costs, or both.

In Knowledge Discovery Process, Data mining techniques are divided
into two major categories. These are descriptive type and prediction type. Each
of the type will have different type of the approaches.




Data mining is also known as Knowledge
Discovery in Database, refers to finding or “mining” knowledge from large
amounts of data. Data mining techniques are used to operate on large volumes of
data to discover hidden patterns and relationships helpful in decision making.
So, many people use the term “knowledge discovery in data” or KDD for data
mining 1.

Data cleaning: This is the first step to
remove noise data and irrelevant data from collected raw data.

 ii) Data integration: At this step, various
data sources are combined into meaningful and useful data.

 iii) Data Selection: Here, data relevant to
the analysis are retrieved from various resources.

 iv) Data transformation: In this step, data is
converted or consolidated into required forms for mining by performing
different operations such as smoothing, normalization or aggregation.

v) Data Mining: At this step, various
clever techniques and tools are applied in order to extract data pattern or

 vi) Pattern evaluation: At this step, Attractive
patterns representing knowledge are identified based on given measures.

 vii) Knowledge representation: This is the
last stage in which, visualization and knowledge representation techniques are
used to help users to understand and interpret the data mining knowledge or



 Data mining
process is extraction of information from large data sets and transforms it
into some understandable form for further uses. So it helps to achieve the
specific objectives. The goal of a data mining effort is normally either to
create a descriptive model or a predictive model7. A Descriptive model presents the data in concise form which is
essentially a summary of the data points, finds patterns in the data and
understands the relationships between attributes represented by the data. The
Descriptive model includes tasks such as Clustering, Association Rules,
Summarizations, and Sequence Discovery. The predictive model works by making a prediction about values of data,
which uses known results found from different datasets 5. The Predictive data
mining model includes classification, prediction, regression and analysis of
time series as in figure 1.

Figure 1 Data
Mining Techniques

n  Classification:
mining patterns that can classify future data into known classes.

n  Association
rule mining: mining any rule of the form X  ® Y, where X and Y
are sets of data items.

n  Clustering:
identifying a set of similarity groups in the data

n  A
sequential rule: A®
B, says that event A will be immediately followed by event B
with a certain confidence

n  Deviation
detection: discovering the most significant changes in data

n  Data
visualization: using graphical methods to show patterns in data.



The Data mining applications are widely used in
diverse areas such as retail stores, hospitals, banks, and insurance companies
2. Many domains like health care, finance insurance, retail stores combines
the data mining applications with statistics, pattern recognition, and other
important tools to perform data analytics. Data mining is used primarily for
decision making.

and health care: Data mining in medicine enables to
characterize patient activities to see incoming office visits.Data mining helps
identify the patterns of successful medical therapies for different illnesses.

Educational Data Mining is a blooming field which provides knowledge from
educational Environment data. The goals of EDM are identified as predicting
students’ learning behavior, emotions and skills 3. This study improves the
educating methods by understanding the ward and to take accurate decisions

Basket Analysis: Market basket analysis is a technique that uses
association rule mining to understand the purchasing behavior of the customer.
It also allows the seller to understand his business, customer’s needs and to
make profitable change accordingly.

Banking: Data mining can contribute to solving business problems in banking
and finance by finding patterns, causalities, and correlations in business
information and market prices. The managers may find this information for
better segmenting, targeting, acquiring, retaining and maintaining a profitable

Analysis: Data mining is very useful in data pre-processing and integration
of databases. Data mining allows the researchers to identify co-occurring
sequences and the correlation between any activities .Data visualization and
visual data mining help the researcher with a clear view of the data.


 According to the techniques of data mining
listed above, it is learned that this a powerful and essential technique for
performing manipulation of data that is data mining gives proper and targeted
outcome from large and vastly growing data worldwide. This paper discusses the
idea of data mining, the process of KDD, different techniques such as
clustering, association, classification, prediction and so on. We also
discussed some insights of the data mining applications.


 1 Aarti Sharma et al,
“Application of Data Mining – A Survey Paper”, International Journal of
Computer Science and Information technologies’, Vol. 5 (2), 2014.

 2 Smita, Priti and Sharma, “Use of Data
Mining in Various Field: A Survey Paper” IOSR Journal of Computer Engineering,
8727Volume 16, Issue 3, Ver. V (May-Jun. 2014)

3 Brijesh Kumar Baradwaj, Saurabh
Pal” Mining Educational Data to Analyze Students Performance” (IJACSA)
International Journal of Advanced Computer Science and Applications, Vol. 2,
No. 6, 2011

4 J. Han and M. Kamber. “Data
Mining, Concepts and Techniques”, Morgan Kaufmann, 2000.

 5 Nikita Jain, Vishal Srivastava “DATA
MINING TECHNIQUES: A SURVEY PAPER” IJRET: International Journal of Research in
Engineering and Technology, Volume: 02 Issue: 11 | Nov-2013.

 6 Prof. Dr. Wolfgang Karl Hardle,” Time
Series Data Mining Methods: A Review”, Berlin, March 25, 2015.

 7 Pradnya P. Sondwale, “Overview of
Predictive and Descriptive Data Mining Techniques” IJARCSSE, Volume 5, Issue 4,
April 2015

 8 Data Mining: Concepts and Techniques,
Third Edition (The Morgan Kaufmann Series in Data Management System`s) 3rd