Data mining techniques can be classified broadly as
- a. Classification
- b. Regression
- c. Time series Analysis
- d. Prediction
- a. clustering
- b. Summarization
- c. Association Rules
- d. Sequence Discovery
You can refer the below chat about that.
Now I am going to give brief intro about each types.
It is often refered as “supervised learning”. It has a predefined set of groups or models based on that we predict values.
(e.g) Airport security maintains a set of metrics and try to predict the terrorist
The regression using known data formats like linear or logistic and assume the future data format will fall in to the data structure. It then try to predict the value by applying some mathematical algorithms on the data set.
(e.g) Investing on Pension fund. Calculating your annual income and try to predict what you need after you retire. Then based on the present income and needed income makes investment decision. The Prediction done by simple regression formula to revise every year.
Time series Analysis:
With time series analysis, every attribute value determine by the different time interval.
(e.g) Buying a company stock. Take X,Y,Z companies month by month performance and try to predict their next one year growth and based on the growth you buy stocks.
Prediction is relates with time series but not time bound. It is used to predict value based on past data and current data.
(e.g) Water flow of a river will be calculated by various monitors at different levels and different time intervals. It then using those information to predict the water flow of future.
It is widely called as unsupervised learning. It is similar to classification except it won’t have any predefined groups. instead the data itself define the group.
(e.g) Consider a super market has buying details like age, job and purchase amount we can group by age against percentage as well job against percentage to make meaningful business decision to target the specific user group.
Summarization is associating the sample subset with small description or snippet.
It is also called as linked analysis. It is all about under covering relationship among data.
(e.g) Amazon “People bought this also bought this” model
Sequence discovery is about finding sequence of an activity.
(e.g) In a shop people may often buy toothpaste after toothbrush. It is all about what sequence user buying the product and based on the shop owner can arrange the items near by each others.