Data Mining Notes
Steps
1. Define the problem;
2. Gather the dataset;
3. Dataset cleaning:
A. Following the problem, get ride of the unuseful data;
B. Make up the missing data using the average method or others;
4. Analysis
Analysis Method
1. Feature Extraction Methods
Target: set the rank of the data in order to define which is more important to recognize the class
A. T-test
Target: using t to define the difference of the class. If t is big, the parameter is more useful to distinguish the class.
Steps
a. Calculate the average (u) of Class 1 and Class 2
b. Using following formula to calculate the s
c. Using following formula to calculate the t
Example:
Class 1
|
Class 2
|
25
|
5
|
16
|
12
|
21
|
9
|
18
|
13
|
32
|
19
|
B. Principle Component Analysis (PCA)
Target: find out the direction efficient for representation.
Example: input (x1, x2) output (u1, u2)
C. Linear Discriminant Analysis (LDA)
Target: Find a linear subspace that maximizes class separability among the feature vector.
Example: input (x1,x2,class) output (u1, u2)
2. Classification
A. Prototype based methods: K-nearest Neighbour (KNN)
B. Boundary based methods: Multiple Layer Perception (MLP)
Comments
Post a Comment