Ch#4

1. What is data reduction?

Answer:
Data reduction is the process of reducing the volume of data while producing the same or similar analytical results.

2. Why do we need data reduction?

Answer:

Improves model performance (speed & accuracy)
Helps in data visualization
Reduces dimensionality
Removes noise
Leads to simpler, faster, and more accurate models

3. What is feature selection (aka attribute/variable selection)?

Answer:
It’s the process of selecting an optimal subset of features from the data that contribute most to the model, based on a specific evaluation criterion.

5. What are the main techniques for feature selection (data reduction)?

Method	Description	Tools/Details
Wrapper Method	Uses a classifier to evaluate feature subsets based on their performance	- Generates all possible subsets- Uses a search technique to find best one
Filter Method	Ranks features using an attribute evaluator and selects top-ranked ones	- Doesn’t rely on classifier- Example in WEKA: `InfoGainAttributeEval` + `Ranker`

6. Difference between Wrapper and Filter methods?

Feature	Wrapper Method	Filter Method
Based on	Classifier performance	Statistical evaluation
Speed	Slower (computationally expensive)	Faster
Accuracy	Generally more accurate	May not consider interaction between features
Tool Example	Classifier + subset evaluator	InfoGain + Ranker in WEKA

Cover Letter

Search This Blog

Ch#4

2. Why do we need data reduction?

3. What is feature selection (aka attribute/variable selection)?

5. What are the main techniques for feature selection (data reduction)?

6. Difference between Wrapper and Filter methods?

Comments

Post a Comment

Popular posts from this blog

Chap#10

Ai Mental Health & Cyber Safety Presentation