A quick summary of ISLR Ch1-4
Below are the key points I summarized from Chapters 1-4 in An Introduction to Statistical Learning with Applications in R (ISLR).
Two Main Types of Learning Tasks
Supervised learning: predict y (response) using x (predictors)
Unsupervised learning: only x (predictors) are available, so we try to learn relationships among them
For each model, there is a bias-variance trade-off. A very flexible model tends to have a low bias (or low training error) but a high variance (or high testing error). Flexible means the model can fit training data very well, thus the low bias. However, this often indicates overfitting, so the model does poorly with a different dataset. Ideally, we want a model with a low bias and a low variance.
Parametric Approach
Logistic Regression: Coefficients are estimated using maximum likelihood. It's best for two-class classification.
Linear Discriminant Analysis: Observations within each class are assumed to come from a normal distribution with a class-specific mean and a common variance. The estimates for these parameters are plugged into the Bayes classifier. It's better for multiple-class classification.
Quadratic Discriminant Analysis: It assumes class-specific variances instead of a common one. The model is more suitable when training data is large or the assumption of a common covariance matrix can't be justified.
Non-Parametric Approach
K-Nearest Neighbors: It assigns a new data point to the class that has the most of these observations. No assumptions are made about the shape of the decision boundary.