Reference from some lecture slides of INFSCI 2595 lectured by Dr. Mai Abdelhakim
Introduction
What is Machine Learning?
- Subfield of artificial intelligence
- Field of study that gives computers the ability to learn without being explicitly programmed
How can we build computer system that learn and improve with experience?
- Statistics make conclusions from data, and estimate reliability of conclusions
- Optimization and computing power to solve problems
Machine learns with respect to a particular **task T**, **performance metric P** and **experience E**, if the performance P on task T improves with **experience E**.
Why Machine Learning is Important
- Provide solution to complex problems that cannot be easily programmed
- Can adapt to new data
- Helps us to understand complicated phenomena
- Can outperform human performance
Machine Learning Algorithms
Supervised Learning
- Learn using labeled data (correct answers are given in learning phase)
- make predictions of previously unseen data
- Two types of problems
- Regression: Target values (Y) are continuous/quantitative
- Classification: Target values (Y) are discrete/finite/qualitative
Unsupervised Learning
- Clustering analysis
- Finding groups of similar users
- Detecting abnormal patterns
Machine Learning Models and Trade-offs
Why do we need a model? Why estimate f?
- Predictions: Make predictions for new inputs/features
- Inference: understand the way Y is affected by each features
- Which feature has stronger impact on the response?
- Is relation positive or negative
- Is the relationship linear or more complicated
How to estimate f?
- Parametric Approach
- First,assume function form
- Second, use training to fit the model
- Non-Parametric Approach
- No explicit form of function f is assumed
- Seek to estimate f as close as possible to the data points
Trade-off: Model Flexibility vs Model Interpretability
Model Accuracy
- In regression setting, a common measure is mean squared error(MSE)
Overfitting and Underfitting
Two thing we need to avoid:
- Overfitting: Building a model that is too complex, fits training data very well, but fail to generalize to new data (e.g. large test MSE)
Underfitting: build simple model that is unable to capture variability in data
Simple models may not capture the variability in the data
- Complex models may not generalize
Bias-Variance Tradeoff
- Variance: amount by which $\hat{f}$ changes if we made the estimation by different training set
- Bias: Errors from approximating real-life problems by a simpler model
- Classification Setting
- $\hat{y_{0}} = \hat{f(x_{0})}$ is the predicted output class
- Test error rate:
Bayes classifier
- Bayes classifier assigns each observation to the most likely class given the feature values.
- Assign $x_{0}$ to class ! that has largest $Pr(Y= j|X = x_{0})$
K-Nearest Neighbors
- Define a positive integer K
- For each test observation $x_{0}$ , identify K points in the training data that are closest to $x_{0}$ referred to as $N_{0}$
- Estimate the conditional probability for class j as fraction of points in $N_{0}$ whose response values equal to j