**Reference from some lecture slides of INFSCI 2595 lectured by Dr. Mai Abdelhakim**

## Introduction

### What is Machine Learning?

- Subfield of artificial intelligence
- Field of study that gives computers the ability to learn without being explicitly programmed

### How can we build computer system that learn and improve with experience?

- Statistics make conclusions from data, and estimate reliability of conclusions
- Optimization and computing power to solve problems

Machine learns with respect to a particular **task T**, **performance metric P** and **experience E**, if the performance P on task T improves with **experience E**.

### Why Machine Learning is Important

- Provide solution to complex problems that cannot be easily programmed
- Can adapt to new data
- Helps us to understand complicated phenomena
- Can outperform human performance

### Machine Learning Algorithms

#### Supervised Learning

- Learn using labeled data (correct answers are given in learning phase)
- make predictions of previously unseen data
- Two types of problems
**Regression**: Target values (Y) are continuous/quantitative**Classification**: Target values (Y) are discrete/finite/qualitative

#### Unsupervised Learning

- Clustering analysis
- Finding groups of similar users
- Detecting abnormal patterns

## Machine Learning Models and Trade-offs

### Why do we need a model? Why estimate f?

**Predictions**: Make predictions for new inputs/features**Inference**: understand the way Y is affected by each features- Which feature has stronger impact on the response?
- Is relation positive or negative
- Is the relationship linear or more complicated

### How to estimate f?

- Parametric Approach
- First,assume function form
- Second, use training to fit the model

- Non-Parametric Approach
- No explicit form of function f is assumed
- Seek to estimate f as close as possible to the data points

### Trade-off: Model Flexibility vs Model Interpretability

### Model Accuracy

- In regression setting, a common measure is mean squared error(MSE)

#### Overfitting and Underfitting

Two thing we need to avoid:

- Overfitting: Building a model that is too complex, fits training data very well, but fail to generalize to new data (e.g. large test MSE)
Underfitting: build simple model that is unable to capture variability in data

Simple models may not capture the variability in the data

- Complex models may not generalize

#### Bias-Variance Tradeoff

- Variance: amount by which $\hat{f}$ changes if we made the estimation by different training set
- Bias: Errors from approximating real-life problems by a simpler model

- Classification Setting

- $\hat{y_{0}} = \hat{f(x_{0})}$ is the predicted output class
- Test error rate:

#### Bayes classifier

- Bayes classifier assigns each observation to the most likely class given the feature values.
- Assign $x_{0}$ to class ! that has largest $Pr(Y= j|X = x_{0})$

#### K-Nearest Neighbors

- Define a positive integer K
- For each test observation $x_{0}$ , identify K points in the training data that are closest to $x_{0}$ referred to as $N_{0}$
- Estimate the conditional probability for class j as fraction of points in $N_{0}$ whose response values equal to j