Reference from some lecture slides of INFSCI 2595 lectured by Dr. Mai Abdelhakim

## Introduction

### What is Machine Learning?

• Subfield of artificial intelligence
• Field of study that gives computers the ability to learn without being explicitly programmed

### How can we build computer system that learn and improve with experience?

• Statistics make conclusions from data, and estimate reliability of conclusions
• Optimization and computing power to solve problems
Machine learns with respect to a particular **task T**, **performance metric P** and **experience E**, if the performance P on task T improves with **experience E**.

### Why Machine Learning is Important

• Provide solution to complex problems that cannot be easily programmed
• Can adapt to new data
• Helps us to understand complicated phenomena
• Can outperform human performance

### Machine Learning Algorithms

#### Supervised Learning

1. Learn using labeled data (correct answers are given in learning phase)
2. make predictions of previously unseen data
3. Two types of problems
• Regression: Target values (Y) are continuous/quantitative
• Classification: Target values (Y) are discrete/finite/qualitative

#### Unsupervised Learning

1. Clustering analysis
2. Finding groups of similar users
3. Detecting abnormal patterns

## Machine Learning Models and Trade-offs

### Why do we need a model? Why estimate f?

• Predictions: Make predictions for new inputs/features
• Inference: understand the way Y is affected by each features
• Which feature has stronger impact on the response?
• Is relation positive or negative
• Is the relationship linear or more complicated

### How to estimate f?

1. Parametric Approach
• First,assume function form
• Second, use training to fit the model
2. Non-Parametric Approach
• No explicit form of function f is assumed
• Seek to estimate f as close as possible to the data points

interpretability

### Model Accuracy

1. In regression setting, a common measure is mean squared error(MSE)

#### Overfitting and Underfitting

Two thing we need to avoid:

• Overfitting: Building a model that is too complex, fits training data very well, but fail to generalize to new data (e.g. large test MSE)
• Underfitting: build simple model that is unable to capture variability in data

• Simple models may not capture the variability in the data

• Complex models may not generalize

• Variance: amount by which $\hat{f}$ changes if we made the estimation by different training set
• Bias: Errors from approximating real-life problems by a simpler model
1. Classification Setting
• $\hat{y_{0}} = \hat{f(x_{0})}$ is the predicted output class
• Test error rate:

#### Bayes classifier

• Bayes classifier assigns each observation to the most likely class given the feature values.
• Assign $x_{0}$ to class ! that has largest $Pr(Y= j|X = x_{0})$

#### K-Nearest Neighbors

• Define a positive integer K
• For each test observation $x_{0}$ , identify K points in the training data that are closest to $x_{0}$ referred to as $N_{0}$
• Estimate the conditional probability for class j as fraction of points in $N_{0}$ whose response values equal to j

knn

Donate article here