Introduction to Machine Learning and Data Mining

What is Machine Learning?

Algorithms, methods, and techniques for learning and prediction based upon data

Use existing data (if an offline task), obtain data (if an online task)

Analyze the data, build a model, update an existing model

Use the knowledge/model to understand the data or make predictions

Supervised learning

Unsupervised learning

Reinforcement learning

and More

Problem domain	Class (output)
Handwritten character recognition	letters/ASCII
Face detection	Bounding box/features of person
Spam detection	True/False
Protein classification	Protein type/attributes
Astronomical phenomena	Orientation/Location/Star type


Prey in the wild	Predator classification

Predicting what discrete category some observation belongs to based on features of the observation.

Is an email spam? Return a boolean.

What character is this? Return a value from a list.

Make a prediction of a continuous value, often a number.

What is the value of a stock price? Some positive floating point.

What equation represents a collection of numbers?

Problem domain	Class (output)
Tweets	Topic of message
All images on the internet	What is in the image?
Gene arrays	What genes are coexpressed?
Marketing surveys	What groups of consumers are there?

Problem domain	Class (output)
Robot control	Determine a sequence of actions to carry out (drive a car, fly a quadcopter)
Game playing	Play backgammon, play as a NPC in games
Elevator control	Move people between floors as efficiently as possible