As wikipedia defines: “Machine learning is a subfield of computer science[1] that evolved from the study of pattern recognition and computational learning theory in artificial intelligence.[1] Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.[2] Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions,[3]:2 rather than following strictly static program instructions.”
i.e. you collect a bunch of face images and non-face images, choose an algorithm, and wait for the computations to finish. This is the spirit of machine learning. “Machine Learning” emphasizes that the computer program (or machine) must do some work after it is given data.
Loosely speaking, most often, ML algorithms work on precise set of features extracted from your raw data. Features could be very simple, such as pixel values for images, temporal values for a signal, or complex features such as Bag of Words feature representation for text. Most known ML algorithms only work as good as the features represent the data. Correct feature identification is the close representative of your all states of your data is a crucial step.
Importance of feature extractor:
Making correct feature extractor is great deal of science in itself. Most of these features extractors (from data) are very specific in function and utility. For ex: for face detection one needs a feature extractor which correctly represents parts of face, resistant to spatial aberrations etc. Each and every type of data and task may have its own class of feature extraction. ( Ex: Speech Recognition, Image Recognition)
These feature extractors can then be used to extract correct data features for a given sample, and pass this information to a classifier/ predictor.
How is Deep Learning different ?
Deep Learning is broader family of Machine Learning methods that tries to learn high level features from the given data. Thus, the problem it solves is reducing task of making new feature extractor for each and every type of data (speech, image etc.)
For last example, Deep Learning algo’s will try to learn features such as difference between human face , a dog and room structure etc. features when image recognition task is presented to them. They may use this info for classification, prediction etc tasks. Thus, this is a major step away from previous “Shallow Learning Algorithms.”
The main difference is that regular machine learning involves a lot of handcrafted feature extraction while deep learning does all the feature extraction by itself.
So, Deep learning is essentially a set of techniques that help you to parameterize deep neural network structures, neural networks with many, many layers and parameters.
It’s a growing trend in ML due to some favorable results in applications where the target function is very complex and the datasets are large. For example in Hinton et al. (2012), Hinton and his students managed to beat the status quo prediction systems on five well known datasets: Reuters, TIMIT, MNIST, CIFAR and ImageNet. This covers speech, text and image classification – and these are quite mature datasets, so a win on any of these gets some attention. A win on all of them gets a lot of attention. Deep learning networks differ from “normal” neural networks and SVMs because they can be trained in an UNSUPERVISED or SUPERVISED manner for both UNSUPERVISED and SUPERVISED learning tasks.
Prof. Andrew Ng remarks that Deep Learning focuses on original aim of One Learning, an ideal Algorithm envisioned for an AI.