Transfer Learning with KNN

Processing with AI

In this module, we are going to discover Transfer Learning, a technique whose goal is to reuse the "expertise" of a model for another application. We are going to use MobileNet with an algorithm called KNN to create a custom image classifier.

Who are my neighbors?

KNN exists for both classification and regression problems, we are going to use it in this module to build a custom classification model. Let's try to explain the way it works by working on a theoretical case. Let's say we want to build a model able to guess which breed a dog belongs to given its height and weight.

Let's start by gathering and labeling data about a selection of small dogs.

The result of our survey
If we put our data into a graph, we can clearly see that dogs of the same breed form "clusters" in a 2D space.

Now, let's say I want to classify a dog of an unknown breed.

A new dog
Mr. Youpee is 27cm high and weight 12.6kgs

A KNN algorithm would look at the Euclidean Distance between our new example and our previous data to determine which ones are its closest neighbors. Then, by looking at the category they belong to, it can predict which breed our new dog is. When doing that, the KNN will only take into account the k closest ones, k is a number determined beforehand, hence the name k-Nearest Neighbors.

A new dog
Kind of like finding who won a game of pétanque

In our case, if we choose k = 2, it seems that Mr. Youpee is a French Bulldog (if we had picked a higher k value, the classifier would have also found a little bit of Dachshund)

In this example, we used a 2D graphical representation to give you a better understanding, but in reality, a KNN is sometimes looking at neighbors in a 100 dimensions space. Something that our mind cannot even apprehend!

If you want to see a KNN algorithm in action, you can try this interactive demo.

KNN is a really powerful (yet simple) tool that you can use to do machine learning without even using a neural network! Now, how can we use this new tool with MobileNet?

Transfer Learning

Our previous example with dogs was only using two input features, height and weight. When doing image classification, we quickly have to deal with millions of inputs.

An RGB image of 1000x1000 pixels is 3.000.000 dimensions (1000 * 1000 * 3 colors)

If we wanted to use a KNN for image classification, it would be almost impossible to gather enough data to represent every possible image. That's when Transfer Learning is going to save us!

As you may already know, MobileNet is a Convolutional Neural Network. CNNs can quickly spot features on a complex input (sound, image, or video for example).

This CNN can identify written digits in a 32x32 pixels image (that's 1024 inputs)
See how with each layer from bottom to top, it can reduce the number of values in the input.

What if we could use this ability to detect key features in an image to build a custom classifier? That's what we are going to do, thanks to MobileNet, we are going to reduce any input image to 1000 inputs (which is still a lot less than 3 million!)

Have a look at this interactive documentation to learn more about how we are going to reuse MobileNet using the .infer() function.

Back to the coding board

The good news is that we won't have to recode a KNN from Scratch, Tensorflow.js already includes a KNN Classifier.

How to use it:

  1. Import Tensorflow's KNN classifier by adding a <script> tag in your html file. See Tensorflow's GitHub to find it.
  2. The function knnClassifier.create(); initialize and return an instance of this classifier. Use it in your setup() function.
  3. You can add a new example to your classifier by doing myClassifier.addExample(newData, className);, newData being the result of MobileNet's .infer() function and className being a number or a string.
  4. Finally, you can start a prediction by using myClassifier.predictClass(newData).then(callback)

You can find a working p5.js implementation of a KNN using MobileNet here

If you want to learn more about KNN and Transfer Learning, have a look at this 3-part video series by The Coding Train on how to build an image classifier using ml5.js, a library built on top of Tensorflow.js

Now, let's have some fun!

Train your KNN to detect two or more things, for example, positions of your hand, facial expressions (you smiling or not for example), you and your friends, different objects... Play with your new model and try to see:

  1. What's the minimum number of examples needed to get a good result?
  2. Does the prediction still make sense if you change the camera angle between training and predictions?
  3. What small details in your picture make the network "click" and switch from one category to another? Can you find a way to reliably trick it?
  4. Try to train your model to detect the difference between you smiling and you not smiling (or anything with your face). Does it work with someone else's face without re-training it?

By doing these little exercises, you should have experienced by yourself an Overfitting and/or a biased model. We will see the consequences of this in the chapter Ethics & AI.


Quiz

Quiz

Answer the quiz to make sure you understand the main notions. Some questions may need to be looked up elsewhere through a quick Internet search!

You can answer this quiz as many times as you want, only your best score will be taken into account. Simply reload the page to get a new quiz.

The due date has expired. You can no longer answer this quiz.


Assignment

Project

  1. Create a p5.js sketch (from scratch or using this code as a base), that implements a custom image classifier using KNN and MobileNet.
  2. Train your model to detect at least three different classes

  3. Make a short video of your application. This video should follow the following requirements:
    • be exactly between 40 and 60 seconds
    • be in French or in English
    • show a working detection of your three classes
    • describe a use case associated with your application (who would be your users and how they would benefit from it)
    Bonus: Show us a way to trick your model!

    What do we expect when we ask for a "use case"?

    Generally explaining at least the problem you are solving, a basic description of your users and how your project will help them is enough.

    For example:

    • This is a game for children learning numbers.
    • This is an example of a game that could be part of a bigger series on letters, animals, jobs. Children from ages 3 to 6 (kindergarten/pre-school) would play with them at home. To help our model with recognition (and increase revenues!), we could also sell a playing card set so that instead of recognizing a number on any picture, we could focus our work on recognizing perfectly 10 specific cards.

    Another example:

    • This is an app for colorblind people to tell them the color of an object.
    • This app was made to help colorblind people dress (especially people having a monochromatic vision), by taking a picture of their clothes when they dress in the morning. Using AI we can detect both the color and the type of clothes to suggest an outfit. With further development, we could store all the clothes that a user have to directly suggest things that would work with the clothes in the picture.

    That does not mean that a long answer is mandatory, but give us a bit of context, if your use case is 50 characters long, there's probably a problem!

    If you prefer, you can also follow the Five Ws. method.

    You can either show the training process or train your model before shooting the video but a video only presenting an untrained model will get you a 0/20.

    You can either shoot a small video of you using it or make a screencast of your computer

    We won't judge the technical and/or artistic quality of the video, don't lose too much time on it, we just want to clearly see your model running and hear you explaining what it is doing and your use case

  4. Upload your video on Imgur, include a link to your p5.js project in your video description, and copy the link in the form below

Evaluation criteria

The video shows a working model (5 pts)

  • Video is clear, we can see a well trained model running with at least three classes making acceptable predictions.
  • The classes names are relevant for the use case.

Code customization (3 pts)

  • The app presents the results of the classification in an interesting way using HTML elements.

Use case: Relevance (5 pts)

  • The use case describes a real problem

Use case: Description (5 pts)

  • Clear explanation of why, how and for who the project is relevant

Quality of oral (or written) communication (2 pts)

  • Students explain themselves clearly and with a professional tone

Bonus: The video presents a way to trick the model (1pt)

Submit

Before submitting, make sure you check all of the criteria below.

The due date has expired. You can no longer submit your work.