ML One
Lecture 03
Introduction to data types and face detection ๐
Welcome ๐ฉโ๐ค๐งโ๐ค๐จโ๐ค
By the end of this lecture, we'll have learnt about:
The theoretical:
- Introduction to data types in data science
- Introduction to face detection
The practical:
- Two example Apps that use Apple's face detection model
First of all, don't forget to confirm your attendence on
Seats App!
Recap
Representation ๐ง
- descriptive, perspective, and contextual
Numeric representation ๐ถ๏ธ
- How we use numbers to represent image, audio and text
- How we use numbers (with an interpretation guide) to represent image classes (๐ถ or ๐ผ)
Image classification ๐น๏ธ
- Given an input image, which is of a pre-defined size, an IC model predicts the probabilities of that image assigned to each class from a pre-defined set of classes.
- Image classes == image categories in this unit.
- We have seen an example of deploying a ready-to-use IC model to predict the image class for your favourite image from the internet in Swift playground.
- We have NOT talked about how IC models work in computational low-level details and how to make one from scratch (these are saved for later).
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)...
๐ถ๏ธ Q1: How many classes are there?
4 classes ๐
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
๐ถ๏ธ๐ถ๏ธ Q2: How can we use numbers to represent each class?
There are many ways!
For instance,
we can use 1 for spring, 2 for summer, 3 for autumn and 4 for winter
A machine-learning-convention way:
[1,0,0,0] for spring
[0,1,0,0] for summer
[0,0,1,0] for autumn
[0,0,0,1] for winter
BTW this number representation for classes is called "one-hot encoding"
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
๐ถ๏ธ๐ถ๏ธ๐ถ๏ธ Q2: How can we use numbers to represent
"i think there are 10% chance for this image to be a spring image, 20% for summer, 70% for autumn, 0 for winter"?
๐ถ๏ธ๐ถ๏ธ๐ถ๏ธ
[0.1, 0.2, 0.7, 0] for the win!
The end of recap
data types in programming language (e.g. Swift): float, integer, etc.
data types in data science: the roles of numbers for describing the world
numerical (for quantitative data)
categorical (for qualitative data)
numerical data
- discrete type
- continuous type
Numerical data
- Usually discrete values occur as the result of counting something
- and continuous values occur as the result of measuring something
but exceptions may apply
- ๐ถ๏ธ can you think of an exception case where the data comes from measuring but is of discrete type?
shoe sizes ๐
Categorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)
23
hint: ๐
Quiz time! Which data type is it?
๐ฆฟ The number of legs this desk has
๐Numerical data
- discrete type
- continuous type
๐ฆCategorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)
๐ฆฟ The numebr of legs this desk has
- Numerical and discrete
๐ง The weight of our camberwell building
๐ง The weight of our camberwell building
- Numerical and continuous
๐ง The floor number of CCI
๐ง The floor number of CCI
- Categorical and ordinal
Face detection ๐
What we will talk about today:
- What is face detection?
- What can a face detection model do?
What we will NOT talk about today:
- How does a face detection model work internally?
- How to make a face detection model from scratch?
Face detection
- It is a computer vision task.
- It involves automatically identifying and locating human faces within digital images or videos.
- It takes digital images or videos as input.
- Depends on the model/system, its output usually include bounding boxes and landmark coordinates.
This is an example input image for face detection model
one type of face detection model output: a bounding box around the detected face
another type of face detection model output: detected facial landmarks
Quiz time!
Which question is more difficult?
- 1. Is there a face in this image?
- 2. Which grid/square in this image contains a face?
Quiz time!
Which question is more difficult?
- 1. Is there a face in this image? (classification) ๐ถ๏ธ
- 2. Which grid/square in this image contains a face? (bounding box detection) ๐ถ๏ธ๐ถ๏ธ
Quiz time!
Which question is more difficult?
- 1.Which grid/square in this image contains a face?
- 2. Where exactly in the detected face in the image does it have a right eye, a left eye, a nose, etc. ?
Quiz time!
Which question is more difficult?
- 1. Which grid/square in this image contains a face? (bounding box detection) ๐ถ๏ธ๐ถ๏ธ
- 2. Where exactly in the detected face in the image does it has a right eye, a left eye, a nose, etc.? (landmark detection) ๐ถ๏ธ๐ถ๏ธ๐ถ๏ธ
Quiz time! ๐ถ๏ธ
How to use numbers to represent the answer to this question?
- 1. Is there a face in this image?
- hint: this is a classic classification label
[0, 1]
where the first number corresponds to the class "HasFace" and the second number corresponds to the class "NoFace"
Quiz time! ๐ถ๏ธ
How to use numbers to represent the answer to this?
- The coordinate of a point within an image.
- hint: there are different ways...
One way of representing the point coordinate (using upper-left corner as the origin [0,0])
Quiz time! ๐ถ๏ธ๐ถ๏ธ
How to use numbers to represent this?
- The location of a rectangle (bounding box) within an image.
- hint: there are different ways...
Quiz time! ๐ถ๏ธ๐ถ๏ธ
How to use numbers to represent the answer to this?
- 3. Which point in the image does it correspond to the right eye centre, or the left eye centre, or the nose tip, etc.?
Till now we have looked at:
- Bounding boxes and facials landmarks as face detection model's output
- Bounding boxes
- How bounding boxes are represented in Apple's Vision framework
- Landmarks
- How landmarks are represented in Apple's Vision framework
That's quite a lot, congrats! ๐
Now let's take a look at two example Apps that use Apple's face detection model
What can we do with detected bounding boxes?
We can count how many faces there are in the image and draw the bounding boxes on the image!
Please download the Apps
here ๐
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.
Connect your phone to the macbook and open the xcode project
Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!
This App looks like this if it runs on your phone
Don't be scared about the big chunk of code
- We are not expected to write these from scratch at the moment.
- A lot of them will become more familiar after Coding and Product One!
- Most code are for building the basic functionality (build the UI, wake up the camera on demand, etc.) of the App.
- That means most of them are directly re-usable for your own project!
Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.
Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.
- It's in Line 16 in Faces.swift
Just for your curiosity,
- Line 61 in Faces.swift is where we retreive the detected bounding boxes
- (and then draw that rectangle on the image)
What can we do with detected landmarks?
We can use the landmarks to overlay emojis nicely over the detected faces!
Please download the Apps
here ๐
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.
Connect your phone to the macbook and open the xcode project
Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!
This App looks like this if it runs on your phone
Don't be scared about the big chunk of code
- We are not expected to write these from scratch at the moment.
- A lot of them will become more familiar after Coding and Product One!
- Most code are for building the basic functionality (build the UI, wake up the camera on demand, etc.) of the App.
- That means most of them are directly re-usable for your own project!
Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.
Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.
- It's in Line 14 in Faces.swift
Recall from the previous App,
- We use VNDetectFaceRectanglesRequest() for detecting bounding boxes.
In this app,
- We use VNDetectFaceLandmarksRequest() for detecting landmarks.
Just for your curiosity,
- Line 106 in Faces.swift is where we retreive the detected landmarks for anchoring the emoji.
The scope of these examples is for you to see face detection in action in Apps, well done everyone! ๐
Take a moment and think about what you would do with Apple's face detection model๐
Today we have looked at:
- One-hot encoding for class labels ๐ฅ
- Face detection ๐
-- Bounding boxes and landmarks as output
- Two examples Apps using face detection
a COOL AI project borrowed from
Murad's slides
In the artwork
Pareidolia*
facial detection is applied to grains of sand. A fully automated robot
search engine examines the grains of sand in situ. When the machine finds
a face in one of the grains, the portrait is recorded.
We'll see you next Thursday same time and same place!