ML One
Lecture 03
Introduction to data types and face detection ๐Ÿ˜Ž
Welcome ๐Ÿ‘ฉโ€๐ŸŽค๐Ÿง‘โ€๐ŸŽค๐Ÿ‘จโ€๐ŸŽค
By the end of this lecture, we'll have learnt about:
The theoretical:
- Introduction to data types in data science
- Introduction to face detection
The practical:
- Two example Apps that use Apple's face detection model
First of all, don't forget to confirm your attendence on Seats App!
A nice tune with music video from my favourite UK Jazz musician to wake us up
Recap
Representation ๐Ÿง 
- descriptive, perspective, and contextual
Numeric representation ๐ŸŒถ๏ธ
- How do we use numbers to represent image, audio and text
- How do we use numbers (with an interpretation guide) to represent image classes (๐Ÿถ or ๐Ÿ˜ผ)
Image classification ๐Ÿ•น๏ธ
- Given an input image, which is of a pre-defined size, a IC model predicts the probabilities of that image assigned to each class from a pre-defined set of classes.
- Image classes == image categories in this unit.
- We have seen an example of deploying a ready-to-use IC model to predict the image class for your favourite image from the internet in Swift playground.
- We have NOT talked about how IC models work in computational low-level details and how to make one from scratch (these are saved for later).
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)...
๐ŸŒถ๏ธ Q1: How many classes are there?
4 classes ๐Ÿ€
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
๐ŸŒถ๏ธ๐ŸŒถ๏ธ Q2: How can we use numbers to represent each class?
There are many ways!
For instance,
we can use 1 for spring, 2 for summer, 3 for autumn and 4 for winter
A more machine-learning-convention way:
[1,0,0,0] for spring
[0,1,0,0] for summer
[0,0,1,0] for autumn
[0,0,0,1] for winter
BTW this number representation for classes is called "one-hot encoding"
Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
๐ŸŒถ๏ธ๐ŸŒถ๏ธ๐ŸŒถ๏ธ Q2: How can we use numbers to represent
"i think there are 10% chance for this image to be a spring image, 20% for summer, 70% for autumn, 0 for winter"?
๐ŸŒถ๏ธ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
[0.1, 0.2, 0.7, 0] for the win!
The end of recap
data types in programming language (e.g. Swift): float, integer, etc.
data types in data science: the roles of numbers for describing the world
numerical (for quantitative data)
categorical (for qualitative data)
numerical data
- discrete type
- continuous type
Numerical data
- Usually discrete values occur as the result of counting something
- and continuous values occur as the result of measuring something
but exceptions may apply
- ๐ŸŒถ๏ธ can you think of an exception case where the data comes from measuring but is of discrete type?
shoe sizes ๐Ÿ‘Ÿ
Categorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)
23
hint: ๐Ÿ€
Quiz time! Which data type is it?
๐Ÿฆฟ The numebr of legs this desk has
๐Ÿ“Numerical data
- discrete type
- continuous type
๐Ÿ“ฆCategorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)
๐Ÿฆฟ The numebr of legs this desk has
- Numerical and discrete
๐Ÿงž The weight of our camberwell building
๐Ÿงž The weight of our camberwell building
- Numerical and continuous
๐Ÿงž The floor number of CCI
๐Ÿงž The floor number of CCI
- Categorical and ordinal
Face detection ๐Ÿ˜Ž
What we will talk about today:
- What is face detection?
- What can a face detection model do?
What we will NOT talk about today:
- How does a face detection model work internally?
- How to make a face detection model from scratch?
Face detection
- It is a computer vision task.
- It involves automatically identifying and locating human faces within digital images or videos.
- It takes digital images or videos as input.
- Depends on the model/system, its output usually include bounding boxes and landmark coordinates.
This is an example input image for face detection model
one type of face detection model output: a bounding box around the detected face
another type of face detection model output: detected facial landmarks
another slide to flex this manga landmark detection model
Quiz time!
Which question sound harder?
- 1. Is there any face in this image?
- 2. Which grid in the image does it have a face?
Quiz time!
Which question sound harder?
- 1. Is there any face in this image? (classification) ๐ŸŒถ๏ธ
- 2. Which grid in the image does it have a face? (bounding box detection) ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
Quiz time!
Which question sound harder?
- 1.Which grid in the image does it have a face?
- 2. Where exactly in the detected face in the image does it have a right eye, a left eye, a nose, etc. ?
Quiz time!
Which question sound harder?
- 1. Which grid in the image does it have a face? (bounding box detection) ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
- 2. Where exactly in the detected face in the image does it has a right eye, a left eye, a nose, etc.? (landmark detection) ๐ŸŒถ๏ธ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
Quiz time! ๐ŸŒถ๏ธ
How to use numbers to represent the answer to this question?
- 1. Is there any face in this image?
- hint: this is a classic classification label
[0, 1]
where the first number corresponds to the class "HasFace" and the second number corresponds to the class "NoFace"
Quiz time! ๐ŸŒถ๏ธ
How to use numbers to represent the answer to this?
- The coordinate of a point within an image.
- hint: there are different ways...
One way of representing the point coordinate (using upper-left corner as the origin [0,0])
The Apple way of representing a point coordinate within an image
- With the lower-left corner as the origin point [0, 0]
- One number specifying the x-coordinate of the point.
- One number specifying the y-coordinate of the point.
Quiz time! ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
How to use numbers to represent this?
- The location of a rectangle (bounding box) within an image.
- hint: there are different ways...
The Apple way of representing a rectangle (bounding box) within an image
- Two numbers specifying the coordinate of the lower-left corner of the rectangle.
- One number specifying the width of the rectangle.
- One number specifying the height of the rectangle.
Quiz time! ๐ŸŒถ๏ธ๐ŸŒถ๏ธ
How to use numbers to represent the answer to this?
- 3. Which point in the image does it correspond to the right eye centre, or the left eye centre, or the nose tip, etc.?
The Apple way of representing facial landmarks within an image
- A set of coordinates with one coordinate for each landmark.
- Which landmarks are used by Apple?
- Let's take a look at the document!
Till now we have looked at:
- Bounding boxes and facials landmarks as face detection model's output
- Bounding boxes
- How bounding boxes are represented in Apple's Vision framework
- Landmarks
- How landmarks are represented in Apple's Vision framework
That's quite a lot, congrats! ๐ŸŽ‰
Female figure by Jordan Wolfson an installation that uses good old face detection models
Now let's take a look at two example Apps that use Apple's face detection model
What can we do with detected bounding boxes?
We can count how many faces there are in the image and draw the bounding boxes on the image!
Please download the Apps here ๐ŸŽ‰
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.
If you have not enable the developer mode on your device
Connect your phone to the macbook and open the xcode project
Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!
This App looks like this if it runs on your phone
Don't be scared about the big chunk of code
- We are not expected to write these from scratch at the moment.
- A lot of them will become more familiar after Coding and Product One!
- Most code are for building the basic functionality (build the UI, wake up the camera on demand, etc.) of the App.
- That means most of them are directly re-usable for your own project!
Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.
Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.
- It's in Line 16 in Faces.swift
Just for your curiosity,
- Line 61 in Faces.swift is where we retreive the detected bounding boxes
- (and then draw that rectangle on the image)
What can we do with detected landmarks?
We can use the landmarks to overlay emojis nicely over the detected faces!
Please download the Apps here ๐ŸŽ‰
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.
Connect your phone to the macbook and open the xcode project
Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!
This App looks like this if it runs on your phone
Don't be scared about the big chunk of code
- We are not expected to write these from scratch at the moment.
- A lot of them will become more familiar after Coding and Product One!
- Most code are for building the basic functionality (build the UI, wake up the camera on demand, etc.) of the App.
- That means most of them are directly re-usable for your own project!
Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.
Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.
- It's in Line 14 in Faces.swift
Recall from the previous App,
- We use VNDetectFaceRectanglesRequest() for detecting bounding boxes.
In this app,
- We use VNDetectFaceLandmarksRequest() for detecting landmarks.
Just for your curiosity,
- Line 106 in Faces.swift is where we retreive the detected landmarks for anchoring the emoji.
The scope of these examples is for you to see face detection in action in Apps, well done everyone! ๐ŸŽ‰
Take a moment and think about what you would do with Apple's face detection model๐ŸŽ‰
Today we have looked at:
- One-hot encoding for class labels ๐Ÿ”ฅ
- Face detection ๐Ÿ˜Ž
-- Bounding boxes and landmarks as output
- Two examples Apps using face detection
a COOL AI project borrowed from Murad's slides
In the artwork Pareidolia* facial detection is applied to grains of sand. A fully automated robot search engine examines the grains of sand in situ. When the machine finds a face in one of the grains, the portrait is recorded.
We'll see you next Thursday same time and same place!