Big

ML One
Lecture 03
Introduction to data types and face detection 😎

Welcome 👩‍🎤🧑‍🎤👨‍🎤

By the end of this lecture, we'll have learnt about:
The theoretical:
- Introduction to data types in data science
- Introduction to face detection
The practical:
- Two example Apps that use Apple's face detection model

First of all, don't forget to confirm your attendence on Seats App!

A nice tune with music video from my favourite UK Jazz musician to wake us up

Recap

Representation 🧠
- descriptive, perspective, and contextual
Numeric representation 🌶️
- How do we use numbers to represent image, audio and text
- How do we use numbers (with an interpretation guide) to represent image classes (🐶 or 😼)

Image classification 🕹️
- Given an input image, which is of a pre-defined size, a IC model predicts the probabilities of that image assigned to each class from a pre-defined set of classes.
- Image classes == image categories in this unit.
- We have seen an example of deploying a ready-to-use IC model to predict the image class for your favourite image from the internet in Swift playground.
- We have NOT talked about how IC models work in computational low-level details and how to make one from scratch (these are saved for later).

Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)...
🌶️ Q1: How many classes are there?

4 classes 🍀

Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
🌶️🌶️ Q2: How can we use numbers to represent each class?

There are many ways!
For instance,
we can use 1 for spring, 2 for summer, 3 for autumn and 4 for winter

A more machine-learning-convention way:
[1,0,0,0] for spring
[0,1,0,0] for summer
[0,0,1,0] for autumn
[0,0,0,1] for winter

BTW this number representation for classes is called "one-hot encoding"

Morning noodling time!
Imagine I have an awesome image classification model for detecting which season it is (here in UK)
🌶️🌶️🌶️ Q2: How can we use numbers to represent
"i think there are 10% chance for this image to be a spring image, 20% for summer, 70% for autumn, 0 for winter"?

🌶️🌶️🌶️
[0.1, 0.2, 0.7, 0] for the win!

The end of recap

data types in programming language (e.g. Swift): float, integer, etc.

data types in data science: the roles of numbers for describing the world

numerical (for quantitative data)

categorical (for qualitative data)

numerical data
- discrete type
- continuous type

Numerical data
- Usually discrete values occur as the result of counting something
- and continuous values occur as the result of measuring something
but exceptions may apply
- 🌶️ can you think of an exception case where the data comes from measuring but is of discrete type?

shoe sizes 👟

Categorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)

23
hint: 🏀

Quiz time! Which data type is it?

🦿 The numebr of legs this desk has

📏Numerical data
- discrete type
- continuous type
📦Categorical data
- ordinal type (categories with an implied order)
- nominal type (named category, no order implied)

🦿 The numebr of legs this desk has
- Numerical and discrete

🧞 The weight of our camberwell building

🧞 The weight of our camberwell building
- Numerical and continuous

🧞 The floor number of CCI

🧞 The floor number of CCI
- Categorical and ordinal

Face detection 😎

What we will talk about today:
- What is face detection?
- What can a face detection model do?
What we will NOT talk about today:
- How does a face detection model work internally?
- How to make a face detection model from scratch?

Face detection
- It is a computer vision task.
- It involves automatically identifying and locating human faces within digital images or videos.
- It takes digital images or videos as input.
- Depends on the model/system, its output usually include bounding boxes and landmark coordinates.

This is an example input image for face detection model

one type of face detection model output: a bounding box around the detected face

another type of face detection model output: detected facial landmarks

another slide to flex this manga landmark detection model

Quiz time!
Which question sound harder?
- 1. Is there any face in this image?
- 2. Which grid in the image does it have a face?

Quiz time!
Which question sound harder?
- 1. Is there any face in this image? (classification) 🌶️
- 2. Which grid in the image does it have a face? (bounding box detection) 🌶️🌶️

Quiz time!
Which question sound harder?
- 1.Which grid in the image does it have a face?
- 2. Where exactly in the detected face in the image does it have a right eye, a left eye, a nose, etc. ?

Quiz time!
Which question sound harder?
- 1. Which grid in the image does it have a face? (bounding box detection) 🌶️🌶️
- 2. Where exactly in the detected face in the image does it has a right eye, a left eye, a nose, etc.? (landmark detection) 🌶️🌶️🌶️

Quiz time! 🌶️
How to use numbers to represent the answer to this question?
- 1. Is there any face in this image?
- hint: this is a classic classification label

[0, 1]
where the first number corresponds to the class "HasFace" and the second number corresponds to the class "NoFace"

Quiz time! 🌶️
How to use numbers to represent the answer to this?
- The coordinate of a point within an image.
- hint: there are different ways...

One way of representing the point coordinate (using upper-left corner as the origin [0,0])

The Apple way of representing a point coordinate within an image
- With the lower-left corner as the origin point [0, 0]
- One number specifying the x-coordinate of the point.
- One number specifying the y-coordinate of the point.

Quiz time! 🌶️🌶️
How to use numbers to represent this?
- The location of a rectangle (bounding box) within an image.
- hint: there are different ways...

The Apple way of representing a rectangle (bounding box) within an image
- Two numbers specifying the coordinate of the lower-left corner of the rectangle.
- One number specifying the width of the rectangle.
- One number specifying the height of the rectangle.

Quiz time! 🌶️🌶️
How to use numbers to represent the answer to this?
- 3. Which point in the image does it correspond to the right eye centre, or the left eye centre, or the nose tip, etc.?

The Apple way of representing facial landmarks within an image
- A set of coordinates with one coordinate for each landmark.
- Which landmarks are used by Apple?
- Let's take a look at the document!

Till now we have looked at:
- Bounding boxes and facials landmarks as face detection model's output
- Bounding boxes
- How bounding boxes are represented in Apple's Vision framework
- Landmarks
- How landmarks are represented in Apple's Vision framework

That's quite a lot, congrats! 🎉

Female figure by Jordan Wolfson an installation that uses good old face detection models

Now let's take a look at two example Apps that use Apple's face detection model

What can we do with detected bounding boxes?

We can count how many faces there are in the image and draw the bounding boxes on the image!

Please download the Apps here 🎉
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.

If you have not enable the developer mode on your device

Connect your phone to the macbook and open the xcode project

Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!

This App looks like this if it runs on your phone

Don't be scared about the big chunk of code
- We are not expected to write these from scratch at the moment.
- A lot of them will become more familiar after Coding and Product One!
- Most code are for building the basic functionality (build the UI, wake up the camera on demand, etc.) of the App.
- That means most of them are directly re-usable for your own project!

Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.

Little task:
- Can you find "VNDetectFaceRectanglesRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing bounding box output.
- It's in Line 16 in Faces.swift

Just for your curiosity,
- Line 61 in Faces.swift is where we retreive the detected bounding boxes
- (and then draw that rectangle on the image)

What can we do with detected landmarks?

We can use the landmarks to overlay emojis nicely over the detected faces!

Please download the Apps here 🎉
- All code are prepared.
- We only need to do some minor modification to bring the Apps running on your phone.

Connect your phone to the macbook and open the xcode project

Here are the steps for getting the App running on your phone
- There might be some issues coming up, let me know!!!

This App looks like this if it runs on your phone

Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.

Little task:
- Can you find "VNDetectFaceLandmarksRequest()" in Faces.swift ?
- That's where we tell the system to run the face detection for producing landmarks output.
- It's in Line 14 in Faces.swift

Recall from the previous App,
- We use VNDetectFaceRectanglesRequest() for detecting bounding boxes.
In this app,
- We use VNDetectFaceLandmarksRequest() for detecting landmarks.

Just for your curiosity,
- Line 106 in Faces.swift is where we retreive the detected landmarks for anchoring the emoji.

The scope of these examples is for you to see face detection in action in Apps, well done everyone! 🎉

Take a moment and think about what you would do with Apple's face detection model🎉

Today we have looked at:
- One-hot encoding for class labels 🔥
- Face detection 😎
-- Bounding boxes and landmarks as output
- Two examples Apps using face detection

a COOL AI project borrowed from Murad's slides

In the artwork Pareidolia* facial detection is applied to grains of sand. A fully automated robot search engine examines the grains of sand in situ. When the machine finds a face in one of the grains, the portrait is recorded.

We'll see you next Thursday same time and same place!