29

Face Detection Tutorial Using the Vision Framework for iOS [FREE]

 5 years ago
source link: https://www.tuicool.com/articles/hit/RJRRFfa
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Precise gene editing technology has been around since about 2012. So why don’t we all have super powers yet?!?

And what’s the greatest super power? No. Not flying. That’s far too dangerous.

The correct answer is laser heat vision!

Imagine what you could do with laser heat vision! You could save money on a microwave, easily light any candle in sight and don’t forget the ability to burn your initials into your woodworking projects. How cool would that be?

Well, apparently real life superpowers aren’t here yet, so you’ll have to deal with the next best thing. You’ll have to use your iPhone to give you pretend laser heat vision.

Fortunately, Apple has a framework that can help you out with this plan B.

In this tutorial, you’ll learn how to use the Vision framework to:

  • Create requests for face detection and detecting face landmarks.
  • Process these requests.
  • Overlay the results on the camera feed to get real-time, visual feedback.

Get ready to super power your brain and your eyes!

Getting Started

Click the Download Materials button at the top or bottom of this tutorial. Open the starter project and explore to your heart’s content.

Note : The starter projects uses the camera, which means you’ll get a crash if you try to run it in the Simulator. Make sure to run this tutorial on an actual device so you can see your lovely face!

Currently, the Face Lasers app doesn’t do a whole lot. Well, it does show you your beautiful mug!

There’s also a label at the bottom that reads Face . You may have noticed that if you tap the screen, this label changes to read Lasers .

That’s exciting! Except that there don’t seem to be any lasers. That’s less exciting. Don’t worry — by the end of this tutorial, you’ll be shooting lasers out of your eyes like Super(wo)man!

You’ll also notice some useful Core Graphics extensions. You’ll make use of these throughout the tutorial to simplify your code.

Vision Framework Usage Patterns

All Vision framework APIs use three constructs:

  1. Request: The request defines the type of thing you want to detect and a completion handler that will process the results. This is a subclass of VNRequest .
  2. Request handler: The request handler performs the request on the provided pixel buffer (think: image). This will be either a VNImageRequestHandler for single, one-off detections or a VNSequenceRequestHandler to process a series of images.
  3. Results: The results will be attached to the original request and passed to the completion handler defined when creating the request. They are subclasses of VNObservation

Simple right?

Writing Your First Face Detector

Open FaceDetectionViewController.swift and add the following property at the top of the class:

var sequenceHandler = VNSequenceRequestHandler()

This defines the request handler you’ll be feeding images to from the camera feed. You’re using a VNSequenceRequestHandler because you’ll perform face detection requests on a series of images, instead a single static one.

Now scroll to the bottom of the file where you’ll find an empty captureOutput(_:didOutput:from:) delegate method. Fill it in with the following code:

// 1
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
  return
}

// 2
let detectFaceRequest = VNDetectFaceRectanglesRequest(completionHandler: detectedFace)

// 3
do {
  try sequenceHandler.perform(
    [detectFaceRequest], 
    on: imageBuffer, 
    orientation: .leftMirrored)
} catch {
  print(error.localizedDescription)
}

With this code you:

orientation

Now you maybe be wondering: But what about detectedFace(request:error:) ? In fact, Xcode is probably wondering the same thing.

You’ll define that now.

Add the following code for detectedFace(request:error:) to the FaceDetectionViewController class, wherever you like:

func detectedFace(request: VNRequest, error: Error?) {
  // 1
  guard 
    let results = request.results as? [VNFaceObservation],
    let result = results.first 
    else {
      // 2
      faceView.clear()
      return
  }
    
  // 3
  let box = result.boundingBox
  faceView.boundingBox = convert(rect: box)
    
  // 4
  DispatchQueue.main.async {
    self.faceView.setNeedsDisplay()
  }
}

In this method you:

  1. Extract the first result from the array of face observation results.
  2. Clear the FaceView if something goes wrong or no face is detected.
  3. Set the bounding box to draw in the FaceView after converting it from the coordinates in the VNFaceObservation .
  4. Call setNeedsDisplay() to make sure the FaceView is redrawn.

The result’s bounding box coordinates are normalized between 0.0 and 1.0 to the input image, with the origin at the bottom left corner. That’s why you need to convert them to something useful.

Unfortunately, this function doesn’t exist. Fortunately, you’re a talented programmer!

Right above where you placed the method definition for detectedFace(request:error:) , add the following method definition:

func convert(rect: CGRect) -> CGRect {
  // 1
  let origin = previewLayer.layerPointConverted(fromCaptureDevicePoint: rect.origin)
  
  // 2
  let size = previewLayer.layerPointConverted(fromCaptureDevicePoint: rect.size.cgPoint)
  
  // 3
  return CGRect(origin: origin, size: size.cgSize)
}

Here you:

AVCaptureVideoPreviewLayer
CGRect

You’re probably tempted to build and run this. And if you did, you would be disappointed to see nothing on the screen except your own face, sadly free of lasers.

Currently FaceView has an empty draw(_:) method. You need to fill that in if you want to see something on screen!

Switch to FaceView.swift and add the following code to draw(_:) :

// 1
guard let context = UIGraphicsGetCurrentContext() else {
  return
}

// 2
context.saveGState()

// 3
defer {
  context.restoreGState()
}
    
// 4
context.addRect(boundingBox)

// 5
UIColor.red.setStroke()

// 6
context.strokePath()

With this code, you:

  1. Get the current graphics context.
  2. Push the current graphics state onto the stack.
  3. Restore the graphics state when this method exits.
  4. Add a path describing the bounding box to the context.
  5. Set the color to red.
  6. Draw the actual path described in step four.

Phew! You’ve been coding for quite some time. It’s finally time!

Go ahead and build and run your app.

facelasers-face-bounding-box-231x500.png

What a good looking detected face!

What Else Can You Detect?

Aside from face detection, the Vision framework has APIs you can use to detect all sorts of things.

  • Rectangles: With VNDetectRectanglesRequest , you can detect rectangles in the camera input, even if they are distorted due to perspective.
  • Text: You can detect the bounding boxes around individual text characters by using VNDetectTextRectanglesRequest . Note, however, this doesn’t recognize what the characters are, it only detects them.
  • Horizon: Using VNDetectHorizonRequest , you can determine the angle of the horizon in images.
  • Barcodes: You can detect and recognize many kinds of barcodes with VNDetectBarcodesRequest . See the full list here .
  • Objects: By combining the Vision framework with CoreML, you can detect and classify specific objects using VNCoreMLRequest .
  • Image alignment: With VNTranslationalImageRegistrationRequest and VNHomographicImageRegistrationRequest you can align two images that have overlapping content.

Amazing, right?

Well, there’s one more very important thing you can detect with the Vision framework. You can use it to detect face landmarks! Since this tutorial is all about face detection, you’ll be doing that in the next section.

Detecting Face Landmarks

The first thing you need to do is update your Vision request to detect face landmarks. To do this, open FaceDetectionViewController.swift and in captureOutput(_:didOutput:from:) replace the line where you define detectFaceRequest with this:

let detectFaceRequest = VNDetectFaceLandmarksRequest(completionHandler: detectedFace)

If you were to build and run now, you wouldn’t see any difference from before. You’d still see a red bounding box around your face.

Why?

Because VNDetectFaceLandmarksRequest will first detect all faces in the image before analyzing them for facial features.

Next, you’re going to need to define some helper methods. Right below convert(rect:) , add the following code:

// 1
func landmark(point: CGPoint, to rect: CGRect) -> CGPoint {
  // 2
  let absolute = point.absolutePoint(in: rect)
  
  // 3
  let converted = previewLayer.layerPointConverted(fromCaptureDevicePoint: absolute)
  
  // 4
  return converted
}

With this code, you:

  1. Define a method which converts a landmark point to something that can be drawn on the screen.
  2. Calculate the absolute position of the normalized point by using a Core Graphics extension defined in CoreGraphicsExtensions.swift .
  3. Convert the point to the preview layer’s coordinate system.
  4. Return the converted point.

Below that method, add the following:

func landmark(points: [CGPoint]?, to rect: CGRect) -> [CGPoint]? {
  return points?.compactMap { landmark(point: $0, to: rect) }
}

This method takes an array of these landmark points and converts them all.

Next, you’re going to refactor some of your code to make it easier to work with and add functionality. Add the following method right below your two new helper methods:

func updateFaceView(for result: VNFaceObservation) {
  defer {
    DispatchQueue.main.async {
      self.faceView.setNeedsDisplay()
    }
  }

  let box = result.boundingBox    
  faceView.boundingBox = convert(rect: box)

  guard let landmarks = result.landmarks else {
    return
  }
    
  if let leftEye = landmark(
    points: landmarks.leftEye?.normalizedPoints, 
    to: result.boundingBox) {
    faceView.leftEye = leftEye
  }
}

The only thing new here is the first if statement in the function. That if uses your new helper methods to convert the normalized points that make up the leftEye into coordinates that work with the preview layer. If everything went well, you assigned those converted points to the leftEye property of the FaceView .

The rest looks familiar because you already wrote it in detectedFace(request:error:) . So, you should probably clean that up now.

In detectedFace(request:error:) , replace the following code:

let box = result.boundingBox
faceView.boundingBox = convert(rect: box)
    
DispatchQueue.main.async {
  self.faceView.setNeedsDisplay()
}

with:

updateFaceView(for: result)

This calls your newly defined method to handle updating the FaceView .

There’s one last step before you can try out your code. Open FaceView.swift and add the following code to the end of draw(_:) , right after the existing statement context.strokePath() :

// 1
UIColor.white.setStroke()
    
if !leftEye.isEmpty {
  // 2
  context.addLines(between: leftEye)
  
  // 3
  context.closePath()
  
  // 4
  context.strokePath()
}

Here you:

leftEye

Time to build and run!

Note : You’ve added code to annotate the left eye , but what does that mean? With Vision, you should expect to see the outline drawn not on your left eye, but on your eye which is on the left side of the image .

A fun game with computer vision APIs is to look for words like left and right and guess what they mean. It’s different every time!

facelasers-left-eye-231x500.png

Awesome! If you try to open your eye wide or shut it, you should see the drawn eye change shape slightly, although not as much.

This is a fantastic milestone. You may want to take a quick break now, as you’ll be adding all the other face landmarks in one fell swoop.

facelasers-coffee-break-500x500.png

Back already? You’re industrious! Time to add those other landmarks.

While you still have FaceView.swift open, add the following to the end of draw(_:) , after the code for the left eye:

if !rightEye.isEmpty {
  context.addLines(between: rightEye)
  context.closePath()
  context.strokePath()
}
    
if !leftEyebrow.isEmpty {
  context.addLines(between: leftEyebrow)
  context.strokePath()
}
    
if !rightEyebrow.isEmpty {
  context.addLines(between: rightEyebrow)
  context.strokePath()
}
    
if !nose.isEmpty {
  context.addLines(between: nose)
  context.strokePath()
}
    
if !outerLips.isEmpty {
  context.addLines(between: outerLips)
  context.closePath()
  context.strokePath()
}
    
if !innerLips.isEmpty {
  context.addLines(between: innerLips)
  context.closePath()
  context.strokePath()
}
    
if !faceContour.isEmpty {
  context.addLines(between: faceContour)
  context.strokePath()
}

Here you’re adding drawing code for the remaining face landmarks. Note that leftEyebrow , rightEyebrow , nose and faceContour don’t need to close their paths. Otherwise, they look funny.

Now, open FaceDetectionViewController.swift again. At the end of updateFaceView(for:) , add the following:

if let rightEye = landmark(
  points: landmarks.rightEye?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.rightEye = rightEye
}
    
if let leftEyebrow = landmark(
  points: landmarks.leftEyebrow?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.leftEyebrow = leftEyebrow
}
    
if let rightEyebrow = landmark(
  points: landmarks.rightEyebrow?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.rightEyebrow = rightEyebrow
}
    
if let nose = landmark(
  points: landmarks.nose?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.nose = nose
}
    
if let outerLips = landmark(
  points: landmarks.outerLips?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.outerLips = outerLips
}
    
if let innerLips = landmark(
  points: landmarks.innerLips?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.innerLips = innerLips
}
    
if let faceContour = landmark(
  points: landmarks.faceContour?.normalizedPoints, 
  to: result.boundingBox) {
  faceView.faceContour = faceContour
}

With this code, you add the remaining face landmarks to the FaceView and that’s it! You’re ready to build and run!

facelasers-full-face-231x500.png

Nice work!

Using Detected Faces

Face detection is something you’ve probably been seeing more of recently. It can be especially useful for image processing, when you want to really make the people in the images shine.

But you’re going to do something way cooler than that. You’re going to shoot lasers out of your eyes!

Time to get started.

While still in FaceDetectionViewController.swift , right below updateFaceView(for:) , add the following method:

// 1
func updateLaserView(for result: VNFaceObservation) {
  // 2
  laserView.clear()
    
  // 3
  let yaw = result.yaw ?? 0.0
    
  // 4
  if yaw == 0.0 {
    return
  }
    
  // 5
  var origins: [CGPoint] = []
    
  // 6
  if let point = result.landmarks?.leftPupil?.normalizedPoints.first {
    let origin = landmark(point: point, to: result.boundingBox)
    origins.append(origin)
  }
    
  // 7
  if let point = result.landmarks?.rightPupil?.normalizedPoints.first {
    let origin = landmark(point: point, to: result.boundingBox)
    origins.append(origin)
  }
}

Whew! That was quite a bit of code. Here’s what you did with it:

  1. Define a new method that will update the LaserView . It’s a bit like updateFaceView(for:) .
  2. Clear the LaserView .
  3. Get the yaw from the result. The yaw is a number that tells you how much your face is turned. If it’s negative, you’re looking to the left. If it’s positive, you’re looking to the right.
  4. Return if the yaw is 0.0. If you’re looking straight forward, no face lasers. :disappointed:
  5. Create an array to store the origin points of the lasers.
  6. Add a laser origin based on the left pupil.
  7. Add a laser origin based on the right pupil.

NOTE : Although the Vision framework includes a left and right pupil among detected face landmarks, it turns out that these are just the geometric centers of the eyes. They’re not actually detected pupils. If you were to keep your head still, but look to the left or right, the pupils returned in the VNFaceObservation would not move.

OK, you’re not quite done with that method, yet. You’ve determined the origin of the lasers. However, you still need to add logic to figure out where the lasers will be focused.

At the end of your newly created updateLaserView(for:) , add the following code:

// 1
let avgY = origins.map { $0.y }.reduce(0.0, +) / CGFloat(origins.count)

// 2
let focusY = (avgY < midY) ? 0.75 * maxY : 0.25 * maxY

// 3
let focusX = (yaw.doubleValue < 0.0) ? -100.0 : maxX + 100.0
    
// 4
let focus = CGPoint(x: focusX, y: focusY)
    
// 5
for origin in origins {
  let laser = Laser(origin: origin, focus: focus)
  laserView.add(laser: laser)
}

// 6
DispatchQueue.main.async {
  self.laserView.setNeedsDisplay()
}

Here you:

  1. Calculate the average y coordinate of the laser origins.
  2. Determine what the y coordinate of the laser focus point will be based on the average y of the origins. If your pupils are above the middle of the screen, you'll shoot down. Otherwise, you'll shoot up. You calculated midY in viewDidLoad() .
  3. Calculate the x coordinate of the laser focus based on the yaw . If you're looking left, you should shoot lasers to the left.
  4. Create a CGPoint from your two focus coordinates.
  5. Generate some Laser s and add them to the LaserView .
  6. Tell the iPhone that the LaserView should be redrawn.

Now you need to call this method from somewhere. detectedFace(request:error:) is the perfect place! In that method, replace the call to updateFaceView(for:) with the following:

if faceViewHidden {
  updateLaserView(for: result)
} else {
  updateFaceView(for: result)
}

This logic chooses which update method to call based on whether or not the FaceView is hidden.

Currently, if you were to build and run, you would only shoot invisible lasers out of your eyes. While that sounds pretty cool, wouldn't it be better to see the lasers?

To fix this, you need tell the iPhone how to draw the lasers.

Open LaserView.swift and find the draw(_:) method. It should be completely empty. Now add the following code to it:

// 1
guard let context = UIGraphicsGetCurrentContext() else {
  return
}
    
// 2
context.saveGState()

// 3
for laser in lasers {
  // 4
  context.addLines(between: [laser.origin, laser.focus])
      
  context.setStrokeColor(red: 1.0, green: 1.0, blue: 1.0, alpha: 0.5)
  context.setLineWidth(4.5)
  context.strokePath()
      
  // 5
  context.addLines(between: [laser.origin, laser.focus])
      
  context.setStrokeColor(red: 1.0, green: 0.0, blue: 0.0, alpha: 0.8)
  context.setLineWidth(3.0)
  context.strokePath()
}

// 6
context.restoreGState()

With this drawing code, you:

  1. Get the current graphics context.
  2. Push the current graphics state onto the stack.
  3. Loop through the lasers in the array.
  4. Draw a thicker white line in the direction of the laser.
  5. Then draw a slightly thinner red line over the white line to give it a cool laser effect.
  6. Pop the current graphics context off the stack to restore it to its original state.

That's it. Build and run time!

Tap anywhere on the screen to switch to Lasers mode.

facelasers-lasers-231x500.png

Great job!

Where to Go From Here?

You can do all sorts of things with your new found knowledge. Imagine combining the face detection with depth data from the camera to create cool effects focused around the people in your photos. To learn more about using depth data, check out this tutorial on working with image depth maps and this tutorial on working with video depth maps .

Or how about trying out a Vision and CoreML tag team? That sounds really cool, right? If that piques your interest, we have atutorial for that!

You could learn how to do face tracking using ARKit with thisawesome tutorial.

There are, of course, plenty of other Vision APIs you can play with. Now that you have a foundational knowledge of how to use them, you can explore them all!

We hope you enjoyed this tutorial and, if you have any questions or comments, please join the forum discussion below!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK