1

Vision Framework Tutorial for iOS: Contour Detection [FREE]

 1 year ago
source link: https://www.raywenderlich.com/32611432-vision-framework-tutorial-for-ios-contour-detection
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Home iOS & Swift Tutorials

Vision Framework Tutorial for iOS: Contour Detection

Learn how to detect and modify image contours in your SwiftUI iOS apps in a fun and artistic way using the Vision framework.

By Yono Mittlefehldt May 16 2022 · Article (25 mins)

Version

VisionFrameworkTutorialForiOSContourDetection-feature.png

Art is a very subjective thing. However, coding is not. Sure, developers can be opinionated at times, but how a computer interprets your code is very much not a matter of opinion.

So how can you, a developer, use code to create art? Maybe it’s not how you code but rather what you choose to do with it. Getting creative with the tools available to you can significantly affect the output.

Think about how Apple has been pushing the limit on computational photography. Most digital photography is about post-processing the pixels that come from a sensor. Different sets of algorithms change how the final output looks and feels. That’s art!

You can even use computer vision algorithms to create exciting filters and effects for images and photos. For instance, if you detected all of the contours in an image, you might have some cool material to make an artsy-looking drawing of the image. And that’s what this tutorial is all about!

In this tutorial, you’ll learn how to use the Vision framework to:

  • Create requests to perform contour detection.
  • Tweak settings to get different contours.
  • Simplify the contours to create an artistic effect.

Sounds like fun, right? Exactly… art should be fun!

Getting Started

Click the Download Materials button at the top or bottom of this tutorial. The starter project includes some extensions, model files, and the UI.

If you build and run now, you’ll see instructions for tapping the screen and a functional settings icon.

A white screen with black lettering, which reads \

You might notice that tapping the screen doesn’t do anything right now, but before you can get to the Vision part of this tutorial, you need to display an image on the screen.

Displaying an Image

While going through the starter project, you might have noticed an ImageView connected to the image property of the ContentViewModel. If you open up ContentViewModel.swift, you’ll see that image is a published property, but nothing is assigned to it.

The first thing you’ll need to do is change that!

Start by adding the following code directly after the three defined published properties in ContentViewModel.swift:

init() {
  let uiImage = UIImage(named: "sample")
  let cgImage = uiImage?.cgImage
  self.image = cgImage  
}

This code loads the image called sample.png from the asset catalog and obtains a CGImage for it, before assigning it to the image published property.

With that small change, go ahead and build and rerun the app and you’ll see the image below on your screen:

Drawn image of a volcano, dinosaurs, the moon, a rocket, and Gus the owl astronaut

Now, when you tap on the screen, it should toggle between the above image and the blank screen you initially saw.

The blank screen will eventually contain the contours you detect using the Vision framework.

Vision API Pipeline

Before you start writing some code to detect contours, it’ll be helpful to understand the Vision API pipeline. Once you know how it works, you can easily include any of the Vision algorithms in your future projects; that’s pretty slick.

The Vision API pipeline consists of three parts:

  1. The first is the request, which is a subclass of VNRequest – the base class for all analysis requests. This request is then passed to a handler.
  2. The handler can be one of two types, either a VNImageRequestHandler or a VNSequenceRequestHandler.
  3. Finally, the result, a subclass of VNObservation, is returned as a property on the original request object.

Vision API Pipeline showing the request, the handler, and the result

Often, it’s quite easy to tell which result type goes with which request type, as they’re named similarly. For instance, if your request is a VNDetectFaceRectanglesRequest, then the result returned will be a VNFaceObservation.

For this project, the request will be a VNDetectContoursRequest, which will return the result as a VNContoursObservation.

Whenever you’re working with individual images, as opposed to frames in a sequence of images, you’ll use a VNImageRequestHandler. A VNSequenceRequestHandler is used when working on sequences of images where you want to apply requests to a sequence of related images, for example frames from a video stream. In this project, you’ll use the former for single image requests.

Now that you have the background theory, it’s time to put it into practice!

Contour Detection

To keep the project nicely organized, right-click on the the Contour Art group in the project navigator and select New Group. Name the new group Vision.

Right click on the new Vision group and select New File…. Choose
Swift File and name it ContourDetector.

Replace the contents of the file with the following code:

import Vision

class ContourDetector {
  static let shared = ContourDetector()
  
  private init() {}
}

All this code does is set up a new ContourDetector class as a Singleton. The singleton pattern isn’t strictly necessary, but it ensures that you only have one ContourDetector instance running around the app.

Performing Vision Requests

Now it’s time to make the detector class do something.

Add the following property to the ContourDetector class:

private lazy var request: VNDetectContoursRequest = {
  let req = VNDetectContoursRequest()
  return req
}()

This will lazily create a VNDetectContoursRequest the first time you need it. The Singleton structure also ensures there’s only one Vision request, which can be reused throughout the app’s lifecycle.

Now add the following method:

private func perform(request: VNRequest,
                     on image: CGImage) throws -> VNRequest {
  // 1
  let requestHandler = VNImageRequestHandler(cgImage: image, options: [:])
  
  // 2
  try requestHandler.perform([request])
  
  // 3
  return request
}

This method is simple but powerful. Here you:

  1. Create the request handler and pass it a supplied CGImage.
  2. Perform the request using the handler.
  3. Return the request, which now has the results attached.

In order to use the results from the request, you’ll need to do a bit of processing. Below the previous method, add the following method to process the returned request:

private func postProcess(request: VNRequest) -> [Contour] {
  // 1
  guard let results = request.results as? [VNContoursObservation] else {
    return []
  }
    
  // 2
  let vnContours = results.flatMap { contour in
    (0..<contour.contourCount).compactMap { try? contour.contour(at: $0) }
  }
      
  // 3
  return vnContours.map { Contour(vnContour: $0) }
}

In this method, you:

  1. Check that results is an array of VNContoursObservation objects.
  2. Convert each result into an array of VNContours.
    • flatMap the results into a single flattened array.
    • Iterate over the contours in contour using compactMap to ensure only non-nil values are kept.
    • Use contour(at:) to retrieve a contour object at a specified index.
  3. Map the array of VNContours into an array of your custom Contour models.

Note: The reason you convert from VNContour to Contour is to simplify some SwiftUI code. Contour conforms to Identifiable, so it's easy to loop through an array of them. Check out the ContoursView.swift to see this in action.

Processing Images in the Detector

Now you just need to tie these two private methods together somewhere that is callable from outside the class. Still in ContourDetector.swift, add the following method:

func process(image: CGImage?) throws -> [Contour] {
  guard let image = image else {
    return []
  }
    
  let contourRequest = try perform(request: request, on: image)
  return postProcess(request: contourRequest)
}

Here you're checking if there's an image, then using perform(request:on:) to create a request, and finally returning the result using postProcess(request:). This will be the method your view model will call to detect contours for an image, which is exactly what you'll do next.

Open ContentViewModel.swift and add the following method to the end of the class:

func asyncUpdateContours() async -> [Contour] {
  let detector = ContourDetector.shared
  return (try? detector.process(image: self.image)) ?? []
}

In this code, you're creating an asynchronous method to detect contours. Why asynchronous? Although detecting contours is generally relatively quick, you still don't want to tie up the UI while waiting for the API call results. The asynchronous method returns an empty array if the detector doesn't find any contours. Also, spoiler alert, you'll add a lot more logic here later, which will tax your device's processor. :]

However, you still need to call this method from somewhere. Find the method stub for updateContours, and fill it in with the following code:

func updateContours() {
  // 1
  guard !calculating else { return }
  calculating = true
  
  // 2
  Task {
    // 3
    let contours = await asyncUpdateContours()
    
    // 4
    DispatchQueue.main.async {
      self.contours = contours
      self.calculating = false
    }
  }
}

With this code, you:

  1. Do nothing if we're already calculating contours. Otherwise set a flag to indicate that you're calculating contours. The UI will then be able to inform the user, so they remain patient.
  2. Create an asynchronous context, from which to run the contour detector. This is necessary for asynchronous work.
  3. Kick off the contour detection method and await its results.
  4. Set the results back on the main thread and clear the calculating flag. Since both contours and calculating are published properties, they should only be assigned on the main thread.

This update method needs to be called from somewhere and the bottom of init is as good a place as any! Find init and add the following line to the bottom:

updateContours()

It's now time to build and run your app. After the app loads and you see the image, tap the screen to show its detected contours using the default settings.

Contours detected from the sample image

Great job!

VNContoursObservation and VNContour

At the time of writing, a VNDetectContoursObservation never seems to return more than one VNContoursObservation in the results array. Instead, all of the contours you see, which is a total of 43 in the previous screenshot, are referenced by the single VNContoursObservation.

Note: The code you wrote handles multiple VNContoursObservation results, just in case Apple ever decides to change how this works.

Each individual contour is described by a VNContour and is organized hierarchically. A VNContour can have child contours. To access them, you have two options:

  1. Index the childContours property, which is an array of VNContours.
  2. Use the childContourCount integer property in conjunction with the childContour(at: Int) method to loop through and access each child contour.

As any VNContour can have a child VNContour, you'll have to recursively access them if you need to preserve the hierarchal information.

If you don't care about the hierarchy, VNContoursObservation gives you an easy way to access all contours in simple manner. A VNContoursObservation has a contourCount integer property and a contour(at: Int) method to access all contours as if they were a flat data structure.

However, if hierarchy is important to you, you need to access the topLevelContours property, which is an array of VNContours. From there, you can access each contour's child contours.

If you were to write some simple code to count top-level and child contours, you'd find that the sample image, with default settings, has four top-level contours and 39 child contours, for a total of 43.

VNDetectContoursRequest Settings

So far, you've created a VNDetectContoursRequest without experimenting with the various settings available to you. Currently, there are four properties you can change to achieve different results-

  1. contrastAdjustment: The algorithm has a built-in way to adjust the contrast of the image prior to performing contour detection. Adjusting the contrast tries to darken the dark parts of the image and lighten the light parts to exaggerate their differences. This float property ranges from 0.0 to 3.0, with a default value of 2.0. The higher the value, the more contrast will be applied to the image, making it easier to detect some contours.
  2. contrastPivot: How does the algorithm know what part of the image should be considered dark vs. light? That's where the contrast pivot comes in. It's an optional NSNumber property ranging from 0.0 to 1.0, with a default of 0.5. Any pixels below this value will be darkened, and any pixels above will be lightened. You can also set this property to nil to have the Vision framework automatically detect what this value "should" be.
  3. detectsDarkOnLight: This boolean property is a hint to the contour detection algorithm. The default is set to true, which means it should look for dark objects on a light background.
  4. maximumImageDimension: Since you can pass in any size image to the request handler, this integer property lets you set the maximum image dimension to use. If your image has a dimension larger than this value, the API scales the image such that the larger of the two-dimensions will be equal to maximumImageDimension. The default value for this property is 512. Why would you want to change this? Contour detection requires quite a bit of processing power - the larger the image, the more it needs. However, the larger the image, the more accurate it can be. This property allows you to fine-tune this trade-off for your needs.

Changing the Contrast

Now that you understand the settings available to you, you'll write some code to change the values for the two contrast settings. For this tutorial, you'll leave the detectsDarkOnLight and maximumImageDimension properties alone and just use the default values for them.

Open ContourDetector.swift and add the following methods to the bottom of ContourDetector:

func set(contrastPivot: CGFloat?) {
  request.contrastPivot = contrastPivot.map {
    NSNumber(value: $0)
  }
}

func set(contrastAdjustment: CGFloat) {
  request.contrastAdjustment = Float(contrastAdjustment)
}

These methods change the contrastPivot and contrastAdjustment on the VNDetectContoursRequest, respectively, with a little extra logic to allow you to set the contrastPivot to nil.

You'll recall that request is a lazy var, meaning if it hasn't been instantiated by the time you've called one of these methods, it will be now.

Next, open ContentViewModel.swift and find asyncUpdateContours. Update the method so it looks like this:

func asyncUpdateContours() async -> [Contour] {
  let detector = ContourDetector.shared

  // New logic    
  detector.set(contrastPivot: 0.5)
  detector.set(contrastAdjustment: 2.0)
    
  return (try? detector.process(image: self.image)) ?? []
}

Those two new lines hard code values for the contrastPivot and the contrastAdjustment.

Build and run the app and experiment with different values for these settings (you'll need to change the values and then build and run again). Here's some screenshots of different values in action:

Sample image contours with the contourPivot set to 0.2 and the contourAdjustment set to 2.0

Sample image contours with the contourPivot set to 0.8 and the contourAdjustment set to 2.0

Sample image contours with the contourPivot set to 0.5 and the contourAdjustment set to 1.0

Sample image contours with the contourPivot set to 0.5 and the contourAdjustment set to 3.0

Ok, now you're getting some interesting results. However, it's a bit annoying that there's no magical setting to get all the contours from the image and combine them into one result.

But… there's a solution for that.

When exploring the starter project, you might have tapped on the settings icon in the bottom right corner. If you tapped on it, you would see sliders for minimum and maximum contrast pivot and adjustment.

You'll use these sliders to create ranges for these settings and loop through them. Then you'll combine all the contours from each setting pair to create a more complete set of contours for the image.

Note: The larger the range for each setting, the more Vision requests you run. This can be a slow process and is not recommended on older devices unless you're very patient. It runs well on newer iPhones, iPads, and M1-based Macs.

If you don't still have ContentViewModel.swift open, go ahead an open it. Delete the entire contents of asyncUpdateContours and replace it with the following code:

// 1
var contours: [Contour] = []

// 2
let pivotStride = stride(
  from: UserDefaults.standard.minPivot,
  to: UserDefaults.standard.maxPivot,
  by: 0.1)
let adjustStride = stride(
  from: UserDefaults.standard.minAdjust,
  to: UserDefaults.standard.maxAdjust,
  by: 0.2)

// 3
let detector = ContourDetector.shared

// 4
for pivot in pivotStride {
  for adjustment in adjustStride {
    
    // 5
    detector.set(contrastPivot: pivot)
    detector.set(contrastAdjustment: adjustment)
    
    // 6
    let newContours = (try? detector.process(image: self.image)) ?? []
    
    // 7
    contours.append(contentsOf: newContours)
  }
}

// 8
return contours

In this new version of asyncUpdateContours, you:

  1. Create an empty array of Contours to store all the contours in.
  2. Setup the strides for the contourPivot and contourAdjustment values to loop through.
  3. Get a reference to the ContourDetector singleton.
  4. Loop through both strides. Notice that this is a nested loop, so that each value of contourPivot will be paired with each value of contourAdjustment.
  5. Change the settings for the VNDetectContoursRequest using the accessor methods you created.
  6. Run the image through the Vision contour detector API.
  7. Append the results to the list of Contours and…
  8. Return this list of Contours.

Phew! That was a lot, but it'll be worth it. Go ahead and build and run the app and change the sliders in the settings menu. After you dismiss the settings menu by swiping down or tapping outside it, it will begin recalculating the contours.

The ranges used in the screenshot below are:

  • Contrast Pivot: 0.2 - 0.7
  • Contrast Adjustment: 0.5 - 3.0

Sample image contours combined from many different settings

Very cool!

Thinning the Contours

This is a pretty cool effect, but you can do even better!

You might notice that some contours now look thick while others are thin. The "thick" contours are actually multiple contours of the same area but slightly offset from one another due to how the contrast was adjusted.

If you could detect duplicate contours, you'd be able to remove them, which should make the lines look thinner.

An easy way to determine whether two contours are the same is to look at how much overlap they have. It's not exactly 100% accurate, but it's a relatively fast approximation. To determine overlap, you can calculate the intersection-over-union of their bounding boxes.

Intersection over union, or IoU, is the intersection area of two bounding boxes divided by the area of their union.

Intersection over union diagram

When the IoU is 1.0, the bounding boxes are exactly the same. If the IoU is 0.0, there's no overlap between the two bounding boxes.

You can use this as a threshold to filter out bounding boxes that seem "close enough" to the same.

Back in asyncUpdateContours in ContentViewModel.swift, add the following code just before the return statement:

// 1
if contours.count < 9000 {
  // 2
  let iouThreshold = UserDefaults.standard.iouThresh
  
  // 3
  var pos = 0
  while pos < contours.count {
    // 4
    let contour = contours[pos]
    // 5
    contours = contours[0...pos] + contours[(pos+1)...].filter {
      contour.intersectionOverUnion(with: $0) < iouThreshold
    }
    // 6
    pos += 1
  }
}

With this code, you:

  1. Only run if the number of contours is less than 9,000. This can be the slowest part of the entire function, so try to limit when it can be used.
  2. Grab the IoU threshold setting, which can be changed in the settings screen.
  3. Loop through each contour. You use a while loop here because you'll be dynamically changing the contours array. You don't want to end up indexing outside of the array's size accidentally!
  4. Index the contour array to get the current contour.
  5. Keep only the contours after the current contour, whose IoU is less than the threshold. Remember, if the IoU is greater than or equal to the threshold, you've determined it to be similar to the current contour and should be removed.
  6. Increment the indexing position.
Note: There's probably a more efficient way to accomplish this, but this is the simplest way to explain the concept.

Go ahead and build and run the app.

Sample image contours, after filtering out similar contours

Notice how many of the thick contours are now significantly thinner!

Simplifying the Contours

You can use another trick to add an artistic flair to your contour art. You can simplify the contours.

VNContour has a member method called polygonApproximation(epsilon:), which does just that. The method's purpose is to return a similar contour with fewer points. It's an approximation of the contour.

The choice of epsilon will determine how simplified the returned contour can be. A larger epsilon will result in a simpler contour with fewer points, whereas a smaller epsilon will return a contour closer to the original.

Open ContourDetector.swift. At the top of ContourDetector, add the following property:

private var epsilon: Float = 0.001

Next, at the bottom of ContourDetector, add the following method:

func set(epsilon: CGFloat) {
  self.epsilon = Float(epsilon)
}

Still within the same class, find postProcess(request:) and replace the return statement at the bottom of the method with the following code:

let simplifiedContours = vnContours.compactMap {
  try? $0.polygonApproximation(epsilon: self.epsilon)
}
        
return simplifiedContours.map { Contour(vnContour: $0) }

This code simplifies each of the detected contours based on the current value for epsilon before returning them.

Before trying out this new feature, you need to connect the epsilon setting to the ContourDetector. You'll just read it from UserDefaults, which the setting screen writes to.

Open ContentViewModel.swift and find asyncUpdateContours once more. Then, just below the line where you define the detector constant, add the following line:

detector.set(epsilon: UserDefaults.standard.epsilon)

This will ensure that the detector gets the latest value for epsilon each time it needs to update the displayed contours.

For the final time, go ahead and build and run!

Sample image contours, but simplified

This example used a value of 0.01 for the Polygon Approximation Epsilon setting.

Now that is contour art with style. ;]

Where To Go From Here?

If you've gotten to this point, you've gone through a lot of code and concepts! Pat yourself on the back; you deserve it. You can download the finished project using the Download Materials link at the top or bottom of the tutorial.

By learning how the Vision API pipeline works, you can now use any of the other algorithms provided by Apple within the Vision framework. Think of the possibilities!

We have just the thing if you're interested in more tutorials on Vision APIs; check out:

I hope you enjoyed this tutorial, and if you have any questions or comments, please join the forum discussion below!

raywenderlich.com Weekly

The raywenderlich.com newsletter is the easiest way to stay up-to-date on everything you need to know as a mobile developer.

Get a weekly digest of our tutorials and courses, and receive a free in-depth email course as a bonus!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK