

GitHub - watson-developer-cloud/swift-sdk: The Watson Swift SDK enables develope...
source link: https://github.com/watson-developer-cloud/swift-sdk
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md
Watson Developer Cloud Swift SDK
Overview
The Watson Developer Cloud Swift SDK makes it easy for mobile developers to build Watson-powered applications. With the Swift SDK you can leverage the power of Watson's advanced artificial intelligence, machine learning, and deep learning techniques to understand unstructured data and engage with mobile users in new ways.
There are many resources to help you build your first cognitive application with the Swift SDK:
- Read the Readme
- Follow the QuickStart Guide
- Review a Sample Application
- Browse the Documentation
Contents
General
- Requirements
- Installation
- Service Instances
- Custom Service URLs
- Custom Headers
- Sample Applications
- Synchronous Execution
- Objective-C Compatibility
- Linux Compatibility
- Contributing
- License
Services
- Assistant
- Discovery
- Language Translator V2
- Language Translator V3
- Natural Language Classifier
- Natural Language Understanding
- Personality Insights
- Speech to Text
- Text to Speech
- Tone Analyzer
- Visual Recognition
Requirements
- iOS 8.0+
- Xcode 9.0+
- Swift 3.2+ or Swift 4.0+
Installation
Dependency Management
We recommend using Carthage to manage dependencies and build the Swift SDK for your application.
You can install Carthage with Homebrew:
$ brew update $ brew install carthage
Then, navigate to the root directory of your project (where your .xcodeproj file is located) and create an empty Cartfile
there:
$ touch Cartfile
To use the Watson Developer Cloud Swift SDK in your application, specify it in your Cartfile
:
github "watson-developer-cloud/swift-sdk"
In a production app, you may also want to specify a version requirement.
Then run the following command to build the dependencies and frameworks:
$ carthage update --platform iOS
Finally, drag-and-drop the built frameworks into your Xcode project and import them as desired. If you are using Speech to Text, be sure to include both SpeechToTextV1.framework
and Starscream.framework
in your application.
Swift Package Manager
Add the following to your Package.swift
file to identify the Swift SDK as a dependency. The package manager will clone the Swift SDK when you build your project with swift build
.
dependencies: [ .package(url: "https://github.com/watson-developer-cloud/swift-sdk", from: "0.28.0") ]
Service Instances
IBM Watson offers a variety of services for developing cognitive applications. The complete list of Watson services is available from the products and services page. Services are instantiated using the IBM Cloud platform.
Follow these steps to create a service instance and obtain its credentials:
- Log in to IBM Cloud at https://bluemix.net.
- Create a service instance:
- From the Dashboard, select "Use Services or APIs".
- Select the service you want to use.
- Click "Create".
- Copy your service credentials:
- Click "Service Credentials" on the left side of the page.
- Copy the service's
username
andpassword
(orapi_key
for Visual Recognition).
let textToSpeech = TextToSpeech(username: "your-username-here", password: "your-password-here")
Note that service credentials are different from your IBM Cloud username and password.
See Getting started with Watson and IBM Cloud for details.
Authentication
There are three ways to authenticate with IBM Cloud through the SDK: using a username
and password
, using an api_key
, and with IAM.
See above for the steps to obtain the credentials for your service.
In your code, you pass these values in the service constructor when instantiating your service. Here are some examples:
Username and Password
let discovery = Discovery(username: "your-username-here", password: "your-password-here", version: "your-version-here")
API Key
Note: This type of authentication only works with Visual Recognition, and for instances created before May 23, 2018. Newer instances of Visual Recognition use IAM.
let visualRecognition = VisualRecognition(apiKey: "your-apiKey-here", version: "your-version-here")
Using IAM
When authenticating with IAM, you have the option of supplying:
- the IAM API key and, optionally, the IAM service URL. The IAM service URL defaults to 'https://iam.bluemix.net/identity/token'.
- an access token for the service.
If you supply an IAM API key, the SDK will request and refresh access tokens on your behalf. If you supply only the IAM access token, you are responsible for refreshing the access token as needed.
Supplying the IAM API key
let discovery = Discovery(version: "your-version-here", apiKey: "your-apikey-here")
Supplying the accessToken
let discovery = Discovery(version: "your-version-here", accessToken: "your-accessToken-here")
Updating the accessToken
discovery.accessToken("new-accessToken-here")
Custom Service URLs
You can set a custom service URL by modifying the serviceURL
property. A custom service URL may be required when running an instance in a particular region or connecting through a proxy.
For example, here is how to connect to a Tone Analyzer instance that is hosted in Germany:
let toneAnalyzer = ToneAnalyzer( username: "your-username-here", password: "your-password-here", version: "yyyy-mm-dd" ) toneAnalyzer.serviceURL = "https://gateway-fra.watsonplatform.net/tone-analyzer/api"
Custom Headers
There are different headers that can be sent to the Watson services. For example, Watson services log requests and their results for the purpose of improving the services, but you can include the X-Watson-Learning-Opt-Out
header to opt out of this.
We have exposed a defaultHeaders
public property in each class to allow users to easily customize their headers:
let naturalLanguageClassifier = NaturalLanguageClassifier(username: username, password: password) naturalLanguageClassifier.defaultHeaders = ["X-Watson-Learning-Opt-Out": "true"]
Each service method also accepts an optional headers
parameter which is a dictionary of request headers to be sent with the request.
Sample Applications
- Simple Chat (Swift)
- Simple Chat (Objective-C)
- Visual Recognition with Core ML
- Visual Recognition and Discovery with Core ML
- Speech to Text
- Text to Speech
- Cognitive Concierge
Synchronous Execution
By default, the SDK executes all networking operations asynchonously. If your application requires synchronous execution, you can use a DispatchGroup
. For example:
let dispatchGroup = DispatchGroup() dispatchGroup.enter() assistant.message(workspaceID: workspaceID) { response in print(response.output.text) dispatchGroup.leave() } dispatchGroup.wait(timeout: .distantFuture)
Objective-C Compatibility
Please see this tutorial for more information about consuming the Watson Developer Cloud Swift SDK in an Objective-C application.
Linux Compatibility
To use the Watson SDK in your Linux project, please follow the Swift Package Manager instructions.. Note that Speech to Text and Text to Speech are not supported because they rely on frameworks that are unavailable on Linux.
Contributing
We would love any and all help! If you would like to contribute, please read our CONTRIBUTING documentation with information on getting started.
License
This library is licensed under Apache 2.0. Full license text is available in LICENSE.
This SDK is intended for use with an Apple iOS product and intended to be used in conjunction with officially licensed Apple development tools.
Assistant
With the IBM Watson Assistant service you can create cognitive agents--virtual agents that combine machine learning, natural language understanding, and integrated dialog scripting tools to provide outstanding customer engagements.
The following example shows how to start a conversation with the Assistant service:
import AssistantV1 let username = "your-username-here" let password = "your-password-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let assistant = Assistant(username: username, password: password, version: version) let workspaceID = "your-workspace-id-here" let failure = { (error: Error) in print(error) } var context: Context? // save context to continue conversation assistant.message(workspaceID: workspaceID, failure: failure) { response in print(response.output.text) context = response.context }
The following example shows how to continue an existing conversation with the Assistant service:
let input = InputData(text: "Turn on the radio.") let request = MessageRequest(input: input, context: context) let failure = { (error: Error) in print(error) } assistant.message(workspaceID: workspaceID, request: request, failure: failure) { response in print(response.output.text) context = response.context }
Context Variables
The Assistant service allows users to define custom context variables in their application's payload. For example, a workspace that guides users through a pizza order might include a context variable for pizza size: "pizza_size": "large"
.
Context variables are get/set using the var additionalProperties: [String: JSON]
property of a Context
model. The following example shows how to get and set a user-defined pizza_size
variable:
// get the `pizza_size` context variable assistant.message(workspaceID: workspaceID, request: request, failure: failure) { response in if case let .string(size) = response.context.additionalProperties["pizza_size"]! { print(size) } } // set the `pizza_size` context variable assistant.message(workspaceID: workspaceID, request: request, failure: failure) { response in var context = response.context // `var` makes the context mutable context?.additionalProperties["pizza_size"] = .string("large") }
For reference, the JSON
type is defined as:
/// A JSON value (one of string, number, object, array, true, false, or null). public enum JSON: Equatable, Codable { case null case boolean(Bool) case string(String) case int(Int) case double(Double) case array([JSON]) case object([String: JSON]) }
The following links provide more information about the IBM Watson Assistant service:
Discovery
IBM Watson Discovery makes it possible to rapidly build cognitive, cloud-based exploration applications that unlock actionable insights hidden in unstructured data — including your own proprietary data, as well as public and third-party data. With Discovery, it only takes a few steps to prepare your unstructured data, create a query that will pinpoint the information you need, and then integrate those insights into your new application or existing solution.
Discovery News
IBM Watson Discovery News is included with Discovery. Watson Discovery News is an indexed dataset with news articles from the past 60 days — approximately 300,000 English articles daily. The dataset is pre-enriched with the following cognitive insights: Keyword Extraction, Entity Extraction, Semantic Role Extraction, Sentiment Analysis, Relation Extraction, and Category Classification.
The following example shows how to query the Watson Discovery News dataset:
import DiscoveryV1 let username = "your-username-here" let password = "your-password-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let discovery = Discovery(username: username, password: password, version: version) let failure = { (error: Error) in print(failure) } discovery.query( environmentID: "system", collectionID: "news-en", query: "enriched_text.concepts.text:\"Cloud computing\"", failure: failure) { queryResponse in print(queryResponse) }
Private Data Collections
The Swift SDK supports environment management, collection management, and document uploading. But you may find it easier to create private data collections using the Discovery Tooling instead.
Once your content has been uploaded and enriched by the Discovery service, you can search the collection with queries. The following example demonstrates a complex query with a filter, query, and aggregation:
import DiscoveryV1 let username = "your-username-here" let password = "your-password-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let discovery = Discovery(username: username, password: password, version: version) let failure = { (error: Error) in print(failure) } discovery.query( environmentID: "your-environment-id", collectionID: "your-collection-id", filter: "enriched_text.concepts.text:\"Technology\"", query: "enriched_text.concepts.text:\"Cloud computing\"", aggregation: "term(enriched_text.concepts.text,count:10)", failure: failure) { queryResponse in print(queryResponse) }
You can also upload new documents into your private collection:
import DiscoveryV1 let username = "your-username-here" let password = "your-password-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let discovery = Discovery(username: username, password: password, version: version) let failure = { (error: Error) in print(failure) } let file = Bundle.main.url(forResource: "KennedySpeech", withExtension: "html")! discovery.addDocument( environmentID: "your-environment-id", collectionID: "your-collection-id", file: file, fileContentType: "text/html", failure: failWithError) { response in print(response) }
The following links provide more information about the IBM Discovery service:
Language Translator V2
The IBM Watson Language Translator service lets you select a domain, customize it, then identify or select the language of text, and then translate the text from one supported language to another.
The following example demonstrates how to use the Language Translator service:
import LanguageTranslatorV2 let username = "your-username-here" let password = "your-password-here" let languageTranslator = LanguageTranslator(username: username, password: password) let failure = { (error: Error) in print(error) } let request = TranslateRequest(text: ["Hello"], source: "en", target: "es") languageTranslator.translate(request: request, failure: failure) { translation in print(translation) }
Language Translator V3
The IBM Watson Language Translator service lets you select a domain, customize it, then identify or select the language of text, and then translate the text from one supported language to another.
The following example demonstrates how to use the Language Translator service:
import LanguageTranslatorV3 let username = "your-username-here" let password = "your-password-here" let version = "yyyy-mm-dd" // use today's date for the most recent version let languageTranslator = LanguageTranslator(username: username, password: password, version: version) let failure = { (error: Error) in print(error) } let request = TranslateRequest(text: ["Hello"], source: "en", target: "es") languageTranslator.translate(request: request, failure: failure) { translation in print(translation) }
The following links provide more information about the IBM Watson Language Translator service:
- IBM Watson Language Translator - Service Page
- IBM Watson Language Translator - Documentation
- IBM Watson Language Translator - Demo
Natural Language Classifier
The IBM Watson Natural Language Classifier service enables developers without a background in machine learning or statistical algorithms to create natural language interfaces for their applications. The service interprets the intent behind text and returns a corresponding classification with associated confidence levels. The return value can then be used to trigger a corresponding action, such as redirecting the request or answering a question.
The following example demonstrates how to use the Natural Language Classifier service:
import NaturalLanguageClassifierV1 let username = "your-username-here" let password = "your-password-here" let naturalLanguageClassifier = NaturalLanguageClassifier(username: username, password: password) let classifierID = "your-trained-classifier-id" let text = "your-text-here" let failure = { (error: Error) in print(error) } naturalLanguageClassifier.classify(classifierID: classifierID, text: text, failure: failure) { classification in print(classification) }
The following links provide more information about the Natural Language Classifier service:
- IBM Watson Natural Language Classifier - Service Page
- IBM Watson Natural Language Classifier - Documentation
- IBM Watson Natural Language Classifier - Demo
Natural Language Understanding
The IBM Natural Language Understanding service explores various features of text content. Provide text, raw HTML, or a public URL, and IBM Watson Natural Language Understanding will give you results for the features you request. The service cleans HTML content before analysis by default, so the results can ignore most advertisements and other unwanted content.
Natural Language Understanding has the following features:
- Concepts
- Entities
- Keywords
- Categories
- Sentiment
- Emotion
- Relations
- Semantic Roles
The following example demonstrates how to use the service:
import NaturalLanguageUnderstandingV1 let username = "your-username-here" let password = "your-password-here" let version = "yyyy-mm-dd" // use today's date for the most recent version let naturalLanguageUnderstanding = NaturalLanguageUnderstanding(username: username, password: password, version: version) let features = Features(concepts: ConceptsOptions(limit: 5)) let parameters = Parameters(features: features, text: "your-text-here") let failure = { (error: Error) in print(error) } naturalLanguageUnderstanding.analyze(parameters: parameters, failure: failure) { results in print(results) }
500 errors
Note that you are required to include at least one feature in your request. You will receive a 500 error if you do not include any features in your request.
The following links provide more information about the Natural Language Understanding service:
- IBM Watson Natural Language Understanding - Service Page
- IBM Watson Natural Language Understanding - Documentation
- IBM Watson Natural Language Understanding - Demo
Personality Insights
The IBM Watson Personality Insights service enables applications to derive insights from social media, enterprise data, or other digital communications. The service uses linguistic analytics to infer personality and social characteristics, including Big Five, Needs, and Values, from text.
The following example demonstrates how to use the Personality Insights service:
import PersonalityInsightsV3 let username = "your-username-here" let password = "your-password-here" let version = "yyyy-mm-dd" // use today's date for the most recent version let personalityInsights = PersonalityInsights(username: username, password: password, version: version) let failure = { (error: Error) in print(error) } personalityInsights.profile(text: "your-input-text", failure: failure) { profile in print(profile) }
The following links provide more information about the Personality Insights service:
- IBM Watson Personality Insights - Service Page
- IBM Watson Personality Insights - Documentation
- IBM Watson Personality Insights - Demo
Speech to Text
The IBM Watson Speech to Text service enables you to add speech transcription capabilities to your application. It uses machine intelligence to combine information about grammar and language structure to generate an accurate transcription. Transcriptions are supported for various audio formats and languages.
The SpeechToText
class is the SDK's primary interface for performing speech recognition requests. It supports the transcription of audio files, audio data, and streaming microphone data. Advanced users, however, may instead wish to use the SpeechToTextSession
class that exposes more control over the WebSockets session.
Please be sure to include both SpeechToTextV1.framework
and Starscream.framework
in your application. Starscream is a recursive dependency that adds support for WebSockets sessions.
Beginning with iOS 10+, any application that accesses the microphone must include the NSMicrophoneUsageDescription
key in the app's Info.plist
file. Otherwise, the app will crash. Find more information about this here.
Recognition Request Settings
The RecognitionSettings
class is used to define the audio format and behavior of a recognition request. These settings are transmitted to the service when initating a request.
The following example demonstrates how to define a recognition request that transcribes WAV audio data with interim results:
var settings = RecognitionSettings(contentType: "audio/wav") settings.interimResults = true
See the class documentation or service documentation for more information about the available settings.
Microphone Audio and Compression
The Speech to Text framework makes it easy to perform speech recognition with microphone audio. The framework internally manages the microphone, starting and stopping it with various function calls (such as recognizeMicrophone(settings:model:customizationID:learningOptOut:compress:failure:success)
and stopRecognizeMicrophone()
or startMicrophone(compress:)
and stopMicrophone()
).
There are two different ways that your app can determine when to stop the microphone:
-
User Interaction: Your app could rely on user input to stop the microphone. For example, you could use a button to start/stop transcribing, or you could require users to press-and-hold a button to start/stop transcribing.
-
Final Result: Each transcription result has a
final
property that istrue
when the audio stream is complete or a timeout has occurred. By watching for thefinal
property, your app can stop the microphone after determining when the user has finished speaking.
To reduce latency and bandwidth, the microphone audio is compressed to OggOpus format by default. To disable compression, set the compress
parameter to false
.
It's important to specify the correct audio format for recognition requests that use the microphone:
// compressed microphone audio uses the opus format let settings = RecognitionSettings(contentType: "audio/ogg;codecs=opus") // uncompressed microphone audio uses a 16-bit mono PCM format at 16 kHz let settings = RecognitionSettings(contentType: "audio/l16;rate=16000;channels=1")
Recognition Results Accumulator
The Speech to Text service may not always return the entire transcription in a single response. Instead, the transcription may be streamed over multiple responses, each with a chunk of the overall results. This is especially common for long audio files, since the entire transcription may contain a significant amount of text.
To help combine multiple responses, the Swift SDK provides a SpeechRecognitionResultsAccumulator
object. The accumulator tracks results as they are added and maintains several useful instance variables:
- results
: A list of all accumulated recognition results.
- speakerLabels
: A list of all accumulated speaker labels.
- bestTranscript
: A concatenation of transcripts with the greatest confidence.
To use the accumulator, initialize an instance of the object then add results as you receive them:
var accumulator = SpeechRecognitionResultsAccumulator() accumulator.add(results: results) print(accumulator.bestTranscript)
Transcribe Recorded Audio
The following example demonstrates how to use the Speech to Text service to transcribe a WAV audio file.
import SpeechToTextV1 let username = "your-username-here" let password = "your-password-here" let speechToText = SpeechToText(username: username, password: password) var accumulator = SpeechRecognitionResultsAccumulator() let audio = Bundle.main.url(forResource: "filename", withExtension: "wav")! var settings = RecognitionSettings(contentType: "audio/wav") settings.interimResults = true let failure = { (error: Error) in print(error) } speechToText.recognize(audio, settings: settings, failure: failure) { results in accumulator.add(results: results) print(accumulator.bestTranscript) }
Transcribe Microphone Audio
Audio can be streamed from the microphone to the Speech to Text service for real-time transcriptions. The following example demonstrates how to use the Speech to Text service to transcribe microphone audio:
import SpeechToTextV1 let username = "your-username-here" let password = "your-password-here" let speechToText = SpeechToText(username: username, password: password) var accumulator = SpeechRecognitionResultsAccumulator() func startStreaming() { var settings = RecognitionSettings(contentType: "audio/ogg;codecs=opus") settings.interimResults = true let failure = { (error: Error) in print(error) } speechToText.recognizeMicrophone(settings: settings, failure: failure) { results in accumulator.add(results: results) print(accumulator.bestTranscript) } } func stopStreaming() { speechToText.stopRecognizeMicrophone() }
Session Management and Advanced Features
Advanced users may want more customizability than provided by the SpeechToText
class. The SpeechToTextSession
class exposes more control over the WebSockets connection and also includes several advanced features for accessing the microphone. The SpeechToTextSession
class also allows users more control over the AVAudioSession shared instance. Before using SpeechToTextSession
, it's helpful to be familiar with the Speech to Text WebSocket interface.
The following steps describe how to execute a recognition request with SpeechToTextSession
:
- Connect: Invoke
connect()
to connect to the service. - Start Recognition Request: Invoke
startRequest(settings:)
to start a recognition request. - Send Audio: Invoke
recognize(audio:)
orstartMicrophone(compress:)
/stopMicrophone()
to send audio to the service. - Stop Recognition Request: Invoke
stopRequest()
to end the recognition request. If the recognition request is already stopped, then sending a stop message will have no effect. - Disconnect: Invoke
disconnect()
to wait for any remaining results to be received and then disconnect from the service.
All text and data messages sent by SpeechToTextSession
are queued, with the exception of connect()
which immediately connects to the server. The queue ensures that the messages are sent in-order and also buffers messages while waiting for a connection to be established. This behavior is generally transparent.
A SpeechToTextSession
also provides several (optional) callbacks. The callbacks can be used to learn about the state of the session or access microphone data.
onConnect
: Invoked when the session connects to the Speech to Text service.onMicrophoneData
: Invoked with microphone audio when a recording audio queue buffer has been filled. If microphone audio is being compressed, then the audio data is in OggOpus format. If uncompressed, then the audio data is in 16-bit PCM format at 16 kHz.onPowerData
: Invoked every 0.025s when recording with the average dB power of the microphone.onResults
: Invoked when transcription results are received for a recognition request.onError
: Invoked when an error or warning occurs.onDisconnect
: Invoked when the session disconnects from the Speech to Text service.
Note that the AVAudioSession.sharedInstance()
must be configured to allow microphone access when using SpeechToTextSession
. This allows users to set a particular configuration for the AVAudioSession
. An example configuration is shown in the code below.
The following example demonstrates how to use SpeechToTextSession
to transcribe microphone audio:
import SpeechToTextV1 let username = "your-username-here" let password = "your-password-here" let speechToTextSession = SpeechToTextSession(username: username, password: password) var accumulator = SpeechRecognitionResultsAccumulator() do { let session = AVAudioSession.sharedInstance() try session.setActive(true) try session.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker]) } catch { print(error.localizedDescription) } func startStreaming() { // define callbacks speechToTextSession.onConnect = { print("connected") } speechToTextSession.onDisconnect = { print("disconnected") } speechToTextSession.onError = { error in print(error) } speechToTextSession.onPowerData = { decibels in print(decibels) } speechToTextSession.onMicrophoneData = { data in print("received data") } speechToTextSession.onResults = { results in accumulator.add(results: results) print(accumulator.bestTranscript) } // define recognition request settings var settings = RecognitionSettings(contentType: "audio/ogg;codecs=opus") settings.interimResults = true // start streaming microphone audio for transcription speechToTextSession.connect() speechToTextSession.startRequest(settings: settings) speechToTextSession.startMicrophone() } func stopStreaming() { speechToTextSession.stopMicrophone() speechToTextSession.stopRequest() speechToTextSession.disconnect() }
Customization
There are a number of ways that Speech to Text can be customized to suit your particular application. For example, you can define custom words or upload audio to train an acoustic model. For more information, refer to the service documentation or API documentation.
Additional Information
The following links provide more information about the IBM Speech to Text service:
- IBM Watson Speech to Text - Service Page
- IBM Watson Speech to Text - Documentation
- IBM Watson Speech to Text - Demo
Text to Speech
The IBM Watson Text to Speech service synthesizes natural-sounding speech from input text in a variety of languages and voices that speak with appropriate cadence and intonation.
The following example demonstrates how to use the Text to Speech service:
import TextToSpeechV1 import AVFoundation let username = "your-username-here" let password = "your-password-here" let textToSpeech = TextToSpeech(username: username, password: password) // The AVAudioPlayer object will stop playing if it falls out-of-scope. // Therefore, to prevent it from falling out-of-scope we declare it as // a property outside the completion handler where it will be played. var audioPlayer = AVAudioPlayer() let text = "your-text-here" let accept = "audio/wav" let failure = { (error: Error) in print(error) } textToSpeech.synthesize(text: text, accept: accept, failure: failure) { data in audioPlayer = try! AVAudioPlayer(data: data) audioPlayer.prepareToPlay() audioPlayer.play() }
The Text to Speech service supports a number of voices for different genders, languages, and dialects. The following example demonstrates how to use the Text to Speech service with a particular voice:
import TextToSpeechV1 let username = "your-username-here" let password = "your-password-here" let textToSpeech = TextToSpeech(username: username, password: password) // The AVAudioPlayer object will stop playing if it falls out-of-scope. // Therefore, to prevent it from falling out-of-scope we declare it as // a property outside the completion handler where it will be played. var audioPlayer = AVAudioPlayer() let text = "your-text-here" let accept = "audio/wav" let voice = "en-US_LisaVoice" let failure = { (error: Error) in print(error) } textToSpeech.synthesize(text: text, accept: accept, voice: voice, failure: failure) { data in audioPlayer = try! AVAudioPlayer(data: data) audioPlayer.prepareToPlay() audioPlayer.play() }
The following links provide more information about the IBM Text To Speech service:
- IBM Watson Text To Speech - Service Page
- IBM Watson Text To Speech - Documentation
- IBM Watson Text To Speech - Demo
Tone Analyzer
The IBM Watson Tone Analyzer service can be used to discover, understand, and revise the language tones in text. The service uses linguistic analysis to detect three types of tones from written text: emotions, social tendencies, and writing style.
Emotions identified include things like anger, fear, joy, sadness, and disgust. Identified social tendencies include things from the Big Five personality traits used by some psychologists. These include openness, conscientiousness, extraversion, agreeableness, and emotional range. Identified writing styles include confident, analytical, and tentative.
The following example demonstrates how to use the Tone Analyzer service:
import ToneAnalyzerV3 let username = "your-username-here" let password = "your-password-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let toneAnalyzer = ToneAnalyzer(username: username, password: password, version: version) let toneInput = ToneInput(text: "your-input-text") let failure = { (error: Error) in print(error) } toneAnalyzer.tone(toneInput: toneInput, contentType: "plain/text", failure: failure) { tones in print(tones) }
The following links provide more information about the IBM Watson Tone Analyzer service:
- IBM Watson Tone Analyzer - Service Page
- IBM Watson Tone Analyzer - Documentation
- IBM Watson Tone Analyzer - Demo
Visual Recognition
The IBM Watson Visual Recognition service uses deep learning algorithms to analyze images (.jpg or .png) for scenes, objects, faces, text, and other content, and return keywords that provide information about that content. The service comes with a set of built-in classes so that you can analyze images with high accuracy right out of the box. You can also train custom classifiers to create specialized classes.
The following example demonstrates how to use the Visual Recognition service:
import VisualRecognitionV3 let apiKey = "your-apikey-here" let version = "YYYY-MM-DD" // use today's date for the most recent version let visualRecognition = VisualRecognition(version: version, apiKey: apiKey) let url = "your-image-url" let failure = { (error: Error) in print(error) } visualRecognition.classify(image: url, failure: failure) { classifiedImages in print(classifiedImages) }
Note: a different initializer is used for authentication with instances created before May 23, 2018:
let visualRecognition = VisualRecognition(apiKey: apiKey, version: version)
Using Core ML
The Watson Swift SDK supports offline image classification using Apple Core ML. Classifiers must be trained or updated with the coreMLEnabled
flag set to true. Once the classifier's coreMLStatus
is ready
then a Core ML model is available to download and use for offline classification.
Once the Core ML model is in the device's file system, images can be classified offline, directly on the device.
let classifierID = "your-classifier-id" let failure = { (error: Error) in print(error) } let image = UIImage(named: "your-image-filename") visualRecognition.classifyWithLocalModel(image: image, classifierIDs: [classifierID], failure: failure) { classifiedImages in print(classifiedImages) }
The local Core ML model can be updated as needed.
let classifierID = "your-classifier-id" let failure = { (error: Error) in print(error) } visualRecognition.updateLocalModel(classifierID: classifierID, failure: failure) { print("model updated") }
The following example demonstrates how to list the Core ML models that are stored in the filesystem and available for offline use:
let localModels = try! visualRecognition.listLocalModels() print(localModels)
If you would prefer to bypass classifyWithLocalModel
and construct your own Core ML classification request, then you can retrieve a Core ML model from the local filesystem with the following example.
let classifierID = "your-classifier-id" let localModel = try! visualRecognition.getLocalModel(classifierID: classifierID) print(localModel)
The following example demonstrates how to delete a local Core ML model from the filesystem. This saves space when the model is no longer needed.
let classifierID = "your-classifier-id" visualRecognition.deleteLocalModel(classifierID: classifierID)
Bundling a model directly with your application
You may also choose to include a Core ML model with your application, enabling images to be classified offline without having to download a model first. To include a model, add it to your application bundle following the naming convention [classifier_id].mlmodel. This will enable the SDK to locate the model when using any function that accepts a classifierID argument.
The following links provide more information about the IBM Watson Visual Recognition service:
Recommend
-
10
Understanding Windows 11 new security requirements with David Weston David Weston is Director of Enterprise and OS...
-
7
David Copperfield's History of Magic - in partnership with Microsoft Outside In Microsoft has an internal l...
-
2
Race & Gender in Silicon Valley with Stanford's Cynthia Lee Join us as we go behind the scenes of a year of big headlines a...
-
12
Fresh Air and Fresh Perspectives for DevelopersThis website uses cookies to ensure you get the best experience on our website. Learn more
-
4
Blog Post Developer diaries: The case of the lunchtime interruption Observability-driven development using Instana In...
-
7
Features and APIs Overview Android 13 introduces great new features and APIs for developers. The sections b...
-
8
Modern Code Generation with Jordan Adler Jordan Adler is Head of Developer Engineering at OneSignal and has a deep interest in code generation....
-
9
Zephyr.js Build Typesafe Node API in minutes Description Zephyr is a Typescript server-side meta framework that is inspired by
-
3
-
7
Tiny Game Engine Tiny is a lightweight, multiplatform game engine that allows developers to crea...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK