

Audio File Transcription, for Super-Efficient Recording
source link: https://dev.to/vivijiangclevercoder/audio-file-transcription-for-super-efficient-recording-33f5
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Audio File Transcription, for Super-Efficient Recording
Jul 12
・3 min read
Introduction
Converting audio into text has a wide range of applications: generating video subtitles, taking meeting minutes, and writing interview transcripts. HUAWEI ML Kit's service makes doing so easier than ever before, converting audio files into meticulously accurate text, with correct punctuation as well!
Actual Effects
Build and run an app with audio file transcription integrated. Then, select a local audio file and convert it into text.
![Alt text of image]
Development Preparations
For details about configuring the Huawei Maven repository and integrating the audio file transcription SDK, please refer to the Development Guide of ML Kit on HUAWEI Developers.
Declaring Permissions in the AndroidManifest.xml File
Open the AndroidManifest.xml in the main folder. Add the network connection, network status access, and storage read permissions before Please note that these permissions need to be dynamically applied for. Otherwise, Permission Denied will be reported.
Development Procedure
Creating and Initializing an Audio File Transcription Engine**
Override onCreate in MainActivity to create an audio transcription engine.
private MLRemoteAftEngine mAnalyzer;
mAnalyzer = MLRemoteAftEngine.getInstance();
mAnalyzer.init(getApplicationContext());
mAnalyzer.setAftListener(mAsrListener);
Use MLRemoteAftSetting to configure the engine. The service currently supports Mandarin Chinese and English, that is, the options of mLanguage are zh and en.
MLRemoteAftSetting setting = new MLRemoteAftSetting.Factory()
.setLanguageCode(mLanguage)
.enablePunctuation(true)
.enableWordTimeOffset(true)
.enableSentenceTimeOffset(true)
.create();
enablePunctuation indicates whether to automatically punctuate the converted text, with a default value of false.
If this parameter is set to true, the converted text is automatically punctuated; false otherwise.
enableWordTimeOffset indicates whether to generate the text transcription result of each audio segment with the corresponding offset. The default value is false. You need to set this parameter only when the audio duration is less than 1 minute.
If this parameter is set to true, the offset information is returned along with the text transcription result. This applies to the transcription of short audio files with a duration of 1 minute or shorter.
If this parameter is set to false, only the text transcription result of the audio file will be returned.
enableSentenceTimeOffset indicates whether to output the offset of each sentence in the audio file. The default value is false.
If this parameter is set to true, the offset information is returned along with the text transcription result.
If this parameter is set to false, only the text transcription result of the audio file will be returned.
Creating a Listener Callback to Process the Transcription Result
private MLRemoteAftListener mAsrListener = new MLRemoteAftListener()
After the listener is initialized, call startTask in AftListener to start the transcription.
@Override
public void onInitComplete(String taskId, Object ext) {
Log.i(TAG, "MLRemoteAftListener onInitComplete" + taskId);
mAnalyzer.startTask(taskId);
}
Override onUploadProgress, onEvent, and onResult in MLRemoteAftListener.
@Override
public void onUploadProgress(String taskId, double progress, Object ext) {
Log.i(TAG, " MLRemoteAftListener onUploadProgress is " + taskId + " " + progress);
}
@Override
public void onEvent(String taskId, int eventId, Object ext) {
Log.e(TAG, "MLAsrCallBack onEvent" + eventId);
if (MLAftEvents.UPLOADED_EVENT == eventId) { // The file is uploaded successfully.
showConvertingDialog();
startQueryResult(); // Obtain the transcription result.
}
}
@Override
public void onResult(String taskId, MLRemoteAftResult result, Object ext) {
Log.i(TAG, "onResult get " + taskId);
if (result != null) {
Log.i(TAG, "onResult isComplete " + result.isComplete());
if (!result.isComplete()) {
return;
}
if (null != mTimerTask) {
mTimerTask.cancel();
}
if (result.getText() != null) {
Log.e(TAG, result.getText());
dismissTransferringDialog();
showCovertResult(result.getText());
}
List<MLRemoteAftResult.Segment> segmentList = result.getSegments();
if (segmentList != null && segmentList.size() != 0) {
for (MLRemoteAftResult.Segment segment : segmentList) {
Log.e(TAG, "MLAsrCallBack segment text is : " + segment.getText() + ", startTime is : " + segment.getStartTime() + ". endTime is : " + segment.getEndTime());
}
}
List<MLRemoteAftResult.Segment> words = result.getWords();
if (words != null && words.size() != 0) {
for (MLRemoteAftResult.Segment word : words) {
Log.e(TAG, "MLAsrCallBack word text is : " + word.getText() + ", startTime is : " + word.getStartTime() + ". endTime is : " + word.getEndTime());
}
}
List<MLRemoteAftResult.Segment> sentences = result.getSentences();
if (sentences != null && sentences.size() != 0) {
for (MLRemoteAftResult.Segment sentence : sentences) {
Log.e(TAG, "MLAsrCallBack sentence text is : " + sentence.getText() + ", startTime is : " + sentence.getStartTime() + ". endTime is : " + sentence.getEndTime());
}
}
}
}
(```
)
Processing the Transcription Result in Polling Mode
After the transcription is completed, call getLongAftResult to obtain the transcription result. Process the obtained result every 10 seconds.
(
```)
private void startQueryResult() {
Timer mTimer = new Timer();
mTimerTask = new TimerTask() {
@Override
public void run() {
getResult();
}
};
mTimer.schedule(mTimerTask, 5000, 10000); // Process the obtained long speech transcription result every 10s.
}
private void getResult() {
Log.e(TAG, "getResult");
mAnalyzer.setAftListener(mAsrListener);
mAnalyzer.getLongAftResult(mLongTaskId);
}
(```
)
(https://stackoverflow.com/questions/tagged/huawei-mobile-services)
Follow our official account for the latest HMS Core-related news and updates.
Recommend
-
34
Overview Audio transcription automatically transcribes the audio of a meeting or webinar that you record to the cloud. After this transcript is processed, it appears as a separate VTT file in the list of recorded meetings. In additio...
-
14
Fixing Our Audio Recording with Xamarin.Essentials Permissions178 views•Dec 31, 2020150ShareSave ...
-
15
6 Audio Recording Tips for DIY Filmmakers By Emma Garofalo Published 6 hours ago The quality of the audio in filmmaking is vital...
-
11
Capturing, Recording and Streaming Video and Audio from Web-Cams CaptureManager SDK - Capturing, Recording and Streaming Video and Audio from Web-Cams ...
-
5
Screen Recording with Internal Audio in macOSusing QuickTime Player & Background Music Pluginillustration by Chaeyun KimRecording video with inte...
-
7
Google has made the I/O 2022 Keynote accessible with audio transcriptionSearch
-
9
WhisperAPI$0.15/hr audio transcription API powered by OpenAI WhisperPayment RequiredThis is an affordable, easy-to-use audio transcription API pow...
-
4
Jan 23, 2023 — 18:26 CUT Last Week, on Club MacStories: Audio Transcription Shortcuts, Scrobbling from Wi...
-
7
Local Audio Transcriptionermine.ai – 100% local audio recording & transcriptionClick to begin transcribing
-
5
AliCloud launches AI audio and video transcription tool AliCloud launches AI audio and video transcription tool June 2, 2023 1:31 pm Alibaba Cloud, a subsidiary of Alibaba Group, announced the public be...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK