2

Tutorial: Play with a Speech-to-Text API using Node.js

 3 years ago
source link: https://dev.to/yongchanghe/tutorial-play-with-a-speech-to-text-api-using-nodejs-2527
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Play with an API from Deepgram converting an audio file or audio stream into written text

The purpose of building this blog is to write down the detailed operation history and my memo for learning Node.js.
If you are also interested and want to get hands dirty, just follow these steps below and have fun!~

Prerequisite

  • Have installed Node.js
  • Have Command Line Interface (CLI / Terminal)
  • Have your favourite code IDE (e.g. VSCode)
  • Have created a Deepgram account.

Getting started

We should first navigate to our favored directory, and create a folder(e.g. named sttApp) using this command:

mkdir sttApp

Enter fullscreen mode

Exit fullscreen mode

Then open the folder using your favourite IDE. Mine is VS code. We can see now the directory is empty with no files.

Next step let's use our terminal, navigate to your current directory /sttApp :

cd sttApp

Enter fullscreen mode

Exit fullscreen mode

And run the following code to initialize a new application:

npm init

Enter fullscreen mode

Exit fullscreen mode

Press enter several times to leave these parameters with default configuration, and then your CLI should get a result like this:

Next, we install the Deepgram Node.js SDK using the following:

npm install @deepgram/sdk

Enter fullscreen mode

Exit fullscreen mode

Till now if all the previous steps are correct, you should get a similar directory in your code IDE like the following:

Now in the current directory of your code IDE (/sttAPP) create a file named index.js , and copy and paste the following code to index.js and save your file:

const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');

// The API key you created in step 1
const deepgramApiKey = 'YOUR_API_KEY';

// Replace with your file path and audio mimetype
const pathToFile = 'SOME_FILE.wav';
const mimetype = 'audio/wav';

// Initializes the Deepgram SDK
const deepgram = new Deepgram(deepgramApiKey);

console.log('Requesting transcript...')
console.log('Your file may take up to a couple minutes to process.')
console.log('While you wait, did you know that Deepgram accepts over 40 audio file formats? Even MP4s.')
console.log('To learn more about customizing your transcripts check out developers.deepgram.com.')

deepgram.transcription.preRecorded(
  { buffer: fs.readFileSync(pathToFile), mimetype },
  { punctuate: true, language: 'en-US' },
)
.then((transcription) => {
  console.dir(transcription, {depth: null});
})
.catch((err) => {
  console.log(err);
});

Enter fullscreen mode

Exit fullscreen mode

The next step is to log in to your Deepgram, navigate to your Dashboard , and choose to Get a Transcript via API or SDK:

Click reveal Key and copy your API KEY SECRET:

In the next step, paste your API KEY SECRET into line 5 of your index.js, like the following:

Then let's replace line 8 and 9 with our voice file path and mime-type
(Hint: use a new CLI to navigate to the directory where your voice file is located and use pwd to acquire absolute path):

Now lastly let's run our application with the following command (Make sure you are at /sttApp):

node index.js

Enter fullscreen mode

Exit fullscreen mode

And you’ll receive a JSON response including a transcript that you want, and including word arrays, timings, and confidence scores:

Pretty COOL!

If you still get confused with the content above, please feel free to leave messages below or refer to my git repository here for the whole project: linkToGit

References

https://console.deepgram.com/project/850abca5-449a-47fa-8c40-6a463e59ad00/mission/transcript-via-api-or-sdk
https://dev.to/devteam/join-us-for-a-new-kind-of-hackathon-on-dev-brought-to-you-by-deepgram-2bjd

Overview of My Submission

A tutorial for beginners to learn node.js using STT API from Deepgram.

Submission Category:

Analytics Ambassadors

Link to Code on GitHub

linkToGit

Additional Resources / Info


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK