Training Text Classification Models with Watson NLP

This post is part of a mini series which describes an AI text classification sample end to end using Watson NLP (Natural Language Understanding) and Watson Studio. This part explains how to train custom models with Watson NLP and Watson Studio.

IBM Watson NLP (Natural Language Understanding) containers can be run locally, on-premises or Kubernetes and OpenShift clusters. Via REST and gRCP APIs AI can easily be embedded in applications. IBM Watson Studio is used for the training of the model.

The code is available as open source in the text-classification-watson-nlp repo.

Here are all parts of the series:

Overview

The training is done via IBM Watson Studio. IBM partners can get a test environment on IBM TechZone.

For the training an ensemble model is used which combines three classification models. It computes the weighted mean of classification predictions using confidence scores. It depends on the syntax model and the GloVe and USE embeddings.

SVM with TF-IDF features
SVM with USE (Universal Sentence Encoder) features

You can find more details about how to train Watson NLP models in the Cloud Pak for Data documentation.

Architecture:

Walk-through

Let’s take a look at Watson Studio and the notebook. See the screenshots for a detailed walk-through.

Choose the runtime template ‘DO + NLP Runtime 22.1 on Python 3.9 (4 vCPU and 16 GB RAM)’ and upload the notebook from the repo.

Before you run the notebook, create a project token with editor access so that the notebook can access project artifacts. Delete the first cell in the notebook and add the token to the notebook by clicking on the icon at the top with the three dots.

The ‘slightly modified’ CSV from the previous step is available on Box and is downloaded automatically by the notebook.

The data is split in training and testing sets.

The data is modified to match the expected input of Watson NLP.

Here are snippets from the notebook how to run the training.

syntax_model = watson_nlp.load(watson_nlp.download('syntax_izumo_en_stock'))

use_model = watson_nlp.load(watson_nlp.download('embedding_use_en_stock'))

training_data_file = train_file

data_stream_resolver = DataStreamResolver(target_stream_type=list, expected_keys={'text': str, 'labels': list})

training_data = data_stream_resolver.as_data_stream(training_data_file)

text_stream, labels_stream = training_data[0], training_data[1]

syntax_stream = syntax_model.stream(text_stream)

use_train_stream = use_model.stream(syntax_stream, doc_embed_style='raw_text')

use_svm_train_stream = watson_nlp.data_model.DataStream.zip(use_train_stream, labels_stream)

def predict_product(text):

ensemble_preds = ensemble_model.run(text)

predicted_ensemble = ensemble_preds.to_dict()["classes"][0]["class_name"]

return (predicted_ensemble, predicted_ensemble)

stopwords = watson_nlp.download_and_load('text_stopwords_classification_ensemble_en_stock')

ensemble_model = Ensemble.train(train_file, 'syntax_izumo_en_stock', 'embedding_glove_en_stock', 'embedding_use_en_stock', stopwords=stopword

predictions = test_orig_df[text_col].apply(lambda text: predict_product(text))

predictions_df = pd.DataFrame.from_records(predictions, columns=('Predicted SVM', 'Predicted Ensemble'))

result_df = test_orig_df[[text_col, "label"]].merge(predictions_df, how='left', left_index=True, right_index=True)

The results are displayed after the training and testing.

What’s next?

Check out my blog for more posts related to this sample or get the code from GitHub. To learn more about IBM Watson NLP and Embeddable AI, check out these resources.

Training Text Classification Models with Watson NLP

Training Text Classification Models with Watson NLP

Share this:

Recommend

Snapchat Announces Paid Add Ons for AR Lenses, New AR Initiatives at Lens Fest 2...

虎符官网内容被清除，仅剩实控人信息公告

Google Launches New Add-On Prompts to Guide Discovery in Search

OPPO雄起！Find X6标准版外观曝光将搭载大底潜望镜

Beware of Productivity Tools

免费的开源商城系统怎么去选

售价创历史新低！特斯拉宣布现车限时补贴6000元

《魔兽世界》停服倒计时，我的工作室流水损失3000万

采购电子元器件IC，这几种砍价行为会让价格越谈越高？

Microsoft says that it will bring Call of Duty franchise to Nintendo Switch

About Joyk