README.md

PyTorch implementation of OpenAI's Finetuned Transformer Language Model

This is a PyTorch implementation of the TensorFlow code provided with OpenAI's paper "Improving Language Understanding by Generative Pre-Training" by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.

This implementation comprises a script to load in the PyTorch model the weights pre-trained by the authors with the TensorFlow implementation.

The model classes and loading script are located in model_py.py.

The names of the modules in the PyTorch model follow the names of the Variable in the TensorFlow implementation. This implementation tries to follow the original code as closely as possible to minimize the discrepancies.

This implementation thus also comprises a modified Adam optimization algorithm as used in OpenAI's paper with:

fixed weights decay following the work of Loshchilov et al., and
scheduled learning rate as commonly used for Transformers.

Requirements

To use the model it-self by importing model_py.py, you just need:

PyTorch (version >=0.4)

To run the classifier training script in train.py you will need in addition:

tqdm
sklearn
spacy
ftfy
pandas

You can download the weights of the OpenAI pre-trained version by cloning Alec Radford's repo and placing the model folder containing the pre-trained weights in the present repo.

Using the pre-trained model as a Transformer Language Model

The model can be used as a transformer language model with OpenAI's pre-trained weights as follow:

from model_py import Model, load_openai_pretrained_model, DEFAULT_CONFIG

args = DEFAULT_CONFIG
vocab = 40000 # Size of your vocabulary
model = Model(vocab, args)
load_openai_pretrained_model(model)

This model generates Transformer's hidden states. You can use the LMHead class in model.py to add a decoder tied with the weights of the encoder and get a full language model. You can also use the ClfHead class in model.py to add a classifier on top of the transformer and get a classifier as described in OpenAI's publication. (see an example of both in the __main__ function of train.py)

To use the positional encoder of the transformer, you should encode your dataset using the encode_dataset() function of utils.py. Please refer to the beginning of the __main__ function in train.py to see how to properly define the vocabulary and encode your dataset.

Fine-tuning the pre-trained model on a classification task

This model can also be integrated in a classifier as detailed in OpenAI's paper. An example of fine-tuning on the ROCStories Cloze task is included with the training code in train.py

The ROCStories dataset can be downloaded from the associated website.

As with the TensorFlow code, this code implements the ROCStories Cloze Test result reported in the paper which can be reproduced by running:

python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path to data here]

Accuracy on the ROCStories test set

Finetuning the PyTorch model for 3 Epochs on ROCStories takes 10 minutes to run on a single NVidia K-80.

The test accuracy of this PyTorch version (with the default TensorFlow hyper-parameters) is 83.43%.

The authors reports a median accuracy of 10 runs with the TensorFlow code of 85.8%. The paper reports a best accuracy of 86.5%.

As noted by the author, the code can be non-deterministic due to various GPU ops.

TO-DO list

Add Multi-GPU training logic

GitHub - huggingface/pytorch-openai-transformer-lm: A PyTorch implementation of...

README.md

PyTorch implementation of OpenAI's Finetuned Transformer Language Model

Requirements

Using the pre-trained model as a Transformer Language Model

Fine-tuning the pre-trained model on a classification task

Accuracy on the ROCStories test set

TO-DO list

Recommend

友情推广：区块链招聘大会（夏日场）即将开幕！提交简历即可获免费门票！

今日Offer速报：数据科学家直通车学员，斩获Citi全职Offer！

不会Coding的中国孩子在美国过的有多惨？

直播丨大数据工程师面试不可不知的关键点

How we track customer costs in Mixpanel

KZWFoudation系列之WKWebView的封装

如何面试一名iOS开发

flutter基础-看完这篇就可以撸app了

iOS逆向-ipa包重签名及非越狱手机安装多个微信

作为一名前端开发工程师，你必须掌握的WEB模板引擎：Handlebars

About Joyk