47

GitHub - graykode/gpt-2-Pytorch: Simple Text-Generator with OpenAI gpt-2 Pytorch...

 5 years ago
source link: https://github.com/graykode/gpt-2-Pytorch
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

GPT2-Pytorch with Text-Generator

68747470733a2f2f6d656469612d7468756d62732e676f6c64656e2e636f6d2f4f4c717a6d726d77417a59315037536c32396b325439576a4a644d3d2f323030783230302f736d6172742f676f6c64656e2d73746f726167652d70726f64756374696f6e2e73332e616d617a6f6e6177732e636f6d2f746f7069635f696d616765732f65303839313461666131306134313739383933656562303763623565343731332e706e67

Better Language Models and Their Implications

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. from openAI Blog

This repository is simple implementation GPT-2 about text-generator in Pytorch with compress code

Quick Start

  1. download GPT2 pre-trained model in Pytorch which huggingface/pytorch-pretrained-BERT already made! (Thanks for sharing! it's help my problem transferring tensorflow(ckpt) file to Pytorch Model!)
$ git clone https://github.com/graykode/gpt-2-Pytorch && cd gpt-2-Pytorch
# download huggingface's pytorch model 
$ curl --output gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin
# setup requirements
$ pip install -r requirements.txt
  1. Now, You can run like this.
$ python main.py --text "Once when I was six years old I saw a magnificent picture in a book, called True Stories from Nature, about the primeval forest."
  1. Also You can Quick Start in Google Colab

Option

  • --text : sentence to begin with.
  • --quiet : not print all of the extraneous stuff like the "================"
  • --nsamples : number of sample sampled in batch when multinomial function use
  • --unconditional : If true, unconditional generation.
  • --batch_size : number of batch size
  • --length : sentence length (< number of context)
  • --temperature: the thermodynamic temperature in distribution (default 0.7)
  • --top_k : Returns the top k largest elements of the given input tensor along a given dimension. (default 40)

See more detail option about temperature and top_k in here

Dependencies

  • Pytorch 0.41+
  • regex 2017.4.5

Author

License

OpenAi/GPT2 follow MIT license, huggingface/pytorch-pretrained-BERT is Apache license. I follow MIT license with original GPT2 repository

Acknowledgement

Jeff Wu(@WuTheFWasThat), Thomas Wolf(@thomwolf) allowing referring code.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK