53

GitHub - namisan/mt-dnn: Multi-Task Deep Neural Networks for Natural Language Un...

 5 years ago
source link: https://github.com/namisan/mt-dnn
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

Multi-Task Deep Neural Networks for Natural Language Understanding

This PyTorch package implements the Multi-Task Deep Neural Networks (MT-DNN) for Natural Language Understanding, as described in:

Xiaodong Liu*, Pengcheng He*, Weizhu Chen and Jianfeng Gao
Multi-Task Deep Neural Networks for Natural Language Understanding
arXiv version
*: Equal contribution

Quickstart

Setup Environment

Install via pip:

  1. python3.6

  2. install requirements
    > pip install -r requirements.txt

Use docker:

  1. pull docker
    > docker pull allenlao/pytorch-mt-dnn:v0.1

  2. run docker
    > docker run -it --rm --runtime nvidia allenlao/pytorch-mt-dnn:v0.1 bash
    Please refere the following link if you first use docker: https://docs.docker.com/

Train a toy MT-DNN model

  1. download data
    > sh download.sh
    Please refer to download GLUE dataset: https://gluebenchmark.com/

  2. preprocess data
    > python prepro.py

  3. training
    > python train.py

Note that we ran experiments on 4 V100 GPUs for base mt-dnn models. You may need to reduce batch size for other GPUs.

GLUE Result reproduce

  1. MTL refinement: refine MT-DNN (shared layers), initialized with the pre-trained BERT model, via MTL using all GLUE tasks excluding WNLI to learn a new shared representation.
    Note that we ran this experiment on 8 V100 GPUs (32G) with a batch size of 32.

    • Preprocess GLUE data via the aforementioned script
    • Training:
      >scripts\run_mt_dnn.sh
  2. Finetuning: finetune MT-DNN to each of the GLUE tasks to get task-specific models.
    Here, we preovide two examples, STS-B and RTE. You can use similar scripts to finetune all the GLUE tasks.

    • Finetune on the STS-B task
      > scripts\run_stsb.sh
      You should get about 90.5/90.4 on STS-B dev in terms of Pearson/Spearman correlation.
    • Finetune on the RTE task
      > scripts\run_rte.sh
      You should get about 83.8 on RTE dev in terms of accuracy.

SciTail & SNIL Result reproduce (Domain Adaptation)

  1. Domain Adaptation on SciTail
    >scripts\scitail_domain_adaptation_bash.sh

  2. Domain Adaptation on SNLI
    >scripts\snli_domain_adaptation_bash.sh

Notes and Acknowledgments

BERT pytorch is from: https://github.com/huggingface/pytorch-pretrained-BERT
BERT : https://github.com/google-research/bert
We also used some code from: https://github.com/kevinduh/san_mrc

How do I cite MT-DNN?

For now, please cite arXiv version:

@article{liu2019mt-dnn,
  title={Multi-Task Deep Neural Networks for Natural Language Understanding},
  author={Liu, Xiaodong and He, Pengcheng and Chen, Weizhu and Gao, Jianfeng},
  journal={arXiv preprint arXiv:1901.11504},
  year={2019}
}

and a new version of the paper will be shared later. 

Typo: there is no activation fuction in Equation 2.

Contact Information

For help or issues using MT-DNN, please submit a GitHub issue.

For personal communication related to MT-DNN, please contact Xiaodong Liu ([email protected]), Pengcheng He ([email protected]), Weizhu Chen ([email protected]) or Jianfeng Gao ([email protected]).


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK