README.md

ZEN

ZEN is a BERT-based Chinese (Z) text encoder Enhanced by N-gram representations, where different combinations of characters are considered during training. The potential word or phrase boundaries are explicitly pre-trained and fine-tuned with the character encoder (BERT), so that ZEN incorporates the comprehensive information of both the character sequence and words or phrases it contains. The structure of ZEN is illustrated in the figure below.

Citation

If you use or extend our work, please cite the following paper:

@article{Sinovation2019ZEN,
  title="{ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations}",
  author={Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, Yonggang Wang},
  journal={ArXiv},
  year={2019},
  volume={abs/1911.00720}
}

Quick tour of pre-training and fine-tune using ZEN

The library comprises several example scripts for conducting Chinese NLP tasks:

run_pre_train.py: an example pre-training ZEN
run_sequence_level_classification.py: an example fine-tuning ZEN on DC, SA, SPM and NLI tasks (sequence-level classification)
run_token_level_classification.py: an example fine-tuning ZEN on CWS, POS and NER tasks (token-level classification)

Examples of pre-training and fine-tune using ZEN.

Contact information

For help or issues using ZEN, please submit a GitHub issue.

For personal communication related to ZEN, please contact chenguimin([email protected]).

GitHub - sinovation/ZEN: A BERT-based Chinese Text Encoder Enhanced by N-gram Re...

README.md

ZEN

Citation

Quick tour of pre-training and fine-tune using ZEN

Contact information

Recommend

GitHub - momika233/ClamAV_0Day_exploit: ClamAV_0Day_exploit

GitHub - MalongTech/research-charnet: CharNet: Convolutional Character Networks

GitHub - svjan5/GNNs-for-NLP: Graph Neural Networks for Natural Language Process...

Tomcat 的单机多实例配置

通过深入对比 Arrays 和 Slices 学习GO

SQL Server 2019 is now generally available - SQL Server Blog

GitHub - spbooks/phpmysql6: Code archive for the book PHP: Novice to Ninja, 6th...

天猫双11全国百城超千个商圈线下狂欢，市民提前探店加购

一线｜柳青就顺风车限制女性夜间使用问题致歉：给朋友们添堵了

三星在中国发售可折叠手机售价 1.6 万

About Joyk