GitHub - PetrochukM/PyTorch-NLP: Text utilities and datasets for PyTorch
source link: https://github.com/PetrochukM/PyTorch-NLP
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
PyTorch-NLP
PyTorch-NLP is a library for Natural Language Processing (NLP) in Python. It's built with the very latest research in mind, and was designed from day one to support rapid prototyping. PyTorch-NLP comes with pre-trained embeddings, samplers, dataset loaders, metrics, neural network modules and text encoders. It's open-source software, released under the BSD3 license.
Documentation ?
The complete documentation for PyTorch-NLP is available via our ReadTheDocs website.
Installation
Make sure you have Python 3.5+ and PyTorch 0.2.0 or newer. You can then install pytorch-nlp
using
pip:
pip install pytorch-nlp
Optional requirements
If you want to use English tokenizer from SpaCy <http://spacy.io/>
, you need to install SpaCy and download its English model:
pip install spacy
python -m spacy download en_core_web_sm
Alternatively, you might want to use Moses tokenizer from NLTK <http://nltk.org/>
. You have to install NLTK and download the data needed:
pip install nltk
python -m nltk.downloader perluniprops nonbreaking_prefixes
Contributing
We've released PyTorch-NLP because we found a lack of basic tool kits for NLP in PyTorch. We hope that other organizations can benefit from the project. We are thankful for any contributions from the community.
Contributing Guide
Read our contributing guide to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to PyTorch-NLP.
License
Docusaurus is BSD3 licensed.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK