232

GitHub - pyannote/pyannote-audio: Neural building blocks for speaker diarization...

 5 years ago
source link: https://github.com/pyannote/pyannote-audio
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

Announcement

Open Phd/postdoc positions at LIMSI combining machine learning, NLP, speech processing, and computer vision.

pyannote-audio

Neural building blocks for speaker diarization

Installation

$ conda create --name pyannote python=3.6 anaconda
$ source activate pyannote
$ conda install -c conda-forge yaafe
$ conda install cmake
$ pip install -U pip setuptools
$ pip install --process-dependency-links pyannote.audio

Citation

If you use pyannote.audio in your research, please use the following citations.

  • Speech activity and speaker change detection
    @inproceedings{Yin2017,
      Author = {Ruiqing Yin and Herv\'e Bredin and Claude Barras},
      Title = {{Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks}},
      Booktitle = {{18th Annual Conference of the International Speech Communication Association, Interspeech 2017}},
      Year = {2017},
      Month = {August},
      Address = {Stockholm, Sweden},
      Url = {https://github.com/yinruiqing/change_detection}
    }
    
  • Speaker embedding
    @inproceedings{Bredin2017,
        author = {Herv\'{e} Bredin},
        title = {{TristouNet: Triplet Loss for Speaker Turn Embedding}},
        booktitle = {42nd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017},
        year = {2017},
        url = {http://arxiv.org/abs/1609.04301},
    }
    
  • Speaker diarization pipeline
    @inproceedings{Yin2018,
      Author = {Ruiqing Yin and Herv\'e Bredin and Claude Barras},
      Title = {{Neural Speech Turn Segmentation and Affinity Propagation for Speaker Diarization}},
      Booktitle = {{19th Annual Conference of the International Speech Communication Association, Interspeech 2018}},
      Year = {2018},
      Month = {September},
      Address = {Hyderabad, India},
    }
    

Tutorials

Documentation

The API is unfortunately not documented yet.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK