GitHub - chakki-works/sumeval: Well tested & Multi-language evaluation frame... - JOYK Joy of Geek, Geek News, Link all geek

Well tested & Multi-language
evaluation framework for Text Summarization.

Well tested
- The ROUGE-X scores are tested compare with original Perl script (ROUGE-1.5.5.pl).
- The BLEU score is calculated by SacréBLEU, that produces the same values as official script (mteval-v13a.pl) used by WMT.
Multi-language
- Not only English, Japanese are also supported. The other language is extensible easily.

Of course, implementation is Pure Python!

How to use

from sumeval.metrics.rouge import RougeCalculator


rouge = RougeCalculator(stopwords=True, lang="en")

rouge_1 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references="I went to Mars",
            n=1)

rouge_2 = rouge.rouge_n(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"],
            n=2)

rouge_l = rouge.rouge_l(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

# You need spaCy to calculate ROUGE-BE

rouge_be = rouge.rouge_be(
            summary="I went to the Mars from my living town.",
            references=["I went to Mars", "It's my living town"])

print("ROUGE-1: {}, ROUGE-2: {}, ROUGE-L: {}, ROUGE-BE: {}".format(
    rouge_1, rouge_2, rouge_l, rouge_be
).replace(", ", "\n"))

from sumeval.metrics.bleu import BLEUCalculator


bleu = BLEUCalculator()
score = bleu.bleu("I am waiting on the beach",
                  "He is walking on the beach")

bleu_ja = BLEUCalculator(lang="ja")
score_ja = bleu_ja.bleu("私はビーチで待ってる", "彼がベンチで待ってる")

From the command line

sumeval r-nlb "I'm living New York its my home town so awesome" "My home town is awesome"

output.

{
  "options": {
    "stopwords": true,
    "stemming": false,
    "word_limit": -1,
    "length_limit": -1,
    "alpha": 0.5,
    "input-summary": "I'm living New York its my home town so awesome",
    "input-references": [
      "My home town is awesome"
    ]
  },
  "averages": {
    "ROUGE-1": 0.7499999999999999,
    "ROUGE-2": 0.6666666666666666,
    "ROUGE-L": 0.7499999999999999,
    "ROUGE-BE": 0
  },
  "scores": [
    {
      "ROUGE-1": 0.7499999999999999,
      "ROUGE-2": 0.6666666666666666,
      "ROUGE-L": 0.7499999999999999,
      "ROUGE-BE": 0
    }
  ]
}

Undoubtedly you can use file input. Please see more detail by sumeval -h.

Install

pip install sumeval

Dependencies

BLEU is depends on SacréBLEU
To calculate ROUGE-BE, spaCy is required.
To use lang ja, janome or MeCab is required.
- Especially to get score of ROUGE-BE, GiNZA is needed additionally.
To use lang zh, jieba is required.
- Especially to get score of ROUGE-BE, pyhanlp is needed additionally.

sumeval uses two packages to test the score.

pythonrouge
- It calls original perl script
- pip install git+https://github.com/tagucci/pythonrouge.git
rougescore
- It's simple python implementation for rouge score
- pip install git+git://github.com/bdusell/rougescore.git

Welcome Contribution

Add supported language

The tokenization and dependency parse process for each language is located on sumeval/metrics/lang.

You can make language class by inheriting BaseLang.

GitHub - chakki-works/sumeval: Well tested & Multi-language evaluation frame...

Well tested & Multi-language
evaluation framework for Text Summarization.

How to use

From the command line

Install

Dependencies

Welcome Contribution

Add supported language

Recommend

无实体店、不打折，性冷淡的Everlane为什么年收入5000万美元？

How to add a GUI to your Golang app in 5 easy steps (powered by Electron)

微信还款信用卡收费背后：代扣江湖的生死时刻

GitHub - raspberrypi/firmware: This repository contains pre-compiled binaries of...

Mysql5.7官方文档

Java 异常处理的误区和经验总结

[WOTD]腾讯杨文兵：基于PaaS快速构建自动化运维体系

AWS宣布推出Amazon Sumerian新服务

Turner选定AWS作为其首选云服务供应商

迪士尼公司选择AWS作为其首选的公有云基础设施提供商

About Joyk

GitHub - chakki-works/sumeval: Well tested & Multi-language evaluation frame...

Well tested & Multi-language evaluation framework for Text Summarization.

How to use

From the command line

Install

Dependencies

Welcome Contribution

Add supported language

Recommend

About Joyk

Well tested & Multi-language
evaluation framework for Text Summarization.