9

Github GitHub - liao961120/cqls: Interpret Corpus Query Language (CQL) into a li...

 3 years ago
source link: https://github.com/liao961120/cqls
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Corpus Query Langauge Subset

Parse Corpus Query Language into a list of queries in JSON format.

Installation

pip install cqls

Usage

>>> import cqls
>>> cql = '''
... "我" [pos="V."]+
... '''
>>> cqls.parse(cql, default_attr="word", max_quant=5)
[
  [{'match': {'word': ['我']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}], 
  [{'match': {'word': ['我']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}], 
  [{'match': {'word': ['我']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}], 
  [{'match': {'word': ['我']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}], 
  [{'match': {'word': ['我']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}, {'match': {'pos': ['V.']}, 'not_match': {}}]
]

Supported CQL features

  • token: [], "我", [word="我"], [word!="我" & pos="N.*"]
  • token-level quantifier: +, *, ?, {n,m}
  • grouping: ("a" "b"? "c"){1,2}
  • label: lab1:[word="我" & pos="N.*"] lab2:("a" "b")

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK