extratools: 145+ extra higher-level functional tools
source link: https://www.tuicool.com/articles/hit/UrMrErR
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Featured on GitHub's Trending Python repos on May 25, 2018. Thank you so much for support!
145+ extra higher-level functional tools that go beyond standard library's itertools
, functools
, etc. and popular third-party libraries like
toolz
,
fancy
, and
more-itertools
.
-
Like
toolz
and others, most of the tools are designed to be efficient, pure, and lazy. Several useful yet non-functional tools are also included. -
While
toolz
and others target basic scenarios, this library targets more advanced and higher-level scenarios. -
A few useful CLI tools for respective functions are also installed. They are available as
extratools-[func]
.
Full documentation is available here .
Why this library?
Typical pseudocode has less than 20 lines, where each line is a higher-level description. However, when implementing, many lower-level details have to be filled in.
This library reduces the burden of writing and refining the lower-level details again and again, by including an extensive set of carefully designed general purpose higher-level tools.
Current status and future plans?
There are currently 140+ functions among 17 categories, 3 data structures, and 3 CLI tools.
- Currently adopted by TopSim and PrefixSpan-py .
This library is under active development, and new tools are added on weekly basis.
- Any idea or contribution is highly welcome.
Besides many other interesting ideas, I am planning to make the following updates in recent days/weeks/months.
-
Add
dicttools.unflatten
andjsontools.unflatten
. -
Add
trie
andsuffixtree
(according to generalized suffix tree ). -
Update
seqtools.align
to support more than two sequences.
No plan to implement tools that are well covered by other popular libraries.
Which tools are available?
-
Function Categories:
debugtools
dicttools
gittools
graphtools
htmltools
jsontools
mathtools
misctools
printtools
rangetools
recttools
seqtools
settools
sortedtools
stattools
strtools
tabletools
-
Data Structures:
defaultlist
disjointsets
segmenttree
-
CLI Tools:
dicttools.remap
jsontools.flatten
stattools.teststats
Any example?
Here are ten examples out of our hundreds of tools.
-
jsontools.flatten(data, force=False)
flattens a JSON object by returning all the tuples, each with a path and the respective value.
import json from extratools.jsontools import flatten flatten(json.loads("""{ "name": "John", "address": { "streetAddress": "21 2nd Street", "city": "New York" }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "office", "number": "646 555-4567" } ], "children": [], "spouse": null }""")) # {'name': 'John', # 'address.streetAddress': '21 2nd Street', # 'address.city': 'New York', # 'phoneNumbers[0].type': 'home', # 'phoneNumbers[0].number': '212 555-1234', # 'phoneNumbers[1].type': 'office', # 'phoneNumbers[1].number': '646 555-4567', # 'children': [], # 'spouse': None}
-
rangetools.gaps(covered, whole=(-inf, inf))
computes the uncovered ranges of the whole rangewhole
, given the covered rangescovered
.
from math import inf from extratools.rangetools import gaps list(gaps( [(-inf, 0), (0.1, 0.2), (0.5, 0.7), (0.6, 0.9)], (0, 1) )) # [(0, 0.1), (0.2, 0.5), (0.9, 1)]
-
recttools.heatmap(rect, rows, cols, points, usepos=False)
computes the heatmap within rectanglerect
by a grid ofrows
rows andcols
columns.
from extratools.recttools import heatmap heatmap( ((1, 1), (3, 4)), 3, 4, [(1.5, 1.25), (1.5, 1.75), (2.75, 2.75), (2.75, 3.5), (3.5, 2.5)] ) # {1: 2, 7: 1, 11: 1, None: 1} heatmap( ((1, 1), (3, 4)), 3, 4, [(1.5, 1.25), (1.5, 1.75), (2.75, 2.75), (2.75, 3.5), (3.5, 2.5)], usepos=True ) # {(0, 1): 2, (1, 3): 1, (2, 3): 1, None: 1}
-
setcover(whole, covered, key=len)
solves the set cover problem by covering the universe setwhole
as best as possible, using a subset of the covering setscovered
.
from extratools.settools import setcover list(setcover( { 1, 2, 3, 4, 5}, [{1, 2, 3}, {2, 3, 4}, {2, 4, 5}] )) # [{1, 2, 3}, {2, 4, 5}]
-
seqtools.compress(data, key=None)
compresses the sequencedata
by encoding continuous identical items to a tuple of item and count, according to run-length encoding .
from extratools.seqtools import compress list(compress([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])) # [(1, 1), (2, 2), (3, 3), (4, 4)]
-
mergeseqs(seqs, default=None, key=None)
merges the sequences of equal length inseqs
into a single sequences. ReturnsNone
if there is conflict in any position.
from extratools.seqtools import mergeseqs seqs = [ (0 , 0 , None, 0 ), (None, 1 , 1 , None), (2 , None, None, None), (None, None, None, None) ] list(mergeseqs(seqs[1:])) # [2, # 1, # 1, # None] list(mergeseqs(seqs)) # None
-
strtools.smartsplit(s)
finds the best delimiter to automatically split strings
. Returns a tuple of delimiter and split substrings.
from extratools.strtools import smartsplit smartsplit("abcde") # (None, # ['abcde']) smartsplit("a b c d e") # (' ', # ['a', 'b', 'c', 'd', 'e']) smartsplit("/usr/local/lib/") # ('/', # ['', 'usr', 'local', 'lib', '']) smartsplit("a ::b:: c :: d") # ('::', # ['a ', 'b', ' c ', ' d']) smartsplit("{1, 2, 3, 4, 5}") # (', ', # ['{1', '2', '3', '4', '5}'])
-
strtools.learnrewrite(src, dst, minlen=3)
learns the respective regular expression and template to rewritesrc
todst
.
from extratools.strtools import learnrewrite learnrewrite( "Elisa likes Apple.", "Apple is Elisa's favorite." ) # ('(.*) likes (.*).', # "{1} is {0}'s favorite.")
-
tabletools.parsebymarkdown(text)
parses a text of multiple lines to a table, according to Markdown format.
from extratools.tabletools import parsebymarkdown list(parsebymarkdown(""" | foo | bar | | --- | --- | | baz | bim | """)) # [['foo', 'bar'], # ['baz', 'bim']]
-
tabletools.hasheader(data)
returns the confidence (between0
and1
) of whether the first row of the tabledata
is header.
from extratools.tabletools import hasheader t = [ ['Los Angeles' , '34°03′' , '118°15′' ], ['New York City', '40°42′46″', '74°00′21″'], ['Paris' , '48°51′24″', '2°21′03″' ] ] hasheader(t) # 0.0 hasheader([ ['City', 'Latitude', 'Longitude'] ] + t) # 0.6666666666666666 hasheader([ ['C1', 'C2', 'C3'] ] + t) # 1.0
How to install?
This package is available on PyPI. Just use pip3 install -U extratools
to install it.
How to cite?
When using for research purpose, please cite this library as follows.
@misc{extratools, author = {Chuancong Gao}, title = {{extratools}}, howpublished = "\url{https://github.com/chuanconggao/extratools}", year = {2018} }
Any recommended library?
There are several great libraries recommended to use together with extratools
:
regex
sortedcontainers
toolz
sh
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK