Convert a number to an approximated text expression
source link: https://github.com/tokenmill/numberwords
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Number Words
Number Words will build numeric expressions for natural numbers, percentages and fractions. For example:
-
0.231
will be converted toless than a quarter
, -
102
toover one hundred
.
Supports multiple languages.
The implementation is based on ideas expressed in Generating Numerical Approximations .
Numerical Approximations
Numerical approximations are all over texts based on the data:
- Water temperature is bellow 10C (input data would be 9.53C) - A third of students failed the exam (34.3%) - Q2 sales were around 1M$ (1,002,184 $)
Numeric data providing information about some metrics of interest is often a number with the precision we do not need. If we see 9.382%, it is likely that the information we need is - almost 10% - instead of the precise number. Furthermore, different approximation strategies are often used in the report involving the same metrics. At the beginning of the report we might say almost 10% or "below 10%" while later in the text, we might choose a more precise expression - around 9.4% .
Number Words will help you build such numerical approximations. Making them available for the text generation systems.
Features
Number Words uses the following abstractions:
-
Actual Value
is a number which needs to be approximated - an input to the approximation function. In the examples above it is the temperature -
9.53C
, or the percentage34.3%
. -
Scale
of approximation. It is a snapping grid across the range of numbers along which the approximation is done. The scale to use is determined by the domain. For example:
-
1/4
scale, will form approximation steps starting at0
then1/4
,1/2
,3/4
ending with1
; -
1/10
scale will express percentages with one precision point; -
scales which are multiples of
10
are useful for natural number approximation. The10
will round to tens:1007
->1010
, the100
to hundreds:1003
->1000
, and so on.
-
The result of actual value approximation to a given scale provides:
- Given Value a discrete value along the scaled number range to which actual value is the closest.
-
Hedge
a common use word describing the relation between actual
and given
values. Actual Value
of
9.5
is below given value of10
. Actual Value of101
is over given value of100
. -
Text
a textual spell out of the given value
. A
2666
isTwo thousand six hundred sixty six
. -
Favorite Number
expresses some common language names for certain numbers. A
0.25
is a favorite number in that that it has the name -a quarter
.
A full approximation result returns three such approximation data structures for a given value which is:
- smaller than the actual value on the scaled number range.
- greater than the actual value on the scaled number range.
- around the actual value on the scaled number range. For this a is chosen from the above two which is closer to the actual value .
Languages
Numeric approximation has two functionality points which are language dependent
- Hedges which will differ from language to language. See Configuration section to see how this can be controlled.
- Text number to text translation for a given value . For this translation Number Words relies on ICU4J .
Currently supported languages:
- English
- German
Usage
Number Words exposes approximation functionality through approximations
function which takes on the following parameters:
-
language
-:de
or:en
-
actual-value
- the number to approximate -
scale
- at which the approximation is to be performed.
(require '[numberwords.core :as nw]) (nw/approximations :en 0.258 1/4) => #:numwords{:around #:numwords{:hedges #{"approximately" "about" "around"}, :text "zero point two five", :given-value 1/4, :favorite-number #{"a quarter"}}, :more-than #:numwords{:hedges #{"over" "more than"}, :text "zero point two five", :given-value 1/4, :favorite-number #{"a quarter"}}, :less-than #:numwords{:hedges #{"nearly" "under" "less than"}, :text "zero point five", :given-value 1/2, :favorite-number #{"a half"}}}
Configuration
Hedges, favorite numbers can be modified and new languages added via changes to a configuration file - resources/numwords.edn
{;;Configuration is strucutured by the language :en { ;;Hedges section specifies which words are associated with given actual to given value relations :hedges {:equal #{"exactly"} :around #{"around" "approximately" "about"} :more #{"more than" "over"} :less #{"less than" "under" "nearly"}} ;;Favourite numbers map a special number with its textual expressions :favorite-numbers {1/4 #{"a quarter" "a fourth"} 1/2 #{"a half"}}}}
License
Copyright © 2020 TokenMill UAB .
Distributed under the The Apache License, Version 2.0.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK