GitHub - chanced/caps: a case conversion library for Go
source link: https://github.com/chanced/caps
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
caps is a case conversion library
caps is a unicode aware, case conversion library for Go. It was built with the following priorites in mind: configurability, consistency, correctness, ergonomic, and reasonable performance; in that order.
The library has built in configurable initialism replacements (e.g. "Uuid" -> "UUID"), can support special case unicode (e.g. Turkish, Azeri, ...), and finally is modular in that each step in the case conversion process can be configured, augmented, or replaced.
The API should be stable as of v0.7.0.
Supported Case Conversions
The following case conversions are available:
- Camel Case (e.g. CamelCase)
- Lower Camel Case (e.g. lowerCamelCase)
- Snake Case (e.g. snake_case)
- Screaming Snake Case (e.g. SCREAMING_SNAKE_CASE)
- Kebab Case (e.g. kebab-case)
- Screaming Kebab Case(e.g. SCREAMING-KEBAB-CASE)
- Dot Notation Case (e.g. dot.notation.case)
- Screaming Dot Notation Case (e.g. DOT.NOTATION.CASE)
- Title Case (e.g. Title Case)
- Other deliminations
Install
go get github.com/chanced/caps
Example
package main
import (
"fmt"
"github.com/chanced/caps"
)
func main() {
fmt.Println(caps.ToCamel("http request"))
// Output:
// HTTPRequest
fmt.Println(caps.ToLowerCamel("some_id"))
// Output:
// someID
fmt.Println(caps.ToLowerCamel("SomeID", caps.WithReplaceStyleCamel()))
// Output:
// someId
// Alternatively:
capsJS := caps.New(caps.CapsOpts{
AllowedSymbols: "$",
ReplaceStyle: caps.ReplaceStyleCamel,
})
fmt.Println(capsJS.ToCamel("SomeID"))
// Output:
// someId
}
Word boundaries
Word boundaries are determined by the caps.Converter
. The provided implementation, caps.StdConverter
,
delegates the boundary detection to caps.Tokenizer
. The provided implementation, caps.StdTokenizer
,
uses the following rules:
- The following characters are considered word breaks
" _.!?:;$-(){}[]#@&+~"
unless present inAllowedSymbols
- Strings with all upper case characters are split by the above symbols or by
numbers, unless the character is allowed in a number based on the following rules:
- 'v' or 'V' followed by numbers
- '.' before/after a number and only once
- 'e' or 'E' if in the fractional part of a number and only once
- '-', '+' if at the start and followed by either a number or '.' and a number or in the fractional part proceeded by 'e' or 'E'
- additional rules can be added through the number rules (e.g.
WithNumberRules
) - NOTE: If '.', '+', '-' are not in the
AllowedSymbols
they are considered breaks even for numbers
- When a string consists of both upper case and lower case letters, upper case
letters are considered boundaries (e.g.
"ThisVar"
would be tokenized into["This", "Var"]
) - When mixed with lower and upper case characters, sequences of upper case are
broken up into tokens (e.g.
"SomeID"
would be tokenized into["Some", "I", "D"]
). - Replacement rules are then evaluated based on the tokens, which may combine them based on the rules below.
Replacements
caps.StdConverter
also allows users to register caps.Replacement
s for
initialism replacements. Each Replacement
is indexed in a trie (see
Index).
- Multi-rune tokens are searched independently unless followed by a number (e.g.
"ID"
,"UTF8"
). - Sequences of single rune tokens (e.g.
["U", "U", "I", "D"]
) are evaluated as a potentialReplacement
until a non-match is found or the sequence is broken by a token with more than one rune.
Default replacements
{"Acl", "ACL"}
{"Api", "API"}
{"Ascii", "ASCII"}
{"Cpu", "CPU"}
{"Css", "CSS"}
{"Dns", "DNS"}
{"Eof", "EOF"}
{"Guid", "GUID"}
{"Html", "HTML"}
{"Http", "HTTP"}
{"Https", "HTTPS"}
{"Id", "ID"}
{"Ip", "IP"}
{"Json", "JSON"}
{"Lhs", "LHS"}
{"Qps", "QPS"}
{"Ram", "RAM"}
{"Rhs", "RHS"}
{"Rpc", "RPC"}
{"Sla", "SLA"}
{"Smtp", "SMTP"}
{"Sql", "SQL"}
{"Ssh", "SSH"}
{"Tcp", "TCP"}
{"Tls", "TLS"}
{"Ttl", "TTL"}
{"Udp", "UDP"}
{"Ui", "UI"}
{"Uid", "UID"}
{"Uuid", "UUID"}
{"Uri", "URI"}
{"Url", "URL"}
{"Utf8", "UTF8"}
{"Vm", "VM"}
{"Xml", "XML"}
{"Xmpp", "XMPP"}
{"Xsrf", "XSRF"}
{"Xss", "XSS"}
If you would like to add or remove entries from that list, you have a few options. See below.
Customizing the Converter
Using caps.Caps
This is likely your best option. You can create and utilize an instance of
caps.Caps
which has all of the conversions as methods:
package main
import (
"fmt"
"github.com/chanced/caps"
)
func main() {
c := caps.New(caps.CapsOpts{
Replacements: []caps.Replacement{
{"Ex", "EX"},
// ... your replacements
},
})
fmt.Println(c.ToCamel("some ex"))
// Output:
// SomeEX
}
Creating isolated caps.StdConverter
instances
You can pass a new instance of caps.StdConverter
with a new set of
caps.Replacement
.
package main
import (
"fmt"
"github.com/chanced/caps"
"github.com/chanced/caps/token"
)
func main() {
replacements := []caps.Replacement{
{"Ex", "EX" },
// ... your replacements
}
converter := caps.NewConverter(replacements, caps.DefaultTokenizer, token.DefaultCaser)
fmt.Println(caps.ToCamel("ex id", caps.WithConverter(converter)))
// note: ID was not in the replacement list above
// Output:
// "EXId"
fmt.Println(caps.ToCamel("ex id"))
// Output:
// ExID
}
Modifying the caps.DefaultConverter
global
You can update caps.DefaultConverter
. You should set it before you make any
conversions. Otherwise, you'll need guard your usage of the library accordingly
(e.g. a mutex).
package main
import (
"fmt"
"github.com/chanced/caps"
)
func main() {
caps.DefaultConverter.Set("Gcp", "GCP")
fmt.Println(caps.ToCamel("some_gcp_var"))
// Output:
// SomeGCPVar
}
Creating a custom caps.Converter
Finally, if you are so inclined, you can create your own caps.Converter
. This
could be as simple as implementing the single Convert
method, calling
caps.DefaultConverter.Convert
, and then modifying the result.
package main
import (
"fmt"
"github.com/chanced/caps"
)
type MyConverter struct{}
func (MyConverter) Convert(req caps.ConvertRequest) string {
res := caps.DefaultConverter.Convert(req)
if req.Style.IsLowerCamel() && req.ReplaceStyle.IsCamel() && res == "id" {
return "_id"
}
return res
}
func main() {
fmt.Println(caps.ToLowerCamel("ID", caps.WithReplaceStyleCamel(), caps.WithConverter(MyConverter{})))
// Output:
// _id
}
Support for special case unicode (e.g. Turkish, Azeri)
caps supports Turkish and Azeri through the token.Caser
interface. It is
satisfied by unicode.TurkishCase
and unicode.AzeriCase
. token.TurkishCaser
and token.AzeriCaser
are available as pointers to those variables (although
you can use the unicode variables directly).
For example, to use Turkish, you would need to instantiate a few variables:
package main
import (
"github.com/chanced/caps"
"github.com/chanced/caps/token"
)
func main() {
tokenizer := caps.NewTokenizer(caps.DEFAULT_DELIMITERS, token.TurkishCaser)
// I suppose these would need to be specific to Turkish?
// if not, you can just use caps.DefaultReplacements
replacements := []caps.Replacement{
{ Camel: "Http", Screaming: "HTTP" }, // just an example
}
turkish := caps.NewConverter(replacements, tokenizer, token.TurkishCaser)
// to use this as your default throughout your application
// you can overwrite caps.DefaultConverter
//
// caps.DefaultConverter = turkish
//
// otherwise, you can pass in the converter to the config for each call:
fmt.Println(caps.ToScreamingKebab("i ı", caps.WithConverter(turkish)))
// Output:
// İ-I
}
Benchmarks
input: "Example Uuid."
goos: darwin
goarch: arm64
pkg: github.com/chanced/caps
Using a caps.Caps
instance:
BenchmarkCapsToTitle
BenchmarkCapsToTitle-10 2418796 473.9 ns/op 208 B/op 9 allocs/op
BenchmarkCapsToCamel
BenchmarkCapsToCamel-10 2572858 466.4 ns/op 208 B/op 9 allocs/op
BenchmarkCapsToLowerCamel
BenchmarkCapsToLowerCamel-10 2607831 458.1 ns/op 208 B/op 9 allocs/op
BenchmarkCapsToSnake
BenchmarkCapsToSnake-10 2567318 466.2 ns/op 208 B/op 9 allocs/op
BenchmarkCapsToScreamingSnake
BenchmarkCapsToScreamingSnake-10 2380708 504.7 ns/op 256 B/op 10 allocs/op
BenchmarkCapsToKebab
BenchmarkCapsToKebab-10 2408059 501.0 ns/op 256 B/op 10 allocs/op
BenchmarkCapsToScreamingKebab
BenchmarkCapsToScreamingKebab-10 2383544 500.0 ns/op 256 B/op 10 allocs/op
BenchmarkCapsToDotNotation
BenchmarkCapsToDotNotation-10 2418474 495.8 ns/op 256 B/op 10 allocs/op
BenchmarkCapsToScreamingDotNotation
BenchmarkCapsToScreamingDotNotation-10 2375731 504.0 ns/op 256 B/op 10 allocs/op
Using top-level functions:
BenchmarkToTitle
BenchmarkToTitle-10 2349061 505.8 ns/op 256 B/op 10 allocs/op
BenchmarkToCamel
BenchmarkToCamel-10 2406518 503.7 ns/op 256 B/op 10 allocs/op
BenchmarkToLowerCamel
BenchmarkToLowerCamel-10 2391112 503.9 ns/op 256 B/op 10 allocs/op
BenchmarkToSnake
BenchmarkToSnake-10 2392641 503.3 ns/op 256 B/op 10 allocs/op
BenchmarkToScreamingSnake
BenchmarkToScreamingSnake-10 2311863 517.7 ns/op 256 B/op 10 allocs/op
BenchmarkToKebab
BenchmarkToKebab-10 2390890 503.5 ns/op 256 B/op 10 allocs/op
BenchmarkToScreamingKebab
BenchmarkToScreamingKebab-10 2313615 519.0 ns/op 256 B/op 10 allocs/op
BenchmarkToDotNotation
BenchmarkToDotNotation-10 2344407 510.3 ns/op 256 B/op 10 allocs/op
BenchmarkToScreamingDotNotation
BenchmarkToScreamingDotNotation-10 2294145 521.2 ns/op 256 B/op 10 allocs/op
License
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK