6

Fancy strings in Scala 3

 2 years ago
source link: https://blog.softwaremill.com/fancy-strings-in-scala-3-37346b7a6a5a
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Fancy strings in Scala 3

Let’s put some of the new Scala 3 features to work! While we try to escape using String s as much as possible, we still end up manipulating them in our codebases daily. However, quite often, these are not arbitrary strings but ones with some special properties. In these cases, Scala's compiler might be able to offer some help.

Our goal will be to create dedicated types for non-empty and lowercase strings. We’ll use opaque types and inlines, which are new features of the Scala 3 compiler.

Non-empty strings

First, let’s create a zero-cost abstraction that will represent non-empty strings. For that, we’ll define an opaque type NonEmptyString:

opaque type NonEmptyString = String

At runtime, values of type NonEmptyString will be simply Strings (hence, there's no additional cost to the abstraction that we are introducing). At compile-time, however, these two types are treated by the compiler as completely unrelated, except for the scope in which this type alias is defined. Here, if this is a top-level definition, the scope will be the entire file, but if we created the type alias in an object, the scope in which it's known that a NonEmptyString is in fact a String would be that object.

A type alias is a good start, but we’ll also need some way to “lift” values from a String into our new type. First, given an arbitrary value at runtime, we can write a function that returns an optional NonEmptyString. Note that this definition needs to be placed next to the opaque type alias as the compiler needs to know that these types are indeed equal:

It’s worth noting that here we do have some additional runtime cost — allocating the option. However, we can do better with constants. We can check at compile time if they are indeed empty or not! For this, we’ll use an inline method that is guaranteed to be evaluated by the compiler as the compilation happens. In the inline’s method definition, we’ll use an inline if which can be used to verify whether a constant expression is true at compile time:

We’re also using scala.compiletime.requireConst to get a nice error message if the value passed as a parameter is not a constant (but e.g. a value reference). If the string is empty, we're using scala.compiletime.error to report a custom error message.

Finally, we need a way to upcast a NonEmptyString into a String. This can be done using an implicit conversion. In order to avoid an additional runtime method call, we define the conversion as an inline method as well (evaluated at compile-time). This conversion will be added automatically by the compiler, given that we import it into scope.

Here’s the entire NonEmptyString abstraction that we have created:

Time for some tests! We need to put them in a different file or a different scope so that the opaque type is really opaque:

What’s important in the above design is that we’re creating no runtime overhead — all NonEmptyString values at runtime are the same String objects from which they are created; it's just at compile-time that these types are distinct.

Lowercase strings

Let’s look at a slightly more complex example — creating a type that represents lowercase strings. We start the same, with an opaque type, implicit conversion to upcast back to String, and a method to lift a String into our LowerCased type. This time, the lifting works slightly differently, as we simply lowercase the parameter:

What about typing known constants as LowerCased? Here, inlines are not powerful enough: in the inline if, we can only check constant conditions. To verify that a constant string is already lower case, we'd have to check that LowerCased(s) == s at compile time. Luckily, unlike inlines, using macros we can run arbitrary code. We won't go in-depth into quoting & splicing (see links to articles dedicated to macros in the next section), instead, we'll focus on the macro logic that is required here.

We’ll need two methods. A user-facing one that “lifts” the parameter it receives into the macro-land (so that we can inspect & manipulate it at compile-time) and “splices” the result of our manipulations back so that it is compiled normally. This method needs to be inline to be evaluated at compile-time and its body will call the macro implementation. The quoting and splicing is done using '{} and ${}, respectively.

Additionally, we’ll need to define the macro implementation method, which operates on abstract syntax trees (ASTs) — represented in code as values of type Expr - corresponding to the chunks of code that are passed as macro parameters:

In the implementation, we’ll inspect the abstract syntax tree corresponding to the parameter s. If it is a constant string, we check if the string is already lower case. Otherwise, we report an error:

Note that in case of success, the result of the method is s - the expression that is passed as a parameter. Since the macro is defined next to the opaque type alias, the compiler knows that LowerCased == String.

Let’s test our solution:

As a result, we’ve created two “subtypes” of String, with methods to downcast ( from and apply) and upcast (implicit conversions). Where possible, we've tried to check as much as possible during compile-time - which is one of the overarching goals when using statically typed languages.

Going further

If you’d like to learn more about macros in Scala, take a look at our tutorial and tips & tricks articles. Magda Stożek explored opaque types more in-depth in her talk on the subject.

Have fun exploring Scala 3! :)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK