4

Type Checking in Ruby — Check Yo Self Before You Wreck Yo Self

 3 years ago
source link: https://blog.appsignal.com/2019/08/27/ruby-magic-type-checking-in-ruby.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Ruby Magic

Type Checking in Ruby — Check Yo Self Before You Wreck Yo Self

Michael Kohl on Aug 27, 2019

“I absolutely love AppSignal.”


Discover AppSignal

Let’s start this post with a fun little guessing game: what do you think is the most common error tracked by AppSignal in Ruby applications?

It’s fair to assume that many of you answered this question with NoMethodError, an exception that is caused by calling a non-existent method on an object. Occasionally, this may be caused by a typo in the method name, but more often it’s the result of calling a method on an object of the wrong type, which often happens to be an unexpected nil. Is there something we can do as Ruby developers to reduce the frequency of such errors?

Types to the Rescue?

Except for the choice of text editor or programming language, few topics can spiral into heated debates faster than discussions of type systems. We won’t have time to go into details here, but Chris Smith’s post “What To Know Before Debating Type Systems” does an excellent job at that.

In the broadest terms, type systems can be divided into two main categories—static and dynamic. While the former happens ahead of time (either via the compiler or a separate tool), dynamic type checking occurs during runtime, where it may lead to exceptions if the actual types don’t align with the developer’s expectations.

Proponents of both philosophies have strong opinions, but alas, there are also many misconceptions floating around: static typing does not require copious type annotations—many modern compilers can figure out the types on their own, a process known as “type inference”. On the other hand, dynamically typed languages don’t seem to exhibit significantly higher defect rates than their statically typed counterparts.

Duck Typing

Ruby itself is a dynamically type-checked language and follows a “duck typing” approach:

If it walks like a duck and it quacks like a duck, then it must be a duck.

What this means is that Ruby developers generally don’t worry too much about an object’s type, but whether it responds to certain “messages” (or methods).

So why bother with static typing in Ruby then, you may ask? While it certainly is no panacea that will make your code magically bug free, it does provide certain benefits:

  • Correctness: static typing is good at preventing certain classes of bugs, like the aforementioned NoMethodError.
  • Tooling: oftentimes, having static type information available during development leads to better tooling options (e.g. refactoring support in IDEs, etc.)
  • Documentation: many statically typed languages have great built-in documentation tools. Haskell’s Hoogle uses this to great effect by offering a search engine where functions can be looked up by their type signatures.
  • Performance: the more the information available to the compiler, the more the performance optimizations that can potentially be applied.

This list is not exhaustive, and one can find counterexamples for most of these points, but there’s certainly a core of truth to them.

Gradual Type Checking

In recent years an approach commonly referred to as “gradual type checking” has made inroads into various dynamically type-checked languages: from TypeScript for JS to Hack for PHP and mypy for Python. What these approaches have in common is that they don’t require an all-or-nothing approach, but instead, allow developers to gradually add type information to variables and expressions as they see fit. This is especially useful for existing large codebases, where one can statically check the most critical parts of the system while still leaving the rest untyped and checked at runtime. All the type checking solutions for Ruby that we’ll explore in the rest of this article follow the same approach.

Options

After looking at why Ruby developers may want to add static type checking to their development workflows, it’s time to explore some of the currently popular options for doing so. However, it’s important to note that the idea of adding static type checking to Ruby isn’t new. Researchers from the University of Maryland worked on a Ruby extension named Diamondback Ruby (Druby) as early as 2009 and the Tufts University Programming Language Group released a paper called The Ruby Type Checker in 2013, which eventually led to the RDL project, which offers type checking and design-by-contract capabilities as a library.

Sorbet

Developed by Stripe, Sorbet is currently the most talked-about type checking solution for Ruby, not least because big companies like Shopify, GitLab, Kickstarter and Coinbase were early adopters during its closed beta phase. It was originally announced during last year’s Ruby Kaigi and saw its first public release on June 20th of this year. Sorbet is written in modern C++ and despite Matz’s preferences (quote: “I hate type annotations”), opted for an approach based on type annotations. One particularly interesting thing about Sorbet is that it opts for a combination of static and dynamic type checking since Ruby’s extremely dynamic nature and metaprogramming capabilities are challenging for static type systems.

# typed: true
class Test
  extend T::Sig

  sig {params(x: Integer).returns(String)}
  def to_s(x)
    x.to_s
  end
end

To enable type checking, we first need to add the # typed: true magic comment and extend our class with the T::Sig module. The actual type annotation is specified with the sig method:

sig {params(x: Integer).returns(String)}

which specifies that this method takes a single argument named x that is of type Integer and returns a String. Trying to call this method with the wrong argument type will lead to an error:

Test.new.to_s("42")
# Expected Integer but found String("42") for argument x

Apart from these basic checks, Sorbet has quite a few more tricks up its sleeve. For example, it can save us from the dreaded NoMethodError on nil:

users = T::Array[User].new
user = users.first
user.username

# Method username does not exist on NilClass component of T.nilable(User)

The snippet above defines an empty array of User objects and when we try to access the first element (which will return nil) Sorbet correctly warns us that no method named username is available on NilClass. However, if we are sure that a certain value can never be nil, we can use T.must to let Sorbet know this:

users = T::Array[User].new
user = T.must(users.first)
user.username

While the above code will now type check, it could lead to a runtime exception, so use this feature with care.

There’s a lot more that Sorbet can do for us: dead code detection, type pinning (essentially committing a variable to a certain type, for example, once it has been assigned a string, it can never be assigned an integer), or the ability to define interfaces.

Additionally, Sorbet can also work with “Ruby Interface” files (rbi) which it keeps in a sorbet/ folder in your current working directory. This allows us to generate interface definitions for all the gems a project uses, which can help us with finding even more type errors.

There’s much more to Sorbet than we can cover in a single article (e.g. the varying strictness levels or metaprogramming plugins), but its documentation is pretty good already and open for PRs.

Steep

The most widely known alternative to Sorbet is Steep by Soutaro Matsumoto. It does not use annotations and doesn’t do any type inference on its own. Instead, it completely relies on .rbi files in the sig directory.

Let’s start from the following simple Ruby class:

class User
  attr_reader :first_name, :last_name, :address

  def initialize(first_name, last_name, address)
    @first_name = first_name
    @last_name = last_name
    @address = address
  end

  def full_name
    "#{first_name} #{last_name}"
  end
end

We can now scaffold an initial user.rbi file with the following command:

$ steep scaffold user.rb > sig/user.rbi

This results in the following file which is intended as a starting point (illustrated by the fact that all types have been specified as any, which provides no safety):

class User
  @first_name: any
  @last_name: any
  @address: any
  def initialize: (any, any, any) -> any
  def full_name: () -> String
end

However, if we try to type check at this point, we’ll encounter some errors:

$ steep check
user.rb:11:7: NoMethodError: type=::User, method=first_name (first_name)
user.rb:11:21: NoMethodError: type=::User, method=last_name (last_name)

The reason we’re seeing these is that Steep needs a special comment to know what methods have been defined through attr_readers, so let’s add that:

# @dynamic first_name, last_name, address
attr_reader :first_name, :last_name, :address

Additionally, we need to add definitions for the methods to the generated .rbi file. While we are at it, let’s also change the signatures from any to the actual types:

class User
  @first_name: String
  @last_name: String
  @address: Address
  def initialize: (String, String, Address) -> any
  def first_name: () -> String
  def last_name: () -> String
  def address: () -> Address
  def full_name: () -> String
end

Now, everything works as expected and steep check doesn’t return any errors.

On top of what we’ve seen so far, Steep also supports generics (e.g. Hash<Symbol, String>) and union types, which represent an either-or choice between several types. For example, a user’s top_post method could return the highest-ranked post written by the user, or nil if they haven’t contributed anything yet. This is represented through the union type (Post | nil), and the corresponding signature would look like this:

def top_post: () -> (Post | nil)

While Steep certainly has fewer features than Sorbet, it’s still a helpful tool and seems to be more in line with what Matz envisioned type checking in Ruby 3 to look like.

Ruby Type Profiler

Yusuke Endoh (better known as “mame” in Ruby developer circles) from Cookpad is working on a so-called level 1 type checker called Ruby Type Profiler. Unlike the other solutions presented here, it doesn’t need signature files or type annotations but instead tries to infer as much as possible about a Ruby program while parsing it. Although it catches a lot less potential problems than either Steep or Sorbet, it comes at no extra cost to the developer.

Summary

While nobody can predict the future, it seems like type checking in Ruby is something that’s here to stay. Currently, there are efforts underway to standardize on a “Ruby Signature Language” for use in .rbi files (potentially scaffolded by Ruby Type Profiler), so developers can use whichever tool they prefer. Steep already allows library authors to ship type information with their gems, and Sorbet has a similar mechanism in the form of sorbet-typed, which was inspired by the DefinitelyTyped repository for TypeScript definitions. If you’re interested in helping shape the future of type checking in Ruby, now is a great time to get involved!

Guest Author Michael Kohl’s love affair with Ruby started around 2003. He also enjoys writing and speaking about the language and co-organizes Bangkok.rb and RubyConf Thailand.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK