14

Ruby 2.7 NEWS: Commentary by Cookpad’s Full Time Ruby Committers

 3 years ago
source link: https://sourcediving.com/ruby-2-7-news-commentary-by-cookpads-full-time-ruby-comitters-bdbaacb36d0c
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Ruby 2.7 NEWS: Commentary by Cookpad’s Full Time Ruby Committers

We are Koichi Sasada (ko1) and Yusuke Endoh (mame) from Cookpad Inc. tech team. Cookpad sponsors us to work full time developing the Ruby interpreter (MRI: Matz Ruby Implementation).

Koichi Sasada (ko1) and Yusuke Endoh (mame)
Koichi Sasada (ko1) and Yusuke Endoh (mame)

We released a Japanese article “Ruby 2.7 NEWS explained by Ruby Professionals” when Ruby 2.7 was released on 25th Dec. 2019. This is an English translation of the article with help from Miles Woodroffe.

NEWS” is a text file that lists all new features and changes of the Ruby interpreter. Compared to a few years ago, we are making an effort to make the file easier to read, for example, by adding many examples. Some of the code in the article is quoted from the NEWS file. In this article, in addition to a description of the new features and changes, we will explain the background of “why and how the changes have been introduced” as much as possible to hopefully make it easier to understand.

The Ruby 2.7 release has a lot of changes leading to Ruby 3, which is scheduled for release next year, in 2020. Also, as always, many useful new features and performance improvements have been introduced. We hope you will enjoy them.

Language Changes

Changes to the grammar, semantics, etc. of the Ruby programming language.

1. Pattern Matching

Pattern matching has been experimentally added [Feature #14912]. The feature is to check and deconstruct a data structure nicely.

What is a pattern match?

At first glance, it looks like the case/when statement that you are familiar with, but the new syntax is case/in. The semantics are also similar to case/when; it will find a pattern that matches the value of {a: 0, b: 1, c: 2}, from the top to the bottom.

in {a: 0, x: 1} means “the key a exists and its value is 0” and “the key x exists and its value is 1”.
{a: 0, b: 1, c: 2} satisfies the first condition, but does not the second condition because there is no x. So the pattern does not match the value.

Then, it tries the next pattern in {a: 0, b: var}. This means “the key a exists and its value is 0” and “the key b exists and its value is anything, and substitute it for the variable var”.
The value satisfies both conditions, so the pattern matches the value, assigns 1 to var, and executes the corresponding clause, i.e., p var in this case.

If no pattern matches, a NoMatchingPatternError exception is thrown. Note that this is different from case/when syntax.

As a concrete use case, you can use it to check that the JSON data has the expected structure, and then take out the necessary data at once. You don’t have to use #dig anymore.

It will take a long time to explain the full details of pattern matching. For more on this, see the material by Kazuki Tsujimoto who designed and implemented Ruby’s pattern matching (may be a little old unfortunately).

What was difficult in introducing pattern matching?

Here is the background of the change.

Pattern matching is a feature often used in statically typed functional programming languages. It has been anticipated for Ruby for a long time: Many people tried to emulate it in Ruby, proposed it for Ruby, or prototyped it with Ruby:

However, it was difficult to propose a suitable language-builtin syntax. This is because Ruby’s syntax is too flexible and has little room for expansion. It is almost impossible to introduce a new keyword because of backwards compatibility. In addition, usually, a pattern syntax for pattern matching is similar to the construction syntax: [x, y, z] for an array and {a: x, b: y} for a hash.

Kazuki Tsujimoto settled this situation. He suggested reusing the keyword in. Ruby has a iteration syntax for ... in (that is rarely used these days), and for this syntax in was already a keyword, so there is no need to introduce a new one. It might not be the best keyword to express a pattern matching, but the case/in syntax is reasonably intuitive. Using this idea, the introduction of pattern matching became a reality.

Kazuki made the prototype proposal of grammar and semantics for pattern matching in 2018, triggered the discussion, implemented in 2019, committed at the time of RubyKaigi 2019, and repeated the experiments and discussions for more than half a year. Finally it is released in 2.7.

However, it is still experimental, and if you use it, you will get the following warning.

$ ./miniruby -e 'case 1; in 1; end'
-e: 1: warning: Pattern matching is experimental, and the behavior may change in future versions of Ruby!

In the future, I think that it will be stabilized through finer improvements from using it more widely. Putting it into production code may be a risk, but I’d like you to give it a try and give us feedback.

(Credit: mame)

2. Warning for keyword argument separation in Ruby 3

In Ruby 3, an incompatibility called “keyword argument separation” is planned. Ruby 2.7 now warns about code that will not work in Ruby 3. [Feature #14183]

Keyword arguments of Ruby 2

Ruby 2 keyword arguments are passed as just hash arguments. This is a legacy of the Ruby 1 era, and I thought it was a natural extension at the time. However, this design was a trap that created many non-intuitive behaviors.

The problem is that the called method cannot tell whether it was a keyword or a hash. This is a concrete example:

We define two very similar methods. Call them by passing them a hash:

foo({}) #=> [{}, {}]
bar({}) #=> [1, {}]

Surprised? Since the methods do not know whether the last argument was a hash object or a keyword, they interpret the argument with a ad-hoc priority of “required arguments > keyword arguments > optional arguments”. At the time of 2.0 release, it was “keyword argument > required argument > optional argument”, but it has been changed after bugs were reported.

How can we pass {} as an optional argument to bar? You might come up with bar({}, **{}). However, in Ruby 2.6 this is a disappointing result.

bar({}, **{}) #=> [1, {}]

**{} is considered “the same as nothing”, and then, the first {} is interpreted as a keyword. To pass the argument {} to bar, it was correct to call it bar({}, {}). I don’t know.

In the initial version of 2.0, it meant “**{}consistently passes {}", but a bug report stating “**{}should be the same as nothing” was submitted. Was changed later. Ruby 2’s keyword arguments keep repeating that fixing something creates new intuition.

Keyword arguments in Ruby 3

The problem with Ruby 2 stems from the basic design of passing keyword arguments as a hash. Ruby 3 will fix this fundamentally. That is, it separates keyword arguments from positional arguments.

In Ruby 3, foo({}) always passes positional arguments, and foo(**{}) always passes keyword arguments. Sounds perfect.

# in Ruby 3
foo({}) #=> [{}, {}]
bar({}) #=> [{}, {}]foo(**{}) #=> wrong number of arguments (given 0, expected 1)
bar(**{}) #=> [1, {}]

However, this would break any code that was written as foo(opt) and intended to pass keyword arguments. Here you need to rewrite foo(**opt).

So Ruby 2.7 works in principle the same as Ruby 2.6, but it warns you if you make a call that doesn’t work this way.

def foo(**kw)
end foo({}) #=> test.rb: 4: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
# test.rb: 1: warning: The called method `foo' is defined here

If this warning appears, it means that the code will not work in Ruby 3 unless you take appropriate measures. Add ** in this case.

More about keyword argument separation

While this is easy, the migration of keyword argument separation is sometimes more difficult, especially in cases involving delegation. Check out the migration guide on Ruby’s official website.

Background of keyword argument separation

I confess. I (Yusuke Endoh) implemented the keyword argument in Ruby 2. As an excuse, I knew that there were some corner cases that were somewhat uncomfortable. However, Ruby 1 already had the omission of hash braces in a method call (foo(:a => 1, :b => 2)). Ruby 2’s keyword arguments were a natural extension. So I thought it was a good compromise if we cared about compatibility.

However, a number of non-intuitive behaviors have been reported since the 2.0 release, and each time they made the semantics increasingly complex. What I learnt is that the language design must not be based on bug reports.

I’ve always regretted this and hoped to fix it with Ruby 3. When I became a full-time committer in Cookpad, I told Matz about this, and Matz thought the same way. He declared to change at RubyWorld Conference 2017 and RubyConf 2017.

Since late 2017, Matz, akr, ko1, and I have been working on this issue; what semantics we should aim for in Ruby 3, and how we can provide a migration path. I repeated the process of creating a design proposal, performing an experiment with a Cookpad internal Rails app, and measuring the impact. Some of the results were also reported in the ticket [Feature #14183].

At first, we aimed for complete separation, but the incompatibilities were too big (too much code passes foo(k: 1) to the method def foo(opt = {}); end). This was claimed by Jeremy Evans on the ticket, and we agreed on the ticket to give up separation on this case. This was around April 2019.

Jeremy Evans became the committer following this discussion. He did not only insist on the suitable semantics, but also implemented and experimented. Ruby 2.7 couldn’t be released without him. I really appreciate his efforts.

(mame)

3. Numbered parameters

A feature called Numbered parameters has been introduced that allows you to omit the declaration of block parameters. [Feature #4475]

We can convert the elements of the array ary = [1, 2, 10] into a string expressed in hexadecimal by ary.map {|e| e.to_s(16) }. The variable name of this block parameter e means “element”, and it is a variable that I often choose when I don’t want to think of a good name. However, since it is an integer, it is hard to discard the integer |i|. No, the number |n| might be nice too. Which should I do? Hmm. Naming is cumbersome.

Naming can be a hassle. It’s a simple program (converting the contents of an array to hexadecimal notation would be a very simple program in Ruby), and I don’t want to spend time to think about it.

Therefore, a new feature called numbered parameter was introduced (Feature #4475: default variable name for parameter). This feature allows you to refer to block arguments as _1 and_2 without naming them. The example of conversion to hexadecimal notation above can be written as ary.map {_1.to_s(16)}.

ary.map {|e| e.to_s(16)}
ary.map {_1.to_s(16)}

Looking the two, you can see that three characters have been omitted. Well, I think it’s a nice thing that you don’t have to think about names when writing, rather than the number of characters.

Detailed story of Numbered parameters

If you use a lot of _1 and_2, your program will be a chaos. You should be careful to use them only for really small bits of code. Maybe someone will write a plugin for RuboCop. Also, it is better not to use it when you need to make your code understandable to others. An appropriate variable name is fairly important for easy-to-understand programs.

It is difficult to understand even if blocks are nested, so only one block can use the numbered parameter in block nesting. For example, if you use it in the outer block and the innermost block, you will get an error:

3.times do
_1.times do
p _1
end
end

# =>
# t.rb: 3: numbered parameter is already used in
# t.rb: 2: outer block here
# p _1
# ^ ~~~~~~~

The error message is easy to understand.

When using only the first argument in a block, you can use two similar notations, iter {|e|} and iter {|e,|}. The difference is |e| and |e,| in the declaration of block arguments, that is, whether or not there is a comma. Well, because it’s a hassle, I avoid a detailed explanation, but if only one _1 is written in the block, it means|e|.

_1 was an ordinary variable that could be used for local variables and method names. However, it is better not to use it in the future. If you already have a program which uses _1 like names, we recommend you rename it.

If there is a local variable _1 outside the block, it will be used as a variable in the outer scope, not an implicit block argument. However, a warning is issued for the expression _1 = ....

_1 = 0
#=> warning: `_1' is reserved for numbered parameter; consider another name
[1].each {p _1} # prints 0 instead of 1

Discussion on Numbered parameters

This feature has been discussed for quite some time. However, it was hard to decide. The main issues are notation and function.

For example, Groovy had a feature called it, equivalent to_1 this time. However, in Ruby, an identifier it is already used in RSpec etc. There was also a debate as to whether only the first argument was sufficient. Personally, <> of Scheme (SRFI 26) was good, but there were some opinions that it looks like !=.

Suddenly, Matz once decided @1, @2, … at a Ruby developer meeting (a monthly meeting to review the specifications with Matz). At this time, using only @1 had the same meaning as|e,|.

After that, there were various discussions about notation. For example, the ticket Misc #15723: Reconsider numbered parameters-Ruby master-Ruby Issue Tracking System has 129 comments.

In addition, mame-san performed an experiment of mechanically changing the block argument to @1, … and checking the notation (https://twitter.com/mametter/status/1159346003536838656). Looking at this, Matz said that @1 is too similar to an instance variable.

And then, Matz changed his mind and chose _1, _2, … Also, |e| is very often used than |e,|, so (lonely)_1 was changed to have the same meaning as |e|. It took very long until it was decided.

This feature is all about how to write blocks easily. In the above conversion to hexadecimal notation, for example, if you have Integer#to_s16 to convert to hexadecimal notation, you can write ary.map(&:to_s16). Everyone loves it. But we have no to_s16. So, numbered parameters were introduced because it meets the needs of writing such a block. Another suggestion was to provide a mechanism to generate a Proc with an argument “16”, for example, ary.map(&:to_s(16)). Requests of various notations like this have been proposed. We thought that numbered parameter will solve almost all of them, so this numbered parameter (_1, …) was introduced.

If someone designs a different mechanism that allows us to use this kind of functional language-like features in Ruby, it may be introduced in future.

(ko1)

4. Calling proc/lambda without blocks is deprecated/prohibited

  • Proc.new andKernel#proc with no block in a method called with a block gets a warning now.
  • Kernel#lambda with no block in a method called with a block raises an exception.

proc {...} creates a Proc object. Well, do you know what happens if you don’t pass the block?

If you don’t specify a block, it returns a block that is passed to the method that calls proc as a Proc object.

The feature was designed in very old era, when there were no block arguments (&block) in Ruby. This change quits this feature. If you are using this feature, you will get a warning: warning: Capturing the given block using Kernel#proc is deprecated; use '& block' instead.

lambda had been warned for a few years, and now throws an exception (ArgumentError) in 2.7.

How to write without using proc / lambda without block

From now on, please rewrite by using a block argument:

If a method accepts a block not only as a Proc but also via a block argument, proc without a block was useful.

This cannot be reproduced with only a block argument.

In this way, it is possible to respond by adding a conditional statement one extra line. Well, it’s hard to read, so it’s better to reconsider the API of receiving both.

No block proc / lambda prohibition background

The reason for this fix was that Matz suggested how to define a method that does not accept a block as def foo(&nil). (Note that this proposal is not included in 2.7.) This proposal was aimed at alerting the mistake of wrongly passing a block to a method that does not accept a block. (Have you ever written p {...} by mistake?) However, I was opposed with this feature because it would bring to Ruby a diligent custom that people add &nil to all the method definitions that accept no block, i.e., most method definitions.

Instead, I think that the interpreter should display a warning or an error against passing a block to a method that seems not to use a block. After a few tries, I actually found two bugs ([Feature #15554]). However, this proposal has some problems and is not included in 2.7. This is because, in real-world programs, I found some reasonable usages that intentionally pass a block to a method that does not use the given block. Let’s expect Ruby 3.

There were some obstacles to determine whether a method uses a block or not. One of them is “proc without a block". proc uses a block, but it looks just like a normal method call to the interpreter. The interpreter has no way to determine at compile time whether the method call named proc is really Kernel#proc. So the interpreter overlooks a usage of a given block.

In addition to this background, some people says that proc (and lambda) without blocks should be prohibited in recent years. So it became obsolete.

(ko1)

5. Beginless range

  • A beginless range has been experimentally introduced. It might not be as useful as the endless range, but could be good for DSL purpose. [Feature #14799]

You may know that the endless range was introduced in Ruby 2.6. The next addition is its beginless counterpart.

ary = [1, 2, 3, 4, 5]
p ary[..2] #=> [1, 2, 3]

Unlike endless range, beginless range cannot use #each, so it will be limited to use for expressing a range of values.

(mame)

6. Deprecation of special variables $; and $,

  • Setting $; to non-nil value will generate a warning now. Use of it in String#split will also be warned. [Feature #14240]
  • Setting $, to non-nil value will generate a warning now now. Use of it in Array#join will be warned too. [Feature #14240]

As part of the move away from the special variables inherited from Perl, the use of $; and $, will display a warning. I don’t know what the variables are, but according to NEWS, they seem to be related to String#split andArray#join.

Few people use them in this way. In addition, they are potentially dangerous since changing them affects some libraries that use String#split. This is why they are deprecated.

(mame)

7. Line break is prohibited as an identifier of the quote here document

  • Quoted here-document identifier must end within the same line.

In here-documents, you can enclose identifiers in quotes. (FYI: You can prohibit string interpolation by writing like <<'EOS'.) Now, this part of EOS was actually allowed to contain line breaks. I think you don’t know what I’m saying.

<< "EOS
"# This had been warned since 2.4; Now it raises a SyntaxError
EOS

We could write this kind of code. In this case, EOS\n was the delimiter. Since Ruby 2.4, such a program has been warned, but in Ruby 2.7 it will be changed to an error. I wonder if this has been used in any program.

(ko1)

8. Flip-flop is back

Flip-flop became obsolete in Ruby 2.6, but came back (probably) because the voice of “I’m still using it!” was so strong. Congratulations to the fans. Saying your idea is important.

(ko1)

9. Some method chains like .bar can be commented out

  • Comment lines can be placed between fluent dot now.
foo
# .bar
.baz #=> foo.baz

When the method chain foo.bar.baz is described with a line feed before the., you may want to comment out only the .bar part, for example, during debugging. Previously, commenting out like this resulted in a syntax error, but in Ruby 2.7 it now means foo.baz.

Note, blank line will generate a syntax error.

foo

.bar
#=> syntax error, unexpected '.', expecting end-of-input

(ko1)

10. Private method can be called even with self.

Private methods could not be called with a receiver. That is, the private method foo could not be called as recv.foo. Also, self.foo failed as well. However, for some reason, we want to attach self., and Ruby 2.7 allows it.

self.p 1
# =>
# Ruby 2.6: t.rb: 1: in `<main>': private method `p' called for main: Object (NoMethodError)
# Ruby 2.7: 1

Note that the receiver must be written exactly as self.; private methods cannot be called ass.foo using variables such as s = self.

With this feature, if you call self.private_method until now, method_missing would be called, but now you can call the called method, so there is a tiny incompatibility that method_missing will not be called. I don’t want to think that there is a program depending on this behavior.

(ko1)

11. Changing the precedence of postfix rescue in multiple assignment

  • Modifier rescue now operates the same for multiple assignment as it does for single assignment. [Bug #8279]
a, b = raise rescue [1, 2]
# Previously parsed as: (a, b = raise) rescue [1, 2]
# Now parsed as: a, b = (raise rescue [1, 2])

As you can see in the comments, the postfix rescue has changed when it is used in multiple assignment expressions. Was the position of the parentheses the same as you expected?

Personally, I don’t use postfix rescue because it’s difficult. I don’t know what exception to catch.

(ko1)

12. yield in singleton class is deprecated

  • yield in singleton class syntax is warned and will be deprecated later [Feature #15575].

I suspect you cannot understand what this is saying. Even if you do, you can probably think of no reason why one would do so. The following code works on Ruby 2.6 and earlier.

But it doesn’t make sense, so it is warned. I believe no one does so. It will be a syntax error in Ruby 3.

The background of the change is actually the same as the prohibition of proc without a block.

(ko1)

13. Notation for delegating arguments (...)

A notation for delegation that passes received arguments to other methods as-is has been introduced.

Both the recipient and the passer must be (...). Otherwise, it would be a syntax error.

Delegation used to be written as follows. It was cumbersome, difficult to optimize, and required to pass **opt in Ruby 3. Thus, the feature was introduced.

In addition, many people think that “parentheses can be omitted in Ruby method calls”, but parentheses are required for this delegation notation. The reason is that bar ... is interpreted as an endless range.

(mame)

14. Deprecation of $SAFE

  • Access and setting of $SAFE is now always warned.$SAFE will become a normal global variable in Ruby 3.0. [Feature #16131]

$SAFE, the old security mechanism of Ruby, has been removed. Did you even know about $SAFE?

$SAFE is security feature based on “taint” flag which are marked to untrusted data (such as strings) generated from IO input, etc. If the code attempts to use the tainted data for dangerous operations like system or open, the interpreter stops it (SecurityError is raised).

However, modern frameworks do not consider $SAFE and do not add an appropriate taint flags, making $SAFE virtually unusable. In this situation, trusting $SAFE is more dangerous. So deletion of this feature was proposed.

In 2.7, if you assign something to $SAFE, you will get the following warning.

$SAFE = 1
#=> t.rb:1: warning: $SAFE will become a normal global variable in Ruby 3.0

In Ruby 3.0, $SAFE will became a normal global variable. I was wondering if it would be treated like a “retired number”, though.

  • Object#{taint, untaint, trust, untrust} and related functions in the C-API no longer have an effect (all objects are always considered untainted), and are now warned in verbose mode.This warning will be disabled even in non-verbose mode in Ruby 3.0, and the methods and C functions will be removed in Ruby 3.2. [Feature #16131]

With this change, methods such as Object#taint have no effect. This means that there are no objects with the taint flag. Ruby 3.2 will remove these methods, so let’s deal with them early.

(ko1)

15. Refinements are considered in Object#method etc.

  • Refinements take place at Object#method and Module#instance_method. [Feature #15373]

Until now, the method objects retrieved by Object#method etc. did not care about refinements, but now they do.

Updates To Embedded Classes

Updates to the core classes of the Ruby language.

1. Introduction of Array#intersection

Array#intersection has been introduced to extract only the common elements of arrays. It is almost the same as the Array#& (it is slightly different as it can take intersections of three or more arrays).

ary1 = [1, 2, 3, 4, 5]
ary2 = [1, 3, 5, 7, 9]p ary1.intersection (ary2) #=> [1, 3, 5]

Ruby 2.6 introduced Array#union as a counterpart toArray #|, but intersection was not introduced because it had not been requested. This time, a request came, so we implemented it. Ruby development sometimes runs paranoid on a demand basis.

(mame)

2. Performance improvement of Array#minmax, Range#minmax

  • Added Array#minmax, with a faster implementation than Enumerable#minmax. [Bug #15929]

When ary.minmax was executed, Enumerable#minmax was executed on Ruby 2.6. However, by preparing Array#minmax separately, it can be executed faster. Because it doesn’t have to call #each.

Theoretically, Enumerable and each are sufficient for this. But practically, they are not, so this kind of change is needed. As an interpreter developer, I’d really like to be able to make it faster without these kinds of tricks.

  • Added Range#minmax, with a faster implementation thanEnumerable#minmax. It returns a maximum that now corresponds to Range#max. [Bug #15807]

Similarly, Range#minmax has been prepared separately. In addition, since the algorithm for calculating the maximum value uses Range#max instead of Enumerable#max, incompatibilities may appear.

(ko1)

3. Comparable#clamp corresponds to Range argument

-1.clamp(0..2) #=> 0
1.clamp(0..2) #=> 1
3.clamp(0..2) #=> 2

As you can see, we round up and down to fit in the range of 0..2. The same thing could be done with n.clamp(0, 2), but there was a request to write it in a Range, so it has been added. An exception is raised if an exclusive range is given (because the meaning of rounding down when it is exceeded cannot be defined).

0.clamp(0...2) #=> ArgumentError (cannot clamp with an exclusive range)

(mame)

4. Introduction of Complex#<=>

  • Added Complex#<=>. So 0 <=> 0i will not raiseNoMethodError. [Bug #15857]

Complex is well known for not being able to define comparisons, but a comparison method has been introduced.

Complex(1, 0) <=> 3 #=> -1

It seems to want to be able to compare when the imaginary part is 0. Note that the comparison of a complex whose imaginary part is not 0 is nil.

Complex(0, 1) <=> 1 #=> nil

(mame)

5. Dir.glob and Dir.[] do not support NUL separate pattern

  • Dir.glob andDir.[] no longer allow NUL-separated glob pattern. Use Array instead. [Feature #14643]

A feature that nobody knew about has been quietly removed. For example, if there are two files named “foo” and “bar” in the current directory, and if Dir.glob is called with a pattern "f*\0b*":

Dir.glob("f*\0b*") #=> ["foo", "bar"]

So, NUL-character (\0) was a “OR” pattern. This behaviour has been removed. If you want to specifically do this, you can use an array.

Dir.glob(["f*", "b*"]) #=> ["foo", "bar"]

(mame)

6. Added CESU-8 encoding

An encoding called CESU-8 has been added.

I don’t know it well, but it seems to be a deprecated encoding (UTR#26: Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8)). You can ignore it unless it is being used in a system that you need to use.

(ko1)

7. Add Enumerable#filter_map

A method to do filter and map simultaneously has been added.

[1, 2, 3].filter_map {|x| x.odd? ? X.to_s : nil } #=> ["1", "3"]

If the result of the block conversion is falsy (false or nil), it will be deleted. Whether to discard both or only nil has pros and cons, and has been discussed many times, but we decided to discard both due to the intuition of Matz.

It is roughly equivalent to: filter and map.

[1, 2, 3].filter {|x| x.odd? }.map {|x| x.to_s } #=> ["1", "3"]

The following example looks like “map and then filter".

# Collect first element of array, but discard false
[ary1, ary2, ary3].filter_map {|ary| ary.first }[ary1, ary2, ary3].map {|ary| ary.first }.filter {|elem| elem }

Since we often want to use filter and map in combination, filter_map has been introduced as a dedicated method that eliminates the need to create intermediate arrays.

(mame)

8. Add Enumerable#tally

A convenience method for counting the number of elements has been introduced.

["A", "B", "C", "B", "A"].tally
#=> {"A" => 2, "B" => 2, "C" => 1}

It returns a hash with elements as keys and numbers as values. Have you ever implemented it yourself?

tally is a word that represents the action of counting while writing a line (Tally marks-Wikipedia).

(mame)

9. Add Enumerator.produce

  • Added Enumerator.produce to generate Enumerator from any custom data-transformation. [Feature #14781]

A class method has been added that is useful for creating infinite sequences as an Enumerator.

naturals = Enumerator.produce(0) {|n| n + 1 } # [0, 1, 2, 3, ...]
p naturals.take(10) #=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The argument is an initial value, and an infinite sequence is created by applying the given block repeatedly to it.

There was a request from Ruby 2.6, but the name was not easy to determine. In Haskell, the name is iterate, but in Ruby there is the word iterator, which is confusing; generate is a bit too common; recurrence is not well understood; how about from, … After all suggestions were considered, Matz chose produce.

(mame)

10. Add Enumerator::Lazy#eager

  • Added Enumerator::Lazy#eager that generates a non-lazy enumerator from a lazy enumerator. [Feature #15901]

A method to convert from Enumerator::Lazy toEnumerator. To understand this, you need to understand the background a bit more deeply.

Ruby has three types of data that represent a sequence of elements: Array, Enumerator, and Enumerator::Lazy.

Array is a data class in which all elements are arranged in sequential memory, and has the property of consuming memory for each element.
On the other hand, Enumerator and Enumerator::Lazy are classes whose internal representation is “calculation to yield the next element”. Even if they represent the same sequence, Enumerator and Enumerator::Lazy can avoid memory consumption. Note that, however, they have disadvantages such as being slow, and cannot be randomly accessed. This is the trade-off.

The difference between Enumerator and Enumeartor::Lazy is quite subtle; the Enumerator's methods often enumerate the elements and actually return an Array, whereas Enumerator::Lazy's methods delay the enumeration of elements until #force method is called.

For example, for a sequence from 0 to 10,000, you want to double each element and get the first five elements.

Enumerator#map immediately enumerates the elements and returns Array, so it creates an array with a length of 10,000 and consumes a lot of memory.
On the other hand, Enumerator::Lazy#map returns Enumerator::Lazy instead of Array because the enumeration of elements is delayed. By doing take(5) and doing force at the end, you can take out the first five elements without creating a 10,000-length array.

Now, consider a last_odd method that takes an Enumerator and returns the last odd number.

What if you want to pass Enumerator::Lazy to this method? If you pass it as is, the result of select is also Enumerator::Lazy, which will cause an error because last is not defined. It works if you pass it to an Array produced by force, but it wastes memory. It would be nice if we could convert from Enumerator::Lazy to Enumerator, but there was no API to convert it easily.

So, to support this case we have introduced Enumerator::Lazy#eager.

You can use this to convert Enumerator to Enumerator::Lazy, then you can safely pass it to last_odd.

Well, it’s complicated. I feel that only Array and Enumerator::Lazy were enough.

(mame)

11. Add Enumerator::Yielder#to_proc

  • Added Enumerator::Yielder#to_proc so that a Yielder object can be directly passed to another method as a block argument. [Feature #15618]

Enumerator::Yielder can be passed to each with &. I think there are few people who can understand that.

Enumerator can be used in two ways for this. One is to call each without the block like (0..10000).each shown in the last section, and the other is Enumerator.new {|y| ...}:

This y is an object of the Enumerator::Yielder class. With the addition of to_proc, you can write something like this:

I wonder how many people are using Enumerator like this.

(mame)

12. Add Fiber#raise

  • Added Fiber#raise that behaves like Fiber#resume but raises an exception on the resumed fiber. [Feature #10344]

Fiber#raise method has been added. What it does is to resume first, and then raise an exception in the resume destination context.

When Fiber is doing something like a worker, it can be used to raise an exception that interrupts processing.

However, because it is difficult (it may generate an unintended exception flow of the fiber creator), it is better to avoid using it if possible. I think that you should do the halt explicitly with resume.

For example, as in the previous example,

When nil comes, the loop terminates. It is clearer.

(ko1)

13. Change a corner case of File.extname

  • File.extname now returns a dot string at a name ending with a dot on non-Windows platforms. [Bug #15267]

File.extname, which gets the extension from the file name string, has changed slightly.

File.extname("foo.") #=> "" in Ruby 2.6
#=> "." in Ruby 2.7

The reason is that when the result of basename and the result of extname are combined, they should round-trip.

f = "foo."
b = File.basename(f, ".*") #=> "foo"
e = File.extname(f) #=> "."
p b + e == f #=> true

Note that this still returns "" on Windows for profound reasons. I’m not sure, but on Windows it seems that filenames ending with a dot are invalid.

(mame)

14. FrozenError supports receiver

  • Added FrozenError#receiver to return the frozen object that modification was attempted on. To set this object when raising FrozenError in Ruby code, pass it as the second argument to FrozenError.new. [Feature #15751]

Attempting to update a frozen object will result in a FrozenError exception. FrozenError#receiver indicates which object you attempted to update.

begin
''.freeze << 1
rescue FrozenError => e
p e.receiver
#=> "", indicating that you tried to change a frozen string
end

You can now create FrozenError by specifying receiver as FrozenError.new (receiver: obj). But, well, there is rarely a chance to make such a thing.

When I searched Gem code published in rubygems with FrozenError.new, I found 122 lines. It’s surprising.

(ko1)

15. GC.compact is added

  • Added GC.compact method for compacting the heap.This function compacts live objects in the heap so that fewer pages may be used, and the heap may be more CoW friendly. [Feature #15626]

One of the highlights of Ruby 2.7 is the heap compaction functionality. Although it is called with GC.compact, compaction is not performed every time GC is performed, but every time the GC.compact method is executed manually.

See the ticket for details on this major contribution from Aaron Patterson.

GC.compact: What is compaction?

Here is a brief introduction to “heap compaction”. First, Ruby objects are stored on the heap, which is implemented as a collection of pages. A page is a sequence of slots that store objects.

When an object is created, we look for an empty slot and that slot is used for the new object. When a GC occurs, the unused slots are reclaimed, the unused objects are collected, and the slots are reserved again as free (empty) slots. In other words, after GC, there will be sparse free slots and busy slots.

Heap compaction in MRI means moving a live object from one page to another with a free slot. As a result, it is possible to change from “pages that have some free slots and some busy ones” to “pages that have busy slots” and “pages that have free slots”. That means eliminating fragmentation. Freeing “pages that have free slots” can be more memory efficient.

GC.compact: The feature introduced in 2.7

Copying GC is one GC algorithm which automatically performs compaction. I knew of its existence, of course, and I wanted to introduce it to Ruby. However, for various reasons (mainly for performance reasons), I thought it would be difficult to introduce a GC that performs compaction every time, such as copy GC. GC.compact has been realized brilliantly thanks to a change in the idea: human beings explicitly instructing the timing of compaction.

Technically, there are some objects that cannot be “moved” in MRI (due to historical reasons). Therefore, the algorithm is to leave them as is and move only those that can be moved. It is called “mostly compaction algorithm”.

Many changes of MRI were introduced for GC.compact, so it may still have problems. When you use it, please be prepared for this possibility. Maybe, GC.compact is not something you call manually (the framework might call it). Please let me know if you find any problems.

(ko1)

16. IO#set_encoding_by_bom is added

  • Added IO#set_encoding_by_bom to check the BOM and set the external encoding. [Bug #15210]

Unicode data may have a BOM at the beginning. IO#set_encoding_by_bom has been added to set the external encoding according to the BOM if it is attached to IO and discard the BOM.

IO needs to be opened in binmode.

io = open("with_bom", 'rb')
p io.tell #=> 0
p io.external_encoding #=> #<Encoding: ASCII-8BIT>
p io.set_encoding_by_bom #=> #<Encoding: UTF-8>
p io.tell #=> 3 (discarded)
p io.external_encoding #=> #<Encoding: UTF-8>

(ko1)

17. Integer#[] supports Range

There is a method called Integer#[] that extracts 0 or 1 for the n-th bit number.

Bitwise operations are rather complicated, so isn’t it convenient when you want to do that kind of thing?

(mame)

18. Rich result of Method#inspect

The notation of Method#inspect has been enhanced. In particular, these two are added:

  • (1) Parameter information
  • (2) Information on defined location
def foo(a, b=1, *r, p1, k1: 1, rk:); end
p method(:foo)
#=> #<Method: main.foo(a, b=..., *r, p1, rk:, k1: ...) t.rb:2>

In this case, (1) is (a, b=..., *r, p1, k1: ...) and (2) is t.rb:2.

When you wonder “what is this method?”, you can inspect the method object. I hear you can get various information with $ on pry, though.

As an aside, at first, (1) and (2) were separated by @ in the meaning of “at” which represents a place. However, when copying the file name on the terminal, the space delimiter is easier because we can just double-click. Looks good.

(ko1)

19. Module#const_source_location is added

  • Added Module#const_source_location to retrieve the location where a constant is defined. [Feature #10771]

Module#const_source_location which returns the definition location of the constant has been added. The position is returned as an array of [file_name, line_number].

class C
class D
end
endp Object.const_source_location('C')
# => ["t.rb", 1]
p Object.const_source_location('C::D')
# => ["t.rb", 2]

(ko1)

20. Module#autoload? Supports inherit option

  • Module#autoload? now takes an inherit optional argument, like as Module#const_defined?. [Feature #15777]

Module#autoload? now has an inherit option to indicate whether to see the autoload status of the class it inherits from. The default is true.

Sample from RDoc.

(ko1)

21. Various frozen character strings

  • Module#name now always returns a frozen String. The returned String is always the same for a given Module. This change is experimental. [Feature #16150]
  • NilClass#to_s, TrueClass#to_s and FalseClass#to_s now always return a frozen String. The returned String is always the same for each of these values. This change is experimental. [Feature #16150]

Results such as Module#name, which returns the module name, and true.to_s, are now unique frozen strings. In the past, they allocated a new object every time for string modification.

By the way, an experiment with Symbol#to_s being frozen was also tried, but it did not work and was reverted.

(ko1)

22. ObjectSpace::WeakMap#[]= can also hold symbols etc.

  • ObjectSpace::WeakMap#[]= now accepts special objects as either key or values. [Feature #16035]

You don’t need to read this section because it is about a class called ObjectSpace::WeakMap, which is not recommended for direct use. Don’t read.

WeakMap is an object like a hash, but when keys or values are collected by the GC, the contents disappear.

o = ObjectSpace::WeakMap.new
o["key"] = "value"
p o.size #=> 1GC.startp o.size #=> 0

Because it depends on GC, the result zero is not guaranteed. Note that if it is # frozen-string-literal: true, it will not disappear.

WeakMap internally sets finalizers on keys and values, so it was not possible to have objects that cannot have a finalizer, such as symbols and numbers.

But this time it was allowed. It is because this restriction was troublesome to use WeakMap as a cache, but since it is a class that is not recommended to use, I do not know what will happen actually.b

(mame)

23. $LOAD_PATH.resolve_feature_path is added

Ruby 2.6 introduced a method called RubyVM.resolve_feature_path which identifies the file to be read when calling require. The method has been moved to the $LOAD_PATH singleton method.

RubyVM is a place to put something specific to the so-called MRI (Matz Ruby Implementation), but resolve_feature_path is likely to be used in other implementations, so we decided to move it outside. However, since it is not a method used by many people, it is not good to put it in the Kernel. And after discussion, it became a very delicate position as a singleton method of $LOAD_PATH.

(mame)

24. Unicode version is updated

The corresponding Unicode version has been increased from 11 to 12.1.0.

For example, Unicode 12 contains small “ゐ” (U + 1B150). It could not be displayed in my environment, though. This has a character property called insmallkanaextension, so you can match this with a regular expression.

p /\p{insmallkanaextension}/ =~ "\u{1b150}" #=> 0

When the characters are widespread in future, some people may gain the benefit.

(mame)

25. Symbol#start_with? and Symbol#end_with? are added

The title says all. Two methods in String have been added to Symbol.

String and Symbol, where are the lines drawn?

Where do you draw the line between String and Symbol? This is a historical issue. Symbol fundamentalists claim to be completely different, while String extremists claim to do the same. I feel like it’s flowing to String extremists.

(ko1)

26. Time#ceil and Time#floor methods are added

Time objects can actually have a time with a higher precision than seconds (i.e., nanosecond).

p Time.now.nsec #=> 532872900

The Time#round method rounds time to seconds, but similarly, floor (round up) and ceil (round down) have been added.

(ko1)

27. Time#inspect is different from Time#to_s, and inspect outputs up to nanoseconds

  • Time#inspect is separated from Time#to_s and it shows its sub second. [Feature #15958]

Up to Ruby 2.6, to_s and inspect truncated and displayed information smaller than seconds. So, by using p or inspect, different Time objects might be displayed like the same.

So we separated Time#to_s and Time#inspect, and Time#inspect now shows nanoseconds (if any). It seems that Time#to_s was not changed to maintain compatibility.

(ko1)

28. UnboundMethod#bind_call is added

This is a feature for advanced Ruby users. Do not use it in normal programs.

Overriding a method by inheriting from a class causes the new method to be called.

It is natural. However, in the rare case like a black magic, there is a request to call a method before it is overridden. At this time, some people use evil idioms that combine UnboundMethod#bind and Method#call.

p Foo.instance_method(:foo).bind(obj).call #=> "foo"

It takes out an instance method object (UnboundMethod) of Foo, binds it to the target object, and calls it, so that the method before being overridden can be called. However, since bind and call are quite heavy operations, bind_call was introduced because it would be a little faster if you put them together.

p Foo.instance_method(:foo).bind_call(obj) #=> "foo"

You need this evil idiom only if you can’t predict what objects will come, like pp or some runtime monitoring libraries. Again, never use it in normal programs.

(mame)

29. Add filters by category of warning (add Warning.[], Warning.[]=)

  • Added Warning.[] and Warning.[]= to manage emit / suppress of some categories of warnings. [Feature #16345]

By setting Warning[category] = true or false, warnings belonging to category can be enabled or disabled.

Many compatibility warnings have been introduced in Ruby 2.7, and we discussed how to eliminate the warnings at once. However, dismissing all warnings also suppresses warnings that may be of interest.

Therefore, Warning.[] and Warning.[]= have been added because we want to control only the warnings for certain categories. Currently there are only two categories: :deprecated and :experimental. I guess it will be organized in the future (But who does it?).

Warning[:deprecated] = falsedef foo
proc # deprecated warning by default, suppresses the warning
end
foo {}

(ko1)

Update standard attached library

The library has also been updated in various ways. There are some in NEWS, but we will introduce only those that are interesting to us.

1. CGI.escapeHTML is 2-5 times faster in some case

CGI.escapeHTML seems to be faster.

(ko1)

2. Renewal of IRB

irb has been redesigned to include the following features:

  • Multiple-line edit
  • Auto indentation
  • Method-name completion
  • Document (rdoc) view
  • Syntax highlighting

It is hard to explain in this text, so please try it out yourself. Install Ruby 2.7 now.

If you can’t do it right away, you can run gem install irb because it also works on 2.6, or see the video of the Release Announcement.

The most popular Ruby interactive console is pry, but now irb has overtaken some aspects. I hear that pry is also considering multi-line editing support. I hope they will be more convenient in the competition.

IRB Renewal: Amazing and Important Points

A terminal emulator is a super legacy that has been improved since the days of the typewriter. It is surprisingly difficult. Coloring is not so difficult, but when it comes to multi-line editing, auto-indentation, and completion, it’s a kind of a text editor. It works on Windows consoles as well as Linux. Works on JRuby. Libraries such as ncurses are not used due to various restrictions, so it is implemented with its own code, without libraries, and is as complicated as “screen” and “tmux” commands. It is amazing.

However, it has not been widely used yet; this is the first time that the new version is widely used. As mentioned earlier, the terminal is a super legacy, and there are differences in behavior among terminals. Sakura Itoyanagi who achieved this improvement seems to have started development around 2018, so he has not been able to experience various environments and usages, and it can not be said that it is mature. So, please try it out and give us feedback if you find weird behavior. If you are having problems, you may want to start with the option irb --legacy and it will work with the previous version of IRB.

(mame)

3. “Did you mean” is displayed if typo option in OptionParser

Ruby has had a built-in did_you_mean gem for some time, and suggests a correction when there’s a typo in a method name or constant name. I incorporated it into OptionParser.

$ ruby test.rb --hepl
Traceback (most recent call last):
test.rb:6:in `<main>': invalid option: --hepl (OptionParser::InvalidOption)
Did you mean? help

--help was mistakenly typed as --hepl, and it gives a suggestion like “Did you mean? help”.

test.rb just uses OptionParser normally.

(mame)

Incompatibilities

1. Some libraries are no longer bundled

The following libraries are no longer bundled gems. Install corresponding gems to use these features.

  • CMath (cmath gem)
  • Scanf (scanf gem)
  • Shell (shell gem)
  • Synchronizer (sync gem)
  • ThreadsWait (thwait gem)
  • E2MM (e2mmap gem)

These libraries are no longer bundled gems (that is, gems that are automatically installed when you install Ruby). If necessary, put them in Gemfile or something.

(ko1)

2. The format of Proc#to_s has changed

Proc#to_s (an alias to Proc#inspect) returns a string containing the file name and line number (where the Proc was generated). In 2.6, instead of [email protected]:123, @ is changed to a blank like ... file.rb:123.

In other words, it looks like this.

p proc {}.to_s
# =>
# Ruby 2.6
# "# <Proc: [email protected]:1>"
# Ruby 2.7
# "# <Proc: 0x0000024cc385c3e0 t.rb:1>"

It is a change according to Method#to_s.

It’s not a big difference. But I found a test failure that uses string match with a regular expression to extract the file name from the result. So I put it in an incompatible place just in case.

(ko1)

Library incompatibilities

1. Gem conversion

Promote stdlib to default gems. The following default gems were published at rubygems.org

  • benchmark
  • delegate
  • getoptlong
  • net-pop
  • net-smtp
  • open3
  • pstore
  • singleton

The following default gems only promoted ruby-core, not yet published at rubygems.org. (looks like coming soon)

  • monitor
  • observer
  • timeout
  • tracer

Additionally, the did_you_mean gem has been promoted up to a default gem from a bundled gem

(ko1)

2. Pathname()

  • Kernel#Pathname when called with a Pathname argument now returns the argument instead of creating a new Pathname. This is more similar to other Kernel methods, but can break code that modifies the return value and expects the argument not to be modified.

Pathname(obj) returns obj itself if obj is Pathname, instead of returning a new Pathname.

p1 = Pathname('/foo/bar')
p2 = Pathname(p1)
p p1.equal?(p2)
#=> Ruby 2.6: false
#=> Ruby 2.7: true

(ko1)

3. profile.rb, Profiler__

  • Removed from standard library. No one updated it from Ruby 2.0.0.

Removed from standard library. Nobody is maintaining it. It will be available as a gem, but not yet.

(ko1)

Changes to command line options

1. Added -W:(no-)category option

The feature of Warning[category] = true or false can now be specified on the command line.

  • Enable warnings: -W:category
  • Disable warnings: -W:no-category

Like Warning[category], there are currently two categories, deprecated and experimental.

Usage example:

    # deprecation warning
$ ruby -e '$; = ""'
-e:1: warning: `$;' is deprecated

# suppress the deprecation warning
$ ruby -W:no-deprecated -e '$; = //'

# works with RUBYOPT environment variable
$ RUBYOPT=-W:no-deprecated ruby -e '$; = //'

# experimental feature warning
$ ruby -e '0 in a'
-e:1: warning: Pattern matching is experimental, and the behavior may change in future versions of Ruby!

# suppress experimental feature warning
$ ruby -W:no-experimental -e '0 in a'

# suppress both by using RUBYOPT
$ RUBYOPT='-W:no-deprecated -W:no-experimental' ruby -e '($; = "") in a'

(ko1)

C API changes

  • Many *_kw functions have been added for setting whether the final argument being passed should be treated as keywords.You may need to switch to these functions to avoid keyword argument separation warnings, and to ensure correct behavior in Ruby 3.

A function with a function name ending in _kw has been added for keyword separation in Ruby 3.

  • The : character in rb_scan_args format string is now treated as keyword arguments.Passing a positional hash instead of keyword arguments will emit a deprecation warning.

The format string : in rb_scan_args() now matches the meaning of the latest keyword argument instead of the last optional hash.

When receiving a function pointer, you can no longer use the ANYARGS feature to indicate that you don’t know its argument. Let’s write the type of the function pointer properly.

(ko1)

Performance Improvements

Performance improvements made for Ruby 2.7.

1. Fiber and thread implementation improvements

  • Allow selecting different coroutine implementation by using --with-coroutine=, e.g.
./configure --with-coroutine=ucontext
./configure --with-coroutine=copy

In configure, you can select the Fiber implementation. Well, you don’t need to worry (the default is fine).

  • Replace previous stack cache with fiber pool cache.The fiber pool allocates many stacks in a single memory region.Stack allocation becomes O (log N) and fiber creation is amortized O (1) .Around 10x performance improvement was measured in micro-benchmarks. https://github.com/ruby/ruby/pull/2224

Various improvements have been made to the stack strategy allocated for Fiber, makingFiber creation about 10 times faster. Hooray!

It depends on the environment, but the strategy is to reserve a large area with mmap and divide it and use it.

  • VM stack memory allocation is now combined with native thread stack, improving thread allocation performance and reducing allocation related failures. ~ 10x performance improvement was measured in micro-benchmarks.

In a similar way, getting the VM stack from the machine stack with alloca has greatly increased the VM stack allocation time. This is also about 10 times faster.

(ko1)

2. Use of realpath(3)

  • File.realpath now uses realpath(3) on many platforms, which can significantly improve performance.

It seems that the performance has been improved by using realpath(3) if it can be used (I don’t know well).

(ko1)

3. Improvement of Hash data structure

A small hash (specifically 1 to 8 elements) requires 128 bytes instead of 192 bytes of memory (64-bit environment).

This is achieved by changing the storage of the hash value for each key-value pair to only one byte. I hope it is effective.

(ko1)

4. Speed up by reimplementing Monitor in C

The Monitor class (MonitorMixin module) was written in Ruby. But due to handle_interrupt method, the overhead was not be negligible. Especially, in Ruby 2.6, some code was added for proper implementation, which made it much slower.

So, by rewriting in C, it was reasonably faster than before.

(ko1)

5. Improved inline method cache

An inline method cache is a cache that is placed at a method call, and that saved the result of the previous method search. In 2.6, if the receiver class was different from the previous call, the cache was not used, even if it invokes the same method. Now, Ruby 2.7 uses the cache by saving the different classes as a cache key.

In an experiment based on Rails app (the discourse benchmark), the inline method cache hit rate increased from 89% to 94%. It’s very important to be faster because method calls are a lot of work in Ruby.

(ko1)

6. JIT improvement

  • JIT-ed code is recompiled to less-optimized code when an optimization assumption is invalidated.

There are some prerequisites for advanced optimization, but when those conditions are not satisfied, it recompiles to a version without the optimization.

  • Method inlining is performed when a method is considered as pure. This optimization is still experimental and many methods are NOT considered as pure yet.

An experimental feature to inline “pure” methods has been implemented. I don’t explain the definition of “pure” because it is troublesome, but in most cases a method is not pure.

  • Default value of --jit-max-cache is changed from 1,000 to 100

The parameter --jit-max-cache now defaults from 1,000 to 100. This is a matter of how many methods remain JIT.

  • Default value of --jit-min-calls is changed from 5 to 10,000

The default value of the parameter --jit-min-calls has been increased from 5 to 10,000. This parameter is the threshold for how many times it is called before JIT compilation.

(ko1)

7. Reduce the size of the compiled instruction sequence

  • RubyVM::InstructionSequence#to_binary method generate compiled binary. The binary size is reduced. [Feature #16163]

With the method RubyVM::InstructionSequence#to_binary, you can convert the sequence of instructions executed by the VM, so-called bytecode, to binary and output it. These binaries are used in Bootsnap, etc., and are used to speed up the startup of Ruby applications.

This output was in a very wasteful format, so we asked Nagayama-san, who came to be an intern at Cookpad to review the specifications, make it slim, and reduce the output size. Please refer to Improve Binary Output of Ruby Intermediate Expression-Cookpad Developer Blog for details (in Japanese).

(ko1)

Other

Other notable changes in Ruby 2.7.

1. IA64 support discontinued

  • Support for IA64 architecture has been removed.Hardware for testing was difficult to find, native fiber code is difficult to implement, and it added non-trivial complexity to the interpreter. [Feature #15894]

It seems that the production of Itanium has ended. So, we decided to stop its support. The code base had quite special processing.

(ko1)

2. Use of C99

MRI implementations can now be written in C99 instead of C89 (with some restrictions). We can write // comments! But C99 is published 20 years ago.

(ko1)

3. Git

  • Ruby’s upstream repository is changed from Subversion to Git.

The source code is now managed by Git. Instead of managing everything on GitHub, there is a separate Git repository, and the GitHub repository is synced up nicely.

  • RUBY_REVISION class is changed from Integer to String.

With the conversion to Git, RUBY_REVISION, which was a Subversion revision (numerical value), is now a Git commit hash.

p RUBY_REVISION
# => "fbe229906b6e55c2e7bb1e68452d5c225503b9ca"
  • RUBY_DESCRIPTION includes Git revision instead of Subversion’s one.

Similarly, RUBY_DESCRIPTION, which included the Subversion revision, now includes a commit hash.

p RUBY_DESCRIPTION
# => "ruby 2.7.0dev (2019-12-17T04: 15: 38Z master fbe229906b) [x64-mswin64_140]"
# Since it is a development version, the release version is probably different.

(ko1)

4. Support for writing embedded classes in Ruby

  • Support built-in methods in Ruby with __builtin_ syntax. [Feature #16254 Some methods are defined in * .rb (such as trace_point .rb) .For example, it is easy to define a method which accepts keyword arguments.

This is what I (ko1) talked at RubyKaigi 2019. See in detail “RubyKaigi 2019: Write a Ruby interpreter in Ruby for Ruby 3-Cookpad developer blog” (in Japanese) or presentation slides (Write a Ruby interpreter in Ruby for Ruby 3 — RubyKaigi 2019).

To put it simply, built-in classes such as Array had been written only in C, but now we can write them by easily combining Ruby and C.

Currently, it is only used by some classes (for example, the definition of TracePoint class is written in a file called trace_point.rb), but I would like to rewrite other classes (:contribution_chance:).

I will summarize the details of this mechanism in another article.

The premise of incorporating this mechanism was the size reduction of the compiled instruction sequence described earlier.

(ko1)

Conclusion

Ruby 2.7 also has various changes. Please give it a try.

Next year Ruby 3 is finally scheduled to be released. We are looking forward to it (we hope it gets out properly).

Happy hacking with Ruby 2.7!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK