4

Splat the List: A Proposal to support Variadics in PHP’s list language construct

 1 year ago
source link: https://markbakeruk.net/2022/06/26/splat-the-list-a-proposal-to-support-variadics-in-phps-list-language-construct/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Splat the List: A Proposal to support Variadics in PHP’s list language construct

I was thinking recently about how useful PHP’s splat operator (...), also known as variadics or array packing/unpacking, can be. I’ve written about variadics before, here and here. Variadics are incredibly powerful and useful; but there are still some limitations with them that I find frustrating. Although, to be fair, the limitation that I’ve encountered here is probably more related to list() than it is to variadics. I’ve also written recently about the list() language construct, and some of the ways it can be use.

A common situation that I’ve encountered a few times now is when I receive an array of data values, need to extract the first few elements from that array into individual variables, and retain the rest of the array “as is”. Put simply, if my initial array is as basic as $initialArray = [1, 2, 3, 4, 5], I want to be able to extract $firstValue = 1, $secondValue = 2, and $remainingValues = [3, 4, 5].


$initialArray = [1, 2, 3, 4, 5]
// do something here
// and the result that I'm looking for is:
//    $firstValue = 1;
//    $secondValue = 2;
//    $remainingValues = [3, 4, 5];

So what are my options for doing something like this in PHP?

The simplest (and most efficient) approach would be something like:


$firstValue = $data[0];
$secondValue = $data[1];
$remainingValues = array_slice($data, 2);
// Or, if I want to re-index the remaining values of this enumerated array
$remainingValues = array_slice(array_values($data), 2);

var_dump($firstValue, $secondValue, $remainingValues);

Of course, this only works with an enumerated array, which is why I need to use array_values() to reset the enumeration for $remainingValues if I knew that I would need to extract further values later; but if the initial array was associative, or there were gaps in the enumeration I’d need to be a little bit cleverer with my approach:


$firstValue = array_shift($data);
$secondValue = array_shift($data);
$remainingValues = $data;

This approach will work with any array, but array_shift() is inefficient (especially with larger arrays) because of it’s need to re-index the array, and both these approaches become a bit longer to code if I need to extract more than 2 elements from that array into individual variables, so if we want (for example) the first four values, then I might take advantage of list (or short list syntax), and the fact that it works with indexes rather than with position in the array, which means that it will work with both associative and enumerated arrays:


$data = ['A' => 1, 'B' => 2, 'C' => 3, 'D' => 4, 'E' => 5, 'F' => 6, 'G' => 7, 'H' => 8];

['A' => $firstValue, 'B' => $secondValue, 'C' => $thirdValue, 'D' => $fourthValue] = array_slice($data, 0, 4);
$remainingValues = array_slice($data, 4);

var_dump($firstValue, $secondValue, $thirdValue, $fourthValue, $remainingValues);

Or if I don’t want to specify the keys in my list() statement, but know that the array is ordered correctly, then I can use:


[$firstValue, $secondValue, $thirdValue, $fourthValue] = array_values(array_slice($data, 0, 4));
$remainingValues = array_slice($data, 4);

Using list() allows me to keep the number of lines of code short, no matter how many values I want to extract from the array. List is a useful and powerful functionality in PHP; but although list() could handle variations like extracting the first and third elements, we’ll lose the second element from our $remainingValues if we use this approach, because array_slice() isn’t quite as selective.

Despite that caveat, and as long as we remember the array_values() call, this approach will work with both associative and enumerated arrays, and I can even take advantage of the preserve_keys argument for the call to array_slice() if necessary; but what happens if our data is an iterable, but not an array?


function generate(int $count) {
    for ($i = 1; $i <= $count; ++$i) {
        yield $i;
    }
}

$data = generate(5);

None of the above approaches will work then, because they all use array-specific functions. Of course, I can always cast the iterable to an array, but I sometimes wish that there was a simple, native-PHP approach that I could use.

That’s when I started thinking about using list() with variadics, but unfortunately there are two stumbling blocks in the way:

  • Fatal error: Uncaught Error: Cannot use object of type Generator as array
  • Fatal error: Spread operator is not supported in assignments

We can’t use list() with anything but an array, and list() doesn’t support variadics. list() isn’t a function despite its appearance, it’s a language construct, and short list syntax with keys makes that a lot clearer. But while it’s understandable (albeit somewhat annoying) that PHP’s array_* functions should only work with arrays – the clue is in the function name – I don’t see any reason why list() should be limited to working only with arrays.


One approach that works with both arrays and other iterables is slightly more complex, using an arrow function to handle the separation into individual variables, and with this approach we can also take advantage of the splat operator for the “remaining” values:


$splitData = fn($firstValue, $secondValue, ...$remainingValues) => [$firstValue, $secondValue, $remainingValues];

[$firstValue, $secondValue, $remainingValues] = $splitData(...$data);

And we can even take advantage of “named arguments” from PHP 8.0 onwards to handle associative keys from the $data, and to select individual values from any named element in the array; but unfortunately that named arguments option isn’t available for enumerated arrays because argument names can’t be numeric, so they are simply assigned first value to first argument, second value to second argument, etc, so we can only extract elements in key order without skipping any.


function generate(int $count) {
    for ($i = 1; $i <= $count; ++$i) {
        yield chr(64+$i) => $i; // Yields A => 1, B => 2, C => 3, etc
    }
}

$data = generate(5);

$splitData = fn($A, $B, ...$remainingValues) => [$A, $B, $remainingValues];

[$firstValue, $secondValue, $remainingValues] = $splitData(...$data);

var_dump($firstValue, $secondValue, $remainingValues);

This will work with enumerated arrays, and with other iterables/Traversables; but sadly it won’t work with associative keys unless we explicitly match the argument names in the arrow function with the keys from the iterable. If the argument names and keys don’t match up, then in PHP 8, the error is particularly obscure, even downright misleading:


Fatal error: Uncaught ArgumentCountError: Too few arguments to function {closure}(), 0 passed and exactly 2 expected

at least PHP 7 provided a more meaningful error message if we trying using an associative array with this technique:


Fatal error: Uncaught Error: Cannot unpack array with string keys

So it might overcome some of the basic issues with list() but it still has severe limitations, including some limitations that a basic use of list() can handle without problem.

And while the code itself is simple, and I could even build the arrow function dynamically as an anonymous function to handle extracting a different number of extractions, there’s a high cognitive complexity overhead here (although it is unlikely to be picked up as complex by most static analysis tools); it’s more difficult for somebody seeing this code for the first time to understand what it’s doing, especially with the use of named arguments for associative keys.

But all of these methods have issues if we don’t want to extract the earliest entries, for example if we wanted to extract the first, third and fifth entries, but leave the second and fourth entries as part of $remainingValues.


As I was thinking about this, I decided that the neatest solution would require a few change to PHP itself, to allow the use of list() for iterables as well as arrays, and to permit the use of the splat operator inside list() itself:


[$firstValue, $secondValue, ...$remainingValues] = $data;

This syntax could achieve everything I needed to address my problem, it should work with enumerated or associative arrays (or with other iterables), and the cognitive complexity is low; what the code is doing is easy to understand (even with keyed list entries), and it should be able to extract any entries from the array, even leaving “gaps” that will remain part of $remainingValues.


$data = range('A', 'H');

[0 => $firstValue, 2 => $thirdValue, 4 => $fifthValue, ...$remainingValues] = $data;
// or
[$firstValue, , $thirdValue, , $fifthValue, ...$remainingValues] = $data;

// and the result that I'm looking for is:
//    $firstValue = 'A';
//    $thirdValue = 'C';
//    $fifthValue = 'E'
//    $remainingValues = [1 => 'B', 3 => 'D', 5 => 'F', 6 => 'G', 7 => 'H'];

This is the first proposed change to core PHP: support for an optional variadic argument in list(). The variadic $remainingValues argument should be an array that includes all entries from the $data array that aren’t explicitly required for the other arguments of the list(), and these values should retain their keys (whether associative or enumerated), and their order in the original $data array. Of course, there may be no additional elements in the $data array, in which case $remainingValues should be an empty array.

The second proposed PHP core change is for list() to work with any iterable, not only with arrays but with Traversables as well. This isn’t a critical requirement, because there is always the option of using the iterator_to_array() function to convert that Traversable to an array before feeding it to list(); but it would certainly be cleaner if it was handled in PHP core.


I haven’t yet looked closely at the list() code in the PHP source yet to see exactly how it works internally, but I believe that these proposed changes are feasible – although I might need to seek help providing an implementation, because it’s a construct of the PHP language rather than a PHP function, and I suspect that it would be beyond my own skill level.

And in practical terms, it’s too late to put forward any new proposals for PHP 8.2, there isn’t time for the necessary discussion and voting on proposals like these; but once 8.2 has reached a GA (General Availability) release, then I shall be submitting RFCs (Requests for Comment) to internals targeted for the PHP 8.3 release.

I also have two other proposals for changes to list(), on type-casting and on handling missing keys; but I’ll discuss the purpose and use cases for those in separate blog posts.

Loading...

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK