25

PHP: rfc:stable_sorting

 3 years ago
source link: https://wiki.php.net/rfc/stable_sorting
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Introduction

Sorting functions in PHP are currently unstable, which means that the order of “equal” elements is not guaranteed. This RFC proposes to make all sorts in PHP stable.

If multiple elements in the input array compare equal, they will always be sorted adjacently. However, if the sort is unstable, their relative order is not guaranteed, and will appear to be random. A stable sort guarantees that equal elements will retain the order they had in the original array.

Stable sorts are useful primarily when sorting complex data only by some part of that data. Consider this example:

usort($users, function($user1, $user2) {
    return $user1->age <=> $user2->age;
});

This code sorts users by age. Currently, the order of users in one age bracket will be arbitrary. With a stable sort, the original order of the objects will be retained. For example, if $users was already sorted by name, then $users will now be sorted by age first and name second. Of course, in this case it would be possible to explicitly sort by two criteria:

usort($users, function($user1, $user2) {
    return $user1->age <=> $user2->age
        ?: $user1->name <=> $user2->name;
});

However, this is not always possible, because the criterion by which the data was originally sorted is not explicitly stored. A recent case I ran into was a list of git commits with metadata, which were stored in push order (but the push order was not explicitly stored on each commit).

Apart from user-supplied comparison functions, another case where stable sorting is often desirable is the asort() function, which sorts by value, but preserves keys.

$array = [
    'c' => 1,
    'd' => 1,
    'a' => 0,
    'b' => 0,
];
asort($array);
 
// With stable sorting, the result is always:
['a' => 0, 'b' => 0, 'c' => 1, 'd' => 1]
 
// With unstable sorting, the following results are also possible:
['b' => 0, 'a' => 0, 'c' => 1, 'd' => 1]
['a' => 0, 'b' => 0, 'd' => 1, 'c' => 1]
['b' => 0, 'a' => 0, 'd' => 1, 'c' => 1]

It is possible to emulate a stable sort on top of an unstable one by explicitly storing the original order of the elements and using it as a fallback comparison criterion. For example, a stable sort implementation could look like this:

function stable_usort(array &$array, callable $compare) {
    $arrayAndPos = [];
    $pos = 0;
    foreach ($array as $value) {
        $arrayAndPos[] = [$value, $pos++];
    }
    usort($arrayAndPos, function($a, $b) use($compare) {
        return $compare($a[0], $b[0]) ?: $a[1] <=> $b[1];
    });
    $array = [];
    foreach ($arrayAndPos as $elem) {
        $array[] = $elem[0];
    }
}

While this approach works, it is also highly inefficient. The additional indirection makes sorting much slower, and will also skyrocket memory usage during sorting.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK