PHP 8.2: str_split function returns empty arrays for empty strings

Version8.2
TypeChange

str_split function splits a given string into an array, with each value containing a given number of bytes. Mbstring extension provides a counter-part named mb_str_split, that can correctly split a string into an array, containing a set number of characters, accounting for multi-byte characters.

Prior to PHP 8.2, the str_split function incorrectly returned an array containing an empty string ("") when an empty string is split.

str_split('') === [""];

This behavior is undocumented, and is unexpected because an empty string contains no bytes into which the string is split.

This is now fixed in PHP 8.2 to correctly return an empty array ([]). This change can be a breaking change for applications that relied on this erroneous behavior in the implementations in older PHP versions.

Since PHP 8.2, calling str_split on an empty string returns an empty array:

str_split('') === [];

$length, the second parameter of str_split function, does not affect the return value for empty strings.

mb_str_split from the Mbstring extension already behaves the way str_split does in PHP 8.2, returning an empty array for empty string inputs.

Backwards Compatibility Impact

This is a breaking change because the output of str_split function is different since PHP 8.2.

Note that the use of str_split function to split a string that needs to be split into characters (as opposed to bytes) is incorrect, because str_split function only splits a string on a given number of bytes, even if a character takes multiple bytes to represent. Using the mb_str_split function is recommended , because mb_str_split correctly splits a string accounting for multi-byte characters as well.

Replacing str_split call with mb_str_split is often the best approach, because it can work with multi-byte characters as well as being consistent across PHP versions:

- str_split($value);
+ mb_str_split($value);

In the rate case that the application must use an array containing an empty string for empty strings (despite being semantically incorrect), explicitly doing so produces consistent results across PHP versions:

- $bytes = str_split($value);
+ $bytes = $value === '' ? [''] : str_split($value);

Implementation