PHP 8.0: substr and iconv_substr return empty string for out-of-bound offsets

Version8.0
TypeChange

substr, mb_substr, iconv_substr, and graphme_substr functions in PHP provides a way to retrieve a part of the provided string.

Prior to PHP 8, if the offset parameter is longer than the provided string itself, it returned a boolean false. This violated the documented function signature that mentioned string as the return type.

For example, prior to PHP 8, the following snippet returns false on these functions:

substr('FooBar', 42, 3); // false
mb_substr('FooBar', 42, 3); // ""
iconv_substr('FooBar', 42, 3); // false
grapheme_substr('FooBar', 42, 3); // false

In the snippet above, the string offset parameter is 42, although the string itself is only 6 characters. substr and its complement functions (except mb_string) from other extensions return false.

In PHP 8, the behavior of these functions have changed.

substr

substr returns an empty string if the offset is larger than the length of the string. Prior to PHP 8, it returned false.

substr('FooBar', 42); // ""

mb_substr

mb_substr already returns an empty string in all PHP versions, and is not changed in PHP 8.0.

iconv_substr

iconv_substr returned false if the offset is larger than the length of the provided string. This is changed in PHP 8 to return an empty string.

iconv_substr('FooBar', 42); // ""

On negative string offsets, iconv_substr function clamps the offset to the length of the string.

iconv_substr('FooBar', -42, 4); // "FooB"

grapheme_substr

grapheme_substr returns an empty string if the offset is larger than the length of the string. Prior to PHP 8, it returned false similar to substr.

grapheme_substr('FooBar', 42); // ""

This now applies to all the following functions:

substr('FooBar', 42, 3); // ""
mb_substr('FooBar', 42, 3); // ""
iconv_substr('FooBar', 42, 3); // ""
grapheme_substr('FooBar', 42, 3); // ""

The offset parameter can be a negative string, which makes substr return a portion of the string counted from the end of the provided string. If the negative string offset exceeds the length of the string, the cursor will not go beyond that. This functionality was not changed for substr and mb_substr functions.

iconv_substr and grapheme_substr functions now clamp negative offsets to the length of the string, following substr and iconv_substr functions.


PHP < 8.0 PHP >= 8.0
substr('FooBar', 42) false ""
substr('FooBar', -42, 4) "FooB" "FooB"
mb_substr('FooBar', 42) "" ""
mb_substr('FooBar', -42, 4) "FooB" "FooB"
iconv_substr('FooBar', 42) false ""
iconv_substr('FooBar', -42, 4) false "FooB"
grapheme_substr('FooBar', 42) false ""
grapheme_substr('FooBar', -42, 4) false "FooB"

Backwards Compatibility Impact

The return type of substr and iconv_substr functions will now be string, from the previous string|false, which brings them in line with mb_substr function.

If you relied on these functions return value to be false, it will now return an empty string ("") instead, which is arguably the more semantically correct behavior.

This change was made after PHP 8.0 beta 4 (final beta) was released. It will only be effective in PHP 8.0 versions RC1 and later, including GA (general availability) versions. In PHP 8.0 versions prior to RC1, grapheme_substr function was set to throw ValueError exceptions on invalid offsets.


Implementation