PHP 8.2: Locale-independent case conversion
In PHP 8.2, functions that provide case conversion and case insensitive operations only consider the ASCII character range.
When a locale is set, it changes how the underlying C libraries handle strings, including case conversion and case-insensitive comparisons. Prior to PHP 8.0, PHP inherited the system locale, which was often unexpected and caused the PHP applications to be somewhat unpredictable on certain locales. PHP 8.0 and later no longer inherits the system locale, but calling setlocale
with a custom locale can still enable the side-effects, especially in relation to case folding.
In PHP 8.2 and later, PHP's internal case conversion functions are made locale-independent, which affects the following functions:
strtolower
strtoupper
lcfirst
ucfirst
ucwords
stristr
stripos
strripos
str_ireplace
All of the functions above only perform case conversion and comparisons in the ASCII character range.
Because PHP 8.0 changed the way the default locale is set, PHP 8.0 no longer inherits the system locale. Unless an application explicitly calls
setlocale
(with a value other than "C"), this change in PHP 8.2 should not have any effect in applications.
For example, when the locale is set to tr_TR
, PHP versions older than PHP 8.2 returned returned a dotted İ
(LATIN CAPITAL LETTER I WITH DOT ABOVE ) as the capital letter for ASCII i
:
setlocale(LC_ALL, 'tr_TR');
echo strtoupper('i'); // İ
In PHP 8.2, this behavior is fixed, and the current locale has no impact on case conversions or comparisons:
setlocale(LC_ALL, 'tr_TR');
echo strtoupper('i'); // I
Related Changes
Backwards Compatibility Impact
PHP Applications that do not call setlocale
to switch to an alternative locale should not experience any change in their functionality due to this.
PHP 8.0 and later no longer respects the system locale, and overriding it with setlocale
is almost always a bad idea, and can cause side-effects because the locale is set per-process, and not per-request/thread.
For applications that need to reliably convert character cases across various languages should consider using the functionality provided by intl
, mbstring
, or iconv
extensions.