PHP 8.4: Mbstring: New mb_trim, mb_ltrim, and mb_rtrim functions

Version8.4
TypeNew Feature

PHP 8.4 adds mb_ function equivalents for the existing trim, ltrim, and rtrim functions.

The trim/ltrim/rtrim functions strip white-space characters from both, beginning, and end of the string. By default, this strips white space (`), tab (\t), LF (\n), CR (\r), NUL-byte (\0), and vertical tab (\v`) characters.

The new mb_trim, mb_ltrim, and mb_rtrim functions support multi-byte strings, and support trimming any multi-byte characters as well. The default list of white-space characters trimmed is also updated to include the Unicode Z block and a few other characters that are typically trimmed. The list of characters that need to be trimmed can be specified as an optional parameter.

The trim() function and its variants support defining a range of characters with the ... notation. For example, trim('testABC', 'A...E') is equivalent to trim('testABC', 'ABCDE'). This is not supported in mb_trim() and variants.

Function Synopses

mb_trim, mb_ltrim, and mb_rtrim functions follow the trim, ltrim, and rtrim function signatures with an updated default characters list to trim, with an additional $encoding parameter.

See the default list of characters trimmed for a comparison of the characters trimmed by default in trim() variants and mb_trim() variants.

mb_trim function

Multi-byte safely strip white-space (or other characters) from the beginning and end of a string.

/**  
 * Multi-byte safely strip white-spaces (or other characters) from the beginning and end of a string.  
 * 
 * @param string $string The string that will be trimmed.  
 * @param string $characters Optionally, the stripped characters can also be specified using the $characters parameter. Simply list all characters that you want to be stripped.  
 * @param string|null $encoding The encoding parameter is the character encoding.  
 *
 * @return string The trimmed string.  
 */
function mb_trim(string $string, string $characters = " \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}", ?string $encoding = null): string {}

mb_ltrim

Multi-byte safely strip white-spaces (or other characters) from the beginning of a string.

/**  
 * Multi-byte safely strip white-spaces (or other characters) from the beginning of a string.  
 *
 * @param string $string The string that will be trimmed.  
 * @param string $characters Optionally, the stripped characters can also be specified using the $characters parameter. Simply list all characters that you want to be stripped.  
 * @param string|null $encoding The encoding parameter is the character encoding.  
 *
 * @return string The trimmed string.  
 */
function mb_ltrim(string $string, string $characters = " \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}", ?string $encoding = null): string {}

mb_rtrim

Multi-byte safely strip white-spaces (or other characters) from the end of a string.

/**  
 * Multi-byte safely strip white-spaces (or other characters) from the end of a string.  
 *
 * @param string $string The string that will be trimmed.  
 * @param string $characters Optionally, the stripped characters can also be specified using the $characters parameter. Simply list all characters that you want to be stripped.  
 * @param string|null $encoding The encoding parameter is the character encoding.  
 *
 * @return string The trimmed string.  
 */
function mb_rtrim(string $string, string $characters = " \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}", ?string $encoding = null): string {

Characters trimmed by default

The following table shows the characters trimmed by default by trim() (including ltrim() and rtrim()) and mb_trim() (including mb_ltrim() and mb_rtrim()) functions.

Name Character(s)/Regexp Unicode/Regex representation Removed by trim() Removed by mb_trim
Space \u{0020}
Tab \t \u{0009}
End of Line \n \u{0009}
Line Tabulation \v \u{000B}
Carriage Return \r \u{000D}
Form Feed \f \u{000D}
Unicode Space Separator \p{Z}
Null bytes \0
Next Line \u{0085}
Mongolian Vowel Separator \u{180E}

User-land PHP Polyfill

It is possible to mimic the functionality of the new mb_trim, mb_ltrim, and mb_rtrim functions using user-land PHP with a Regular Expression.

See polyfills/mb-trim for a complete polyfill that provides these new functions to older PHP versions. This polyfill can be installed via composer:

composer require polyfills/mb-trim

Backward Compatibility Impact

mb_trim, mb_ltrim, and mb_rtrim are new functions declared in the global namespace. This change should not cause any backward compatibility impact unless there are functions with the same names.

It is possible to implement these functions in user-land PHP code.


RFC Discussion Implementation