PHP 7.3: PCRE to PCRE2 migration


PHP uses Perl Compatible Regular Expressions, or PCRE in short, as the underlying library for Regular Expressions. Until PHP 7.2, PHP used the 8.x versions of the legacy PCRE library, and from PHP 7.3, PHP will use PCRE2. Note that PCRE2 is considered to be a new library although it's based on and largely compatible with PCRE (8.x).

PCRE2 library is more aggressive in pattern validation, and may result on your existing patterns being not compiling anymore under PCRE2.

For example, the following snippet will fail with PHP 7.3:

preg_match('/[\w-.]+/', '');

PHP will now throw a warning:

Warning: preg_match(): Compilation failed: invalid range in character class at offset 3.

The problem is with the pattern: PCRE2 is strict that the hyphen needs to be moved to the end, or escaped for this to work.

preg_match('/[\w\-.]+/', '');

The above code should compile just fine with PHP 7.3 as well as older versions. Note how this new pattern escapes the hyphen (- to \-). This is perhaps the most common problem you'd run into with existing pattern incompatibilities.

This is a quite subtle change, but there is a chance of things going wrong. Error messages are quite useful: which shows the exact offset of the offending pattern. Make sure to thoroughly test your Regular Expression patterns. There are software, such as Regex Buddy that can help you with conversion to PCRE2 syntax. For more information, see PCRE2 syntax and legacy PCRE syntax. Although this can appear annoying at the first glance, PCRE2 is just being less forgiving about regular expressions that are not 100% compliant to begin with.

Backwards compatibility impact

Because PCRE2 is more nagging and stricter about the patterns, some of your preg_match() and similar calls might not work anymore. The fix can range from a simple update to the pattern (for example escaping hyphens inside a character class), to a rewrite of the pattern. Make sure to run your test suite to detect the errors in compilation.

RFC discussion Implementation