PHP 8.3: highlight_file and highlight_string output HTML changes

Version8.3
TypeChange

PHP's highlight_file and highlight_string functions provide syntax highlighting support for PHP. They accept a file or a string containing PHP code, and return an HTML snippet with PHP keywords, functions, and other tokens highlighted. The colors of the syntax highlighter are configured through PHP INI directives.

PHP 8.3 makes some changes to the syntax highlighter, resulting in changes to the resulting HTML output.

<?php
function hello(): void {
    echo "Hello World";
}

hello();
echo highlight_string(string: $code, return: true);

PHP 8.3 makes several changes to how the highlighter process white-spaces, and now wraps the output with a <pre></pre> HTML tag. Further, it no longer converts new-line characters to HTML <br /> tags, resulting in a multi-line highlighted HTML output.

The following is a diff of the HTML output for the PHP snippet above in PHP versions prior to PHP 8.3, and on PHP 8.3:

- <code><span style="color: #000000"> <span style="color: #0000BB">&lt;?php<br /></span><span style="color: #007700">function&nbsp;</span><span style="color: #0000BB">hello</span><span style="color: #007700">():&nbsp;</span><span style="color: #0000BB">void&nbsp;</span><span style="color: #007700">{<br />&nbsp;&nbsp;&nbsp;&nbsp;echo&nbsp;</span><span style="color: #DD0000">"Hello&nbsp;World"</span><span style="color: #007700">;<br />}<br /><br /></span><span style="color: #0000BB">hello</span><span style="color: #007700">();</span> </span> </code>
+ <pre><code style="color: #000000"><span style="color: #0000BB">&lt;?php
+ </span><span style="color: #007700">function </span><span style="color: #0000BB">hello</span><span style="color: #007700">(): </span><span style="color: #0000BB">void </span><span style="color: #007700">{
+    echo </span><span style="color: #DD0000">"Hello World"</span><span style="color: #007700">;
+ }
+ 
+ </span><span style="color: #0000BB">hello</span><span style="color: #007700">();</span></code></pre>

Changes in detail

There are three main changes in the PHP's built-in syntax highlighter in PHP 8.3:

  1. Output is now wrapped in <pre><code></code></pre>
  2. Line-breaks no longer converted to <br /> tags
  3. White spaces and tabs are no longer converted to HTML entities

1. Output wrapped in <pre><code></code></pre> tags

In PHP 8.3 and later, the highlight_file and highlight_string output is wrapped in <pre><code></code></pre> tags. It also removes the outermost <span> tag in previous PHP versions, and adds the style="color: attribute to the <code> tag.

highlight_string('');
- <code><span style="color: #000000">
- </span>
- </code>
+ <pre><code style="color: #000000"></code></pre>

By default, HTML <pre> elements are block elements, which means using highlight_string/file for inline code snippets will likely cause the HTML output to break unless the <pre> is styled to be inline (within <p> tags for example).

2. Line-breaks no longer converted to <br /> tags

PHP 8.3's highlight_file and highlight_string function output no longer converts new-line (\n) characters to HTML <br > tags. This means that the line-breaks in the original snippet will be preserved.

echo "Hello";
echo "World";
echo highlight_string(string: $code, return: true);
- <code><span style="color: #000000"> <span style="color: #0000BB">&lt;?php<br /></span><span style="color: #007700">echo&nbsp;</span><span style="color: #DD0000">"Hello"</span><span style="color: #007700">;<br />echo&nbsp;</span><span style="color: #DD0000">"World"</span><span style="color: #007700">;</span> </span> </code>
+ <pre><code style="color: #000000"><span style="color: #0000BB">&lt;?php
+ </span><span style="color: #007700">echo </span><span style="color: #DD0000">"Hello"</span><span style="color: #007700">;
+ echo </span><span style="color: #DD0000">"World"</span><span style="color: #007700">;</span></code></pre>

3. White spaces and tabs no longer converted to HTML entities

Prior to PHP 8.3, the highlighter converted white-space characters to &nbsp; HTML entities, and tab (\t) characters to four spaces as HTML entities (&nbsp;&nbsp;&nbsp;&nbsp;). In PHP 8.3 and later, white-space characters are not changed at all, and tab characters are converted to four standard space characters.


Backward Compatibility Impact

The function signatures of the highlight_file and highlight_string functions remain unchanged. However, note that PHP applications using these functions to highlight PHP snippets for content might need to update the CSS of the web applications to account for the changes, most notably the introduction of the <pre><code> tag pair.

Further, these functions no longer convert new lines to HTML <br /> tags. However, browsers should still render the new-line characters as new lines because of the <pre> tag.

Given the complexity of parsing HTML, it might not be possible to easily convert the outputs of PHP < 8.3 and PHP >= 8.3 to be identical in either direction.

However, there should be no visible difference in the HTML outputs even after this change, as long as browsers render new-line characters as new-lines within <pre> tags, and the application declares <pre> tags with CSS if they highlight inline PHP snippets. The following is a suggested CSS snippet to make the <pre><code> tags inline when used within a <p> tag:

p pre {
    display: inline;
}

Discussion Implementation