PHP 7.3: Heredoc and Nowdoc syntax requirements are more relaxed

TypeNew Feature

Heredoc and Nowdoc syntax, that helped to use multi-line strings had rigid requirements that the ending identifier should be the first string appearing in a new line.

For example:

$foo = <<<IDENTIFIER
the crazy dog jumps over the lazy fox
"foo" bar;

In here, the last IDENTIFIER must be the first string in a new line for this to work. In addition, there must not be any other characters after the last IDENTIFIER (other than a semi colon, which is optional).

The RFC for PHP 7.3 suggested to remove the requirement above with the goal of making the code more readable. Before this RFC, one had to break the indentation used in the rest of the code just so here/now doc tokens can be used. The RFC suggests making these changes to heredoc/nowdoc syntax:

  1. The ending token no longer needs to be the first string of the line.
  2. The ending token can be indented with spaces or tabs
  3. The white-space characters (space or tab) must not be intermixed. If you do so, you will get a Parse error: Invalid indentation - tabs and spaces cannot be mixed in .. on line ...
  4. The exact number of spaces/tabs used in the ending token will be stripped off from the contents within the heredoc/nowdoc expression.
  5. If the number of white-space characters used in the ending token is greater than any of the white-space characters within the expression, you will get Parse error: Invalid body indentation level (expecting an indentation level of at least ..) in .. on line ..
  6. You can add more expressions after the ending token without any errors
$foo = ['foo', 'bar', <<<EOT
    -  hello world! --
  EOT, 'qux', 'quux'


the output would be:

array(5) {         
  string(3) "foo"  
  string(3) "bar"  
  string(29) "baz  
  -  hello world! --
  string(3) "qux"
  string(4) "quux"

Notice how the white-spaces used in the heredoc declaration did not make in to the var_dump()'d output, and we continued to add more element to the $foo array after the EOT token.

Backwards compatibility impact

As long as you don't have any heredox/nowdoc string literals that contain the same token as the first positive character in a line, you are golden.

$foo = <<<HELLO
  HELLO_WORLD <-- this will not terminate the string literal
  HELLOWORLD <-- this one will not either. 
  HELLO WORLD<-- this one will

If you have any heredoc/nowdoc syntax similar to the above, note that with PHP 7.3, PHP assumes the HELLO terminates the string literal, and will throw an error on the next line. In earlier versions, the HELLO WORLD is not considered the ending token of the heredoc. Thanks to /u/ImSuperObjective2 on reddit for pointing this out.

RFC discussion Implementation