This not completely new because the first implementation already had this, but this representation had some problems. There was no way to model a hexadecimal escape without making things ambiguous. This problem can be solved by making the order of the internal parts explicit, but we then had a terrible representation of the string:
Note that this string is represented with 1 of those DQContent-thingies, the biggest one has three children. So every string with more then 3 parts has nested DQContents. Let me give you an example, the string "Hello \\\0123" looks like:
So this had to be solved by a post-processing step. We have to walk over the tree bottom-up to rewrite these nasty DQContent's to a nice list. It is not nice to have such a post-process-step, but this was already required because of HereDoc-strings.
The problem with the HereDoc is analogous with the problem of the Dangling-else. If you have multiple HereDoc-strings with the same label you will have to choice where the first HereDoc ends. PHP always takes the shortest HereDoc so this piece of code has two variables:
$foo = <<<BAR
$bar = <<<BAR
As long as HereDoc is ambiguous this is easily solved by choosing the right amb-node.
But after the rewrite to the new internal implementation, HereDoc became unambiguous! This is a bit frustrating because it takes the longest HereDoc, which is wrong. I spend some time in trying to get it right, but I could not make it work (yet). So a new puzzle has entered the project, happy christmas! :)
P.S. As said in the last blog, I can hope, but maybe I should just read.