And the answer is...

Did you have enjoy thinking about the problem? I definitely had fun during this weekend, pictures will be put on the website soon.

But back to the question off last friday, what will be the result? It would not have surprised me if PHP would raise a NOTICE, but this is not the case. It might have surprised me if PHP used the last statement in the definition as the result, but this is also not the case. The variable $result is just null, not initialized, nothing happens! This is probably not the intention of a programmer, either the function is misspelled or the function is not correctly implemented. So this pattern is added to the correctness-category and has id C005.

I have also worked on the constant-propagation again. There is now support for superglobals and the list-statement. The support for superglobals also implies that every variable is also accessible through the $GLOBALS array.

Getting through the weekend

My upcoming weekend will be filled with a lot of fun and excitement, I have another World Jamboree troop-weekend. Check out this site to see the other people of my troop.

For the people that can not spend a weekend without wondering about PHP I have a little exercise. What is the output of the following piece of code:
<?php 
function foo($param1, $param2){
$param1 + $param2;
}

$result = foo(1,2);

echo $result;

?>

Will it be an error, the result of the function or nothing at all? Don't spoil the fun by running it in PHP without thinking about it! The answer was a little surprising for me.

I proudly present: the logo

The PHP-Sat logo is made by Robert van Geenhuizen.
I am very grateful that he took the time to develop this logo. I think it is 'compleet hip', which means so much as 'very trendy'.

Robert made this logo with a concept in mind. The following quote describes this concept:

The creation of the PHP-Sat logo is based
upon the following:

Develop a conceptually strong logo that
uses modern illustration techniques to
make a simple, yet strong ideograph.
This results in a "Bug" stopped by a
imaginary (debug)-filter, the conceptual
base principal behind PHP-Sat.

The logo consists of many complex items
with gradient mesh and three tints, but
together they still form the basis for
this strong and modern ideograph.


The statement is very hard to translate, so this is the statement in Dutch. If anyone can provide me with a better translation, please let me know.

Bij de creatie van het PHP-SAT logo is
uitgegaan van het volgende:

Een conceptueel sterk logo neerzetten
dat door middel van moderne
illustratietechnieken een modern maar
toch simpel en sterk beeldmerk is.
Dit resulteert in een Bug die door een
denkbeeldig (debug)-filter vliegt, het
conceptuele basisprincipe achter PHP-SAT.

Het logo bestaat uit veel complexe items
met verloopnetten en 3 kleurtinten, maar
toch vormen zij samen de basis voor dit
sterke en aanwezige, moderne beeldmerk.


Here it comes:

php-sat logo

What is going on?

If you have been wondering about what is going on, here is the answer. I have attended a conference of the NLUUG last Thursday. I actually worked there to check badges, but I could also attend the talks that where given. The main reason for being at the conference was the change to talk to people that would probably be interested in PHP-sat. I gave a demo to some of the partners of madison-gurkha, which went well I think. They where introduced to me by Armijn Hemel, who has a brother (Tim) that works there. I already presented the tool to Tim last Monday and he seemed to like it too. He is even learning Stratego to be able to adapt the tool.

Armijn also helped me out with some typo's in the documentation and with testing the tool. He found some things that where not support by the SDF. Some of them where rather easy to fix, others require an update of Stratego or a post-processor for the parsing.

If you have checked the issue tracker recently you might have noticed things are a bit more organized now. I have added some milestones to be able to plan more. So you can check out the roadmap to see what is going on.

Another thing that is going on is the creation of a paper. I have to write a paper for the STC and Martin suggested that it would be nice to write for a real conference as well. I will try to get it published so that the project will get an even more solid foundation. But I will certainly talk about the project at the STC and maybe also at the Stratego User Days.

To conclude this entry with a cliffhanger, the logo for the project should be ready very soon!

Anything new?

Sorry for the late update, college is really getting started so I have to get used to going to lectures and meetings again. But my php-*-projects are still going strong. The evaluation strategy can handle construction of, assigning to and reading entries from arrays now. An array in the interpreter is modeled as a map which maps keys to values. Some useful strategies have been implemented to easily add and retrieve values from the arrays. The documentation about arrays tells us that PHP also sees an array as a map, so the implementation of the semantics was not that hard. The only thing that it does not support (yet) is references. These will be added later.

The second thing that has been added is the fix for the long lasting psat-2 bug-entry. This states that HereDoc should be parsed as a sequence of literals and escapes, just like double-quoted strings. This turns out to be very tricky. Double-quoted strings have a very straight-forward start and end-point, HereDoc does not have this. The end of a HereDoc is marked with a newline-label-newline sequence. So the newline was added as an escape in a literal, we have to treat this newline in a special way if it is followed by a label. The problem that arises here is that we cannot express this within SDF. This should be written down as a follow-restriction on the newline, but then we would have to know the length of the label. This length is not fixed, so we can not do this. By not setting the follow restriction a literal became ambiguous when there was a variable in front of it. The solution here was to write out the definition of 'a list of literals or escapes'. This definition was extended by 'where there are no two literals in a row'. This solved all the problems for that moment and everybody was happy!
This happiness lasted until I tried to parse a file with a statement after the HereDoc, this failed. So a statement after a HereDoc caused the parser to keep parsing until the end of the file, something that we definitely do not want. So the problem here was that the end of a HereDoc was not strict enough. I tried several things, but nothing seemed to work. The optional semicolon was causing big problems. It turns out that this optional semicolon was a misinterpretation of the documentation. The documentation mentions this semicolon and I thought that it was part of the end of the HereDoc. But this semicolon is only allowed because the HereDoc is probably part of a bigger expression that needs to be closed by the semicolon. So I simplified the HereDoc-end and this allowed a nice follow-restriction. All the problems are over and everybody is happy again :)
So there is now a more expressive support for HereDoc, with a restriction. This was already true with the first implementation, but never really mentioned. The problem is that one can not express the fact that the labels of the HereDoc should match within SDF. So HereDoc is parsed from the first open label until the last closing label, even if there are statements in between. This restriction is unfortunate, but reality for all parsers that are based on SDF. It is simply not possible to solve this within the syntax definition. It can be solved by adding a post-processing step, so an issue for this is already created.

The finish this entry with some good news, there are windows-builds for both php-front and php-sat! It took us some time and some hackery implementation of stratego-routines to get everything right, but we succeeded. The hackery stuff will be moved to the stratego libraries as soon as possible. I have tested the build on my own machine and it seems to work fine. Please try it out if you have a windows machine that you can (ab)use. Windows-builds for stratego programs are relatively new, so there could be some hidden problems.

Finding differences

Here is another example that shows that testing is useful. I had implemented the type-juggling from Integer- to String-value and written some tests for this. All the tests succeeded on my machine so I could commit. I was surprised when I got a mail about a failing build. If you look at the page then you will notice that the check passed for i686-linux, but failed for i686-darwin. The tests that fail have in common that they handle a non-empty string that does not start with a number. Thus my code applied string-to-int to an empty string. This strategy is implemented in C and calls strtol. It turns out that the result of this function is not the same for both platforms if the input is an empty-string. I am not a C-expert, but i686-darwin seems to give an error and i686-linux does not do this. The problem is solved and tests are added to the Stratego-libraries for this. But it might be useful to know for others.

These implementation details might be interesting for some, others prefer an update. So what has been added to PHP-SAT? It is now possible to include files in a simple way. This means that all files are included without looking at the context. They can only be included if the file names are coded in Strings, so no concatenation. They should also be on the include-path. But this is the same for PHP.

It is also possible to print the included files. All files are printed to a file with the same file name and a post-fix '.psat'. The bug patterns within PHP-SAT are also applied to the included files, so these are checked automatically.

I am currently improving the simple evaluation to be able to handle more complex file inclusion. It should also improve file-inclusion by having the normal semantics for include_- and require_once.

The last cool thing is that almost all calls to the stratego-xtc-library are gone. This means that we could make windows-builds soon!

Thanks and progress

I think that there are many people with great ideas for projects. Most people do not get around to actually starting up these projects. It takes a lot of time which is usually not available. Without the Summer of Code I would not have had the time to start the project, let alone work on it for so many hours. Thank you Google!

The project would not be in his current state without my mentor. He helped me in setting up the development environment and automating the build- and test-process. Our meetings motivated me and helped me in keeping focused. I would recommend all (future) participants of the Summer of Code to meet with his/her mentor face-to-face, or at least in some interactive way. It helped me a lot, thank you Martin!

The following section is taken from my evaluation for the Summer of Code. I think it nicely summarizes the current progress of the project. The project has produced the following two libraries, together with tools to interface with them.

PHP-Front is a library with support for parsing and pretty-printing php, reflection of parsed sources, some generic traversals and a simple evaluation. This part is available as a separate package which provides a solid basis for transforming or inspecting PHP source code. I think that there will be more projects that are going to use this package for this purpose. One of the projects that I am already aware of is StringBorg.

PHP-SAT is the library that actually performs an analysis on the given source code. It tries to detect 7 bug patterns, more will be implemented later. It also check pre-conditions for functions and language construct to detect possible vulnerabilities. This last analysis will be improved over time.