The operator type

My first intention was to name this entry after one of these quotes about assumptions, because in the past week I found out (again) that they are totally correct.

Last Sunday I started to work on typing the operators of PHP. My first attempt was really basic, simply follow the documentation and everything will be alright. So I started with the arithmetic operators '+','-' and '*'. They all act the same on the type level in the sense that the result is always integer, unless there is a float involved. The next operator in line was division ('/') which always returns an float (according to the documentation). I finished off with the modulus operator which behavior turns out to be ...... not documented.

Since I want PHP-Sat to follow the semantics of PHP I needed to know the exact semantic of this operator in PHP. Naturally this is not hard to figure out because we simply run PHP on some test-data and use the function gettype. I wrote a little script with some lines like:
...
echo "Type of 5/2 = ", gettype(5/2), "<br />";
...
which is not only tedious, but also a complete violation of the DRY principle.

In order to comply with DRY I turned to the eval function of PHP. This function evaluates a string as PHP code and can actually return the result of a its computation! I used this function within a loop to dynamically calculate the result of applying an operator to different types. You can find the complete script here, but I will show the critical part here:
foreach($types2 as $key2 => $val2){
$code = 'return ' . $val1 .' '.$op.' '.$val2. ';' ;
$result = eval($code);

echo ''. gettype($result) . ' (' . showval($result) . ') ';
}

The foreach-loop resides within another foreach-loop looping through a copy of the array $types2. This gives us access to two different types and the current operator. These are combined within a string which is executed. We then extract the type and the result of the execution and display it.

Basically you just simply pass an array with types and a operator to the function and it will output the type and result of all possible combinations. I put the output of this script here, you can find the types and values that I used before the actual result. The first few sections contain most of the unary operators, the binary operators are organized in tables. Within the tables each row contains the first value given to the operator, each column is the second value. I apologize for for the ugly formatting, just trying to be a good programmer.

It was now a trivial task to read type(s) of the modulus operator from the table and be amazed. The result type is almost always an integer, but it can an also be a boolean type! This comes from the fact that whenever the RHS is a '0'-value the modulus cannot be calculated. Where other languages would simply halt execution, PHP returns a 'false'-value and issues a warning. This is also the case for division and is now also handled correctly in PHP-Sat.

You might wonder why I have chosen to test all of the operators instead of just the modulus. The first reason for this is that I could simply not resist the temptation, it was just to easy to add all the operators. The second reason was that the first run of the script also involved the testing of the division operator. According to the documentation it always always return a float, but the table tells us that it return either a float, a boolean or an integer! I might be able to see that the boolean return-value is not documented, it is almost certainly a mistake to divide by zero, but the documentation specifically states that division never returns an integer! Luckily (yes I believe it is a good thing), this problem is already mentioned by someone else and I hope it will be fixed before the year ends.

Back to PHP-Sat, the todo-list regarding operators has been heavily shortened. The only ones left are the assignment- and string-operators. The first group involve some dynamic rules which I want to implement when I am in a more awakened state-of-mind. For the second group I still have to figure out which safety-type is to be assigned when the safety-type of both side differ. I currently believe it should be the lowest safety-type, but please feel free to provide me with a counter-example that shows that this belief is wrong.

Interfacing Feedback

My framework for feedback generation does not need a nice graphical user-interface. The input is simply a set of strings (current term, previous term and maybe some file names), and the output is also a string (the generated message). Since I am perfectly capable of using a command-line interface, why should I spend time on designing a fancy interface?

For one simple reason: Testing.

In order to know whether the feedback that is generated can actually be of use for students I am going to test the tools on real-life persons! This will not only give me a change to get out the lab and mingle with normal people, it will also provide useful feedback about the usability of the generated feedback. I intend to test the tools on students and teachers in a normal school setting, something which will probably result in nice anecdotes of system crashes, hanging computations and incomprehensible error-messages. Definitely something to look forward to :)

I thought that the easiest way of making sure that the tools can be tested everywhere is by providing a web-interface. No problems with installation or configuration, simply fire up the browser and lets start the fun.

Unfortunately, the choice for a web-application raises some other problems like different interpretations of web-standards and dealing with response time.
The first issue is a matter of using the standards and applying some hacks here and there. The last issue is (hopefully) solved by making use of AJAX (yes, my thesis is buzzword compliant). I haven't had the pleasure of programming with this technique so this is a nice opportunity.

A different issue with the design of a graphical interface is more personal. I am not that great in designing user interfaces. Most of the GUI's I have designed have the same design or are only input-output fields. It is not that this describes my GUI's, but I will probably not bring home any awards either. This is definitely something I want to learn.

Fortunately I love a challenge, so I intend to make a high-quality web-interface for my feedback generation framework. I have read several papers and books and I think that my current design is (theoretically) pretty good. It is rather minimalistic, gives visual indications for different zones and allows the users to adapt it to their personal taste.

Please feel free to comment on the design. After all, feedback is critical for learning ;)

Summer of Code 2007

For those of you who are unaware of the occasion, today is the day that students can start sending in their applications for the Summer of Code 2007!

Another change for students to work on open-source solutions, get a t-shirt, gain experience in working with larger projects, get a t-shirt, and to get paid for doing this!

Oh yeah, did I mention you also get a cool T-shirt?

Unfortunately, I will not be able to join this year because of an other world-wide event, so your changes of getting accepted have just been increased :)
This year I will just try to promote the Summer of Code because I think it is a wonder full chance to do something useful and interesting.

And what better way to spread the word then with some flyer's! We (that is me and my girlfriend) have translated the flyer to Dutch, but it is also available in other languages.
Please mail/print/fax these to everyone who might or should be interested!

Taking precedence

It has finally been done, PHP-Front has perfect operator precedence!

And why can this be written down as a fact? Because of the work of Martin! He has worked on the grammar-engineering-tools tool for generating SDF priorities. When I worked on them I ran into some problems because of the decisions that where made during the initial development (read: when the tools had to be finished yesterday). This resulted in a common format for PHP precedence rules that was neither YACC-like nor SDF-like. Martin has rewritten the tools to use most of SDF representation and this worked very good according to all the solved issues. The operator precedence is now encoded into PHP-Front as separate modules for both version 4 and version 5, both of them over 6k LOC.

The (technical) details about how these files are generated will probably make a long and interesting blog-post. Since Martin says it is hard to find such a subject, not to mention (again) that it is his work, who am I to take this subject away from him? Maybe he can find some time after preparing the LDTA-presentation for Thursday. I won't be able to make it, but if you are close to Delft you might want to sneak into the Research Colloqium. (If you do, please tape it!)

Besides bringing perfect precedence to PHP-Front, this work has also resulted in a new issue. It was not very hard to figure it out, I just misinterpreted the note in the documentation. So I immediately fixed it by removing a special case which did not have to exist, just to put the icing on the cake.

Downs and ups

Some weeks are filled with good things, others with bad things. For me, the last week was filled with both. My parents celebrated there 30th wedding anniversary, my little brother turned 21 and my last remaining grandmother passed away. You can probably figure out yourself which ones are good and which one is bad.

So the schedule of this week was a bit out of balance because of these events. However, I still managed to get some work done. The analysis for constant-propagation and the one for the safety-levels now share the same structure. This will make it easier to generalize the analysis into a more generic framework, something which will reduce code duplication.

I have also finished my thesis proposal which means that I can now start the real graduation process. The official start date will be on March 5th, next Monday. This process is suppose to take 22 weeks, in this case until the 6th of August. Since there are some other events in between it will take some additional weeks. However, this schedule still allows me to graduate before first of September 2007, the start of the new academic year and the end of my 5th year.

Unfortunately, people keep telling me that the changes on keeping this schedule are pretty slim. Each year, only 1 or 2 students manage to graduate within 5 years, the minimal amount of time for this study. A nice challenge I would say :)

Oh, for those who are interested, my thesis proposal can be downloaded from this page. Please let me know if you have any questions/comments.