The operator type

My first intention was to name this entry after one of these quotes about assumptions, because in the past week I found out (again) that they are totally correct.

Last Sunday I started to work on typing the operators of PHP. My first attempt was really basic, simply follow the documentation and everything will be alright. So I started with the arithmetic operators '+','-' and '*'. They all act the same on the type level in the sense that the result is always integer, unless there is a float involved. The next operator in line was division ('/') which always returns an float (according to the documentation). I finished off with the modulus operator which behavior turns out to be ...... not documented.

Since I want PHP-Sat to follow the semantics of PHP I needed to know the exact semantic of this operator in PHP. Naturally this is not hard to figure out because we simply run PHP on some test-data and use the function gettype. I wrote a little script with some lines like:
...
echo "Type of 5/2 = ", gettype(5/2), "<br />";
...
which is not only tedious, but also a complete violation of the DRY principle.

In order to comply with DRY I turned to the eval function of PHP. This function evaluates a string as PHP code and can actually return the result of a its computation! I used this function within a loop to dynamically calculate the result of applying an operator to different types. You can find the complete script here, but I will show the critical part here:
foreach($types2 as $key2 => $val2){
$code = 'return ' . $val1 .' '.$op.' '.$val2. ';' ;
$result = eval($code);

echo ''. gettype($result) . ' (' . showval($result) . ') ';
}

The foreach-loop resides within another foreach-loop looping through a copy of the array $types2. This gives us access to two different types and the current operator. These are combined within a string which is executed. We then extract the type and the result of the execution and display it.

Basically you just simply pass an array with types and a operator to the function and it will output the type and result of all possible combinations. I put the output of this script here, you can find the types and values that I used before the actual result. The first few sections contain most of the unary operators, the binary operators are organized in tables. Within the tables each row contains the first value given to the operator, each column is the second value. I apologize for for the ugly formatting, just trying to be a good programmer.

It was now a trivial task to read type(s) of the modulus operator from the table and be amazed. The result type is almost always an integer, but it can an also be a boolean type! This comes from the fact that whenever the RHS is a '0'-value the modulus cannot be calculated. Where other languages would simply halt execution, PHP returns a 'false'-value and issues a warning. This is also the case for division and is now also handled correctly in PHP-Sat.

You might wonder why I have chosen to test all of the operators instead of just the modulus. The first reason for this is that I could simply not resist the temptation, it was just to easy to add all the operators. The second reason was that the first run of the script also involved the testing of the division operator. According to the documentation it always always return a float, but the table tells us that it return either a float, a boolean or an integer! I might be able to see that the boolean return-value is not documented, it is almost certainly a mistake to divide by zero, but the documentation specifically states that division never returns an integer! Luckily (yes I believe it is a good thing), this problem is already mentioned by someone else and I hope it will be fixed before the year ends.

Back to PHP-Sat, the todo-list regarding operators has been heavily shortened. The only ones left are the assignment- and string-operators. The first group involve some dynamic rules which I want to implement when I am in a more awakened state-of-mind. For the second group I still have to figure out which safety-type is to be assigned when the safety-type of both side differ. I currently believe it should be the lowest safety-type, but please feel free to provide me with a counter-example that shows that this belief is wrong.

No comments: