Bouwers > Eric > Blog: April 2007

Doing it with javascript

I have been working in the wondrous world of JavaScript the last couple of (work-)days, and I have to say that it kind of fun to work with. It turns out that the functionality that I want for the GUI is mostly captured in libraries, which makes it a lot easier to implement everything.

People that are interested in the current status of the GUI can always take a look here. Currently, the best way to view the page is through Firefox. I still have to do some debugging for other browsers, in particular Internet Explorer. I managed to get the designs the same across browsers by using the IE4linux project, but unfortunately the JavaScript-error-pop-up does not seem to work on my box. I am afraid I have to use a real Windows-installation for debugging this.

So what is the JavaScript suppose to provide in the interface? It enables the usage of the graphical view and the merged view. Without JavaScript the interface can still be used, but only through the text view. After the graphical view is enabled, the engine of Scriptaculous makes it possible to drag the keyboard to any position on the screen. I can recommend this library to anyone who wants to add nice animations to their website, it is easy to use and seems to work quite well. Scriptaculous builds upon the Prototype library which extends the objects of JavaScript with several handy properties. I have used some elements of Prototype in my own code as well, it is just to handy to ignore.

When the effects of the keyboard where finished, I started to search for a way to display mathematics and logic without to much effort. During my search I stumbled upon a library called jsMath which is capable of turning LaTeX-code into nice HTML. Even though the library is rather big, the fonts take up about 5MB, it is pretty easy to use. You can simple pass some LaTeX-code to a function which transforms it into HTML that can be shown in a normal website. The library seems to support most of the mathematical notation of LaTex, so I can at least use it for fractions, equations and logic.

The usage of jsMath is easy and attractive, but forcing a student to enter LaTeX-code is not my idea of a friendly user-interface. A normal Dutch student can understand the expression 2 / (3 * 5), but when it is written like \frac{2}{3 \times 5} it becomes harder to understand. So in order to allow a student to enter the expression in a normal notation, and view it in a nice graphical notation, I wanted to transform the first notation into the second one.

A (very naive) first approach involved splitting the string on the operators and putting the parts back together with the right notation. This works for very simple expression, but as soon as you want to have priorities and parenthesis it breaks. So the second approach needed to involve some kind of parsing.

It probably does not come as a surprise that finding a parser-generator for JavaScript is quit hard. There are some people that want it, but nobody which actually wrote one. The only thing I found was an example of an expression evaluator in JavaScript. This project has the elements that I want, but it has too much operators and fancy bits to be of immediate use in my project. Furthermore, it evaluates the expression, I want it to print LaTeX.

Even though I could not use the expression evaluator in the project, it served as a pretty good reference for the implementation of a more generic tokenizer and parser.
This generic code makes it easier to instantiate a parser for a specific domain, you only have to give information about operators and literals. Just like the expression evaluator it tokenizes the input string, after which it uses the shunting yard algorithm to transform the expression into an AST.
In this case the AST consists of objects holding other objects, which are clones of main objects. The objects need to be cloned to prevent infinite recursion to occur when the AST is printed (yes, something I found out the hard way).
I haven't had time to turn it into a library with a website, documentation and comments in the code, this is probably something for the future. However, I did have time to develop some tests for the code, long live jsUnit!

Another functionality that is currently available are the buttons on the keyboard. They work by adding text-snippets to the input field, so in theory you can enter an expression by only using the virtual keyboard. It took some time to figure out how to work with the Caret position, but after combining several JavaScript-snippets it seems to work well.

After reading all this you might wonder whether the GUI is done. The answer is no, there are still many things that can be done! First of all, the functionality needs to be debugged for IE and Safari. Second, when the text-view is hidden the keyboard does not function well. Third, using AJAX should make the updates go smoother. Lastly, it would be nice to be able to point to a place in the graphical view and place the caret at that position on the text-view. The first three todo's are definitely needed before the interface can be used for testing, the last one is a nice thing to have.

However, the real thing needed for testing the framework is .... the framework! It has been interesting to work on the interface with JavaScript and all, but it is not the core of my thesis. Therefore, the GUI is set aside. On to coding the framework!

The third phase

In the beginning of this week I wrote down a description of the third phase of feedback generation. This phase has access to the previous term (PT), the current term (CT) and a set of rewrite rules. The rules describe valid actions on the domain, for example the adding of two fractions:

A/B  + C/B -> (A+C)/B.

In the case that the PT is 1/3 + 1/3, the CT obtained after application is 2/3.

So if this is the input, what is the output? You guessed (or read) it, a feedback-message! This feedback-message is supposed to contain more information then 'this is incorrect', but it can only use the CT, the PT and the allowed rules. Furthermore, the algorithm is assumed to only start when a mistake has been made.

A first try

When we know that a mistake has been made, we also know that there is no path of rule-application which leads from the PT to the CT, otherwise there is a rule which is not valid. In order to get a path between the two terms we introduce a magic rule R which exactly encodes the rewrite between the PT and the CT. In the case that the PT is 1/3 + 1/5 and the CT is 3/15, we have the following R in which A != B != C != D:

A/B + A/D -> B/C

Now lets assume that we have the following (conditional) rewrite-rules:

(1) A1/B1 + C1/B1 -> (A1 + C1)/B1  
(2) A2/B2 + C2/D2 -> (A2*D2)/(B2*D2) + (C2*B2)/(D2*B2)
(3) A3/B3 + C3/D3 -> ((A3*N) + C3)/D3 where N = D3/B3

Given this set, our task becomes to identify the rewrite-rule which the student wanted to apply, but in which he made a mistake. This can be translated into finding the allowed rule which is most similar to the R. We can use some form of tree-edit distance for this problem, since the rules actually represent a structured action.

There exists many different tree-edit distance algorithms, for this algorithm we can use a rather simple one. When we encounter two intermediate or leaves nodes which cannot be matched we replace the one with the other. This operation costs 2, one node is deleted and one node is added. When we encounter a leave node which needs to be matched against an intermediate node we check whether the leaf node is a free variable. A variable is free if there is no previous match and the variable is not subject of any restrictions. When the variable is free we simply match it against the sub-tree, otherwise we replace the variable with the sub-tree which costs 1 + size(inserted-tree).

To illustrate the distance we calculate the difference between (1) and R. Within the LHS we need to replace B1 with B which costs us 2. The rest of the variables can be matched against each other (so A = A!, etc). Within the RHS we need to replace A1+C1 with B which costs 3 + 1 = 4. This replacement needs to be done because B is already matched against B2 in the LHS. The last step is to replace the B1 with C in the RHS which costs 2. This has to be done because we already matched B1 to B in the LHS, and we know that B != C. This results in a total distance between R and (1) of 8. If we perform the same algorithm for the other rules we end up with distance(R,(2)) = 16, and distance(R,(3)) = 8.
Even though (1) and (3) have a similar score, we end up with rule (3) as our guess. This is because rule (1) cannot be applied to the PT, so it is more unlikely that this was the intention of the student.

Unfortunately, we are not done yet because the student could have taken a correct step before making a mistake. To model this we calculate a new set of PT's by applying the rules where possible. This results in two new PT's, PT1' = (1*5)/(3*5) + (1*3)/(5*3) and PT2' = ((1*5)+1)/5. Calculating the rules R1, R2 and all the distances we get six new distances. None of these distances is smaller then the distance of 8 we already have (the proof of this claim is left as an exercise for the reader), therefore we stop the recursion and return a feedback-message containing the rule (3) as a guess.

Although I knew we needed to tests this algorithm more thoroughly, it seemed to be correct to me. However, the assumption that the recursion can be stopped when there is no distance <= the lowest distance up until now can not be made out of the blue. This has to do with confluence, a property that is not guaranteed for the set of rewrite-rules. Sigh, just when I thought that I had a fitting key for the lock, I could not turn it to unlock the door.

Getting the facts

The concept of confluence came up Thursday and I have been trying to get around it until today. A first thing that I did was sum up some properties about the scope of the problem.

The terms are rather small (100 nodes or less)
There are few steps between the start and the answer (15 or less)
All rules are applied with a reason

The first two assumptions follow from the fact that teachers want students to practice the same routine with many different situations. The last one comes from the fact that students for example know that the denominator and the enumerator of a fraction may be multiplied by the same digit, but they do not do this at random. Such rules always have a context in which they are used. Therefore we can assume that the allowed rules always rewrite a term towards an answer.

With these assumptions in mind I noticed that the domain of fractions has only a few allowed rules. The same holds for the domain of equations and the domain of rewriting logic formulas to CNF, there are only 6 allowed rules for this last domain! Even if we combine rewrite-rules, by unifying the RHS of a rule with the LHS of another rule, we only get 9 rules for the domain of logic. These are not the kind of figures a computer cannot handle within a few milliseconds.

When we introduce new rules by unification this can be viewed as applying the rules after each other on the same (part of the) term. For example, if we unify the rules (2) and (1) from above we get:

1.2 A2/B2 + C2/D2 -> (A1+C1)/B1 
        where B1 = B2*D2, A1 = A2*D2
            , C1 = C2*B , B1 = D2*B2

If we perform the same algorithm as above, this rule surfaces as the result with a distance of only 4. The only right thing to do now is to return both rules (1) and (2) as a result. We cannot deduce which of the two rules went wrong, but we know that the error is somewhere along this path.

If we compare this with the first result of the algorithm we see that it is completely different. However, if we look at the example we see that it is indeed more likely that the student wanted to apply the rules (1) and (2). A guess which complies more with my intuition because of the matching denominators in the answer. Furthermore, this adapted version of the algorithm also work on the other examples that I tried, even the ones for the domain of rewriting logic!

The solution?

Now that the improved algorithm seems to work it would be nice if we can prove the claim 'we can stop when the distance becomes greater'. We can view the path between the PT and the CT as a sequence of allowed rule application in which there is an error. This error can be anywhere on the path. So is i is an allowed rule, we have something like i_n..i_m . error . i_k..i_j. The recursion in the algorithm strips of the first set of rules, so we end up with error . i_k..i_j.
Because we assume that all rules rewrite the term towards an answer we argue that stripping of allowed rules results in a smaller distance. When we 'jump over' the error the distance becomes bigger because we get strange R's which do not look like an allowed rewrite rule at all, otherwise we could have stripped of another rule.

When we combine this reasoning with the fact that we only have a small sequence of steps, I believe that this algorithm will at least give better feedback in many practical situations. Whenever it is not possible to make a good guess we can always fall back to a message like: 'We do not know what you did, can you try to do it in smaller steps?'. I don't see a problem here because the tools are focused on practicing a task in small steps. Now lets see what my supervisor thinks of it.

Thinking of Merging

I have been dealing with some challenges in the last couple of days. For example my practical assignment for DBA, to be implemented in MIL, and styling the GUI for my thesis, which is coming along just fine I guess. Both of these tasks are easy for people who invented/work with it every day, but I sometimes find it hard to wrap my mind around the problem or keeping track of every detail. Good luck for me there exists a topic where I can work on without repeatedly searching on Google, analyzing PHP!

Speaking of analyzing PHP, Martin has written a blog about the generation of rules for operator-precedence in PHP. He mentions some interesting ideas for work on grammar engineering, so if you are looking for a (thesis)-project you should definitely check it out. Otherwise it is just an interesting post to read.

As for my work in operators in PHP, I have finished the rest of the operators in PHP-Sat. Furthermore, the implementation of the constant-propagation regarding operators is revised within PHP-Front. This is done because I am thinking about merging the constant-propagation and the safety-type analysis into one big analysis. I know, it sounds like premature optimization (a.k.a. the root of all evil), but I can explain why it is necessary.

Consider the following piece of code:

  $foo = array(1,2);
  $foo[] = $_GET['bar'];
  echo $foo[2];

When we consider the constant-propagation we first assign the values 1 and 2 to the first two indexes of the array. The value of $_GET['foo'] is then assigned to the third index of the array which is the parameter to echo in the last statement. We know that the value is assigned to the third index because PHP-Front keeps track of the internal index-count of arrays.
Now lets look at the safety-type analysis. We first assign the safety-type IntegerType to the first two indexes of the array. The safety-type of $_GET['foo'] is then assigned to the third index of the array which is the parameter to echo in the last statement. We know that the safety-type is assigned to the third index because PHP-Sat keeps track of the internal index-count of arrays.

You might have noticed that both paragraphs are almost identical, except for the kind of value that is assigned. Thinking about this case, cases for function calls and cases for objects it turns out that performing the safety-analysis involves a lot of bookkeeping. This bookkeeping, for example the internal index-count, is not specific for the propagated values, it encodes the internal semantics of PHP. Therefore, in order to provide a good safety-type analysis, we have to embed this bookkeeping.

In order to avoid code duplication, which might be worse then premature optimization, I believe we must merge the two analyzes together. By separating the assignment of values from the bookkeeping we can visit a node, perform the bookkeeping and then perform as much analyzes on the node as we want. The only thing needed is a list of strategies that encodes a single analysis on the node.

The idea might sound a bit vague, but while I was working on the operators I already saw some duplication creeping in. Some of the strategies for both analysis only differ on the strategies to fetch or add some value, a perfect change to generalize a bit I would say.