Wednesday, June 28, 2006

Backwards compatibility

One of the biggest lessons I've learned from the ECMA process is how critical backwards compatibility is. When you have millions of users, the enormous weight of existing software holds you back from doing just about anything that breaks compatibility. An email from Brendan on the ES4 public mailing list (you have to subscribe to the mailing list to view the original email) crystallizes the difference:
...we are not breaking backward compatibility except in selected and few ways that actually improve the bugscape.
In other words, if you expect that a break in compatibility will cause fewer bugs than it fixes, then it's acceptable.

Wednesday, June 21, 2006

A semantics for prophecies

Here's a quick sketch of a semantics for prophecies. Let's model a multi-threaded mini-Scheme where thread interleaving is represented by non-deterministically choosing a thread from a list of active threads and reducing it one step:
(e1, ..., E[e], e2, ...) → (e1, ..., E[e'], e2, ...)
We also need a store to represent creating and updating prophecy queues. So the rule for cleaning up a terminated thread looks like:
<(e1, ..., v, e2, ...), σ>
    → <(e1, ..., e2, ...), σ>
And beta-reduction occurs in a thread like so:
<(e1, ..., E[((lambda (x) e) v)], e2, ...), σ>
    → <(e1, ..., E[e[v/x]], e2, ...), σ>
Creating a new prophecy causes a new thread to be spawned and allocates a new, empty queue on the store. The prophecy's thread marks its context with the location of the queue so that subsequent calls to yield will know where to store their values.
<(e1, ..., E[(prophecy e)], e2, ...), σ>
    → <(e1, ..., E[<prophecy: i>], e2, ..., e'), σ[i := <queue:>]>
    where e' = (prophecy-context i e)
When yielding a value, we look up the current thread's prophecy context to find its associated queue, and add the value to the queue. (Because this is an atomic step, queue updates are thread-safe.)
<(e1, ..., E[(yield v)], e2, ...), σ>
    → <(e1, ..., E[null], e2, ...), σ[i := <queue: v1, ..., vn, v>]>
    where E = (prophecy-context E0)
    and σ[i] = <queue: v1, ..., vn>
Finally, reading a prophecy off the queue succeeds whenever the queue is non-empty.
<(e1, ..., E[(prophecy-next <prophecy: i>)], e2, ...), σ>
    → <(e1, ..., E[v1], e2, ...), σ[i := <queue: v2, ..., vn>]>
    where E ≠ (prophecy-context i E0)
    and σ[i] = <queue: v1, v2, ..., vn>
Conspicuously missing from this first draft: the stop-iteration exception (because I didn't feel like adding exceptions to the model), and prophecy->stream, which really can be considered a library function added on top, implementable in a language with delay and force by means of a recursive function that produces null when it catches the stop-iteration exception.

The prophecy was unclear

My explanation of prophecies wasn't very clear. Ezra Cooper asked in the comments:
You say that the subexprs of the prophecy form are evaluated in a separate thread; is each one evaluated in its own separate thread? Or is one background thread generated for that whole prophecy? What happens after yielding--it queues up a value for other threads, but what does the present thread do?
The idea is that a prophecy form spawns one single background thread, which immediately goes to work evaluating the prophecy's subexpressions sequentially. Within that thread, each call to yield queues up its value and the thread immediately continues evaluating (this is in contrast to generators, which cause a transfer of control upon a yield).

When any other thread requests a value from the prophecy, it either dequeues the first yielded value, blocks if no values have yet been queued, or raises a stop-iteration exception if the prophecy is complete (i.e., the prophecy background thread has terminated and all yielded values have been dequeue).

But in my servlet example, each page in the servlet only wants to read one single yielded value, so there's no need to block the page from loading until all the values have been yielded. That's why the prophecy->stream library function provides a lazy stream view of the prophecy which the servlet can use to pull items off the queue by need. That is:
(define (view stream)
(if (null? (force stream))
;; produce a web page signalling that we're done
'(html (body "DONE"))
;; produce a web page displaying the first item in the stream
(send/suspend/dispatch
(lambda (proc->url)
`(html
(body
(p ,(render-item (car (force stream))))
;; a link to view the next item in the stream
(p ,(a ([href ,(proc->url
(lambda (request)
(view (cdr (force stream)))))])
"Next"))))))))
This procedure generates a web page that shows a view over the first item in the stream, with a link to display the next. When the user goes back to the previous page and reclicks the link, the servlet simply reloads this same stream (which was the cdr of the previous page's stream). In other words, we've effectively masked out any mutation. The streams provide a lazy view of an immutable sequence of values, but which are produced "eagerly" in a background thread by the prophecy.

Tuesday, June 20, 2006

Prophecies: future-generators

I've discovered a neat idiom. I have a web servlet that spawns a separate process and reads a bunch of terms from the process's output. I want to be able to view those terms one at a time in the web page, but I don't want to leave the process hanging indefinitely based on the user's interaction. At the same time, I want it to be safe for clone/back as all web programs should. So the program should be able to process the terms in a lazy fashion, and allow backtracking (without having to re-read terms after backtracking), but the actual reading of the terms should happen concurrently in order to close the separate process as soon as possible.

This is a sort of combination of generators, which allow programmers to separate producers from consumers, and futures, which are a concurrent variation of laziness, where evaluation of a subexpression can happen concurrently, but evaluation of the containing expression must block if it needs the value of the subexpression.

Let's call these future-generators "prophecies." We create a prophecy with a special syntax whose subexpressions are evaluated immediately but in a separate thread from the creator of the prophecy:
(let ([p (prophecy
(yield 1)
(yield 2)
(yield 3))])
...)
The difference between this and traditional generators is that yield does not halt evaluation in the current local evaluation context, which is happening in a separate thread. It simply queues up the value to make it accessible to the rest of the world.

We can access the yielded values with a library function prophecy-next, which takes a prophecy and blocks until there is a value available, or raises the sentinal value stop-iteration to indicate that "the prophecy is fulfilled."

Or we can extract the values lazily with a stream with prophecy->stream, which produces a stream, where (streamof a) = (promiseof (union null (cons a (streamof a)))).

Now my servlet is easy to write:
(define (start request)
(let ([p (prophecy
... (launch "prog.exe") ...
(let loop ()
... (yield (read-term)) ...)])
(go (prophecy->stream p))))
(define (go stream)
(if (null? (force stream))
`(html (body "DONE"))
(send/suspend/dispatch
(lambda (proc->url)
`(html
(body
(p ,(format "~a" (car (force stream))))
(p (a ([href ,(proc->url (lambda (req)
(go (cdr (force stream))))
"Next"))))))))
The separate program launches immediately, and all its output is read in as soon as possible. But the terms are packaged up via a prophecy into a lazy stream, which the web servlet reads in one page at a time. Users can safely backtrack, clone windows, and/or refresh any of the pages in the interaction, and the values always appear in the proper sequence, without ever having to regenerate them from the separate process.
Update: Added to PLaneT.

Wednesday, June 14, 2006

Formalizing JavaScript

Last week I gave a talk at the UCSC Software Engineering Seminar entitled Formalizing JavaScript.* I talked about the approach I've been taking to specifying the operational semantics of JavaScript, in terms of both the formal approach to the model and the engineering of the executable model.
*You'll probably want a better cursive font than Comic Sans MS, such as Monotype Corsiva, Textile, or one of the Chancery fonts.
Update: I've written up some notes to accompany the talk as well.