Sunday, March 25, 2007

Bisimulation bleg

I have a question for anyone who happens to know: are bisimulations more common in process calculi such as the π-calculus because of τ transitions?

Lately I've been proving some simulation theorems for λ calculi, and none of them is a bisimulation; there are too many "administrative" reductions in the implementation semantics, so it's not possible to guarantee that every one-step reduction in the implementation semantics corresponds to a reduction sequence in the specification semantics.

Now, it's been a while since I looked at the π-calculus, but I seem to recall that you can hide a sequence of reductions inside a "black box" and call them a single transition τ. I suspect this facility allows you to pretend a sequence of transitions is a single transition in order to hide the details of an implementation semantics, effectively producing a completely indistinguishable simulation.

(This is admittedly all a little vague since my knowledge of bisimulations is scant.)

Tuesday, March 20, 2007

John Backus (1924 - 2007)

"The beauty of FORTRAN--and the reason it was an improvement over assembly language--is that it relieves the programmer of the obligation to make up names for intermediate results."
--Andrew Appel, Compiling with Continuations

My knowledge of the early history of programming languages is limited, but I believe Backus is considered to have invented the compiler. Slashdot unfairly describes Backus's Turing Award lecture "Can Programming be Liberated from the von Neumann Style?" as "apologizing" for the creation of FORTRAN, but I think the idea of functional programming was already there in FORTRAN. Once the idea of a FORmula TRANslator had been invented--i.e., the notion of a programming language with expressions--it was a natural step to consider languages without any statements at all.

This is to say nothing of the enormous debt of gratitude we owe him for BNF.

Update: Oh dear, I spoke too soon. Grace Hopper wrote the first compiler (thanks to the anonymous commenter for setting me straight). Apparently the FORTRAN compiler is considered one of the first successful optimizing compilers for a high-level language.

Saturday, March 10, 2007

Inside-out evaluation contexts

I've been a big fan of the "inside-out" (i.e., zipper) formulation of evaluation contexts, but almost everyone I talk to has a real problem with it. I think this is primarily due to lack of familiarity, but I do recognize that it requires you to stand on your head.

I'd thought for a while it was the only real way to specify evaluation contexts for a CEK machine semantics, but in fact it's really only an optimized representation for that approach. The outside-in definitions work just fine.

All you need in a CEK machine is to be able to match the innermost frame of a context by decomposing it into E[F]. With the inside-out formulation, frames are stacked up from inside out, so this is tantamount to looking at the head of a list. This is so efficient that the entire history of stack-based language implementations uses it. However, it's still possible to match E[F] with the inside-out outside-in formulation, it just requires recursively traversing the context until you find the innermost frame F.

I think I'll probably stick with outside-in for most of my papers, since it seems to generate so much angst.

Update: whoops! corrected a typo.

Progress report

It's Saturday morning and I've had a full night's sleep--the first in a while and probably the last for a while. Progress report:

I've submitted my final paper on Space-Efficient Gradual Typing for the draft proceedings of Trends in Functional Programming with Aaron Tomb and Cormac Flanagan.

Now I'm working on several papers--one or two for ICFP and one for GPCE--and a couple of talks, all with deadlines in April.

Mostly

It occurred to me that my taste in programming languages tends to run in the "mostly" direction, such as mostly functional programming and mostly hygienic macros.

Friday, March 09, 2007

Avoiding coinduction in soundness proofs

Mads Tofte's thesis uses coinduction to prove type soundness for a language with a store, because of the problem of cycles within a store: if the value in a reference cell can point to itself (such as a closure that refers to the cell that contains it), then it's not possible to find the type of a value by looking in the store. That is, type soundness involves determining the type of values, and store addresses are one kind of value. If finding the type of an address means looking up its value in the store and finding the type of that value, but that value might itself contain the same address, then typing values is not inductive.

But according to Mitch and Matthias, it's not actually necessary to use coinduction, as long as you set up the soundness hypothesis right.

The statement of subject reduction usually looks like this:
For any type environment Γ, store typing Σ, expression e, and store σ, if
  1. Γ;Σ ⊢ e : τ and
  2. Γ;Σ ⊢ σ and
  3. e,σ → e′,σ′
then ∃ Σ′ ⊇ Σ s.t.
  1. Γ;Σ ⊢ e′ : τ and
  2. Γ;Σ′ ⊢ σ′.
By quantifying over all stores in the hypothesis, you can get the value of an address by simply looking it up directly in the store typing, without passing through the actual value. So the definition of Γ;Σ ⊢ σ is
dom(σ) = dom(Σ)
∀a ∈ dom(σ) . Γ;Σ ⊢ σ(a) : Σ(a)
Γ;Σ ⊢ σ
and finding the type of an address value is defined simply by looking up the address in the store typing:
     (a : τ) ∈ Σ     
Γ;Σ ⊢ a : τ ref


Update: More accurately, the hypothesis that quantifies over all stores is in the main soundness theorem:
If ∅;Σ ⊢ e : τ and ∅;Σ ⊢ σ then either
  1. e,σ diverges or
  2. e,σ →* v,σ′ and ∃ Σ′ s.t. ∅;Σ′ ⊢ v : t and ∅;Σ ⊢ σ′.
The proof of this theorem is a straightforward induction over the length of the reduction sequence using progress and preservation. The interesting part is that it simply assumes from the outset that we have a store σ and a store typing Σ that agree. We can easily provide an initial store and store typing , which trivially agree, to get the actual proposition we want as a corollary, namely that evaluating the program in the initial store produces a well-typed result.

Monday, March 05, 2007

Inkscape

Creating figures in LaTeX is a pain, especially if you want to embed LaTeX in your figures. In the past I've used xfig and pstricks, which allows you to embed typeset text in a diagram, but aside from all the hoops you have to jump through (I had to massage the output files before feeding them to TeX), what you see in the GUI is uninterpreted LaTeX source. So I was always nudging things over a pixel at a time, recompiling, and checking out the PDF.

This weekend I tried Inkscape, which has a more modern UI than xfig. From what I've seen so far it's pretty nice. It comes with a plugin that allows you to input a string of LaTeX source, and it creates a shell, compiles the LaTeX source, and uses pstoedit to convert the output to SVG to import back into your diagram. It's a hack, and a little slow, but it's true WYSIWYG again.

To get it to work in cygwin I had to tweak things a bit -- one very nice thing about Inkscape is that its extensions are just python source files stuck in a directory with a naming convention, which means I could tweak the file and rerun the command (without even restarting Inkscape!) and it would reload the plugin.

I created a script /usr/local/bin/tex2svg:
#!/bin/sh

base_dir=`cygpath -wl "$1"`
base_dir=`cygpath -u "${base_dir}"`

latex_file=$base_dir/eq.tex
aux_file=$base_dir/eq.aux
log_file=$base_dir/eq.log
ps_file=$base_dir/eq.ps
dvi_file=$base_dir/eq.dvi
svg_file=$base_dir/eq.svg
out_file=$base_dir/eq.out

latex -output-directory="${base_dir}" -halt-on-error \
"${latex_file}" > "${out_file}" 2>&1
dvips -q -f -E -D 600 -y 5000 -o "${ps_file}" "${dvi_file}"

export LD_LIBRARY_PATH=/usr/lib/pstoedit:$LD_LIBRARY_PATH
pstoedit -f plot-svg -adt -ssp "${ps_file}" \
"${svg_file}" > "${out_file}" 2>&1
The file $inkscape/share/extensions/eqtexsvg.py is the plugin; I commented out the lines that call latex, dvips, and pstoedit, and replaced them with a single command:
os.system("bash --login -e tex2svg '" + base_dir + "'")
Python and LaTeX were having all sorts of problems communicating about file names with different conventions (Unix, DOS 8.3, Windows long), so I just basically punted all the work to a shell script, where I could have better control over the file names via cygpath.

To use it, go to Effects | Render | LaTeX Formula, type in a LaTeX formula, and it will magically appear (after a few external windows blip on and off). I don't understand scaling issues, especially when you're dealing with multiple file formats (.tex, .dvi, .ps, .svg), but for whatever reason I've found that sometimes I have to kill the -y 5000 in the dvips command to get the right size.

Also, when I tried saving the diagram directly to PDF, it ended up kind of blurry. I've had better luck saving it to EPS. That means that you can't use pdflatex anymore; instead you have to use dvips followed by ps2pdf. But then they seem to look okay.

Thursday, March 01, 2007

'a typing

There are a lot of typing paradigms running around these days without clear definitions. One of the difficulties is that any given phrase carries a number of different meanings, including (but not necessarily limited to) the phrase's apparent meaning from its constituent words, the phrase's common usage in any one of a number of different communities, the originator's intended meaning of the phrase, the original intended application domain of the paradigm, or the subsequent historical application domain of the paradigm.

Here's what I tend to mean by the following phrases. Disclaimer: I try to keep a balance between the aforementioned categories of meaning, although my community's common usage generally takes precedence. In any case there's a lot of vagueness and variability in the terminology, not to mention controversy. Read this at your own risk.

static typing: systems for reasoning about and enforcing inherent language invariants algorithmically and without involving program evaluation. In order to guarantee the invariants, the runtime semantics is only allowed to evaluate statically well-typed programs.

dynamic typing: systems for enforcing inherent language invariants incrementally during program evaluation via runtime checks.

soft typing: systems for detecting possible violations of imposed (i.e., not inherent) language invariants algorithmically and without involving program evaluation. The runtime semantics does not depend on programs being well-typed according to the soft typing algorithm.

optional typing: systems for detecting possible violations of imposed language invariants algorithmically and without involving program evaluation. The runtime semantics does not depend on program being well-typed according to the optional typing algorithm. (Come to think of it, I can't come up with a distinction between soft typing and optional typing off-hand.)

pluggable typing: systems of optional typing that admit any number of different static checking algorithms.

hybrid typing: systems for enforcing inherent language invariants algorithmically both via static reasoning and dynamic checks. This paradigm was originally introduced as a way to combine the static guarantees of statically typed languages with the more expressive annotation languages of dynamic contract systems.

gradual typing: systems for enforcing inherent language invariants algorithmically both via static reasoning and dynamic checks. This paradigm was originally introduced to facilitate program evolution by permitting incremental annotation of programs.

strong/weak typing: your guess is as good as mine.