CPSCM: interfacing Javascript and Scheme

Calling Javascript from Scheme just got easier in CPSCM:

(define v 10)
(define (f x) (+ x v))
(%cpscm:native "alert (" v ")")
;; Can pass Scheme variables and computations
(%cpscm:native "alert (" (f 5) ")")

All string arguments to %cpscm:native are copied verbatim to the output. Inner computations are still compiled as any normal Scheme code, and their results are passed to the native call via temporary variables. Each native call must correspond to a single JS statement.

The old method was to provide a CPS wrapper with the correct mangled name, e.g. to create a function callable as (fun 1 2) from Scheme:

var cpscmjsfun = cpscm__cpswrap (
  function fun (x, y) { return x + y; }

This still works, of course (as demonstrated in the DHTML bubble-sort example), but the new method adds convenience.

The main reason for native is the anticipated Emacs Lisp backend: users will surely want to call a myriad of elisp functions from Scheme, and writing a CPS stub for each of them would be impractical.

Note: the new code is in SVN, but the I haven't updated the online compiler demo webapp yet.

More on R6RS ratification

Given the preliminary results, it looks like R6RS will pass with a 66% margin (I predicted 70% a while ago). Strangely, official results haven't been announced yet, in violation of the announced schedule. More strangely, everyone seems to be quiet on this delay; I suspect that the nays are hoping for some last-minute deliverance handed down from the Steering Committee, but I'm surprised that the ayes aren't becoming impatient at this point.

The comments section provides a glimpse into the disastrous effects of a biased electoral process that requires justification from dissenters, but not from approvers. While most of the nays provide a detailed analysis of the draft (usually acknowledging its virtues where applicable), the "yes" camp, where it bothers at all to comment, seems to employ a pretty lax standard (including a few pearls that I won't quote in order not to offend the authors).

One comment that I found sad, yet funny states that "...Scheme needs a splash of 'worse is better' to move the language standard forward" — coming from a voter from New Jersey, no less.

On a personal note, I almost messed up my ballot. I sent my vote just before the deadline from my registered voter address, but I forgot to change the default email-address field of the ballot. My vote was rejected; I sent a corrected ballot, but only after the deadline, so it wouldn't count. I e-mailed Alan Bawden, who mentioned that "I did fix a lot of people's ballots in trivial ways... [including]... unbalanced parentheses, but I drew the line at actually altering people's claimed identities" (the email-address field apparently taking precedence over the originating address in the email envelope). Luckily, my vote was accepted after a short back-and-forth.

Update: the Steering Committee has ratified R5.97RS.


Emacs Lisp vs. Scheme: scoping and globals

I've been considering an elisp back-end for CPSCM (so that we can program Emacs in R5RS Scheme). I thought the lack of lexical scoping would prove a major stumbling block, but in the end it turns out that Elisp will be somewhat easier to support than Common Lisp. Here are the twists and turns (to evaluate Elisp code, go to the *scratch* buffer, paste the code and type C-x C-e):

  • Elisp has dynamic scope by default:
    (defun f () y)
    (let ((y 10)) (f))  ;; 10
    ;; lambda arguments are also dynamic
    (funcall (lambda (y) (f)) 11)  ;; 11
  • However, with (require 'cl) you get access to the (lexical-let ...) macro, which does exactly what the name says (there is also a lexical-let*)
  • Using lexical-let, one can easily define lexical-lambda — here's a simple version (optimized for minimal line lengths, not Lisp-ness)
    (defmacro lexical-lambda (args &rest body)
      (lexical-let* ((r '&rest) (g (lambda (x) (if (eq x r) x (gensym))))
                     (gvars (mapcar g args))
                     (bnd (mapcar* #'list args gvars)))
        `(lambda ,gvars
           (lexical-let ,(delete-if (lambda (b) (eq (car b) r)) bnd)

OK, so we've played catch up with Common Lisp and managed to work around dynamic scoping; here's the beautiful part:

  • Elisp has sane(r) globals (from a Schemer's POV, at least)

To those who haven't bashed their heads against this problem, Common Lisp's "normal" way of declaring globals (defvar / defparameter) makes variables "pervasively special" (i.e. dynamic) — meaning that

(defvar myvar 10)
(defun (f) myvar)
(defun g () (let ((myvar 1)) (f)))
(g)  ;; => 1, not 10

This is not such a problem for Lisp, but what with Scheme being a Lisp-1, translation of global functions is problematic:

(define (f x) x)  ;; Scheme
;; Lisp translation -- broken
(defvar f (lambda (x) x))

There's another standard-compliant way to simulate globals in Lisp (using symbol macros — search comp.lang.lisp for deflex); however this method requires you to define each global before referencing it, which would preclude mutually-recursive global functions:

(deflex f (lambda (x) (funcall g (- x 1))))  ;; broken: g undefined
(deflex g (lambda (x) (if (> x 0) (funcall f x) 0)))

There are other, more convoluted ways to implement "non-special" globals that have elicited endless (and inconclusive, as far as I could tell) threads on comp.lang.lisp, e.g. using (locally (declare (special myvar))). Finally, in many Lisp's one can simply use (setq myvar ...) and at most get a warning, but this is not standards-compliant.

As luck would have it, setq globals work in Elisp too, and the manual seems to indicate that this is intended semantics, not accident. So this will save me the pain of working around Common Lisp's "special" variable rules (I've never found a satisfactory solution), which is why I'm happy about Elisp.

Of course there are areas that need work, e.g. an easier "FFI" to access Elisp functions from Scheme (currently, one has to define a CPS-style wrapper in the back-end with the proper mangled name to make a function callable from CPSCM). But I find the prospect of programming Emacs in Scheme a pretty good motivation...

R6RS ratification guesstimate results

I have gone through the R6RS electorate and tried to guess each elector's vote based on their statement of interest. It's been a fun exercise. Most of the statements give clear hints (such as sentences like "R6RS will" + positive action, or Felix's all-caps screaming battle cry). Other statements offer partial clues (such as insisting on minimalism, or deploring the lack of a standard library). Yet others are completely opaque (some of them reading like CVs packed with the author's numerous accomplishments), in which case I ignored them.

OK, the results: R6RS will probably pass with a 70% approval rate.

I'd be interested in seeing any other predictions, especially if anybody has closely followed the r6rs-discuss list and has some time to burn. The official results are still more than a month off.

As most of you know, I strongly oppose R6RS. I will limit myself to asking those electors who complained about Scheme's lack of structs and hash tables to go read the SRFI list and pay attention to SRFI-9 and SRFI-69. If your Scheme does not support one of the widespread SRFIs, you should ask your friendly implementor to reconsider. Also, the fact that R6RS brings some good things does not mean we need to compromise on the bad things; it's always possible to restart the standardization effort and pick the gems out of the brown stuff.


BurryFS: a file system written in Scheme

I've released BurryFS, a file system based on Fuse and implemented in Chicken Scheme. BurryFS interacts with Fuse (the userspace filesystem API — merged into the Linux kernel since 2.6.14) to organize Digg content as a file system. Since the Fuse API relies on callbacks to deliver file system requests, and Scheme functions cannot serve as C callbacks, I have written a simple inversion-of-control layer that serializes Fuse requests over an internal socket and waits for replies from Scheme. At the other end, Scheme sits in an event loop, unpacking requests, reading information via the Digg API and sending replies. Since Chicken implements cooperative (lightweight) threads, complete with TCP support, BurryFS performance should be high even with multiple parallel requests.

For more information (and downloads), see the BurryFS homepage.

SWIG, Chicken and TinyCLOS

Note: this is a fairly technical post; if you have no interest in FFI's, you may still find the @ TinyCLOS macro useful.

When dealing with large C libraries, SWIG (the wrapper generator) can be a mixed blessing. On the one hand, it's a pleasure to work with wrapped C libraries from a dynamic language; on the other hand, generating the right wrappers can require significant time and effort, often with nothing to show for the plumbing work until the interface is complete.

In my case, the accessors and modifiers for C structures have been the most painful, initially at least. The library was full of complex, nested records of the following sort:

struct msg {
  int op;
  struct {
    char *name;
    union {
      int start;
      char *dst;
    } args;
  } req;

SWIG treats struct msg and its innards as separate objects; in Chicken, if you want to get to msg.args.start, you have to type a monstrosity like (msg-req-args-start-get (msg-req-args-get (msg-req-get msg))) (with bonus points for longer identifiers or deeper structures, of course).

The verbosity grows quadratically, and after a short while I started investigating the TinyCLOS mapping option. When invoked with the -proxy option, SWIG generates wrapper classes for C structures. This is enormously helpful: the previous incantation becomes (slot-ref (slot-ref (slot-ref msg 'req) 'args) 'start), which in real cases is a lot shorter due to, um, linear verbosity. To modify fields, you use slot-set!.

This was still too much typing, so I introduced the @ macro with which you can simply write (@ msg req args start), or (@ msg req name = "flush"):

 (define-syntax @
  (syntax-rules (=)
    ((_ o) o)
    ((_ o slot = v) (slot-set! o 'slot v))
    ((_ o slot . slots) (@ (slot-ref o 'slot) . slots))))

Finally, relief. In retrospect, I find it hard to believe that nobody solved this problem before; maybe there's some "standard" macro for this purpose, but I haven't found it.

This isn't the end of the saga, though. As soon as I moved back from experimenting to the actual library, I was hailed by a salvo of errors indicating that SWIG/TinyCLOS has probably never been used for large applications. Specifically, SWIG translates a composite structure name such as my_class into either <my-class> or <my_class> depending on the context. Presumably, SWIG/TinyCLOS was only tested for the traditional OOP toy examples (Shape, Pos etc.)

Fortunately this is easily fixed with perl -ne 'if (/<.*>/) { s/_/-/g; print } else { print }'. Older versions of SWIG also add an unnecessary (and harmful) (declare (uses tinyclos)) to the Scheme wrappers, but this is also easily excised.

The great news is that after all these machinations, as well as others not described here (involving callbacks and typemaps), SWIG/TinyCLOS seems to work without a hitch. I have had no problems using a large C library from a long-running Chicken program — writing the code was a lot of fun (compared to the SWIG saga), and, more importantly, there where no crashes. Has anybody else played with SWIG / Chicken / TinyCLOS?

Scheme object systems: POS

I'm no OOP fan (much less a fan of single-dispatch OOP), but sometimes I miss the implicit lexical scope that single-dispatch provides for methods. Take something as simple as

class Rect {
  int top, left, bottom, right;
  int area () const {
    return (top - bottom) * (right - left);

Most Scheme object systems (see for example the Chicken OOP section) turn the area() method body into something tedious along the lines of

(* (- (slot-ref self 'top) (slot-ref self 'bottom))
   (- (slot-ref self 'right) (slot-ref self 'left)))

or, with objects implemented as closures,

(* (- (self 'top) (self 'bottom) ...) ...)

Until recently, I thought that short of codewalker-based macros, nothing could restore the terseness of single-dispatch methods.

Well, I've discovered Blake McBride's POS (portable object system). With POS, you can write

(define-class Rect (ivars top left bottom right)
    (set-top! (self val) (set! top val)) ...
    (get-area (self) (* (- top bottom) (- right left)))))

POS is a set of pure R5RS macros, and correctly interacts with other syntax-rules macros (e.g. macros can appear within method bodies). The trick is not only to represent the objects as closures, but to also expand method bodies inside the closure:

;; not the actual POS expansion -- just an illustration
(define (make-rect)
  (let ((top #f) (left #f) (bottom #f) (right #f))
    (define self
      (lambda (meth-name . args)
        (case meth-name
           (apply (lambda (self val) (set! top val))
                  (cons self args))) ...
           (apply (lambda (self)
                    (* (- top bottom) (- right left))
                  (cons self args)))))))

This way, methods can access instance variables as simple literals. Each object is a dispatch function that closes over those variables.

POS is very useful, and I plan to add default getters and setters, as well as a way to convert between the closure representation and a-lists. This should help with persistence, among other things.

POS has a couple of extra features (inheritance, access to the parent object, class methods) but really is a light-weight system. The major downside is that methods (and instance variables) can no longer be added dynamically, since it's impossible to inject code (or data) into a closure.

Update: see the comments for yet another way of simulating an implicit "this" argument.


Scsh-regexp and SISCweb URLs

A new version of scsh-regexp, the SCSH regular expression API port to Chicken and SISC, is available. All API functions (except regexp manipulation such as uncase) are implemented, and there are a few bugfixes (including one that prevented compilation under SISC); there is still no SRE support.

For the moment, you will have to download this release from Google Code; the Chicken egg should be updated to release 0.2 in a day or two.

If you'd like to see a port of scsh-regexp to your favorite Scheme, feel free to let me know (or contribute some code), as I am unlikely to make any further changes otherwise (save fixing potential bugs, should any surface). A Guile port would be the easiest.

And now, a fun exercise: scsh-regexp can handle pretty URLs in SISCweb applications. More specifically, we are going to sidestep the publish method by passing all requests to a single handler, which will discriminate between various URLs using the match-cond macro; here's what a CMS application's entry point might look like:

(publish "/*" url-dispatcher)
(define (url-dispatcher request)
  (define path (request/get-path-info))
  (define (m url)
    (regexp-search (posix-string->regexp (string-append "^" url "$") path)))
   ((m "/archives/([0-9]+)/([0-9]+)") (_ year month)
    (list-stories year month))
   ((m "/story/([^/]*)") (_ story-slug)
    (send-story story-slug))
   (else (send-404))))

match-cond is described in the SCSH manual, but briefly it goes through a list of clauses (just like cond), and when a match is found, it binds variables to the specified sub-matches (which correspond to paranthesized chunks in the regexp). Note that the URLs now contain regexp-style wildcards, instead of SISCweb's usual servlet-style wildcards.

This approach can be made cleaner with macros, or, as I will describe in a future post, by using an upcoming feature of SISCweb 0.5 — (publish "/some/url/.*" handler 'regexp)

Sharing the same SISC (in SISCweb)

Over the past few days, I've been evaluating a new host (I'm looking at moving from shared hosting to a VPS). We have several applications running on SISCweb (the web framework that marries J2EE and SISC), and this blog runs on Wordpress, so there was the usual fun of configuring Apache, PHP, Tomcat and mod_proxy / mod_jk. At the end of this, unsurprisingly, Tomcat used the largest single amount of memory, though not as much as some doomsayers predicted.

Seeing that memory is the main bottleneck, I set out to optimize the setup a bit. Previously, each web application ran its own private copy of SISC and SISCweb. The most obvious step is to share SISC between all web applications.

It turns out that SISC has the concept of running several Scheme "applications" in the same interpreter, with separate heaps and absolutely no interference. To do this, one must create multiple AppContext instances and juggle them when calling Scheme from Java. This is a rather neat feature and one that I haven't seen in other Schemes. It's even possible to launch a new "application" from within Scheme.

When I tried to share SISC between multiple SISCweb instances, I ran into problems. It was clear that applications were stepping on each others' AppContexts and overwriting global SISCweb book-keeping structures. Looking at the code, I found that SISCweb was not designed with the possibility of a shared SISC interpreter in mind and relied on an implicit default AppContext whenever calling Scheme from a servlet.

To make a long story short, I made a few fixes and this is no longer the case. I can now run several web applications (even for different virtual hosts) in the same SISC instance. The patches will be in the official SISCweb repository as soon as Alessandro reviews and publishes them.

Oh yeah, I almost forgot. Happy New Year!