May 2007

SWIG, Chicken and TinyCLOS

Note: this is a fairly technical post; if you have no interest in FFI's, you may still find the @ TinyCLOS macro useful.

When dealing with large C libraries, SWIG (the wrapper generator) can be a mixed blessing. On the one hand, it's a pleasure to work with wrapped C libraries from a dynamic language; on the other hand, generating the right wrappers can require significant time and effort, often with nothing to show for the plumbing work until the interface is complete.

In my case, the accessors and modifiers for C structures have been the most painful, initially at least. The library was full of complex, nested records of the following sort:

struct msg {
  int op;
  struct {
    char *name;
    union {
      int start;
      char *dst;
    } args;
  } req;
}

SWIG treats struct msg and its innards as separate objects; in Chicken, if you want to get to msg.args.start, you have to type a monstrosity like (msg-req-args-start-get (msg-req-args-get (msg-req-get msg))) (with bonus points for longer identifiers or deeper structures, of course).

The verbosity grows quadratically, and after a short while I started investigating the TinyCLOS mapping option. When invoked with the -proxy option, SWIG generates wrapper classes for C structures. This is enormously helpful: the previous incantation becomes (slot-ref (slot-ref (slot-ref msg 'req) 'args) 'start), which in real cases is a lot shorter due to, um, linear verbosity. To modify fields, you use slot-set!.

This was still too much typing, so I introduced the @ macro with which you can simply write (@ msg req args start), or (@ msg req name = "flush"):

 (define-syntax @
  (syntax-rules (=)
    ((_ o) o)
    ((_ o slot = v) (slot-set! o 'slot v))
    ((_ o slot . slots) (@ (slot-ref o 'slot) . slots))))

Finally, relief. In retrospect, I find it hard to believe that nobody solved this problem before; maybe there's some "standard" macro for this purpose, but I haven't found it.

This isn't the end of the saga, though. As soon as I moved back from experimenting to the actual library, I was hailed by a salvo of errors indicating that SWIG/TinyCLOS has probably never been used for large applications. Specifically, SWIG translates a composite structure name such as my_class into either <my-class> or <my_class> depending on the context. Presumably, SWIG/TinyCLOS was only tested for the traditional OOP toy examples (Shape, Pos etc.)

Fortunately this is easily fixed with perl -ne 'if (/<.*>/) { s/_/-/g; print } else { print }'. Older versions of SWIG also add an unnecessary (and harmful) (declare (uses tinyclos)) to the Scheme wrappers, but this is also easily excised.

The great news is that after all these machinations, as well as others not described here (involving callbacks and typemaps), SWIG/TinyCLOS seems to work without a hitch. I have had no problems using a large C library from a long-running Chicken program — writing the code was a lot of fun (compared to the SWIG saga), and, more importantly, there where no crashes. Has anybody else played with SWIG / Chicken / TinyCLOS?

Why the C preprocessor is a good thing

Yesterday, Christian Kienle argued that the C preprocessor is a bad thing. When a language lacks closures and garbage collection and forces static typing without type inference on its users, you would think that a moderately powerful feature like preprocessor macros would get some respect, at least in these times of programming-language renaissance when there are so many good alternatives.

First of all, I believe that Paul Graham's advice holds true in any language: macros should only be used when nothing else will do. But when that happens, avoiding macros leads to contorted or verbose solutions.

Let's look at Christian's arguments:

Debugging preprocessor macros is hard

It's true that most debuggers can't map compiled code back to the original macros. However, most debugging is (or should be) done outside debuggers, and debugging would be hard without the preprocessor:

  • the preprocessor provides the __FILE__ and __LINE__ macros. Yes, they could be predefined identifiers, just like C99's __func__, but that's actually a less flexible solution: since C concatenates adjacent string literals, you can write "error in " __FILE__, but you can't do that with __func__
  • assert can only be written as a macro, since it needs to stringify the condition being tested
  • Without macros, you'd be forced to invoke logging primitives like in Java:
    if(logger.isDebugEnabled() {
      logger.debug(expensive_function ());
    }
  • using macros and RAII in C++, you can write a tracing system

Preprocessor macros are not type-safe

True, and it's the closest thing that C/C++ have to type inference. Christian doesn't actually show how this supposed type-unsafety can bite you, but instead points to the next reason and suggests that you use templates in C++ (or NSNumbers in Objective C). I don't know about Objective C, but

template <class T1, class T2> bool min (T1 x, T2 y)
{ return x < y ? x : y; }

looks pretty verbose to me.

Preprocessor macros often lead to side effects

What this means is that macro arguments can appear multiple times in the macro-expansion:

#define MAX( a, b ) ((a) < (b) ? (a) : (b))

If one of the arguments is an expression with side effects (such as x++ or a function call that modifies some state), then we have a bug. This is true, but

  • programming with side effects is not a good practice. Even if you don't have the luxury to program in a functional language, you should still strive to minimize reliance on side-effects
  • macros are usually given capitalized names, like MAX, just so they scream at you when you are about to type MAX (x++, f (y))
  • if one of the arguments is a function f(), but f has no side effects, the compiler may be able to optimize away redundant multiple invocations
  • you get what you pay for — this is not Lisp, after all.

Of the three arguments against macros, only the last one is actually a serious objection; and just because the C preprocessor is too weak doesn't mean you shouldn't use it when necessary.

Finally, for fun, I'd like to point you to some macro magic:

If you have other cool examples, feel free to add them in the comments.

Tags: