Regular expressions in Scheme
In my last post, I mentioned generating the R5RS identifier list by scraping the HTML version of the R5RS standard. I decided to use Scheme for the job, and quickly learned that Chicken and SISC lack adequate regexp support (SISC has no support at all, apart from letting you interface with the underlying JVM). Eventually, I settled upon SCSH, as it has a powerful regexp API, as well as good shell integration.
The resulting SCSH script took forever to run (to be fair, I added code to separate procedure names from macro names, and didn’t bother optimizing beyond the naive O(n2) algorithm). I started to miss Chicken’s speed. The SCSH regexp API looked reasonably easy to port. I ended up writing both a Chicken and a SISC emulation layer (the latter based on java.util.regex). I am planning to add a pregexp backend as well, which would extend regexp support to any R5RS system.
Have a look at the scsh-regexp project for details, examples and news.
