Omnigia

February 8, 2007

VMWare: when two OSs access the same partition

Filed under: linux, vmware — Dan Muresan @ 7:42 am

Probably the most convenient way to run Windows under Linux is to start with a dual-boot setup, then create (in Linux) a VMWare Server virtual machine based on the physical Windows partition. This ensures that you don’t have re-install Windows and your favorite applications.

But with great convenience comes great danger. When you power on the virtual machine, it will boot into GRUB (or LILO) which will ask which OS you want to run. No problem you’ll say, select Windows, it’s just a small inconvenience. Until the day your fingers err. Or, if GRUB has a timeout, the day you run to get a cup of water and come back to witness Linux booting. That means that the virtual machine and the host OS are now accessing the same partitions simultaneously.

The various VMWare tutorials strongly caution you to avoid this situations, which will likely result in data loss. But maybe you are wondering just how bad things can go (at least I always have). Well, about a month ago, facing a complete Linux re-install, I found the perfect opportunity to experiment. I had two Linux partitions (a JFS root and an EXT3 volume). So I powered up the virtual machine into Linux, and let it run its course, after which I rebooted.

The results? Surprisingly, the root JFS partition came out from fsck unscratched. That’s right, there were no errors, and nothing in /lost+found. The EXT3 partition, by contrast, was destroyed beyond repair (it started with a bad superblock, and went downhill from there as I tried to recover). Unphased, I decided to try again (after reformatting my EXT3 partition). The same thing happened. I have no ideea why, and I wouldn’t necessarily conclude that JFS is safer, but if you ever have the chance (or misfortune) to experiment, let me know how it goes…

And now, on to something more useful: how do you prevent such disasters? The answer is to force the VMWare partition to boot from a virtual floppy disk that makes the correct OS choice automatically (it could be GRUB with a single-item boot menu, or an NTLDR-based solution). Scott Bronson’s VMWare tutorial shows how to do this. Unfortunately, his method is rather inconvenient, requiring several reboots. So what follows is a simpler solution that replaces steps 3-10 from his Set up the Boot Disk section:

dd if=/dev/zero of=bootdisk.img bs=1k count=512
mke2fs -F bootdisk.img
mount -oloop bootdisk.img /mnt
mkdir -p /mnt/boot/grub
cp /boot/grub/stage[12] /mnt/boot/grub/

cat >/mnt/boot/grub/grub.conf <<EOF
timeout=3
title=Windows
root            (hd0,0)
chainloader     +1
makeactive
EOF

umount /mnt

grub --device-map=/dev/null <<EOF
device (fd0) bootdisk.img
root (fd0)
setup (fd0)
quit
EOF

The rest of Scott’s tutorial still applies — in particular, setting up different hardware profiles is important. How important? I’ll let you know next time I’m stuck with a complete Windows reinstall…

January 30, 2007

Compressed filesystem using SquashFS and AutoFS

Filed under: linux — Dan Muresan @ 9:42 am

When installing a modern Linux distribution on older computers, one problem you may face is the lack of disk space. I ran into this last week, while helping a friend install Ubuntu on an antique laptop with a 2G hard drive. The obvious starting point is to begin with a minimalist installation — Ubuntu Alternate CD (my choice), Arch Linux, or a few others. The good news is that your system doesn’t have to stay minimalistic if you know how to tailor the distribution.

One way to save space is to use data compression. It’s possible to keep parts of the filesystem compressed on disk and have Linux decompress them on the fly when they’re needed. This ideea is as old as Stacker / DoubleSpace, but for Linux we need to do more work, as there’s no stable read-write compressed filesystem as of this writing (though you may want to watch Johan Parent’s compFUSEd as it matures).

First, install the tools: squashfs, a compressed file system that yields better performance than the traditional cramfs, and autofs, to mount and unmount compressed directories automatically. Next, if you’ve never used a compressed filesystem, it helps to play with squashfs a bit:

# log in as root or type "sudo bash"
mksquashfs /tmp dummy.squashfs
mount -o loop dummy.squashfs /mnt
ls /mnt           # should be identical to /tmp
touch /mnt/x  # won't work, squashfs is read-only

This example creates a squash file system (in the file dummy.squashfs) that mirrors the contents of /tmp and mounts it (using loop, since it’s an ordinary file and not a block device) on /mnt. As the last command demonstrates, you can’t write in a squashfs, so you’ll want to compress directories that are normally not modified (so /tmp would actually be a bad choice, and so would be any user home directory, /var etc.)

Now, to work — let’s set autofs up (this only needs to be done once:)

cd /etc
echo '/var/autofs/squash /etc/auto.z --timeout=300' >>auto.master
echo '* -fstype=squashfs,loop :/opt/squashfs/&.squashfs' >>auto.z
/etc/init.d/autofs restart

The first line tells autofs to read /etc/auto.z (and to unmount auto-mounted directories 300 seconds after they are unused for 300 seconds); the second one says that whenever someone accesses /var/autofs/squash/DIR (where DIR is an arbitrary name), autofs should try to mount /opt/squashfs/DIR.squashfs automatically.

Next, set your sights on a large, read-only directory — say /usr/lib/mozilla-thunderbird. Here’s the plan:

  1. convert relative symlinks: for f in `find /usr/lib/mozilla-thunderbird -type l`; do t=`readlink -f $f`; rm $f; ln -s $t $f; done
  2. create a compressed filesystem: mksquashfs /usr/lib/mozilla-thunderbird /opt/squashfs/mozilla-thunderbird-lib.squash
  3. remove the original directory: rm -rf /usr/lib/mozilla-thunderbird
  4. replace the directory with a symbolic link: ln -s /var/autofs/squash/mozilla-thunderbird-lib /usr/lib/mozilla-thunderbird

You may wonder why the first step is necessary. The answer is that /usr/lib/mozilla-thunderbird contains some relative links (things like ../share/icons) that would break when the directory is relocated to /var/autofs/squash. So we use find to locate symlinks, readlink to read their target, and then rewrite these links.

That’s it. Whenever you access the compressed directory, it will be automounted:

ls /usr/lib/mozilla-thunderbird
mount

This method does have one disadvantage: if you ever upgrade thunderbird, dpkg will follow the compressed directory symlink and try to write inside it (which will fail). You should remove the /usr/lib/mozilla-thunderbird symlink prior to an upgrade (and, presumably, re-compress once the upgrade completes)

January 14, 2007

Running a home IMAP server on Ubuntu

Filed under: imap, linux — Dan Muresan @ 2:56 pm

I tend to work from several locations, and I like having access to my mail folders from everywhere. Some of my mail comes from POP accounts, but I also have access to an IMAP server. For a while, I used to store “active” conversations on the IMAP server, and periodically archive them in the mbox folders at home to stay within my alloted mail quota. This meant that if I wanted to look at an older message, I basically had to ssh to my home box and dig it out.

I finally got tired of moving e-mails back and forth and decided to set up an IMAP server on my home box. Which server to use? I’ve been using mbox folders for a long time and wasn’t about to convert to Maildir, which ruled out Courier. After a bit of searching, I found UW IMAP. I’ve always been a fan of pine, so I tend to trust mail software from the University of Washington.

An apt-get install uw-imapd later, I was faced with a server that installs no configuration files in /etc and no service script in /etc/init.d. The latter puzzle was easy to solve: UW-IMAP expects to be called from inetd or xinetd and conveniently appends to /etc/inetd.conf, so apt-get install inetd is enough to enable this service. I fired up Thunderbird, and defined a new IMAP account; Thunderbird was able to connect, but authentication failed.

I next turned to the IMAP FAQ, and found out that UW-IMAP prides itself on needing no configuration; meaning that, if you need to change things like the user’s mail directory (which annoyingly defaults to the user’s home and exports the entire directory tree), you have to recompile the package.

I also learned that, in fact, there is one configuration file: /etc/cram-md5.pwd, to be filled with usernames and passwords (one pair per row, tab-separated). Since I did not install the IMAP SSL package, cram-md5 is the only way to retain some security, otherwise passwords are sent in clear over the network. Thunderbird has an option to force CRAM MD5 authentication in the account settings dialog.

I was finally able to connect to the IMAP server, but now the folder list included every file in my home directory, since, as I mentioned, UW-IMAP’s idea of the mail store is the user’s entire home. To solve this, you can use “mail” as an IMAP server path (Server settings > Advanced in Thunderbird).

But some programs don’t have this option or ignore the setting (Opera’s M2 seems to do that). Another solution involves a hack; create a dummy user (say joe-mail if your login is joe) with the same UID and GID as the real user, and with a home directory in the desired location:

id joe  # your login name
# note the uid and gid, then
sudo useradd -u $uid -g $gid -d /home/joe/mail joe-mail

After updating /etc/cram-md5.pwd and Thunderbird, I was finally able to read mail from my home IMAP server. UW-IMAP has some counter-intuitive defaults, but, once you know what to do, is quite easy to tweak.

Update: apparently, dovecot-imapd (recommended by two readers) and cyrus-imapd might be easier to set up, but I’ve already spent enough time tending to UW-IMAP, so I’m not touching it. Feel free to comment if you have used other servers.

January 6, 2007

Sharing the same SISC (in SISCweb)

Filed under: scheme, siscweb — Dan Muresan @ 5:40 am

Over the past few days, I’ve been evaluating a new host (I’m looking at moving from shared hosting to a VPS). We have several applications running on SISCweb (the web framework that marries J2EE and SISC), and this blog runs on Wordpress, so there was the usual fun of configuring Apache, PHP, Tomcat and mod_proxy / mod_jk. At the end of this, unsurprisingly, Tomcat used the largest single amount of memory, though not as much as some doomsayers predicted.

Seeing that memory is the main bottleneck, I set out to optimize the setup a bit. Previously, each web application ran its own private copy of SISC and SISCweb. The most obvious step is to share SISC between all web applications.

It turns out that SISC has the concept of running several Scheme “applications” in the same interpreter, with separate heaps and absolutely no interference. To do this, one must create multiple AppContext instances and juggle them when calling Scheme from Java. This is a rather neat feature and one that I haven’t seen in other Schemes. It’s even possible to launch a new “application” from within Scheme.

When I tried to share SISC between multiple SISCweb instances, I ran into problems. It was clear that applications were stepping on each others’ AppContexts and overwriting global SISCweb book-keeping structures. Looking at the code, I found that SISCweb was not designed with the possibility of a shared SISC interpreter in mind and relied on an implicit default AppContext whenever calling Scheme from a servlet.

To make a long story short, I made a few fixes and this is no longer the case. I can now run several web applications (even for different virtual hosts) in the same SISC instance. The patches will be in the official SISCweb repository as soon as Alessandro reviews and publishes them.

Oh yeah, I almost forgot. Happy New Year!

December 23, 2006

CPSCM passes R5RS pitfalls

Filed under: scheme, cpscm — Dan Muresan @ 2:53 am

After several fixes and tweaks, the CPSCM Javascript backend passes the R5RS Pitfalls test with a full score. The Lisp backend almost does, but since I decided to stick with Lisp’s convention of representing both false booleans and empty lists with NIL, it fails cases 5.1 - 5.3.

Additionally, I have tested the bubble sort example and the JS backend in general under IE 6, Firefox and Opera (the browsers I can access easily). I haven’t encountered any issues. If you are using IE 7 or Safari and run into problems, please report. The online compiler now lets you execute compiled code directly in your browser, so it’s much easier to test than before.

On the negative side, some of you have noticed that the CPS code is more bloated than before. This is because I had to back out η-reduction for some cases, in the interest of correctness: (lambda (x) (f x)) does not reduce to f automaticaly if f is mutable (and, of course, Scheme is not Erlang). The analysis module will catch up eventually…

December 14, 2006

CPSCM translates Scheme to Javascript

Filed under: scheme, cpscm — Dan Muresan @ 7:21 am

I have added a Javascript backend for CPSCM. The compiled code runs either inside a browser or in Rhino. The Bubble sort example demonstrates compiled Scheme code running “in a web page” and interfacing with native Javascript code (the latter provides DHTML functionality). You can call Scheme functions from Javascript and Javascript functions from Scheme with no restrictions (even continuations will work correctly). The explanations on the bubblesort page should get you started, in case you want to roll your own Scheme programs.

I have made updates and fixes to all backends. There are still missing pieces, but at least now they are summarized on the conformance page, so you know what to expect. Among the improvements, there is an error function which interacts correctly with dynamic-wind. I have borrowed the concept of failure continuations from SISC; you can access them via with-failure-continuation.

Finally, since Javascript console I/O is not standardized, I have implemented SRFI-6 output strings. By default, (display) and its family assume you are using Rhino and try to print to standard output. You can switch to a string using (current-output-port (open-output-string)), and at any point retrieve the accumulated output using (get-output-string (current-output-port)).

December 3, 2006

CPSCM improvements

Filed under: scheme, cpscm — Dan Muresan @ 10:44 am

Alessandro Colomba (of SISCweb fame) played with CPSCM and noted that he had trouble compiling the SRFI-1 reference implementation (you need a self-contained SRFI-1 to check). After investigating the problem, I found out that the culprit was the η-reduction code in simplify-sexp, which wasn’t designed carefully and exhibited exponential behavior on certain inputs (in practice, I’ve only seen that happen on CPS-ed code). After refactoring simplify-sexp, SRFI-1 compiles in just a few seconds.

The fixes are up in SVN (I’ve tagged the current version as rel-0.9.2). Other improvements:

  • Most of the code can be compiled under Chicken, with remarkable speed gains. Just type make in the scm directory. csi (which is still needed for the REPL) uses the compiled libraries automatically.
  • By popular demand I have added a file->lisp procedure for compiling a Scheme source file.
  • Programs are no longer wrapped in a giant letrec, but generate a sequence of top-level definitions and evaluation calls. This means you can compile libraries (such as SRFI-1 above) to separate files, and then load those compiled files independently in the back-end.

November 25, 2006

Announcing CPSCM, a new Scheme

Filed under: scheme, cpscm — Dan Muresan @ 7:25 am

I am releasing CPSCM, a new Scheme compiler based on classic CPS conversion and trampolines. It will eventually support multiple backends (Javascript and Java are in the works), but currently it supports Scheme to Common Lisp translation. You can see it work right from your browser on the online demo page (no large jobs, please), or you can download and run it by following the instructions on the CPSCM homepage.

Macro-expansion is delegated to Al Petrofsky’s alexpander, which means that CPSCM has full syntax-rules support from the start. I’ll probably add define-macro support at some point. I don’t feel up to integrating syntax-case, but if anyone wants to contribute, it would be greatly appreciated.

Other than this, CPSCM supports full continuations, including correct call/cc + dynamic-wind interaction, and SRFI-0. It still lacks eval, error protection in dynamic-wind, streams, load, and multiple-file source facilities. An interesting point is that as soon as CPSCM is able to compile itself, eval can be added in (though the environment functions other than interaction-environment will be problematic.)

As with scsh-regexp, I will use Google Code Project Hosting. Some people have questioned this choice (and Google Code has earned mixed reviews); compared to Sourceforge, Google Code has a big advantage: they don’t make you fill out multi-page forms (and wait for approval!) for the “privillege” of uploading an open-source project.

November 15, 2006

Regular expressions in Scheme

Filed under: scheme, scsh-regexp — Dan Muresan @ 5:16 am

In my last post, I mentioned generating the R5RS identifier list by scraping the HTML version of the R5RS standard. I decided to use Scheme for the job, and quickly learned that Chicken and SISC lack adequate regexp support (SISC has no support at all, apart from letting you interface with the underlying JVM). Eventually, I settled upon SCSH, as it has a powerful regexp API, as well as good shell integration.

The resulting SCSH script took forever to run (to be fair, I added code to separate procedure names from macro names, and didn’t bother optimizing beyond the naive O(n2) algorithm). I started to miss Chicken’s speed. The SCSH regexp API looked reasonably easy to port. I ended up writing both a Chicken and a SISC emulation layer (the latter based on java.util.regex). I am planning to add a pregexp backend as well, which would extend regexp support to any R5RS system.

Have a look at the scsh-regexp project for details, examples and news.

November 7, 2006

Scheme readline completion

Filed under: scheme — Dan Muresan @ 12:31 pm

When interacting with a REPL, readline history and tab completion support are major productivity boosters. This is true in general, but especially so given Scheme’s long names (e.g. call-with-input-file). Some Schemes have integrated readline support, but the one I use most, namely SISC, does not.

The next best thing is to use something like rlwrap (as SISC actually does). «rlwrap COMMAND» adds history support out of the box by intercepting COMMAND’s standard input and output. Furthermore, I’ve just learned that, when configured properly, rlwrap can also autocomplete a predefined set of identifiers (and optionally learn new identifiers from standard input/output). SISC does not enable completion by default, but we can easily fix it. The relevant command-line arguments are

-b DELIMITER-CHARS
A list of word-separating delimiters. Whitespace is included by default; for Scheme, use “\”()[]’`”.
-H HISTORY-file
The history file.
-f IDENTIFIER-FILE
A file listing the identifiers to be completed by default, one per line (this option can occur multiple times). I wrote a script to parse the R5RS index HTML page; I’m putting the output online, so you can simply download the resulting R5RS identifier list. It makes sense to also add a -f HISTORY-FILE argument.

These are the basics; the man page documents a few more interesting options (in particular, see -r). Using rlwrap, you can enable readline history and completion for any interaction-challenged Scheme system.

« Previous Page

[ Powered by WordPress ]