Omnigia

January 30, 2007

Compressed filesystem using SquashFS and AutoFS

Filed under: linux — Dan Muresan @ 9:42 am

When installing a modern Linux distribution on older computers, one problem you may face is the lack of disk space. I ran into this last week, while helping a friend install Ubuntu on an antique laptop with a 2G hard drive. The obvious starting point is to begin with a minimalist installation — Ubuntu Alternate CD (my choice), Arch Linux, or a few others. The good news is that your system doesn’t have to stay minimalistic if you know how to tailor the distribution.

One way to save space is to use data compression. It’s possible to keep parts of the filesystem compressed on disk and have Linux decompress them on the fly when they’re needed. This ideea is as old as Stacker / DoubleSpace, but for Linux we need to do more work, as there’s no stable read-write compressed filesystem as of this writing (though you may want to watch Johan Parent’s compFUSEd as it matures).

First, install the tools: squashfs, a compressed file system that yields better performance than the traditional cramfs, and autofs, to mount and unmount compressed directories automatically. Next, if you’ve never used a compressed filesystem, it helps to play with squashfs a bit:

# log in as root or type "sudo bash"
mksquashfs /tmp dummy.squashfs
mount -o loop dummy.squashfs /mnt
ls /mnt           # should be identical to /tmp
touch /mnt/x  # won't work, squashfs is read-only

This example creates a squash file system (in the file dummy.squashfs) that mirrors the contents of /tmp and mounts it (using loop, since it’s an ordinary file and not a block device) on /mnt. As the last command demonstrates, you can’t write in a squashfs, so you’ll want to compress directories that are normally not modified (so /tmp would actually be a bad choice, and so would be any user home directory, /var etc.)

Now, to work — let’s set autofs up (this only needs to be done once:)

cd /etc
echo '/var/autofs/squash /etc/auto.z --timeout=300' >>auto.master
echo '* -fstype=squashfs,loop :/opt/squashfs/&.squashfs' >>auto.z
/etc/init.d/autofs restart

The first line tells autofs to read /etc/auto.z (and to unmount auto-mounted directories 300 seconds after they are unused for 300 seconds); the second one says that whenever someone accesses /var/autofs/squash/DIR (where DIR is an arbitrary name), autofs should try to mount /opt/squashfs/DIR.squashfs automatically.

Next, set your sights on a large, read-only directory — say /usr/lib/mozilla-thunderbird. Here’s the plan:

  1. convert relative symlinks: for f in `find /usr/lib/mozilla-thunderbird -type l`; do t=`readlink -f $f`; rm $f; ln -s $t $f; done
  2. create a compressed filesystem: mksquashfs /usr/lib/mozilla-thunderbird /opt/squashfs/mozilla-thunderbird-lib.squash
  3. remove the original directory: rm -rf /usr/lib/mozilla-thunderbird
  4. replace the directory with a symbolic link: ln -s /var/autofs/squash/mozilla-thunderbird-lib /usr/lib/mozilla-thunderbird

You may wonder why the first step is necessary. The answer is that /usr/lib/mozilla-thunderbird contains some relative links (things like ../share/icons) that would break when the directory is relocated to /var/autofs/squash. So we use find to locate symlinks, readlink to read their target, and then rewrite these links.

That’s it. Whenever you access the compressed directory, it will be automounted:

ls /usr/lib/mozilla-thunderbird
mount

This method does have one disadvantage: if you ever upgrade thunderbird, dpkg will follow the compressed directory symlink and try to write inside it (which will fail). You should remove the /usr/lib/mozilla-thunderbird symlink prior to an upgrade (and, presumably, re-compress once the upgrade completes)

January 14, 2007

Running a home IMAP server on Ubuntu

Filed under: imap, linux — Dan Muresan @ 2:56 pm

I tend to work from several locations, and I like having access to my mail folders from everywhere. Some of my mail comes from POP accounts, but I also have access to an IMAP server. For a while, I used to store “active” conversations on the IMAP server, and periodically archive them in the mbox folders at home to stay within my alloted mail quota. This meant that if I wanted to look at an older message, I basically had to ssh to my home box and dig it out.

I finally got tired of moving e-mails back and forth and decided to set up an IMAP server on my home box. Which server to use? I’ve been using mbox folders for a long time and wasn’t about to convert to Maildir, which ruled out Courier. After a bit of searching, I found UW IMAP. I’ve always been a fan of pine, so I tend to trust mail software from the University of Washington.

An apt-get install uw-imapd later, I was faced with a server that installs no configuration files in /etc and no service script in /etc/init.d. The latter puzzle was easy to solve: UW-IMAP expects to be called from inetd or xinetd and conveniently appends to /etc/inetd.conf, so apt-get install inetd is enough to enable this service. I fired up Thunderbird, and defined a new IMAP account; Thunderbird was able to connect, but authentication failed.

I next turned to the IMAP FAQ, and found out that UW-IMAP prides itself on needing no configuration; meaning that, if you need to change things like the user’s mail directory (which annoyingly defaults to the user’s home and exports the entire directory tree), you have to recompile the package.

I also learned that, in fact, there is one configuration file: /etc/cram-md5.pwd, to be filled with usernames and passwords (one pair per row, tab-separated). Since I did not install the IMAP SSL package, cram-md5 is the only way to retain some security, otherwise passwords are sent in clear over the network. Thunderbird has an option to force CRAM MD5 authentication in the account settings dialog.

I was finally able to connect to the IMAP server, but now the folder list included every file in my home directory, since, as I mentioned, UW-IMAP’s idea of the mail store is the user’s entire home. To solve this, you can use “mail” as an IMAP server path (Server settings > Advanced in Thunderbird).

But some programs don’t have this option or ignore the setting (Opera’s M2 seems to do that). Another solution involves a hack; create a dummy user (say joe-mail if your login is joe) with the same UID and GID as the real user, and with a home directory in the desired location:

id joe  # your login name
# note the uid and gid, then
sudo useradd -u $uid -g $gid -d /home/joe/mail joe-mail

After updating /etc/cram-md5.pwd and Thunderbird, I was finally able to read mail from my home IMAP server. UW-IMAP has some counter-intuitive defaults, but, once you know what to do, is quite easy to tweak.

Update: apparently, dovecot-imapd (recommended by two readers) and cyrus-imapd might be easier to set up, but I’ve already spent enough time tending to UW-IMAP, so I’m not touching it. Feel free to comment if you have used other servers.

January 6, 2007

Sharing the same SISC (in SISCweb)

Filed under: scheme, siscweb — Dan Muresan @ 5:40 am

Over the past few days, I’ve been evaluating a new host (I’m looking at moving from shared hosting to a VPS). We have several applications running on SISCweb (the web framework that marries J2EE and SISC), and this blog runs on Wordpress, so there was the usual fun of configuring Apache, PHP, Tomcat and mod_proxy / mod_jk. At the end of this, unsurprisingly, Tomcat used the largest single amount of memory, though not as much as some doomsayers predicted.

Seeing that memory is the main bottleneck, I set out to optimize the setup a bit. Previously, each web application ran its own private copy of SISC and SISCweb. The most obvious step is to share SISC between all web applications.

It turns out that SISC has the concept of running several Scheme “applications” in the same interpreter, with separate heaps and absolutely no interference. To do this, one must create multiple AppContext instances and juggle them when calling Scheme from Java. This is a rather neat feature and one that I haven’t seen in other Schemes. It’s even possible to launch a new “application” from within Scheme.

When I tried to share SISC between multiple SISCweb instances, I ran into problems. It was clear that applications were stepping on each others’ AppContexts and overwriting global SISCweb book-keeping structures. Looking at the code, I found that SISCweb was not designed with the possibility of a shared SISC interpreter in mind and relied on an implicit default AppContext whenever calling Scheme from a servlet.

To make a long story short, I made a few fixes and this is no longer the case. I can now run several web applications (even for different virtual hosts) in the same SISC instance. The patches will be in the official SISCweb repository as soon as Alessandro reviews and publishes them.

Oh yeah, I almost forgot. Happy New Year!

[ Powered by WordPress ]