Haskell is awesome

18 Mar 2012

I have started to learn myself haskell using the book named “Real world Haskell”. I have so far only come to chapter 4, but I am already in love with some of the features:

  • Its strict static type system, which makes it easy to understand what a function does. Moreover, it allows you to think through what your code is going to do as well as make the decisions of what to do for special cases up front. The following is a definition of a function which compares the length of two lists, and returns their order (==, <, >). The definition clearly states that it operates on two lists of any type, and returns a value of type Ordering. Crystal clear!

      listCmp :: [a] -> [a] -> Ordering
    
  • Partially due to the above point, one can avoid unpleasant bugs later on, because you chose to postpone your decision on what to do with your input.

  • Pattern matching. I came across this in the Oz programming language when I was in university, but I didn’t really understand how powerful and readable everything becomes until using it in Haskell. The following function takes a separator and a list of lists as argument, and combines the lists using the separator:

      intersperse :: a -> [ [a] ] -> [a]
      intersperse sep [] = []
      intersperse sep (x:[]) = x
      intersperse sep (x:xs) = x ++ [sep] ++ (intersperse sep xs)
    

    I love how you can just look at the patterns to see what cases is covered by the function, rather than nesting into some complex if sentence.

  • Readability when using ‘where’ syntax. This is the implementation of the listCmp function:

      listCmp lhs rhs
          | lengthLhs < lengthRhs = LT
          | lengthLhs > lengthRhs = GT
          | otherwise             = EQ
        where lengthLhs = (length lhs)
              lengthRhs = (length rhs)
    

    What I like about it is that you can separate the logic performed on values from the function calls, so that when you read the code, you see the actual computation done by the function in the different cases. You can also do this with the let syntax, but I think the above reads really well.





Back to compiling software

01 Oct 2011

For a while now, I have been using Ubuntu Linux on my desktop, and it as worked really well. In fact, I even installed Windows 7 on my media center (replacing Linux) just to stop bothering with configuring my system all the time. Since I started working at Yahoo!, I did not really feel like having to do extra work at home in order for my computer to function properly. Moreover, I did not have much time left to work on FreeBSD, so I simply reinstalled my desktop with Linux, and that has been working well for almost a year now.

But recently I have sort of missed working on FreeBSD, so I decided to give it at try again from a user perspective. Many of the things I feel was lacking is still there. However, the things that were good, are still good. So far, I have been able to install all software that I wanted to install, but I still feel that we need something better on top of ports in order to make it easier for users. Hopefully, some of the initiatives that I have seen on the mailing list will not die any time soon. Apart from ports, many of the common tasks are pretty manual too. Configuring the system should be more straightforward than having to guess and edit what should be in /etc/rc.conf. Though many of the issues I encounter comes from the fact that FreeBSD has a very small userbase, and is simply not prioritized by many companies, there are a lot of things that can be improved irregardless of that. If i start doing any more FreeBSD work, it is most likely to be in the “make-it-less-painful-to-use”-department.





Using 4k sector drives

15 Aug 2010

I just bought two Western Digital 2 TB disks the other day in order to increase storage capacity. I was planning on putting a ZFS mirror on them. The other day I discovered that the disks uses a new drive format called “Advanced Disk Format”. This format basically extends the sector size from 512 to 4096 bytes.

The problem is that the disks report their sector size to be 512 rather than 4096 in order for them to work well with existing operating systems. The issues with these disks are discussed here and here.

To summarize, this results in two main problems:

  1. Partitioning tools operate on 512 bytes “logical” sectors, which may result in a partition starting at a non-aligned (compared to 4096 bytes) physical sector. If using partitioning tools that are not updated to align partitions to 4k, a request may cause a write to more than one sector.

  2. File systems/disk consumers think the underlying device has a 512 byte sector size, and issues requests that are below 4096 bytes. For a write request, this is catastrophic, because in order to write only parts of a block, the disk will have to read the block and modify the part that changed, before writing it back to disk (Read-modify-write).

Dag Erling Smørgrav made a tool to benchmark disk performance using aligned and misaligned writes (mentioned in his post above (svn co svn://svn.freebsd.org/base/user/des/phybs). Here are the results:

nobby# ./phybs -w /dev/gpt/storage0
count    size  offset    step        msec     tps    kBps

131072    1024       0    4096      131771      16     994
131072    1024     512    4096      136005      16     963

 65536    2048       0    8192       74762      14    1753
 65536    2048     512    8192       71407      15    1835
 65536    2048    1024    8192       73432      15    1784

 32768    4096       0   16384       20710     130    6328
 32768    4096     512   16384       61987      43    2114
 32768    4096    1024   16384       62719      43    2089
 32768    4096    2048   16384       61089      44    2145

 16384    8192       0   32768       14238     245    9205
 16384    8192     512   32768       53348      65    2456
 16384    8192    1024   32768       52868      66    2479
 16384    8192    2048   32768       50914      68    2574

Clearly, using < 4k blocks results in bad performance. Using blocks larger than 4k results in a 3x speedup.

The way I solved this in FreeBSD was to partition the disk manually with gpart and set the partition start to a multiple of 8 (8 * 512 = 4096). All partitions on the disk should start at a sector number that is a multiple of 8.

ZFS uses variable block sizes for its requests, which can pose a problem when the underlying provider reports a sector size of 512 bytes. In order to override this, I used gnop(8), which can create a provider on top of another provider with different characteristics: gnop create -o 4096 -S 4096

The -o parameter makes sure that the new provider does not conflict with the original provider when ZFS tries to detect any filesystems on the disk. The second parameter sets the sector size of the new parameter to 4096, which makes sure that all requests going to the disk from ZFS will be in 4k blocks.

For UFS, the default fixed block size is 16k, so there should be no worries about it using lower block sizes. Moreover, newfs provides a -S parameter, which overrides the sector size of the underlying provider. I have not tried using UFS on these disks, but I don’t see any reason for it not working.





Locale fix

12 Dec 2009

After looking for a long time as to why my default locale in gnome changed after a recent upgrade, I finally found out where to change the locale setting. The problem was that gnome did not seem to pick up my system locale settings, and the norwegian characters in my terminal came up as question marks.

As the gnome login manager (gdm) got rewritten, there is now no way to change this locale at the login screen unless it was picked up by gdm. But, as always, reading the documentation helps. After reading

http://library.gnome.org/admin/gdm/2.28/configuration.html.en

I discovered that I could just edit

~/.dmrc

and write this:

[Desktop]
Language=en_US.UTF-8
Layout=no

to set the correct locale!





Sometimes ports make me cry

08 Sep 2009

I guess I’m not the typical FreeBSD user, because I do not enjoy using ports much. Mainly this is because I also use it as a desktop. On a powerful server or workstation, ports is fine. It’s super flexible and everything works quite well. And kudos to all people working on updating and making improvements to it.

However, using ports on my laptop really makes me cry. Why? If I want to install a port, I have to keep a ports tree on my laptop and actually compile everything. Since I have a pretty weak laptop in terms of processing power, this takes ages. But of course, I can install packages! The thing with packages, however, is that it works really well for a release, but when upgrading later on, I always end up in trouble if I try to use the official FreeBSD packages.

First of all, the package sets following each release gets outdated quickly. Second, if I want to update my packages without using ports I get into trouble. There is no real package upgrade tool that I know of, but I can install portupgrade if I want to, because it has a fancy -PP options, telling it to use packages only. But there are issues with this: portupgrade seems to require that you have a ports tree to work. In addition, when you have the ports tree, portupgrade will look for packages matching the exact version that is in ports, and if the package server does not happen to have the same ports tree as you (only one commit updating a port can break this), it fails.

So what is the solution for me, besides writing a pkg_upgrade? Having a ports tinderbox on a different host to build packages for my laptop (I could use official 8-stable packages for instance, but there always seem to be some packages missing, and some not built). And the upgrade procedure? Move /usr/local and /var/db/pkg away, and reinstall packages. It works ok, but looking at how well this can be handled on other systems, it’s a bit silly :/ So, maybe I’ll just have to look closer at the pkg_upgrade idea :)

So, on to the constructive part of this rant^Wpost. There is no need to change everything for this to work better. A pkg_upgrade tool can probably reuse a lot from the other pkgtools, such as version checking and dependency checking. However, the hard part is knowing what version to get from the servers. Luckily, the Latest/ directory contains unversioned tarballs of packages that can be examined to get their version. But again, this requires one to get the packages first in order to examine it. Not very bandwidth-friendly. I think a simple approach would be to keep a version list together with the packages, which could be used by pkg_upgrade to check if any new version of a package exists (much like INDEX in /usr/ports I guess). I haven’t thought about the hardest question yet: how to handle dependencies and package renaming, but I would think one could allow specifying this in the same file.

Update: As i was working against my local package repository, I did not notice that the official package repositories actually contains the INDEX file from the ports tree where the packages are built.

I also think the package building procedures could be changed, because somehow, there are always packages missing (at least several gnome packages last time I tried). I do not know much about this though, but I would advocate for a system where a package was rebuilt on all architectures and supported releases once a commit was made to the affecting port.

There, I feel better now :)