Debian 7 on the Samsung Series 9 Ultrabook

I recently purchased an upgrade to my aging laptop; a SAMSUNG Series 9 NP900X3C-A01US 13″ Ultrabook. I wont go too much into aesthetics except to say that this laptop is everything the reviews say it is. It’s light, sturdy, stylish, fast, and sips power. It is, almost down to the PCB, Samsungs answer to the 13″ Macbook Air. I am happier with it so far than I have been with any laptop I’ve owned… and I’ve owned quite a few.

At any rate, throwing Debian 7.0 (wheezy) on this laptop was trivial and almost everything “just works”. There are a few things I had to tweak as far as power saving, function keys, etc. and I wanted to outline those things here. Implement the items below to get the most out of yours if you own one.

Use the latest kernel
I am running 3.7.4 from kernel.org on this ultrabook. Always use the latest available stable kernel on laptops. This is doubly true on very new ones like the series 9 if you want all the hardware to be well supported. Some hardware wont work under the default wheezy kernel on this model. There are also continual improvements in power management happening in the kernel. One example of something that didnt work properly under the default wheezy kernel was detecting when the lid was closed.

Use tmpfs
Debian doesn’t yet default to putting some things on tmpfs that should be. In /etc/default/tmpfs set RAMTMP=yes to mount /tmp on tmpfs. I also like to add an entry to /etc/fstab to mount /home/someuser/.cache/google-chrome on tmpfs as well. Both of these things speed up access to temporary/cache data and help to save power.

tmpfs /home/someuser/.cache/google-chrome tmpfs mode=1777,noatime 0 0

Enable discard support
This laptop comes with a 128GB SanDisk SSD U100. If your SSD supports TRIM (and this one does) and you are using ext4 (and you should be!) you can enable TRIM support in the file system by adding ‘discard’ to all the mount points in /etc/fstab.

/dev/mapper/lvm-root / ext4 discard,errors=remount-ro 0 1

If, as in the example above, you are also using LVM then you should configure it to issue discards to the underlying physical volume. To do so, set “issue_discards=1” in /etc/lvm/lvm.conf.

Use NOOP scheduler
Schedulers are getting smarter these days so this might not be necessary any more. I am still in the habit of setting noop as the scheduler for non-rotational storage devices though. I like to add a udev rule that will set noop if the device advertises itself as non-rotational. You could just set the default elevator to noop but this would effect, say, a USB SATA disk that you may plug in some day.

cat > /etc/udev/rules.d/60-schedulers.rules << EOF # set noop scheduler for non-rotating disks ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop" EOF

i915 power saving
The i915 kernel module for the Intel HD 4000 graphics chip set supports some extra power saving options that you can take advantage of. To enable them, add the following to /etc/default/grub, in the same spot where the "quiet" option for grub currently exists.

GRUB_CMDLINE_LINUX_DEFAULT="quiet i915.i915_enable_rc6=1 i915.i915_enable_fbc=1 i915.lvds_downclock=1"

To see more information about what those options do, execute "modinfo i915".

Disable onboard LAN
This is a truly portable notebook. You shouldnt generally be using the onboard LAN a lot. You can save some power by disabling it in the BIOS.

Extend battery life
This laptop has such good battery life that you should be able to live with it quite comfortably in "battery extender" mode. This mode only lets the battery charge up to 80% and greatly extends the useful life of the battery. Enable it in the BIOS.

If you are in a situation where you know you're going to need maximum battery life (say, while waiting to board a very long flight) you can disable battery extender mode via a file in /sys. Letting the battery charge to 100% should give you about another hour of run time.

echo 0 > /sys/devices/platform/samsung/battery_life_extender

Enable touchpad tapping
Xorg uses the wrong driver for the touchpad by default. If you want to enable tap / doubletap / etc. then you'll need to touch a config file for Xorg.

mkdir -p /etc/X11/xorg.conf.d
cat > /etc/X11/xorg.conf.d/50-snaptics.conf << EOF Section "InputClass" Identifier "touchpad" Driver "synaptics" MatchIsTouchpad "on" Option "TapButton1" "1" Option "TapButton2" "2" Option "TapButton3" "3" #Option "VertEdgeScroll" "on" #Option "VertTwoFingerScroll" "on" #Option "HorizEdgeScroll" "on" #Option "HorizTwoFingerScroll" "on" #Option "CircularScrolling" "on" #Option "CircScrollTrigger" "2" #Option "EmulateTwoFingerMinZ" "40" #Option "EmulateTwoFingerMinW" "8" #Option "CoastingSpeed" "0" EndSection EOF

Coming Soon...

Enable silent mode binding
omething here about binding Fn-F11 to enable/disable silent mode.

Enable keyboard backlight bindings
Something here about enabling backlight keys Fn-F9 and Fn-F10

Enable wifi binding
Something here about Fn-F12

Turn off bluetooth radio by default
Related to the above, but only turn off bluetooth radio during boot up

Use Powertop
Something here about enabling powertops tunables on boot up

Linux KVM: Openvswitch on Debian Wheezy

Among a great many other things, openvswitch is an alternative to managing your virtual networking stacks for KVM with bridge-utils. It supports VLANs, LACP, QoS, sFlow, and so forth.  Listed below are the steps required to get openvswitch running on Debian 7.0 (wheezy).

This article is written with the presumption that you are running a source-installed kernel (3.6.6 with the openvswitch module in this case), and want to use the latest openvswitch from git.

Install prerequisites

Apply any available updates, get all the build dependencies for openvswitch, and install module-assistant.

apt-get update && apt-get dist-upgrade
apt-get install build-essential
apt-get build-dep openvswitch
apt-get install module-assistant

Prep your environment

bridge-utils has a kernel modules that conflicts with the brcompat module in openvswitch. Lets remove that and at the same time stop libvirt and KVM for a bit.

apt-get remove --purge bridge-utils
/etc/init.d/libvirt-bin stop
/etc/init.d/qemu-kvm stop

Build openvswitch

Clone the openvswitch git repo and build debian packages from it.

git clone git://openvswitch.org/openvswitch
cd openvswitch
dpkg-buildpackage -b

Install the packages you just built.

cd ../
dpkg -i openvswitch-switch_1.9.90-1_amd64.deb openvswitch-common_1.9.90-1_amd64.deb \
openvswitch-brcompat_1.9.90-1_amd64.deb openvswitch-datapath-source_1.9.90-1_all.deb \
openvswitch-controller_1.9.90-1_amd64.deb openvswitch-pki_1.9.90-1_all.deb

Build openvswitch-datapath for your running kernel.

module-assistant auto-install openvswitch-datapath

Configure brcompat to load on startup.

sed -i 's/# BRCOMPAT=no/BRCOMPAT=yes/' /etc/default/openvswitch-switch

Verify your configuration

At this point you should reboot and verify that the proper modules are loaded, the service starts normally, and the status output is correct.

[email protected]:~$ lsmod | grep brcompat
brcompat               12982  0 
openvswitch            73431  1 brcompat

[email protected]:~$ /etc/init.d/openvswitch-switch restart
[ ok ] Killing ovs-brcompatd (5439).
[ ok ] Killing ovs-vswitchd (5414).
[ ok ] Killing ovsdb-server (5363).
[ ok ] Starting ovsdb-server.
[ ok ] Configuring Open vSwitch system IDs.
[ ok ] Starting ovs-vswitchd.
[ ok ] Starting ovs-brcompatd.

[email protected]:~$ /etc/init.d/openvswitch-switch status
ovsdb-server is running with pid 6281
ovs-vswitchd is running with pid 6332
ovs-brcompatd is running with pid 6357

And that’s it! You now have a working openvswitch installation upon which you can do all the usual things you did with bridge-utils, and so much more.

leap seconds and Linux

On June 30, 2012 a leap second was inserted into UTC which caused a fair amount of difficulty for companies across the Internet. Some explanation of leap seconds, the problems with it that exist in the Linux kernel, and solutions to it follows.

What are leap seconds?

A leap second is a one second adjustment that is applied to UTC in order to prevent it from deviating more than 0.9 seconds from UT1 (mean solar time). It can be positive or negative and is implemented by adding 23:59:60 or skipping 23:59:59 on the last day of a given month (usually June 30 or December 31). Since the UTC standard was established in 1972, however, 25 leap seconds have been scheduled and all of them have been positive.

Since they are dependent on climatic and geologic events that affect the Earths moment of inertia (mostly tidal friction), leap seconds are irregularly spaced and unpredictable. The International Earth Rotation and Reference Systems Service (IERS) is responsible for deciding when leap seconds will occur, and announces them about six months in advance. The most recent leap second was inserted on June 30, 2012 at 23:59:60 UTC. It has been announced that there will not be a leap second on December 31, 2012.

What problems do leap seconds cause?

Leap seconds are problematic in computing for a number of reasons. As an example, to compute the elapsed seconds between two UTC dates in the past requires a table of leap seconds which must be updated whenever one is announced. It is also impossible to calculate accurate time intervals for UTC dates farther in the future than the interval of leap second announcements. There are more practical problems dealing with distributed systems that depend on accurate time stamping of series data.

In particular, there have been problems with the implementation of leap second handling in the Linux kernel itself. When the last leap second occurred on June 30, 2012 this caused outages at reddit (Apache Cassandra), Mozilla (Hadoop), Qantas Airlines, and other sites. Generally speaking, leap second problems on Linux hosts are characterized by high CPU usage of certain processes immediately after application of a leap second to the local clock.

In one particular case, tgtd (scsi-target-utils) on CentOS 6 hosts began generating an average 14,000 log messages per second:

Jun 30 23:59:59 host kernel: Clock: inserting leap second 23:59:60 UTC
Jun 30 23:59:59 host tgtd: work_timer_evt_handler(89) failed to read from timerfd, Resource temporarily unavailable
Jun 30 23:59:59 host tgtd: work_timer_evt_handler(89) failed to read from timerfd, Resource temporarily unavailable
Jun 30 23:59:59 host tgtd: work_timer_evt_handler(89) failed to read from timerfd, Resource temporarily unavailable

This caused the root file system of approximately 600 hosts to become full before the issue was mitigated.

Why do these problems occur?

The last leap second exposed a kernel bug that can affect any threaded application. It is most apparent with applications that use sub-second CLOCK_REALTIME timeouts in a loop, usually connected with futexes.

On July 3, 2007 commit 746976a301ac9c9aa10d7d42454f8d6cdad8ff2b (2.6.22) removed clock_was_set() in seconds_overflow() to prevent a deadlock. Due to this patch the following occurs when a leap second is added to UTC:

  • The leap second occurs and CLOCK_REALTIME is set back by one second
  • clock_was_set() is not called by seconds_overflow() so the hrtimer base.offset value for CLOCK_REALTIME is not updated
  • CLOCK_REALTIME’s sense of wall time is now one second ahead of the timekeeping core’s
  • At interrupt time, hrtimer code expires all CLOCK_REALTIME timers that are set for ($interrupt_time + 1 second) and before

At this point all TIMER_ABSTIME CLOCK_REALTIME timers now expire one second early. Even worse, all sub-second TIMER_ABSTIME CLOCK_REALTIME timers will return immediately. Any applications that use such timer calls in a loop will experience load spikes. This situation persists until clock_was_set() is called, for example, via settimeofday().

On July 13, 2012, Linus merged several commits in d55e5bd0201a2af0182687882a92c5f95dbccc12 (3.5-rc7) which, beyond simply providing clock_was_set_delayed() in hrtimer to resolve the problem, included other rework of hrtimer and timekeeping.

Affected Kernels

This problem has existed since kernel 2.6.22. All kernels from 2.6.22 to 3.5-rc7 are presumably affected. All RHEL 5.x kernels already include a patch to avoid this bug. Unfortunately, Red Hat either neglected to patch, or mispatched, RHEL 6 for the same issue. All RHEL 6 kernels are vulnerable to this problem with patches available in the following updates;

  • RHEL 6.3: kernel-2.6.32-279.5.2
  • RHEL 6.2 Extended Updates: kernel-2.6.32-220.25.1.el6
  • RHEL 6.1 Extended Updates: kernel-2.6.32-131.30.2

In Debian and it’s derivatives this issue is patched in the following kernel updates;

  • Debian 6.x (squeeze): linux-image-2.6.32-46
  • Debian 7.x (wheezy): linux-image-3.2.29-1

Resolution

Quite obviously, the most prudent fix is to apply a patched kernel package to the affected host, or upgrade to upstream > 3.5-rc7. If a given host cannot be patched, it is possible to manually call settimeofday() after a leap second is applied by issuing either of the following;

date -s "`LC_ALL=C date`"
date `date +'%m%d%H%M%C%y.%S'`

Doing so will resolve any present issues on the host in question.

Another interesting approach to solving this problem was devised by Google, which they call “Leap Smear”. Since Google run their own stratum 2 NTP servers they patched NTP to not issue LI (leap indicator) and instead “smear” a leap second by modulating ‘lie’ over time window w before midnight;

lie(t) = (1.0 – cos(pi * t / w)) / 2.0

You can read more about the leap smear technique at their blog.

install debian directly onto an AoE root filesystem

Something that just about no one out there seems to be doing (yet) is trying to install Debian directly onto network block devices. The Debian installer doesnt support it (yet), grub doesnt support it (usually), and its just generally not an easy thing to do.

Now, there are quite a few ways around this problem. You can install to a ‘real’ computer and migrate the installation to a network block device. You can use debootstrap in place of the actual Debian installation system. You can use a combination of these two methods, NFS root filesystems, TFTP hacks, etc. All of these solutions are lacking in my opinion. I want to run the ‘real’ debian installer against a network block device and boot my physical hardware using only the built in PXE booting capability of the BIOS.

Taking all these issues as a personal challenge, I’ve outlined below how to go about using the regular old Debian Lenny installer directly against an AoE block device.
Continue reading install debian directly onto an AoE root filesystem

openais: an alternative to clvm with cman

I’ve been battling lately with a lot of problems with cman, part of Red Hat Cluster Suite. Specifically, the fencing tool (fenced) is pretty much junk when you try to start using it with Xen dom0’s. After much searching and gnashing of teeth I happened upon this mailing list post. The promise there is that you could take clvm and compile it against openais and get a cluster aware LVM which doesnt require the rest of Red Hat Cluster Suite (and its crappy documentation, crappy fencing, and general all around crappiness). A little more searching turned up this web site from Olivier Le Cam which pretty much did 90% of the work for me.

After some testing I’m happy to say it appears to work smashingly. What follows is a somewhat more complete version of how to achieve the same results on Debian Lenny. Enjoy :)

Continue reading openais: an alternative to clvm with cman

Set up a bluetooth keyboard in Debian Etch

I recently purchased a new Apple Wireless Bluetooth Keyboard for use with MythTV. The choice of input device for MythTV is a very subjective thing to be sure, but I love this device because its as small as it can be without feeling cramped, its thin, light weight, and stylish.

Setting the device up to work with Debian Etch is fairly straightforward once you know what to do

Continue reading Set up a bluetooth keyboard in Debian Etch

VLAN Bridging in Xen

Recently I came upon the need to do all my network routing and firewalling inside a Xen domU. I am not the first to do this but I thought I’d do a little write up on it to help others trying to accomplish the same thing in Debian.

The idea here is to end up with (at least) two VLANs on the network with the dom0 and domU’s being able to choose one or both networks on which to exist. In the case of both, you can set up a handy domU firewall/gateway :)

As you can see from the diagram above, we will end up with three bridges in the dom0 with all the appropriate glue to tie everything together. Best of all, this is all assembled on the fly during bootup.

Continue reading VLAN Bridging in Xen

Coraid Odyssey: Part 5 (AoE vs iSCSI)

The next phase of this project is choosing AoE or iSCSI. The debate on the relative merits of each protocol continues to rage on the Internet but in my particular case the criteria are pretty simple; which one performs better without causing excessive system load? Just from reading about the two protocols I am already leaning toward iSCSI for the simple fact that I can use all my TCP/IP management tools (routing, NAT, firewalling, etc.) on every iSCSI device. The only (potential) drawback is CPU load on the involved systems since it has to calculate TCP checksums for all those packets. Yes, there are many, many other advantages of one protocol over the other. No, they don’t matter to me in this scenario :-) So here we go!

Continue reading Coraid Odyssey: Part 5 (AoE vs iSCSI)