• Linux kernel code coverage with rapido and gcov

    In quality engineering, we want to know how “good” or comprehensive our testing is. We want to know, if our test coverage in any area is sufficient. This can help to decide if we want to move on to the next area or invest more time and effort before doing so.

    Challenges with testing of big software projects

    When you write your own small application or service, it is rather easy to see if test coverage is good enough. Often, we can test every important part manually and be confident enough. If projects get more and more complex, you cannot guess anymore if your test is “good” or if you need to do more. If you want to test a single file of the kernel tree, you will need to take all the other parts of the kernel into account. You simply will have a hard time to see this.

    GCOV and the Linux kernel

    There are a lot of commercial tools out there, as well as a lot of frameworks for many languages etc. The Linux kernel supports GCOV, and this is what comes to help us for analyzing code coverage.

    Please note, even if we are able to achieve a good test coverage, it won’t be useful for argumentation with regards to safety certification, as our test cases won’t be derived from requirements (so called black-box tests). While we would prefer this, this is not feasible, so we want to use code coverage information to be confident enough about our code.

    Installing GCOV, the Linux kernel, LTP and rapido

    We can install gcov[1] and gcovr[2], a report generation tool for gcov plus some dependencies for rapido. The following is by example for openSUSE, but should not be much different on other distributions.

    zypper in gcov gcovr qemu
    

    We also need a linux kernel installed`. You’ll need a couple of build dependencies again, but we’ll skip this here. We need to work on the same location on the host as we do on the guest later, so we need to place the kernel in /host for simplicity.

    sudo mkdir /host
    sudo chown <user> /host
    cd /host<F2>
    git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
    cd linux
    make x86_64_defconfig
    make kvm_guest.config
    
    scripts/config --enable DEBUG_FS
    scripts/config --enable GCOV_KERNEL
    scripts/config --enable GCOV_PROFILE_ALL
    
    make -j8
    INSTALL_MOD_PATH=./mods make modules_install
    

    Next, we need LTP[4], asusming you have a toolchain and autotools installed:

    git clone https://github.com/linux-test-project/ltp.git
    cd ltp
    make autotools
    ./configure
    make
    sudo make install
    

    Next we can proceed and install rapido[5]:

    git clone https://github.com/rapido-linux/rapido.git
    cd rapido
    cp rapido.conf.example rapido.conf
    

    Find the lines below and uncomment/set them accordingly in rapido.conf:

    KERNEL_SRC="/host"
    VIRTFS_SHARE_PATH="/host"
    LTP_DIR="/opt/ltp"
    

    Plus, we need to change the following in rapido’s cut/ltp.sh

     cat lsmod ip ping tc \
    to
     cat lsmod ip ping tc gcov gzip \
    

    This could be added upstream, but for now you have to add the binaries to the list manually.

    Now we can boot the system and verify it is working:

    ./rapido cut ltp
    

    This will start your kernel and allow you to run LTP tests. Stop the VM with

    
    ## Prepare Coverage scripts
    Check the kernel documentation on gcov[6], then create a file "gather_on_test.sh"
    in /host with the following content:
    ``` bash
    #!/bin/bash -e
    
    DEST=$1
    GCDA=/sys/kernel/debug/gcov
    
    if [ -z "$DEST" ] ; then
      echo "Usage: $0 <output.tar.gz>" >&2
      exit 1
    fi
    
    TEMPDIR=$(mktemp -d)
    echo Collecting data..
    find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \;
    find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \;
    find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \;
    tar czf $DEST -C $TEMPDIR sys
    rm -rf $TEMPDIR
    
    echo "$DEST successfully created, copy to build system and unpack with:"
    echo "  tar xfz $DEST"
    

    Running the tests and collect coverage data

    Run rapido again, exactly as you did to verify it is working. Then, when the VM is finished booting, run LTP with the syscalls group:

    cd /opt/ltp
    ./runltp syscalls
    

    And then grab a cup of tea and put on a record. This will take a while. Once it is finished, you can collect coverage data:

    cd /host
    ./gather_on_test.sh coverage.tar.gz
    shutdown
    

    Generate Coverage report

    Now we are able to generate a coverage report:

    cd /host
    tar xzf coverage.tar.gz
    gcovr -j 8 -s -o cov.txt -r .
    

    This will take quite a while again. After it is complete, you can look at the report in the file cov.txt.

    Conclusion

    As you see, it is a rather lengthy process to get coverage data for a kernel and a set of tests. And we also receive rather low coverage numbers, and those only for a minimal configuration of the kernel. Anyway, having coverage information may be helpful, so I’ll continue to look into this topic. In the next post, I will cover how to read the report plus how to obtain coverage information for a small part of interest. Anyways it is always fun to explore available tools and see what they can do for you.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
    [2] https://gcovr.com/en/stable/index.html
    [3] https://kernel.org
    [4] https://github.com/linux-test-project/ltp
    [5] https://github.com/rapido-linux/rapido
    [6] https://elixir.bootlin.com/linux/latest/source/Documentation/dev-tools/gcov.rst
  • Redesigning the home server (Part 1)

    I am writing this as part of my design process for re-implementing my home server in 2020, after many, many years of (production-) usage. This article is the first in a series of posts I am planning describing what I do, why I do it and what the benefits are.

    This first post is intended to describe the status quo, describing the hardware I am using, and the services I am running. On top, I will try to summarize some of the weakness I discovered. In the end, there is a reason for me to re-design this monster.

    Hardware overview

    The hardware changed over time. It changed a lot. I started with some cheap Intel Celeron processor with 4 GB of RAM. Some years later, I upgraded to a third generation Core i5 and upgraded the RAM to 8 GB. The latest upgrade was a (used) Supermicro X10SLL-F with an Intel Xeon CPU E3-1270 v3 and 32 gigabytes of ECC memory and a LSI SAS2008 HBA that just passes the individual disks to the OS. I have five hard drives of 3TB each in a RAID 5 configuration.

    Software overview

    First, to the operating system. After a lot of trial and error, I ended up again with the distribution I used as my first one, back in 1998: openSUSE. I migrated the installation from Tumbleweed to Leap 42.1 when it was released, and kept it up-to-date over the years. While there is always something you don’t like, it did never break completely and in the meantime, I’m really familar with this. And more than two years ago, I started my job at SUSE, but that did not even impact decision. This distribution is rock-solid and never let me down.

    I have my disks in one big RAID 5 array, using XFS as the file system. While again, this never let me down, I miss the advanced features of modern filesystems, like snapshots. But more on that later.

    Services

    The services I am running can be grouped into two groups: essential production services, especially for my wife on the one hand, and my personal playground. The most important services are actually simple file serving with samba and NFS and automatic encrypted backups of importand data. So the documents my wife has to access for her business are stored on samba shares. Every night, an incremental, encrypted off-site backup is created by a cron job. As a second service, I’m running a nextcloud instance for her (and myself, of course) on our 100 MBit/s dialup connection. Good enough for our requirements. Then there are a lot of private files stored on the array. Mainly media files and photos. That’s too much data for periodic backups at a price I am willing to spend, so I just backup my photos and personal documents.

    Next, I am running a Plex media server. If you never used a proper media center software: take a look [1]. We really use this a lot, even in the days of Netflix & co. This is running inside a docker container. The second service I run containerized is gitea. A lightweight, personal git server, where I keep repositories I don’t want on Github, like my smart home configuration or my password store. Talking about smart home, I’m running Home Assistant as my personal smart home hub. This is just running in a python virtual environment. The same for Octoprint, the amazing interface to my 3D printer.

    Finally, right now I am playing and investigating PiHole ad blocker in a virtual machine.

    I used to run a lot of websites and -services for me and my family, but basically only a few, like a photo gallery, survived. But as you can see, there is quite a lot going on. And most of the services are used on a regular basis.

    Problems with the current setup

    I run most of the services for a reason. However, right now something is running in a container, another thing in a VM, the next is just a plain process running as some user and residing somewhere on the filesystem. Sometimes it is even difficult to quickly dive into problems, because I tend to forget things once I don’t touch them for some weeks. After gathering some experience, I want to ditch docker. Containers just don’t serve me well, and I think in my case, the added complexity is totally not worth. Plus, on top, you end up just docker pull and you even give up control about what is running. I intend to use docker solely for evaluating services before setting them up properly. I gained a lot of experience with administrative tasks, so I am partly writing this series as my personal "lessons learned" - and trying to get things right now. I hope this will be useful for someone else, too. My root filesystem resides on a 256 GB SSD drive and is formatted with btrfs. Together with snapper, this saved my life many many times. I really love snapshots, and I will never again use a filesystem that does not offer them. This leads me to the next problem: XFS does not support snapshotting. You could achieve this functionality with LVM [2], but as XFS is not a copy-on-write-filesystem, this will never be fun. So basically, I want to replacy my software-raid-with-xfs with something more modern and feature-rich, like btrfs. More on that in one of the next posts. And I would prefer moving my services into one (or a few) virtual machine, so I could maybe even run them in a HA-setup or at least have a cold standby machine sitting somewhere in case my hardware dies.

    Conclusion

    Even if you build a server for your personal/home use, you should have a plan in advance. What started as a toy for me during my university days ended up being a production grade system that has to be up and running 24/7. This was never foreseeable, but I need to get this right with only minimal service disruptions. However, the following posts will be very useful if you want to setup your home server - and do it in a way you could use it for a long time and keep it maintainable.

    [1] https://plex.tv
    [2] https://www.tecmint.com/take-snapshot-of-logical-volume-and-restore-in-lvm/
  • Password managers

    This post was originally written for the WordPress blog I ran in addition, but I want to get rid of this. This is an updated English version of the original post there.

    It’s such a thing with password managers. On the one hand, you want to use them for storing individual passwords for each service. On the other hand, though, you have to entrust all your data to some service. Over the past time, I tried a lot of those, and I want to share my experience, concerns and opinion on the tools and services I used. Using generated passwords and not reusing them is one thing I really think improves the overall security of all your accounts. You really want to use this. And here are a few options for you.

    LastPass

    LastPass [1] was the first service I tried, and I even re-visited the service when writing the original post. Initially, I was using the premium version, but after the trial, I continued the free version. It was actually working quite well, including the checks for leaked passwords, across many devices. IPhone and IPad were no surprise, like Windows - but even the browser integration on Linux worked reasonably well. The password audits for weak and reused passwords came in handy, and the service helps you change passwords quickly. In general, the company seems to be quite responsive when they had security issues in the past, but this is of course no audit of LastPass.

    But now, let’s go for the downside, and what is blocking me from using LastPass as my personal driver. You cannot add an arbitrary number of URLs for a password entry, to use it like on different services on the intranet. You can only define equivalent domains, but those must not include subdomains. This is a serious limitation for me, and maybe for you.

    1Password

    1Password is another commercial provider, but without a free model. You get a trial period after which you have to buy a subscription. I was using this service for a while after ditching LastPass, because the integration on my iOS devices is rather nice, they also offer proper browser integration across all platforms. They even provide a command line interface, but this was rather complicated to use. Actually it’s not a bad service, with good features, but I found it was too expensive.

    Bitwarden

    On my journey, I encountered Bitwarden [3]. This software is open source, and you can either get an account hosted by them, or you can host it yourself. In both cases, you can get a paid option for extra features like sharing of passwords etc. I would have been willing to spend that money to support open source, especially as I can host my own server. From the feature side, this is almost en par with LastPass and 1Password, with the big plus of hosting this on your own infrastructure - which for me is something I want have.

    Update: I discovered a second implementation of the server component, which uses way less ressources and seems to be quite usable [8].

    Pass

    Pass [4] is completely different. Every password is just a plain text file with meta-information like username, URL, etc. and your password in it. Each of these files is encrypted with gpg [5], a technology widely used for securing communication via e-mail (and a lot more, of course). At least for me, using this was quite easy, as I am already using gpg a lot. But that’s just half the thing - you want that store synchronized across your devices. Pass is using git [6] to synchronize your passwords. In my case, with a self-hosted service using Gitea [7] and a private repository. Initial setup is a bit more challenging, you need to create keys for gpg and for ssh, distribute them, etc. But then, this is working like a charm. On Linux and iOS I never had any problems. The only problematic platform for me was Windows, but this is only used for occasional gaming, anyway, so I don’t care too much.

    Conclusion

    For me, there are a few considerations. The most important when it comes to IT security is, in my opinion, trust. And I’m not sure how much you should trust companies with your passwords. Current developments in Germany depict a scenario where those providers could be forced to give your passwords to law enforcement in plain text. I promise, when they need to have plain text passwords stored, these will also leak to criminals. For me, personally, a self-hosted open source solution is the only way to go with this. For me, as Linux power user, Pass is the perfect solution. If you are interested in a proper introduction/how to setup all this, send a mail blog at frankenmichl dot de. In case you are interested: I started a small project to convert my 1Password data into pass, you can find it on my git server

    Even though I would still consider Pass the perfect solution, I’m using Bitwarden in the meantime. It has proven itself to be rock solid and reliable, it is easy to use on other operating systems (like Windows) and - that’s the main reason: my wife is finally using it now, too. I don’t dare trying to teach her to use Git..

    [1] https://lastpass.com/
    [2] https://1password.com/
    [3] https://bitwarden.com/
    [4] https://www.passwordstore.org/
    [5] https://gnupg.org/
    [6] https://git-scm.com/
    [7] https://gitea.io/
    [8] https://github.com/dani-garcia/bitwarden_rs
  • NVMe device in QEMU-VM

    Have you ever wanted to play with fancy new NVMe, but lack the hardware? Well, luckily, QEMU has all you need to play with NVMe (and even NVMe over Fabrics).

    manual starting of QEMU

    Just create a backing image file (using dd if=/dev/zero of=/path/to/nvme.img bs=1M count=4096 for example) and start QEMU like this:

    $ qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -cpu host-hda ~/path/to/qemu_disk.qcow2 -boot c \
            -drive file=/path/to/nvme.img,if=none,id=D22 \
    	-device nvme,drive=D22,serial=1234
    

    NVMe and virt-manager

    Unfortunately, if you want to use virt-manager, things are (as of the time of this writing) not that easy - you need to configure the additional command line arguments manually. Find the UUID of your QEMU-VM in virt-manager, for example b11b447d-00a9-4764-80d9-3d68e88ef686

    Just use virsh:

    $ virsh
    virsh # connect qemu:///system
    
    virsh # edit b11b447d-00a9-4764-80d9-3d68e88ef686
    

    This opens the XML-file describing your VM in your $EDITOR. Change the first line from:

    domain type='kvm'
    

    to

    <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
    

    and add at the very end, just before the </domain> the following:

    <qemu:commandline>
    	<qemu:arg value='-drive'/>
    	<qemu:arg value='file=/path/to/nvme.img,if=none,id=D22'/>
    	<qemu:arg value='-device'/>
    	<qemu:arg value='nvme,drive=D22,serial=1234'/>
    </qemu:commandline>
    

    And save and exit the editor. Then run your VM again

    Have a look in /dev:

    $ dmesg | grep nvme
    [    4.906319] nvme nvme0: pci function 0000:00:0a.0
    
    $ ls /dev/nvme*
    /dev/nvme0
    

    Have fun playing with nvme!

  • Get these includes right

    Have you ever thought about your #include directives if everything is compiling fine? Sometimes you include header files you don’t need. Often you miss some, and your code works because other header files included the ones you need.

    Wouldn’t it be nice if there was an easy way to tell if you got your #includes right? In this post I will show you how I explored Include What You Use on a new test case for the Linux Test Project (LTP)

    Include What You Use

    include-what-you-use builds on clang to analyze C and C++ files for their #include-directives.

    At least for my distribution (currently openSUSE Leap 42.3) no packages are available, so I followed the installation instructions for my clang version, which is 3.8. Be sure to read and follow the instructions in README.md. I got it up and running in no time, but instead of copying the clang header files over to the install location, I just symlinked them.

    Using include-what-you-use on a LTP test case

    Currently I create a bunch of regression tests for already fixed CVE’s in the Linux kernel and in order to submit my patches for review, I want them to be as clean as possible.

    If you want to change your compiler for building LTP, you would normally do the following (see the documentation):

    make -k CC=include-what-you-use
    

    However, I strongly suggest building the entire LTP with a normal C compiler. The code base is huge, and you would only want to check one file at a time, don’t you? Oh and, you can use clang to build LTP.

    $ make
    or 
    $ make CC=clang 
    

    Now I saved my new test case as testcases/cve/cve-2017-16939.c and finally analyzed it:

    $ make -k CC=include-what-you-use
    include-what-you-use -g -O2 -g -O2 -fno-strict-aliasing -pipe -Wall -W -Wold-style-definition -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -I../../include -I../../include -I../../include/old/   -L../../lib  cve-2017-16939.c   -lltp -o cve-2017-16939
    warning: -lltp: 'linker' input unused
    warning: argument unused during compilation: '-L../../lib'
    
    cve-2017-16939.c should add these lines:
    #include <unistd.h>           // for usleep, pid_t
    
    cve-2017-16939.c should remove these lines:
    - #include <netinet/in.h>  // lines 31-31
    - #include <sys/wait.h>  // lines 30-30
    
    The full include-list for cve-2017-16939.c:
    #include <linux/netlink.h>    // for nlmsghdr, sockaddr_nl, NETLINK_XFRM
    #include <linux/xfrm.h>       // for XFRMNLGRP_NONE, XFRM_MSG_GETPOLICY
    #include <sched.h>            // for unshare, CLONE_NEWNET, CLONE_NEWUSER
    #include <stdlib.h>           // for exit, WIFEXITED
    #include <string.h>           // for memset
    #include <sys/socket.h>       // for socket, AF_NETLINK, PF_NETLINK, SOCK_RAW
    #include "tst_res_flags.h"    // for TCONF, TFAIL, TPASS
    #include "tst_safe_macros.h"  // for SAFE_MALLOC, SAFE_WAITPID
    #include "tst_safe_net.h"     // for SAFE_SENDTO, SAFE_SETSOCKOPT
    #include "tst_test.h"         // for tst_brk, tst_res, SAFE_FORK, tst_test
    ---
    

    Looking at the results

    So, include-what-you-use tells me to add an include to unistd.h. That is totally right, and the build did not fail by accident. So this is to be added. The header sys/wait.h was included by accident - I had a call to waitpid() in the code. However, LTP has a SAFE_WAITPID() macro, and I replaced the call, but I forgot to remove the include. If I would remove netinet/in.h, the build would fail. This is a mistake of include-what-you-use. If you use the fix_includes.py-script, you would want to add // IWYU pragma: keep at the end of this line. For my case, I just ignored this one.

    The other includes were found to be correct, and the output even shows the symbols that a header file is needed for.

    Conclusion

    Why did I do this? Well, primarily because I stumbled upon include-what-you-use. I was curious, and I played with it. However, it is at least good practice not to include header files you don’t need, as they could, at least theoretically, interfere with your software in a way you don’t want to. On the other hand, relying on implicitly included headers may be a portability issue.

    I think I will use this tool more regularly now.