• 这是一篇关于内核如何限制通过fork方式提升进程特权的文章。是为了求解阅读了Android init进程的源码中,关于prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)调用的缘由的。

  • 以下是原文与翻译的内容:

    The execve system call can grant a newly-started program privileges that
    its parent did not have.

    execve 系统调用能够赋予最新启动的进程其父进程没有的权限。

    The most obvious examples are setuid/setgid
    programs and file capabilities. To prevent the parent program from
    gaining these privileges as well, the kernel and user code must be
    careful to prevent the parent from doing anything that could subvert the
    child.

    最常见的例子就是通过 setuid和setgid来设置程序进程的uid以及gid以及文件的访问权限。(子进程)同样继承了父进程的权限,在内核以及用户代码中必须小心这些权限信息,避免造成子进程崩溃。

    For example:

    • The dynamic loader handles LD_* environment variables differently if
      a program is setuid.

    • chroot is disallowed to unprivileged processes, since it would allow
      /etc/passwd to be replaced from the point of view of a process that
      inherited chroot.

    • The exec code has special handling for ptrace.

例如:

  • 一个被重新设置了uid的程序,(启动运行时)动态链接器在处理这些以”LD_”为前缀的环境变量时,要注意(其文件路径的权限)差异;

  • 使用chroot生成的进程,它所加载的/etc/passwd文件所指向的路径将会(不同的root运行环境所)变更。因此chroot会禁止那些(在新环境下)未定义权限的进程运行;

  • 使用ptrace来跟踪指定的代码;

These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a
new, generic mechanism to make it safe for a process to modify its
execution environment in a manner that persists across execve. Any task
can set no_new_privs. Once the bit is set, it is inherited across fork,
clone, and execve and cannot be unset. With no_new_privs set, execve
promises not to grant the privilege to do anything that could not have
been done without the execve call. For example, the setuid and setgid
bits will no longer change the uid or gid; file capabilities will not
add to the permitted set, and LSMs will not relax constraints after
execve.

因此内核引入了一些临时性解决方案。到后来,内核(从 Linux3.5版本开始)引入(设置)”no_new_privs”位的全新的通用机制,提供给进程一种能够在execve()调用整个阶段都能持续有效且安全的方法。任何一个进程都可以设置”no_new_privs”位。然而一旦(当前进程)被置位,不论通过fork,clone,或者execve生成的子进程都无法将该位清零。因此通过”no_new_privs”置位的方式,execve函数可以确保所有的操作都必须调用execve()(判定)赋予权限后才被执行。比如,
setuid和setgid操作将无法有效执行;
文件访问权限无法被扩增,LINUX安全模块(LSM-Linux Security Module)在execve()执行后不会释放控制权限;

To set no_new_privs, use

prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0).

设置”no_new_privs”位,可以这样调用:

1
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)

Be careful, though: LSMs might also not tighten constraints on exec
in no_new_privs mode. (This means that setting up a general-purpose
service launcher to set no_new_privs before execing daemons may
interfere with LSM-based sandboxing.)

要注意到的是,因此:处于”no_new_privs”模式下,调用exec()函数时,Linux安全模块将收紧权限控制。
(这意味着建立一个通用启动器,在设置no_new_privs之前execing守护进程可能会受到LSM沙盒的影响)

Note that no_new_privs does not prevent privilege changes that do not
involve execve. An appropriately privileged task can still call
setuid(2) and receive SCM_RIGHTS datagrams.

注意,通过execve调用,可以确保处于”no_new_privs”模式(下的进程)权限不发生变化。一个适当的特权任务,可以持续完成uid设置,以及接收SCM_RIGHTS报文数据。

There are two main use cases for no_new_privs so far:

  • Filters installed for the seccomp mode 2 sandbox persist across
    execve and can change the behavior of newly-executed programs.
    Unprivileged users are therefore only allowed to install such filters
    if no_new_privs is set.

  • By itself, no_new_privs can be used to reduce the attack surface
    available to an unprivileged user. If everything running with a
    given uid has no_new_privs set, then that uid will be unable to
    escalate its privileges by directly attacking setuid, setgid, and
    fcap-using binaries; it will need to compromise something without the
    no_new_privs bit set first.

“no_new_privs”更多地应用在以下两个场景:

  • 已安装在seccomp模式2的沙盒过滤器可以持续在execve()(函数调用期间)改变新执行的程序的行为。no_new_privs设置后,只允许非特权用户安装过滤器。
  • 就其本身而言,”no_new_privs”能够减小非法用户可进行攻击的攻击面。如果每一个进程都能够运行在设置了”no_new_privs”的UID下,这个UID就不会被”setuid”,”setgid”以及”fcap-using binaries”这些攻击手段提升权限;(为了避免权限提升)首先确保”no_new_privs”位能够提前设置。

In the future, other potentially dangerous kernel features could become
available to unprivileged tasks if no_new_privs is set. In principle,
several options to unshare(2) and clone(2) would be safe when
no_new_privs is set, and no_new_privs + chroot is considerable less
dangerous than chroot by itself.

在未来,在”no_new_privs”模式下,将出现其他具有威胁内核的功能被非法的任务所应用。原则上,在”no_new_privs”模式下需要对unshare(2),以及clone(2)进行配置设置。实现”no_new_privs”与chroot的组合使用方式是一种相比与独立使用chroot的一种可实现的低风险方案。