nix-super/src/libstore/linux/fchmodat2-compat.hh

35 lines
1.3 KiB
C++
Raw Normal View History

libstore/local-derivation-goal: prohibit creating setuid/setgid binaries With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that isn't filtered away by the libseccomp sandbox. Being able to use this to bypass that restriction has surprising results for some builds such as lxc[1]: > With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2, > which slips through https://github.com/NixOS/nix/blob/9b88e5284608116b7db0dbd3d5dd7a33b90d52d7/src/libstore/build/local-derivation-goal.cc#L1650-L1663. > The fixupPhase then uses fchmodat, which fails. > With older kernel or glibc, setting the suid bit fails in the > install phase, which is not treated as fatal, and then the > fixup phase does not try to set it again. Please note that there are still ways to bypass this sandbox[2] and this is mostly a fix for the breaking builds. This change works by creating a syscall filter for the `fchmodat2` syscall (number 452 on most systems). The problem is that glibc 2.39 and seccomp 2.5.5 are needed to have the correct syscall number available via `__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on nixpkgs 23.11. To have this change everywhere and not dependent on the glibc this package is built against, I added a header "fchmodat2-compat.hh" that sets the syscall number based on the architecture. On most platforms its 452 according to glibc with a few exceptions: $ rg --pcre2 'define __NR_fchmodat2 (?!452)' sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h 58:#define __NR_fchmodat2 1073742276 sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h 67:#define __NR_fchmodat2 6452 sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h 62:#define __NR_fchmodat2 5452 sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h 70:#define __NR_fchmodat2 4452 sysdeps/unix/sysv/linux/alpha/arch-syscall.h 59:#define __NR_fchmodat2 562 I tested the change by adding the diff below as patch to `pkgs/tools/package-management/nix/common.nix` & then built a VM from the following config using my dirty nixpkgs master: { vm = { pkgs, ... }: { virtualisation.writableStore = true; virtualisation.memorySize = 8192; virtualisation.diskSize = 12 * 1024; nix.package = pkgs.nixVersions.nix_2_21; }; } The original issue can be triggered via nix build -L github:nixos/nixpkgs/d6dc19adbda4fd92fe9a332327a8113eaa843894#lxc \ --extra-experimental-features 'nix-command flakes' however the problem disappears with this patch applied. Closes #10424 [1] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2031073804 [2] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2030844251
2024-04-14 15:10:23 +03:00
/*
* Determine the syscall number for `fchmodat2`.
*
* On most platforms this is 452. Exceptions can be found on
* a glibc git checkout via `rg --pcre2 'define __NR_fchmodat2 (?!452)'`.
*
* The problem is that glibc 2.39 and libseccomp 2.5.5 are needed to
* get the syscall number. However, a Nix built against nixpkgs 23.11
* (glibc 2.38) should still have the issue fixed without depending
* on the build environment.
*
* To achieve that, the macros below try to determine the platform and
* set the syscall number which is platform-specific, but
* in most cases 452.
*
* TODO: remove this when 23.11 is EOL and the entire (supported) ecosystem
* is on glibc 2.39.
*/
#if HAVE_SECCOMP
# if defined(__alpha__)
# define NIX_SYSCALL_FCHMODAT2 562
# elif defined(__x86_64__) && SIZE_MAX == 0xFFFFFFFF // x32
# define NIX_SYSCALL_FCHMODAT2 1073742276
# elif defined(__mips__) && defined(__mips64) && defined(_ABIN64) // mips64/n64
# define NIX_SYSCALL_FCHMODAT2 5452
# elif defined(__mips__) && defined(__mips64) && defined(_ABIN32) // mips64/n32
# define NIX_SYSCALL_FCHMODAT2 6452
# elif defined(__mips__) && defined(_ABIO32) // mips32
# define NIX_SYSCALL_FCHMODAT2 4452
# else
# define NIX_SYSCALL_FCHMODAT2 452
# endif
#endif // HAVE_SECCOMP