The workaround for "Some distros patch Linux" mentioned in
local-derivation-goal.cc will not help in the `--option
sandbox-fallback false` case. To provide the user more helpful
guidance on how to get the sandbox working, let's check to see if the
`/proc` node created by the aforementioned patch is present and
configured in a way that will cause us problems. If so, give the user
a suggestion for how to troubleshoot the problem.
local-derivation-goal.cc contains a comment stating that "Some distros
patch Linux to not allow unprivileged user namespaces." Let's give a
pointer to a common version of this patch for those who want more
details about this failure mode.
This commit causes nix to `warn()` if sandbox setup has failed and
`/proc/self/ns/user` does not exist. This is usually a sign that the
kernel was compiled without `CONFIG_USER_NS=y`, which is required for
sandboxing.
This commit uses `warn()` to notify the user if sandbox setup fails
with errno==EPERM and /proc/sys/user/max_user_namespaces is missing or
zero, since that is at least part of the reason why sandbox setup
failed.
Note that `echo -n 0 > /proc/sys/user/max_user_namespaces` or
equivalent at boot time has been the recommended mitigation for
several Linux LPE vulnerabilities over the past few years. Many users
have applied this mitigation and then forgotten that they have done
so.
The failure modes for nix's sandboxing setup are pretty complicated.
When nix is unable to set up the sandbox, let's provide more detail
about what went wrong. Specifically:
* Make sure the error message includes the word "sandbox" so the user
knows that the failure was related to sandboxing.
* If `--option sandbox-fallback false` was provided, and removing it
would have allowed further attempts to make progress, let the user
know.
Specifically, if we're not root and the daemon socket does not exist,
then we use ~/.local/share/nix/root as a chroot store. This enables
non-root users to download nix-static and have it work out of the box,
e.g.
ubuntu@ip-10-13-1-146:~$ ~/nix run nixpkgs#hello
warning: '/nix' does not exists, so Nix will use '/home/ubuntu/.local/share/nix/root' as a chroot store
Hello, world!
With this, Nix will write a copy of the sandbox shell to /bin/sh in
the sandbox rather than bind-mounting it from the host filesystem.
This makes /bin/sh work out of the box with nix-static, i.e. you no
longer get
/nix/store/qa36xhc5gpf42l3z1a8m1lysi40l9p7s-bootstrap-stage4-stdenv-linux/setup: ./configure: /bin/sh: bad interpreter: No such file or directory
This allows changes to nix-cache-info to be picked up by existing
clients. Previously, the only way for this to happen would be for
clients to delete binary-cache-v6.sqlite, which is quite awkward for
users.
On the other hand, updates to nix-cache-info should be pretty rare,
hence the choice of a fairly long TTL. Configurability is probably not
useful enough to warrant implementing it.
The manpage for `getgrouplist` says:
> If the number of groups of which user is a member is less than or
> equal to *ngroups, then the value *ngroups is returned.
>
> If the user is a member of more than *ngroups groups, then
> getgrouplist() returns -1. In this case, the value returned in
> *ngroups can be used to resize the buffer passed to a further
> call getgrouplist().
In our original code, however, we allocated a list of size `10` and, if
`getgrouplist` returned `-1` threw an exception. In practice, this
caused the code to fail for any user belonging to more than 10 groups.
While unusual for single-user systems, large companies commonly have a
huge number of POSIX groups users belong to, causing this issue to crop
up and make multi-user Nix unusable in such settings.
The fix is relatively simple, when `getgrouplist` fails, it stores the
real number of GIDs in `ngroups`, so we must resize our list and retry.
Only then, if it errors once more, we can raise an exception.
This should be backported to, at least, 2.9.x.
Without the change any CA deletion triggers linear scan on large
RealisationsRefs table:
sqlite>.eqp full
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SCAN RealisationsRefs
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
With the change it gets turned into a lookup:
sqlite> CREATE INDEX IndexRealisationsRefsRealisationReference on RealisationsRefs(realisationReference);
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SEARCH RealisationsRefs USING INDEX IndexRealisationsRefsRealisationReference (realisationReference=?)
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
If the derivation `foo` depends on `bar`, and they both have the same
output path (because they are CA derivations), then this output path
will depend both on the realisation of `foo` and of `bar`, which
themselves depend on each other.
This confuses SQLite which isn’t able to automatically solve this
diamond dependency scheme.
Help it by adding a trigger to delete all the references between the
relevant realisations.
Fix#5320
Otherwise the clang builds fail because the constructor of `SQLiteBusy`
inherits it, `SQLiteError::_throw` tries to call it, which fails.
Strangely, gcc works fine with it. Not sure what the correct behavior is
and who is buggy here, but either way, making it public is at the worst
a reasonable workaround