Previously, to build a derivation remotely, we had to copy the entire
closure of the .drv file to the remote machine, even though we only
need the top-level derivation. This is very wasteful: the closure can
contain thousands of store paths, and in some Hydra use cases, include
source paths that are very large (e.g. Git/Mercurial checkouts).
So now there is a new operation, StoreAPI::buildDerivation(), that
performs a build from an in-memory representation of a derivation
(BasicDerivation) rather than from a on-disk .drv file. The only files
that need to be in the Nix store are the sources of the derivation
(drv.inputSrcs), and the needed output paths of the dependencies (as
described by drv.inputDrvs). "nix-store --serve" exposes this
interface.
Note that this is a privileged operation, because you can construct a
derivation that builds any store path whatsoever. Fixing this will
require changing the hashing scheme (i.e., the output paths should be
computed from the other fields in BasicDerivation, allowing them to be
verified without access to other derivations). However, this would be
quite nice because it would allow .drv-free building (e.g. "nix-env
-i" wouldn't have to write any .drv files to disk).
Fixes#173.
The following patch is an attempt to address this bug (see
<http://bugs.gnu.org/18994>) by preserving the supplementary groups of
build users in the build environment.
In practice, I would expect that supplementary groups would contain only
one or two groups: the build users group, and possibly the “kvm” group.
[Changed &at(0) to data() and removed tabs - Eelco]
Not substituting builds with "preferLocalBuild = true" was a bad idea,
because it didn't take the cost of dependencies into account. For
instance, if we can't substitute a fetchgit call, then we have to
download/build git and all its dependencies.
Partially reverts 5558652709 and adds a
new derivation attribute "allowSubstitutes" to specify whether a
derivation may be substituted.
Nixpkgs' writeTextAsFile does this:
mv "$textPath" "$n"
Since $textPath was owned by root, if $textPath is on the same
filesystem as $n, $n will be owned as root. As a result, the build
result was rejected as having suspicious ownership.
http://hydra.nixos.org/build/22836807
This hook can be used to set system-specific per-derivation build
settings that don't fit into the derivation model and are too complex or
volatile to be hard-coded into nix. Currently, the pre-build hook can
only add chroot dirs/files through the interface, but it also has full
access to the chroot root.
The specific use case for this is systems where the operating system ABI
is more complex than just the kernel-support system calls. For example,
on OS X there is a set of system-provided frameworks that can reliably
be accessed by any program linked to them, no matter the version the
program is running on. Unfortunately, those frameworks do not
necessarily live in the same locations on each version of OS X, nor do
their dependencies, and thus nix needs to know the specific version of
OS X currently running in order to make those frameworks available. The
pre-build hook is a perfect mechanism for doing just that.
This hook can be used to set system specific per-derivation build
settings that don't fit into the derivation model and are too complex or
volatile to be hard-coded into nix. Currently, the pre-build hook can
only add chroot dirs/files.
The specific use case for this is systems where the operating system ABI
is more complex than just the kernel-supported system calls. For
example, on OS X there is a set of system-provided frameworks that can
reliably be accessed by any program linked to them, no matter the
version the program is running on. Unfortunately, those frameworks do
not necessarily live in the same locations on each version of OS X, nor
do their dependencies, and thus nix needs to know the specific version
of OS X currently running in order to make those frameworks available.
The pre-build hook is a perfect mechanism for doing just that.
This was causing NixOS VM tests to fail mysteriously since
5ce50cd99e. Nscd could (sometimes) no
longer read /etc/hosts:
open("/etc/hosts", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
Probably there was some wacky interaction between the guest kernel and
the 9pfs implementation in QEMU.
Thus, for example, to get /bin/sh in a chroot, you only need to
specify /bin/sh=${pkgs.bash}/bin/sh in build-chroot-dirs. The
dependencies of sh will be added automatically.
I'm seeing hangs in Glibc's setxid_mark_thread() again. This is
probably because the use of an intermediate process to make clone()
safe from a multi-threaded program (see
524f89f139) is defeated by the use of
vfork(), since the intermediate process will have a copy of Glibc's
threading data structures due to the vfork(). So use a regular fork()
again.
If ‘build-use-chroot’ is set to ‘true’, fixed-output derivations are
now also chrooted. However, unlike normal derivations, they don't get
a private network namespace, so they can still access the
network. Also, the use of the ‘__noChroot’ derivation attribute is
no longer allowed.
Setting ‘build-use-chroot’ to ‘relaxed’ gives the old behaviour.
chroot only changes the process root directory, not the mount namespace root
directory, and it is well-known that any process with chroot capability can
break out of a chroot "jail". By using pivot_root as well, and unmounting the
original mount namespace root directory, breaking out becomes impossible.
Non-root processes typically have no ability to use chroot() anyway, but they
can gain that capability through the use of clone() or unshare(). For security
reasons, these syscalls are limited in functionality when used inside a normal
chroot environment. Using pivot_root() this way does allow those syscalls to be
put to their full use.
I.e., not readable to the nixbld group. This improves purity a bit for
non-chroot builds, because it prevents a builder from enumerating
store paths (i.e. it can only access paths it knows about).
For the "stdenv accidentally referring to bootstrap-tools", it seems
easier to specify the path that we don't want to depend on, e.g.
disallowedRequisites = [ bootstrapTools ];
It turns out that using clone() to start a child process is unsafe in
a multithreaded program. It can cause the initialisation of a build
child process to hang in setgroups(), as seen several times in the
build farm:
The reason is that Glibc thinks that the other threads of the parent
exist in the child, so in setxid_mark_thread() it tries to get a futex
that has been acquired by another thread just before the clone(). With
fork(), Glibc runs pthread_atfork() handlers that take care of this
(in particular, __reclaim_stacks()). But clone() doesn't do that.
Fortunately, we can use fork()+unshare() instead of clone() to set up
private namespaces.
See also https://www.mail-archive.com/lxc-devel@lists.linuxcontainers.org/msg03434.html.
The Nixpkgs stdenv prints some custom escape sequences to denote
nesting and stuff like that. Most terminals (e.g. xterm, konsole)
ignore them, but some do not (e.g. xfce4-terminal). So for the benefit
of the latter, filter them out.
While running Python 3’s test suite, we noticed that on some systems
/dev/pts/ptmx is created with permissions 0 (that’s the case with my
Nixpkgs-originating 3.0.43 kernel, but someone with a Debian-originating
3.10-3 reported not having this problem.)
There’s still the problem that people without
CONFIG_DEVPTS_MULTIPLE_INSTANCES=y are screwed (as noted in build.cc),
but I don’t see how we could work around it.
Since the addition of build-max-log-size, a call to
handleChildOutput() can result in cancellation of a goal. This
invalidated the "j" iterator in the waitForInput() loop, even though
it was still used afterwards. Likewise for the maxSilentTime
handling.
Probably fixes#231. At least it gets rid of the valgrind warnings.
The daemon now creates /dev deterministically (thanks!). However, it
expects /dev/kvm to be present.
The patch below restricts that requirement (1) to Linux-based systems,
and (2) to systems where /dev/kvm already exists.
I’m not sure about the way to handle (2). We could special-case
/dev/kvm and create it (instead of bind-mounting it) in the chroot, so
it’s always available; however, it wouldn’t help much since most likely,
if /dev/kvm missing, then KVM support is missing.
We were relying on SubstitutionGoal's destructor releasing the lock,
but if a goal is a top-level goal, the destructor won't run in a
timely manner since its reference count won't drop to zero. So
release it explicitly.
Fixes#178.
The flag ‘--check’ to ‘nix-store -r’ or ‘nix-build’ will cause Nix to
redo the build of a derivation whose output paths are already valid.
If the new output differs from the original output, an error is
printed. This makes it easier to test if a build is deterministic.
(Obviously this cannot catch all sources of non-determinism, but it
catches the most common one, namely the current time.)
For example:
$ nix-build '<nixpkgs>' -A patchelf
...
$ nix-build '<nixpkgs>' -A patchelf --check
error: derivation `/nix/store/1ipvxsdnbhl1rw6siz6x92s7sc8nwkkb-patchelf-0.6' may not be deterministic: hash mismatch in output `/nix/store/4pc1dmw5xkwmc6q3gdc9i5nbjl4dkjpp-patchelf-0.6.drv'
The --check build fails if not all outputs are valid. Thus the first
call to nix-build is necessary to ensure that all outputs are valid.
The current outputs are left untouched: the new outputs are either put
in a chroot or diverted to a different location in the store using
hash rewriting.
*headdesk*
*headdesk*
*headdesk*
So since commit 22144afa8d, Nix hasn't
actually checked whether the content of a downloaded NAR matches the
hash specified in the manifest / NAR info file. Urghhh...
On Linux, Nix can build i686 packages even on x86_64 systems. It's not
enough to recognize this situation by settings.thisSystem, we also have
to consult uname(). E.g. we can be running on a i686 Debian with an
amd64 kernel. In that situation settings.thisSystem is i686-linux, but
we still need to change personality to i686 to make builds consistent.
On a system with multiple CPUs, running Nix operations through the
daemon is significantly slower than "direct" mode:
$ NIX_REMOTE= nix-instantiate '<nixos>' -A system
real 0m0.974s
user 0m0.875s
sys 0m0.088s
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m2.118s
user 0m1.463s
sys 0m0.218s
The main reason seems to be that the client and the worker get moved
to a different CPU after every call to the worker. This patch adds a
hack to lock them to the same CPU. With this, the overhead of going
through the daemon is very small:
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m1.074s
user 0m0.809s
sys 0m0.098s
This reverts commit 69b8f9980f.
The timeout should be enforced remotely. Otherwise, if the garbage
collector is running either locally or remotely, if will block the
build or closure copying for some time. If the garbage collector
takes too long, the build may time out, which is not what we want.
Also, on heavily loaded systems, copying large paths to and from the
remote machine can take a long time, also potentially resulting in a
timeout.
mount(2) with MS_BIND allows mounting a regular file on top of a regular
file, so there's no reason to only bind directories. This allows finer
control over just which files are and aren't included in the chroot
without having to build symlink trees or the like.
Signed-off-by: Shea Levy <shea@shealevy.com>
With C++ std::map, doing a comparison like ‘map["foo"] == ...’ has the
side-effect of adding a mapping from "foo" to the empty string if
"foo" doesn't exist in the map. So we ended up setting some
environment variables by accident.
In particular this means that "trivial" derivations such as writeText
are not substituted, reducing the number of GET requests to the binary
cache by about 200 on a typical NixOS configuration.
Before calling dumpPath(), we have to make sure the files are owned by
the build user. Otherwise, the build could contain a hard link to
(say) /etc/shadow, which would then be read by the daemon and
rewritten as a world-readable file.
This only affects systems that don't have hard link restrictions
enabled.
The assertion in canonicalisePathMetaData() failed because the
ownership of the path already changed due to the hash rewriting. The
solution is not to check the ownership of rewritten paths.
Issue #122.
Otherwise subsequent invocations of "--repair" will keep rebuilding
the path. This only happens if the path content differs between
builds (e.g. due to timestamps).
Don't pass --timeout / --max-silent-time to the remote builder.
Instead, let the local Nix process terminate the build if it exceeds a
timeout. The remote builder will be killed as a side-effect. This
gives better error reporting (since the timeout message from the
remote side wasn't properly propagated) and handles non-Nix problems
like SSH hangs.
I'm not sure if it has ever worked correctly. The line "lastWait =
after;" seems to mean that the timer was reset every time a build
produced log output.
Note that the timeout is now per build, as documented ("the maximum
number of seconds that a builder can run").
It turns out that in multi-user Nix, a builder may be able to do
ln /etc/shadow $out/foo
Afterwards, canonicalisePathMetaData() will be applied to $out/foo,
causing /etc/shadow's mode to be set to 444 (readable by everybody but
writable by nobody). That's obviously Very Bad.
Fortunately, this fails in NixOS's default configuration because
/nix/store is a bind mount, so "ln" will fail with "Invalid
cross-device link". It also fails if hard-link restrictions are
enabled, so a workaround is:
echo 1 > /proc/sys/fs/protected_hardlinks
The solution is to check that all files in $out are owned by the build
user. This means that innocuous operations like "ln
${pkgs.foo}/some-file $out/" are now rejected, but that already failed
in chroot builds anyway.
...where <XX> is the first two characters of the derivation.
Otherwise /nix/var/log/nix/drvs may become so large that we run into
all sorts of weird filesystem limits/inefficiences. For instance,
ext3/ext4 filesystems will barf with "ext4_dx_add_entry:1551:
Directory index full!" once you hit a few million files.
If a derivation has multiple outputs, then we only want to download
those outputs that are actuallty needed. So if we do "nix-build -A
openssl.man", then only the "man" output should be downloaded.
Likewise if another package depends on ${openssl.man}.
The tricky part is that different derivations can depend on different
outputs of a given derivation, so we may need to restart the
corresponding derivation goal if that happens.
For example, given a derivation with outputs "out", "man" and "bin":
$ nix-build -A pkg
produces ./result pointing to the "out" output;
$ nix-build -A pkg.man
produces ./result-man pointing to the "man" output;
$ nix-build -A pkg.all
produces ./result, ./result-man and ./result-bin;
$ nix-build -A pkg.all -A pkg2
produces ./result, ./result-man, ./result-bin and ./result-2.
vfork() is just too weird. For instance, in this build:
http://hydra.nixos.org/build/3330487
the value fromHook.writeSide becomes corrupted in the parent, even
though the child only reads from it. At -O0 the problem goes away.
Probably the child is overriding some spilled temporary variable.
If I get bored I may implement using posix_spawn() instead.
With this flag, if any valid derivation output is missing or corrupt,
it will be recreated by using a substitute if available, or by
rebuilding the derivation. The latter may use hash rewriting if
chroots are not available.
This operation allows fixing corrupted or accidentally deleted store
paths by redownloading them using substituters, if available.
Since the corrupted path cannot be replaced atomically, there is a
very small time window (one system call) during which neither the old
(corrupted) nor the new (repaired) contents are available. So
repairing should be used with some care on critical packages like
Glibc.
Using the immutable bit is problematic, especially in conjunction with
store optimisation. For instance, if the garbage collector deletes a
file, it has to clear its immutable bit, but if the file has
additional hard links, we can't set the bit afterwards because we
don't know the remaining paths.
So now that we support having the entire Nix store as a read-only
mount, we may as well drop the immutable bit. Unfortunately, we have
to keep the code to clear the immutable bit for backwards
compatibility.
Note that this will only work if the client has a very recent Nix
version (post 15e1b2c223), otherwise the
--option flag will just be ignored.
Fixes#50.
This handles the chroot and build hook cases, which are easy.
Supporting the non-chroot-build case will require more work (hash
rewriting!).
Issue #21.