Doing a chdir() is a bad idea in multi-threaded programs, leading to
failures such as
error: cannot connect to daemon at ‘/nix/var/nix/daemon-socket/socket’: No such file or directory
Since Linux doesn't have a connectat() syscall like FreeBSD, there is
no way we can support this in a race-free way.
This enables an optimisation in hydra-queue-runner, preventing a
download of a NAR it just uploaded to the cache when reading files
like hydra-build-products.
This allows a RemoteStore object to be used safely from multiple
threads concurrently. It will make multiple daemon connections if
necessary.
Note: pool.hh and sync.hh have been copied from the Hydra source tree.
Previously, to build a derivation remotely, we had to copy the entire
closure of the .drv file to the remote machine, even though we only
need the top-level derivation. This is very wasteful: the closure can
contain thousands of store paths, and in some Hydra use cases, include
source paths that are very large (e.g. Git/Mercurial checkouts).
So now there is a new operation, StoreAPI::buildDerivation(), that
performs a build from an in-memory representation of a derivation
(BasicDerivation) rather than from a on-disk .drv file. The only files
that need to be in the Nix store are the sources of the derivation
(drv.inputSrcs), and the needed output paths of the dependencies (as
described by drv.inputDrvs). "nix-store --serve" exposes this
interface.
Note that this is a privileged operation, because you can construct a
derivation that builds any store path whatsoever. Fixing this will
require changing the hashing scheme (i.e., the output paths should be
computed from the other fields in BasicDerivation, allowing them to be
verified without access to other derivations). However, this would be
quite nice because it would allow .drv-free building (e.g. "nix-env
-i" wouldn't have to write any .drv files to disk).
Fixes#173.
Hello!
The patch below adds a ‘verifyStore’ RPC with the same signature as the
current LocalStore::verifyStore method.
Thanks,
Ludo’.
>From aef46c03ca77eb6344f4892672eb6d9d06432041 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Mon, 1 Jun 2015 23:17:10 +0200
Subject: [PATCH] Add a 'verifyStore' remote procedure call.
This makes things more efficient (we don't need to use an SSH master
connection, and we only start a single remote process) and gets rid of
locking issues (the remote nix-store process will keep inputs and
outputs locked as long as they're needed).
It also makes it more or less secure to connect directly to the root
account on the build machine, using a forced command
(e.g. ‘command="nix-store --serve --write"’). This bypasses the Nix
daemon and is therefore more efficient.
Also, don't call nix-store to import the output paths.
When copying a large path causes the daemon to run out of memory, you
now get:
error: Nix daemon out of memory
instead of:
error: writing to file: Broken pipe
The flag ‘--check’ to ‘nix-store -r’ or ‘nix-build’ will cause Nix to
redo the build of a derivation whose output paths are already valid.
If the new output differs from the original output, an error is
printed. This makes it easier to test if a build is deterministic.
(Obviously this cannot catch all sources of non-determinism, but it
catches the most common one, namely the current time.)
For example:
$ nix-build '<nixpkgs>' -A patchelf
...
$ nix-build '<nixpkgs>' -A patchelf --check
error: derivation `/nix/store/1ipvxsdnbhl1rw6siz6x92s7sc8nwkkb-patchelf-0.6' may not be deterministic: hash mismatch in output `/nix/store/4pc1dmw5xkwmc6q3gdc9i5nbjl4dkjpp-patchelf-0.6.drv'
The --check build fails if not all outputs are valid. Thus the first
call to nix-build is necessary to ensure that all outputs are valid.
The current outputs are left untouched: the new outputs are either put
in a chroot or diverted to a different location in the store using
hash rewriting.
As discovered by Todd Veldhuizen, the shell started by nix-shell has
its affinity set to a single CPU. This is because nix-shell connects
to the Nix daemon, which causes the affinity hack to be applied. So
we turn this off for Perl programs.
On a system with multiple CPUs, running Nix operations through the
daemon is significantly slower than "direct" mode:
$ NIX_REMOTE= nix-instantiate '<nixos>' -A system
real 0m0.974s
user 0m0.875s
sys 0m0.088s
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m2.118s
user 0m1.463s
sys 0m0.218s
The main reason seems to be that the client and the worker get moved
to a different CPU after every call to the worker. This patch adds a
hack to lock them to the same CPU. With this, the overhead of going
through the daemon is very small:
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m1.074s
user 0m0.809s
sys 0m0.098s
So if a path is not garbage solely because it's reachable from a root
due to the gc-keep-outputs or gc-keep-derivations settings, ‘nix-store
-q --roots’ now shows that root.
With this flag, if any valid derivation output is missing or corrupt,
it will be recreated by using a substitute if available, or by
rebuilding the derivation. The latter may use hash rewriting if
chroots are not available.
To implement binary caches efficiently, Hydra needs to be able to map
the hash part of a store path (e.g. "gbg...zr7") to the full store
path (e.g. "/nix/store/gbg...kzr7-subversion-1.7.5"). (The binary
cache mechanism uses hash parts as a key for looking up store paths to
ensure privacy.) However, doing a search in the Nix store for
/nix/store/<hash>* is expensive since it requires reading the entire
directory. queryPathFromHashPart() prevents this by doing a cheap
database lookup.
queryValidPaths() combines multiple calls to isValidPath() in one.
This matters when using the Nix daemon because it reduces latency.
For instance, on "nix-env -qas \*" it reduces execution time from 5.7s
to 4.7s (which is indistinguishable from the non-daemon case).
Getting substitute information using the binary cache substituter has
non-trivial latency overhead. A package or NixOS system configuration
can have hundreds of dependencies, and in the worst case (when the
local info cache is empty) we have to do a separate HTTP request for
each of these. If the ping time to the server is t, getting N info
files will take tN seconds; e.g., with a ping time of 0.1s to
nixos.org, sequentially downloading 1000 info files (a typical NixOS
config) will take at least 100 seconds.
To fix this problem, the binary cache substituter can now perform
requests in parallel. This required changing the substituter
interface to support a function querySubstitutablePathInfos() that
queries multiple paths at the same time, and rewriting queryMissing()
to take advantage of parallelism. (Due to local caching,
parallelising queryMissing() is sufficient for most use cases, since
it's almost always called before building a derivation and thus fills
the local info cache.)
For example, parallelism speeds up querying all 1056 paths in a
particular NixOS system configuration from 116s to 2.6s. It works so
well because the eccentricity of the top-level derivation in the
dependency graph is only 9. So we only need 10 round-trips (when
using an unlimited number of parallel connections) to get everything.
Currently we do a maximum of 150 parallel connections to the server.
Thus it's important that the binary cache server (e.g. nixos.org) has
a high connection limit. Alternatively we could use HTTP pipelining,
but WWW::Curl doesn't support it and libcurl has a hard-coded limit of
5 requests per pipeline.
We can't open a SQLite database if the disk is full. Since this
prevents the garbage collector from running when it's most needed, we
reserve some dummy space that we can free just before doing a garbage
collection. This actually revives some old code from the Berkeley DB
days.
Fixes#27.
* Buffer the HashSink. This speeds up hashing a bit because it
prevents lots of calls to the hash update functions (e.g. nix-hash
went from 9.3s to 8.7s of user time on the closure of my
/var/run/current-system).
daemon (which is an error), print a nicer error message than
"Connection reset by peer" or "broken pipe".
* In the daemon, log errors that occur during request parameter
processing.
hook script proper, and the stdout/stderr of the builder. Only the
latter should be saved in /nix/var/log/nix/drvs.
* Allow the verbosity to be set through an option.
* Added a flag --quiet to lower the verbosity level.
An "using namespace std" was added locally in those functions that refer to
names from <cstring>. That is not pretty, but it's a very portable solution,
because strcpy() and friends will be found in both the 'std' and in the global
namespace.
(Linux) machines no longer maintain the atime because it's too
expensive, and on the machines where --use-atime is useful (like the
buildfarm), reading the atimes on the entire Nix store takes way too
much time to make it practical.
SHA-256 outputs of fixed-output derivations. I.e. they now produce
the same store path:
$ nix-store --add x
/nix/store/j2fq9qxvvxgqymvpszhs773ncci45xsj-x
$ nix-store --add-fixed --recursive sha256 x
/nix/store/j2fq9qxvvxgqymvpszhs773ncci45xsj-x
the latter being the same as the path that a derivation
derivation {
name = "x";
outputHashAlgo = "sha256";
outputHashMode = "recursive";
outputHash = "...";
...
};
produces.
This does change the output path for such fixed-output derivations.
Fortunately they are quite rare. The most common use is fetchsvn
calls with SHA-256 hashes. (There are a handful of those is
Nixpkgs, mostly unstable development packages.)
* Documented the computation of store paths (in store-api.cc).
again. (After the previous substituter mechanism refactoring I
didn't update the code that obtains the references of substitutable
paths.) This required some refactoring: the substituter programs
are now kept running and receive/respond to info requests via
stdin/stdout.
This isn't usually a problem, except that it causes tests to fail
when performed in a directory with a very long path name. So chdir
to the socket directory and use a relative path name.
need any info on substitutable paths, we just call the substituters
(such as download-using-manifests.pl) directly. This means that
it's no longer necessary for nix-pull to register substitutes or for
nix-channel to clear them, which makes those operations much faster
(NIX-95). Also, we don't have to worry about keeping nix-pull
manifests (in /nix/var/nix/manifests) and the database in sync with
each other.
The downside is that there is some overhead in calling an external
program to get the substitutes info. For instance, "nix-env -qas"
takes a bit longer.
Abolishing the substitutes table also makes the logic in
local-store.cc simpler, as we don't need to store info for invalid
paths. On the downside, you cannot do things like "nix-store -qR"
on a substitutable but invalid path (but nobody did that anyway).
* Never catch interrupts (the Interrupted exception).
--export' into the Nix store, and optionally check the cryptographic
signatures against /nix/etc/nix/signing-key.pub. (TODO: verify
against a set of public keys.)
path. This is like `nix-store --dump', only it also dumps the
meta-information of the store path (references, deriver). Will add
a `--sign' flag later to add a cryptographic signature, which we
will use for exchanging store paths between build farm machines in a
secure manner.
from a source directory. All files for which a predicate function
returns true are copied to the store. Typical example is to leave
out the .svn directory:
stdenv.mkDerivation {
...
src = builtins.filterSource
(path: baseNameOf (toString path) != ".svn")
./source-dir;
# as opposed to
# src = ./source-dir;
}
This is important because the .svn directory influences the hash in
a rather unpredictable and variable way.
`nix-store --delete'. But unprivileged users are not allowed to
ignore liveness.
* `nix-store --delete --ignore-liveness': ignore the runtime roots as
well.
process, so forward the operation.
* Spam the user about GC misconfigurations (NIX-71).
* findRoots: skip all roots that are unreadable - the warnings with
which we spam the user should be enough.
processes can register indirect roots. Of course, there is still
the problem that the garbage collector can only read the targets of
the indirect roots when it's running as root...
* SIGIO -> SIGPOLL (POSIX calls it that).
* Use sigaction instead of signal to register the SIGPOLL handler.
Sigaction is better defined, and a handler registered with signal
appears not to interrupt fcntl(..., F_SETLKW, ...), which is bad.
via the Unix domain socket in /nix/var/nix/daemon.socket. The
server forks a worker process per connection.
* readString(): use the heap, not the stack.
* Some protocol fixes.
mode. Presumably nix-worker would be setuid to the Nix store user.
The worker performs all operations on the Nix store and database, so
the caller can be completely unprivileged.
This is already much more secure than the old setuid scheme, since
the worker doesn't need to do Nix expression evaluation and so on.
Most importantly, this means that it doesn't need to access any user
files, with all resulting security risks; it only performs pure
store operations.
Once this works, it is easy to move to a daemon model that forks off
a worker for connections established through a Unix domain socket.
That would be even more secure.