* Document manual migration for use-xdg-base-directories
As there's currently no automatic migration for use-xdg-base-directories
option, add instructions for manual migration to the option's
description.
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
When encountering a build error, Nix moves the output paths out of the
chroot into their final location (for “easier debugging of build
failures”). However this was broken for chroot stores as it was moving
it to the _logical_ location, not the _physical_ one.
Fix it by moving to the physical (_real_) location.
Fix https://github.com/NixOS/nix/issues/8395
The `hashed-mirrors` option did use to have this default value,
but it was removed and re-added with an empty default value.
As the autogenerated docs show the (actual) default values from code,
remove this incorrect reference from the docs.
I was updating my nix.conf settings after a few years and noticed this.
This adds a new configuration option to Nix, `always-allow-substitutes`,
whose effect is simple: it causes the `allowSubstitutes` attribute in
derivations to be ignored, and for substituters to always be used.
This is extremely valuable for users of Nix in CI, where usually
`nix-build-uncached` is used. There, derivations which disallow
substitutes cause headaches as the inputs for building already-cached
derivations need to be fetched to spuriously rebuild some simple text
file.
This option should be a good middle-ground, since it doesn't imply
rebuilding the world, such as the approach I took in
https://github.com/NixOS/nixpkgs/pull/221048
Using abstract types like can help cut down on compilation time, both
from scratch, and especially incremental builds during development. The
idea is that `worker-protocol.hh` can declare all the (de)serializers, but
only again abstract types; when code needs to use some (de)serializers, it can
include headers just for the data types it needs to (de)serialize.
`store-api.hh` in particular is a bit of a sledgehammer, and the data
types we want to serialize have their own headers.
This is the more typically way to do [Argument-dependent
lookup](https://en.cppreference.com/w/cpp/language/adl)-leveraging
generic serializers in C++. It makes the relationship between the `read`
and `write` methods more clear and rigorous, and also looks more
familiar to users coming from other languages that do not have C++'s
libertine ad-hoc overloading.
I am returning to this because during the review in
https://github.com/NixOS/nix/pull/6223, it came up as something that
would make the code easier to read --- easier today hopefully already,
but definitely easier if we were have multiple codified protocols with
code sharing between them as that PR seeks to accomplish.
If I recall correctly, the main criticism of this the first time around
(in 2020) was that having to specify the type when writing, e.g.
`WorkerProto<MyType>::write`, was too verbose and cumbersome. This is
now addressed with the `workerProtoWrite` wrapper function.
This method is also the way `nlohmann::json`, which we have used for a
number of years now, does its serializers, for what its worth.
This reverts commit 45a0ed82f0. That
commit in turn reverted 9ab07e99f5.
This is good in general, but in particular ensures when we heavily
refactor it in the next commit there is less likelihood for an
unintentional change in behavior to sneak in.
These items are not templates, and they declared in
`worker-protocol.hh`; therefore they should live in a
`worker-protocol.cc`.
Anything else needlessly diverges from convention. After all, it is not
like this code is only used in `remote-store.cc`; it is also used in
`daemon.cc`. There is no good reason to place it with the client
implementation or the server implementation when it used equally by
both.
This requires switching on SQLITE_OPEN_URI because there is no open flag to
make the database immutable. Without immutable, sqlite will still attempt to
create journal and wal files, even when the database is opened read-only.
https://www.sqlite.org/c3ref/open.html
The immutable parameter is a boolean query parameter that indicates that the
database file is stored on read-only media. When immutable is set, SQLite
assumes that the database file cannot be changed, even by a process with higher
privilege, and so the database is opened read-only and all locking and change
detection is disabled.
Nix does not manage the overlayfs mount point itself, but the correct
functioning of the overlay store does depend on this mount point being set up
correctly. Rather than just assume this is the case, check that the lowerdir
and upperdir options are what we expect them to be. This check is on by
default, but can be disabled if needed.
They were improperly added in 8a93b5a551.
They were not `.gitignore`d because they were stale in that commit --
build artifacts no longer used that name by then and so `.gitignore` was
updated accordingly.
Previously, we relied on the `shutdown()` function to terminate `accept()`
calls on a listening socket. However, this approach did not work on macOS as
the waiting `accept()` call is not considered a connected socket, resulting in
an `ENOTCONN` error. Instead, we now close the listening socket to terminate
the `accept()` call.
Additionally, we fixed a resource management issue where we set the
`daemonSocket` variable to -1, triggering resource cleanup and causing the
`stopDaemon` function to be called twice. This resulted in errors as the socket
was already closed by the time the second `stopDaemon` call was made. Instead of
setting `daemonSocket` to -1, we now release the socket using the `release()`
method on a unique pointer. This properly transfers ownership and allows for
correct resource cleanup.
These changes ensure proper behavior and resource management for the
recursive-nix feature on macOS.
In other words, use a plain `ContentAddress` not
`ContentAddressWithReferences` for `DerivationOutput::CAFixed`.
Supporting fixed output derivations with (fixed) references would be a
cool feature, but it is out of scope at this moment.
Recently, I encountered the "NAR info file 'xxxx' is corrupt" error
with my binary cache. The message is not helpful in determining, which
kind of corruption happened. The file, fetched with curl, looked
reasonably.
This commit adds more information to the error message, which should
allow debugging and hopefully fixing the problem.
We finally test the status quo of remote build trust in a number of
ways. We create a new experimental feature on `nix-daemon` to do so.
PR #3921, which improves the situation with trustless remote building,
will build upon these changes. This code / tests was pull out of there
to make this, so everything is easier to review, and in particular we
test before and after so the new behavior in that PR is readily apparent
from the testsuite diff alone.
Issues:
1. Features gated on disabled experimental settings should warn and be
ignored, not silently succeed.
2. Experimental settings in the same config "batch" (file or env var)
as the enabling of the experimental feature should work.
3. For (2), the order should not matter.
These are analogous to the issues @roberth caught with my changes for
arg handling, but they are instead for config handling.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
In many cases we are dealing with a collection of realisations, they are
all outputs of the same derivation. In that case, we don't need
"derivation hashes modulos" to be part of our map key, because the
output names alone will be unique. Those hashes are still part of the
realisation proper, so we aren't loosing any information, we're just
"normalizing our schema" by narrowing the "primary key".
Besides making our data model a bit "tighter" this allows us to avoid a
double `for` loop in `DerivationGoal::waiteeDone`. The inner `for` loop
was previously just to select the output we cared about without knowing
its hash. Now we can just select the output by name directly.
Note that neither protocol is changed as part of this: we are still
transferring `DrvOutputs` over the wire for `BuildResult`s. I would only
consider revising this once #6223 is merged, and we can mention protocol
versions inside factored-out serialization logic. Until then it is
better not change anything because it would come a the cost of code
reuse.
If my memory is correct, @edolstra objected to modifying `wantedOutputs`
upon falling back to doing a build (as we did before), because we should
only modify it in response to new requests --- *actual* wants --- and
not because we are "incidentally" building all the outptus beyond what
may have been requested.
That's a fair point, and the alternative is to replace the boolean soup
with proper enums: Instead of modifying `wantedOuputs` som more, we'll
modify `needsRestart` to indicate we are passed the need.
In https://github.com/NixOS/nix/pull/6311#discussion_r834863823, I
realized since derivation goals' wanted outputs can "grow" due to
overlapping dependencies (See `DerivationGoal::addWantedOutputs`, called
by `Worker::makeDerivationGoalCommon`), the previous bug fix had an
unfortunate side effect of causing more pointless rebuilds.
In paticular, we have this situation:
1. Goal made from `DerivedPath::Built { foo, {a} }`.
2. Goal gives on on substituting, starts building.
3. Goal made from `DerivedPath::Built { foo, {b} }`, in fact is just
modified original goal.
4. Though the goal had gotten as far as building, so all outputs were
going to be produced, `addWantedOutputs` no longer knows that and so
the goal is flagged to be restarted.
This might sound far-fetched with input-addressed drvs, where we usually
basically have all our goals "planned out" before we start doing
anything, but with CA derivation goals and especially RFC 92, where *drv
resolution* means goals are created after some building is completed, it
is more likely to happen.
So the first thing to do was restore the clearing of `wantedOutputs` we
used to do, and then filter the outputs in `buildPathsWithResults` to
only get the ones we care about.
But fix also has its own side effect in that the `DerivedPath` in the
`BuildResult` in `DerivationGoal` cannot be trusted; it is merely the
*first* `DerivedPath` for which this goal was originally created.
To remedy this, I made `BuildResult` be like it was before, and instead
made `KeyedBuildResult` be a subclass wit the path. Only
`buildPathsWithResults` returns `KeyedBuildResult`s, everything else
just becomes like it was before, where the "key" is unambiguous from
context.
I think separating the "primary key" field(s) from the other fields is
good practical in general anyways. (I would like to do the same thing
for `ValidPathInfo`.) Among other things, it allows constructions like
`std::map<Key, ThingWithKey>` where doesn't contain duplicate keys and
just precludes the possibility of those duplicate keys being out of
sync.
We might leverage the above someday to overload `buildPathsWithResults`
to take a *set* of return a *map* per the above.
-----
Unfortunately, we need to avoid C++20 strictness on designated
initializers.
(BTW
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html
this offers some new syntax for this use-case. Hopefully this will be
adopted and we can eventually use it.)
No having that yet, maybe it would be better to not make
`KeyedBuildResult` a subclass to just avoid this.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
As requested by @roberth, it is good to call out the specific instances
we care about, which is `!` for the RPC protocols, and `^` for humans.
This doesn't take advantage of parametricity as much, but since the
human and computer interfaces are good to decouple anyways (we don't
care if they drift further apart over time in the slightest) some
separation and slight duplication is fine.
Also, unit test both round trips.
More progress on issue #5729
The method trivially generalizes to be store-implementation-agnostic, in
fact.
However, we force it to continue to be unimplemented with `RemoteStore`
and `LegacySSHStore` because the implementation we'd get via the
generalization is probably not the one users expect. This keeps our
hands untied to do it right going forward.
For more about the tension between the scheduler logic being
store-type-agnostic and remote stores doing their own scheduling, see
issues #5025 and #5056.
The warning message should produce an installable name that can be
passed to `nix build`, `nix path-info`, etc. again. Since the CLI
expects that the .drv path and the output names are separated by
a caret, the warning message must also separate the .drv path and output
names with a caret.
However, `DerivedPath::Built.to_string()` uses an exclamation point as
the separator instead. This commit adds a `separator` argument to the
to_string method.
This changes the warning message from:
If this command is now failing try again with '/nix/store/foo.drv!*'
to:
If this command is now failing try again with '/nix/store/foo.drv^*'
More progress on issue #5729.
Instead of having it by the default method in `Store` itself, have it be
the implementation in `DummyStore` and `LegacySSHStore`. Then just the
implementations which fail to provide the method pay the "penalty" of
dealing with the icky `unimplemented` function for non-compliance.
Combined with my other recent PRs, this finally makes `Store` have no
`unsupported` calls!
Getting the occasional SQLITE_BUSY is expected when the database is being
accessed concurrently. The retry will likely succeed so it is pointless to warn
immediately. Instead we track how long each retrySQLite block has been running,
and only begin warning after a second has elapsed (and then every 10 seconds
subsequently).
libutil is a dependency of libstore, so it should always be
initialized as such.
libutil is also a dependency of libmain. Being explicit about this
dependency might be good, but not worth the slight code complexity
until the library structure gets more advanced.
Part of an effort to make it easier to initialize the right things,
by moving code into the appropriate libraries.
This code is bad. We shouldn't unset variables in programs whose
children may need them. Fixing one issue at a time, so postponing.
See https://github.com/NixOS/nix/issues/7731
Part of an effort to make it easier to initialize the right things,
by moving code into the appropriate libraries.
It is required for the sandbox, which is a libstore responsibility;
not just libmain.
Part of an effort to make it easier to initialize the right things,
by moving code into the appropriate libraries.
Part of an effort to make it easier to initialize the right things,
by moving code into the appropriate libraries.
Using libstore without loading the config file is risky, as sqlite
may then be misconfigured. See https://github.com/cachix/cachix/issues/475
* Finish converting existing comments for internal API docs
99% of this was just reformatting existing comments. Only two exceptions:
- Expanded upon `BuildResult::status` compat note
- Split up file-level `symbol-table.hh` doc comments to get
per-definition docs
Also fixed a few whitespace goofs, turning leading tabs to spaces and
removing trailing spaces.
Picking up from #8133
* Fix two things from comments
* Use triple-backtick not indent for `dumpPath`
* Convert GNU-style `\`..'` quotes to markdown style in API docs
This will render correctly.
This is non-breaking change in the to-JSON direction. This *is* a
breaking change in the from-JSON direction, but we don't care, as that
is brand new in this PR.
`nix show-derivation --help` currently has the sole public documentation
of this format, it is updated accordingly.
This function returns true or false depending on whether the Nix client
is trusted or not. Mostly relevant when speaking to a remote store with
a daemon.
We include this information in `nix ping store` and `nix doctor`
Co-Authored-By: John Ericson <John.Ericson@Obsidian.Systems>
The code is not local-store-specific, so we should share it with all
stores. More uniform behavior is better, and a less store-specific
functionality is more maintainable.
This fixes a FIXME added in f73d911628 by @edolstra himself.
This fixes the issue that `nix-build`, without experimental feature
'nix-command' enabled, recommends the experimental CLI `nix log` to view
build logs. Now it'll recommend the stable `nix-store -l` CLI instead.
Fixes https://github.com/NixOS/nix/issues/8118
Pause logger before starting SSH connections, and resume it after the
connection is established, so that SSH password prompts are not erased
by the logger's updates.
Otherwise, when running as root and user namespaces are enabled,
opening the slave fails with EPERM.
Fixes "opening pseudoterminal slave: Permission denied" followed by a
hang (https://hydra.nixos.org/build/213104244), and "error: getting
sandbox mount namespace: No such file or directory" (#8072), which
happens when the child fails very quickly and consequently reading
/proc/<child>/ns fails.
This provides a platform-independent way to configure the SSL
certificates file in the Nix daemon. Previously we provided
instructions for overriding the environment variable in launchd, but
that obviously doesn't work with systemd. Now we can just tell users
to add
ssl-cert-file = /etc/ssl/my-certificate-bundle.crt
to their nix.conf.
Hopefully this fixes "unexpected EOF" failures on macOS
(#3137, #3605, #7242, #7702).
The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:
* The child finishes very quickly. That's why this bug is likely to
trigger in nix-env tests, since that uses a builtin builder. Adding
a short sleep before the child exits makes the problem go away.
* The parent has closed its duplicate of the slave file
descriptor. This shouldn't matter, since the child has a duplicate
as well, but it does. E.g. moving the close to the bottom of
startBuilder() makes the problem go away. However, that's not a
solution because it would make Nix hang if the child dies before
sending the "sandbox initialized" message.
* The system is under high load. E.g. "make installcheck -j16" makes
the issue pretty reproducible, while it's very rare under "make
installcheck -j1".
As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.
I saw this random failure in https://hydra.nixos.org/build/211811692:
error: opening /proc/15307/fd: No such process
while running nix-collect-garbage in a readfile-context.sh. This is
because we're not handling ESRCH errors reading /proc/<pid>/fd. So
just move the read inside the try/catch where we do handle it.
`nix copy` operations did not show progress. This is quite confusing.
Add a `progressSink` which displays the progress during `copyPaths`,
pretty much copied from `copyStorePath`.
Fixes https://github.com/NixOS/nix/issues/8000
The curl download can outlive DrvOutputSubstitutionGoal (if some other
error occurs), so at shutdown setting the promise to an exception will
fail because 'this' is no longer valid in the callback. This can
manifest itself as a segfault, "corrupted double-linked list" or hang.
We make sure the env var paths are actually set (ie. not "") before
sending them to the canonicalization function. If we forget to do so,
the user will end up facing a puzzled failed assertion internal error.
We issue a non-failing warning as a stop-gap measure. We could want to
revisit this to issue a detailed failing error message in the future.
Currently the valid key is only present when the path is invalid, which
makes checking path validity more complex than it should be. With this
change, the valid key can always be used to check if a path is valid
At the moment an Error is thrown that only holds an error message
regarding `nix-env` and `nix profile`. These tools make use of
builtins.buildEnv, but buildEnv is also used in other places. These
places are unrelated to Nix profiles, so the error shouldn't mention
these tools.
This generic error is now BuildEnvFileConflictError, which holds more
contextual information about the files that were conflicting while
building the environment.
Nixpkgs on aarch64-linux is currently stuck on GCC 9
(https://github.com/NixOS/nixpkgs/issues/208412) and using gcc11Stdenv
doesn't work either.
So use c++2a instead of c++20 for now. Unfortunately this means we
can't use some C++20 features for now (like std::span).
XDG Base Directory is a standard for locations for storing various
files. Nix has a few files which seem to fit in the standard, but
currently use a custom location directly in the user's ~, polluting
it:
- ~/.nix-profile
- ~/.nix-defexpr
- ~/.nix-channels
This commit adds a config option (use-xdg-base-directories) to follow
the XDG spec and instead use the following locations:
- $XDG_STATE_HOME/nix/profile
- $XDG_STATE_HOME/nix/defexpr
- $XDG_STATE_HOME/nix/channels
If $XDG_STATE_HOME is not set, it is assumed to be ~/.local/state.
Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
Co-authored-by: Tim Fenney <kodekata@gmail.com>
Co-authored-by: pasqui23 <pasqui23@users.noreply.github.com>
Co-authored-by: Artturin <Artturin@artturin.com>
Co-authored-by: John Ericson <Ericson2314@Yahoo.com>
Fixes#3898
The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.
Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
This is slightly more accurate considering that an outdated record
may exist in the persistent cache. Possibly-outdated records are
quite relevant as they may be foreign keys to more recent information
that we want to keep, but we will not return them here.
In unprivileged podman containers, /proc is not fully visible (there
are other filesystems mounted on subdirectories of /proc). Therefore
we can't mount a new /proc in the sandbox that matches the PID
namespace of the sandbox. So this commit automatically disables
sandboxing if /proc is not fully visible.
This didn't work because sandboxing doesn't work in Docker. However,
the sandboxing check is done lazily - after clone(CLONE_NEWNS) fails,
we retry with sandboxing disabled. But at that point, we've already
done UID allocation under the assumption that user namespaces are
enabled.
So let's get rid of the "goto fallback" logic and just detect early
whether user / mount namespaces are enabled.
This commit also gets rid of a compatibility hack for some ancient
Linux kernels (<2.13).
tl;dr: With this 1 line change I was able to get a speedup of 1.5x on 1Gbit/s
wan connections by enabling zstd compression in nginx.
Also nix already supported all common compression format for http
transfer, webservers usually only enable them if they are advertised
through the Accept-Encoding header.
This pull requests makes nix advertises content compression support for
zstd, br, gzip and deflate.
It's particular useful to add transparent compression for binary caches
that serve packages from the host nix store in particular nix-serve,
nix-serve-ng and harmonia.
I tried so far gzip, brotli and zstd, whereas only zstd was able to bring
me performance improvements for 1Gbit/s WAN connections.
The following nginx configuration was used in combination with the
[zstd module](https://github.com/tokers/zstd-nginx-module) and
[harmonia](https://github.com/nix-community/harmonia/)
```nix
{
services.nginx.virtualHosts."cache.yourhost.com" = {
locations."/".extraConfig = ''
proxy_pass http://127.0.0.1:5000;
proxy_set_header Host $host;
proxy_redirect http:// https://;
proxy_http_version 1.1;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
zstd on;
zstd_types application/x-nix-archive;
'';
};
}
```
For testing I unpacked a linux kernel tarball to the nix store using
this command `nix-prefetch-url --unpack https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.8.tar.gz`.
Before:
```console
$ nix build && rm -rf /tmp/hello && time ./result/bin/nix copy --no-check-sigs --from https://cache.thalheim.io --to 'file:///tmp/hello?compression=none' '/nix/store/j42mahch5f0jvfmayhzwbb88sw36fvah-linux-6.1.8.tar.gz'
warning: Git tree '/scratch/joerg/nix' is dirty
real 0m18,375s
user 0m2,889s
sys 0m1,558s
```
After:
```console
$ nix build && rm -rf /tmp/hello && time ./result/bin/nix copy --no-check-sigs --from https://cache.thalheim.io --to 'file:///tmp/hello?compression=none' '/nix/store/j42mahch5f0jvfmayhzwb
b88sw36fvah-linux-6.1.8.tar.gz'
real 0m11,884s
user 0m4,130s
sys 0m1,439s
```
Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Update src/libstore/filetransfer.cc
Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
These settings are not needed for libstore at all, they are just used by
the nix daemon *command* for authorization on unix domain sockets. My
moving them to a new configuration struct just in that file, we avoid
them leaking anywhere else.
Also, it is good to break up the mammoth `Settings` struct in general.
Issue #5638 tracks this.
The message is not changed because I do not want to regress in
convenience to the user. Just saying "this connection is not trusted"
doesn't tell them out to fix the issue. The ideal thing to do would be
to somehow parameterize `processCommand` on how the error should be
displayed, so different sorts of connections can display different
information to the user based on how authentication is performed for the
connection in question. This, however, is a good bit more work, so it is
left for the future.
This came up with me thinking about the tcp:// store (#5265). The larger
project is not TCP *per se*, but the idea that it should be possible for
something else to manage access control to services like the Nix Daemon,
and those services simply trust or trust the incoming connection as they
are told. This is a more capability-oriented way of thinking about trust
than "every server implements its own auth separately" as we are used to today.
Its very great that libstore itself already implements just this model,
and so via this refactor I basically want to "enshrine" that so it
continues to be the case.
With the switch to C++20, the rules became more strict, and we can no
longer initialize base classes. Make them comments instead.
(BTW
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html
this offers some new syntax for this use-case. Hopefully this will be
adopted and we can eventually use it.)
I don't think the `narHash` is in need of documentation more than the
other undocumented fields, but regardless this change has nothing to do
with that field and so we should leave the comment as is.
`&` without space before is far more common on this codebase than I
thought, so it is not worth changing just this one file. Maybe we will
adopt a formatter someday but until then this is fine.
The references set seems to have been unused since `LegacySSHStore`
references were first created in
caa5793b4a.
The method decls never were upstream, and accidentally added by me in
062533f7cd (probably due to `git rerere`).
Sorry!
This reduces the diff from #3746.
Avoid needless work and throwing away invariants.
These conversions date back to when `StorePath` was in Rust and there
were issues with it missing utility methods.
It's used as the “system” profile in a bunch of places, so better not
touch it. Besides, it doesn't hurt to keep it since it's owned by root
any way, so it doesn't have the `chown` problem that the user profiles
had and that led to wanting to move them on the client-side.
Rather than using `/nix/var/nix/{profiles,gcroots}/per-user/`, put the user
profiles and gcroots under `$XDG_DATA_DIR/nix/{profiles,gcroots}`.
This means that the daemon no longer needs to manage these paths itself
(they are fully handled client-side). In particular, it doesn’t have to
`chown` them anymore (removing one need for root).
This does change the layout of the gc-roots created by nix-env, and is
likely to break some stuff, so I’m not sure how to properly handle that.
Originally there was no `path-info.*`, then there was `path-info.hh`,
then there was `path-info.cc`, but only for new things. Moving this
stuff over makes everything consistent.
This should be a non-empty set, and so we don't want people doing this
by accident. We remove the zero-0 constructor with a little inheritance
trickery.
`DerivedPath::Built` and `DerivationGoal` were previously using a
regular set with the convention that the empty set means all outputs.
But it is easy to forget about this rule when processing those sets.
Using `OutputSpec` forces us to get it right.
This way the links are clearly within the manual (ie not absolute paths),
while allowing snippets to reference the documentation root reliably,
regardless of at which base url they're included.
macOS doesn't have user namespacing, so the gid of the builder needs
to be nixbld. The logic got "has sandboxing enabled" confused with
"has user namespaces".
Fixes#7529.
This basically reverts 6e5165b773.
It fixes errors like
sandbox-exec: <internal init prelude>:292:47: unable to open sandbox-minimal.sb: not found
when trying to run a development Nix installed in a user's home
directory.
Also, we're trying to minimize the number of installed files
to make it possible to deploy Nix as a single statically-linked
binary.
Adds a new boolean structured attribute
`outputChecks.<output>.unsafeDiscardReferences` which disables scanning
an output for runtime references.
__structuredAttrs = true;
outputChecks.out.unsafeDiscardReferences = true;
This is useful when creating filesystem images containing their own embedded Nix
store: they are self-contained blobs of data with no runtime dependencies.
Setting this attribute requires the experimental feature
`discard-references` to be enabled.
Previously addTempRoot() acquired the LocalStore state lock and waited
for the garbage collector to reply. If the garbage collector is in the
same process (as it the case with auto-GC), this would deadlock as
soon as the garbage collector thread needs the LocalStore state lock.
So now addTempRoot() uses separate Syncs for the state that it
needs. As long at the auto-GC thread doesn't call addTempRoot() (which
it shouldn't), it shouldn't deadlock.
Fixes#3224.
This also moves the file handle into its own Sync object so we're not
holding the _state while acquiring the file lock. There was no real
deadlock risk here since locking a newly created file cannot block,
but it's still a bit nicer.
This has the same goal as b13fd4c58e81b2b2b0d72caa5ce80de861622610,but
achieves it in a different way in order to not break
`nix why-depends --derivation`.
In principle, this should avoid deadlocks where two instances of Nix are
holding a shared lock on big-lock and are both waiting to get an
exclusive lock.
However, it seems like `flock(2)` is supposed to do this automatically,
so it's not clear whether this is actually where the problem comes from.
This makes 'nix develop' set the Linux personality in the same way
that the actual build does, allowing a command like 'nix develop
nix#devShells.i686-linux.default' on x86_64-linux to work correctly.
Without this, the error is lost, and it makes for a hard to debug
situation. Also remove some of the busyness inside the sqlite_open_v2
args.
The errcode returned is not the extended one. The only way to make open
return an extended code, would be to add SQLITE_OPEN_EXRESCODE to the
flags. In the future it might be worth making this change,
which would also simplify the existing SQLiteError code.
They did not include the detailed error message, losing essential
information for troubleshooting.
Example message:
warning: creating statement 'insert or rplace into NARs(cache, hashPart, namePart, url, compression, fileHash, fileSize, narHash, narSize, refs, deriver, sigs, ca, timestamp, present) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 1)': at offset 10: SQL logic error, near "rplace": syntax error (in '/tmp/nix-shell.grQ6f7/nix-test/tests/binary-cache/test-home/.cache/nix/binary-cache-v6.sqlite')
It's not the best example; more important information will be in
the message for e.g. a constraint violation.
I don't see why this specific error is printed as a warning, but
that's for another commit.
Unsetting `build-users-group` (without `auto-allocate-uids` enabled)
gives the following error:
```
src/libstore/lock.cc:25: static std::unique_ptr<nix::UserLock> nix::SimpleUserLock::acquire(): Assertion `settings.buildUsersGroup != ""' failed.
```
Fix the logic in `useBuildUsers` and document the default value
for `build-users-group`.
Fix#6209
When trying to run `nix log <installable>`, try first to resolve the derivation pointed to
by `<installable>` as it is the resolved one that holds the build log.
This has a couple of shortcomings:
1. It’s expensive as it requires re-reading the derivation
2. It’s brittle because if the derivation doesn’t exist anymore or can’t
be resolved (which is the case if any one of its build inputs is missing),
then we can’t access the log anymore
However, I don’t think we can do better (at least not right now).
The alternatives I see are:
1. Copy the build log for the un-resolved derivation. But that means a
lot of duplication
2. Store the results of the resolving in the db. Which might be the best
long-term solution, but leads to a whole new class of potential
issues.
These only functioned if a very narrow combination of conditions held:
- The result path does not yet exist (--check did not result in
repeated builds), AND
- The result path is not available from any configured substituters, AND
- No remote builders that can build the path are available.
If any of these do not hold, a derivation would be built 0 or 1 times
regardless of the repeat option. Thus, remove it to avoid confusion.
We need to close the GC server socket before shutting down the active
GC client connections, otherwise a client may (re)connect and get
ECONNRESET. But also handle ECONNRESET for resilience.
Fixes random failures like
GC socket disconnected
connecting to '/tmp/nix-shell.y07M0H/nix-test/default/var/nix/gc-socket/socket'
sending GC root '/tmp/nix-shell.y07M0H/nix-test/default/store/kb5yzija0f1x5xkqkgclrdzldxj6nnc6-non-blocking'
reading GC root from client: error: unexpected EOF reading a line
1 store paths deleted, 0.00 MiB freed
error: reading from file: Connection reset by peer
in gc-non-blocking.sh.
We shouldn't skip this if the supplementary group list is empty,
because then the sandbox won't drop the supplementary groups of the
parent (like "root").
The new experimental feature 'cgroups' enables the use of cgroups for
all builds. This allows better containment and enables setting
resource limits and getting some build stats.
It occurred when a output of the dependency was already available,
so it didn't need rebuilding and didn't get added to the
inputDrvOutputs.
This process-related info wasn't suitable for the purpose of finding
the actual input paths for the builder. It is better to do this in
absolute terms by querying the store.
This change is needed to support aws-sdk-cpp 1.10 and newer.
I opted not to make this dependent on the sdk version because
the crt dependency has been in the interface of the older
sdk as well, and it was only coincidence that libstore didn't
make use of any privately defined symbols directly.
When calling `builtins.readFile` on a store path, the references of that
path are currently added to the resulting string's context.
This change makes those references the *possible* context of the string,
but filters them to keep only the references whose hash actually appears
in the string, similarly to what is done for determining the runtime
references of a path.
Cgroups are now only used for derivations that require the uid-range
range feature. This allows auto UID allocation even on systems that
don't have cgroups (like macOS).
Also, make things work on modern systems that use cgroups v2 (where
there is a single hierarchy and no "systemd" controller).
Call it as `['nix', '__build-remote', ... ]` rather than the previous
`["__build-remote", "nix __build-remote", ... ]` which seemed to have
been most likely unintended
Currently, Nix passes `-a` when it runs commands on a remote machine via
SSH, which disables agent forwarding. This causes issues when the
`ForwardAgent` option is set in SSH config files, as the command line
operation always overrides those.
In particular, this causes issues if the command being run is `sudo`
and the remote machine is configured with the equivalent of NixOS's
`security.pam.enableSSHAgentAuth` option. Not allowing SSH agent
forwarding can cause authentication to fail unexpectedly.
This can currently be worked around by setting `NIX_SSHOPTS="-A"`, but
we should defer to the options in the SSH config files to be least
surprising for users.
After we've send "\2\n" to the parent, we can't send a serialized
exception anymore. It will show up garbled like
$ nix-build --store /tmp/nix --expr 'derivation { name = "foo"; system = "x86_64-linux"; builder = "/foo/bar"; }'
this derivation will be built:
/nix/store/xmdip0z5x1zqpp6gnxld3vqng7zbpapp-foo.drv
building '/nix/store/xmdip0z5x1zqpp6gnxld3vqng7zbpapp-foo.drv'...
ErrorErrorEexecuting '/foo/bar': No such file or directory
error: builder for '/nix/store/xmdip0z5x1zqpp6gnxld3vqng7zbpapp-foo.drv' failed with exit code 1
While trying to use an alternate directory for my Nix installation, I
noticed that nix's output didn't reflect the updated state
directory. This patch corrects that and now prints the warning before
attempting to create the directory (if the directory creation fails,
it wouldn't have been obvious why nix was attempting to create the
directory in the first place).
With this patch, I now get the following warning:
warning: '/home/deck/.var/app/org.nixos.nix/var/nix' does not
exist, so Nix will use '/home/deck/.local/share/nix/root' as a
chroot store
These settings seem harmless, they control the same polling
functionality that timeout does, but with different behavior. Should
be safe for untrusted users to pass in.
I just had a colleague get confused by the previous phrase for good
reason. "valid" sounds like an *objective* criterion, e.g. and *invalid
signature* would be one that would be trusted by no one, e.g. because it
misformatted or something.
What is actually going is that there might be a signature which is
perfectly valid to *someone else*, but not to the user, because they
don't trust the corresponding public key. This is a *subjective*
criterion, because it depends on the arbitrary and personal choice of
which public keys to trust.
I therefore think "trustworthy" is a better adjective to use. Whether
something is worthy of trust is clearly subjective, and then "trust"
within that word nicely evokes `trusted-public-keys` and friends.
- call close explicitly in writeFile to prevent the close exception
from being ignored
- fsync after writing schema file to flush data to disk
- fsync schema file parent to flush metadata to disk
https://github.com/NixOS/nix/issues/7064
Remove the `verify TLS: Nix CA file = 'blah'` message that Nix used to print when fetching anything as it's both useless (`libcurl` prints the same info in its logs) and misleading (gives the impression that a new TLS connection is being established which might not be the case because of multiplexing. See #7011 )
Implements the approach suggested by feedback on PR #6994, where
tempdir paths are created in the store (now with an exclusive lock).
As part of this work, the currently-broken and unused
`createTempDirInStore` function is updated to create an exclusive lock
on the temp directory in the store.
The GC now makes a non-blocking attempt to lock any store directories
that "look like" the temp directories created by this function, and if
it can't acquire one, ignores the directory.
readDerivation is pretty slow, and while it may not be significant for
some use cases, on things like ghc-nix where we have thousands of
derivations is really slows things down.
So, this just doesn’t do the impure derivation check if the impure
derivation experimental feature is disabled. Perhaps we could cache
the result of isPure() and keep the check, but this is a quick fix to
for the slowdown introduced with impure derivations features in 2.8.0.
This hang for some reason didn't trigger in the Nix build, but did
running 'make installcheck' interactively. What happened:
* Store::addMultipleToStore() calls a SinkToSource object to copy a
path, which in turn calls LegacySSHStore::narFromPath(), which
acquires a connection.
* The SinkToSource object is not destroyed after the last bytes has
been read, so the coroutine's stack is still alive and its
destructors are not run. So the connection is not released.
* Then when the next path is copied, because max-connections = 1,
LegacySSHStore::narFromPath() hangs forever waiting for a connection
to be released.
The fix is to make sure that the source object is destroyed when we're
done with it.
RewritingSink can handle being fed input where a reference crosses a
chunk boundary. we don't need to load the whole source into memory, and
in fact *not* loading the whole source lets nix build FODs that do not
fit into memory (eg fetchurl'ing data files larger than system memory).
Once a derivation goal has been completed, we check whether or not
this goal was meant to be repeated to check its output.
An early return branch was preventing the worker to reach that repeat
code branch, hence breaking the --check command (#2619).
It seems like this early return branch is an artifact of a passed
refactoring. As far as I can tell, buildDone's main branch also
cleanup the tmp directory before returning.
By default, Nix sets the "cores" setting to the number of CPUs which are
physically present on the machine. If cgroups are used to limit the CPU
and memory consumption of a large Nix build, the OOM killer may be
invoked.
For example, consider a GitLab CI pipeline which builds a large software
package. The GitLab runner spawns a container whose CPU is limited to 4
cores and whose memory is limited to 16 GiB. If the underlying machine
has 64 cores, Nix will invoke the build with -j64. In many cases, that
level of parallelism will invoke the OOM killer and the build will
completely fail.
This change sets the default value of "cores" to be
ceil(cpu_quota / cpu_period), with a fallback to
std:🧵:hardware_concurrency() if cgroups v2 is not detected.
The workaround for "Some distros patch Linux" mentioned in
local-derivation-goal.cc will not help in the `--option
sandbox-fallback false` case. To provide the user more helpful
guidance on how to get the sandbox working, let's check to see if the
`/proc` node created by the aforementioned patch is present and
configured in a way that will cause us problems. If so, give the user
a suggestion for how to troubleshoot the problem.
local-derivation-goal.cc contains a comment stating that "Some distros
patch Linux to not allow unprivileged user namespaces." Let's give a
pointer to a common version of this patch for those who want more
details about this failure mode.
This commit causes nix to `warn()` if sandbox setup has failed and
`/proc/self/ns/user` does not exist. This is usually a sign that the
kernel was compiled without `CONFIG_USER_NS=y`, which is required for
sandboxing.
This commit uses `warn()` to notify the user if sandbox setup fails
with errno==EPERM and /proc/sys/user/max_user_namespaces is missing or
zero, since that is at least part of the reason why sandbox setup
failed.
Note that `echo -n 0 > /proc/sys/user/max_user_namespaces` or
equivalent at boot time has been the recommended mitigation for
several Linux LPE vulnerabilities over the past few years. Many users
have applied this mitigation and then forgotten that they have done
so.
The failure modes for nix's sandboxing setup are pretty complicated.
When nix is unable to set up the sandbox, let's provide more detail
about what went wrong. Specifically:
* Make sure the error message includes the word "sandbox" so the user
knows that the failure was related to sandboxing.
* If `--option sandbox-fallback false` was provided, and removing it
would have allowed further attempts to make progress, let the user
know.
Specifically, if we're not root and the daemon socket does not exist,
then we use ~/.local/share/nix/root as a chroot store. This enables
non-root users to download nix-static and have it work out of the box,
e.g.
ubuntu@ip-10-13-1-146:~$ ~/nix run nixpkgs#hello
warning: '/nix' does not exists, so Nix will use '/home/ubuntu/.local/share/nix/root' as a chroot store
Hello, world!
With this, Nix will write a copy of the sandbox shell to /bin/sh in
the sandbox rather than bind-mounting it from the host filesystem.
This makes /bin/sh work out of the box with nix-static, i.e. you no
longer get
/nix/store/qa36xhc5gpf42l3z1a8m1lysi40l9p7s-bootstrap-stage4-stdenv-linux/setup: ./configure: /bin/sh: bad interpreter: No such file or directory
This allows changes to nix-cache-info to be picked up by existing
clients. Previously, the only way for this to happen would be for
clients to delete binary-cache-v6.sqlite, which is quite awkward for
users.
On the other hand, updates to nix-cache-info should be pretty rare,
hence the choice of a fairly long TTL. Configurability is probably not
useful enough to warrant implementing it.
The manpage for `getgrouplist` says:
> If the number of groups of which user is a member is less than or
> equal to *ngroups, then the value *ngroups is returned.
>
> If the user is a member of more than *ngroups groups, then
> getgrouplist() returns -1. In this case, the value returned in
> *ngroups can be used to resize the buffer passed to a further
> call getgrouplist().
In our original code, however, we allocated a list of size `10` and, if
`getgrouplist` returned `-1` threw an exception. In practice, this
caused the code to fail for any user belonging to more than 10 groups.
While unusual for single-user systems, large companies commonly have a
huge number of POSIX groups users belong to, causing this issue to crop
up and make multi-user Nix unusable in such settings.
The fix is relatively simple, when `getgrouplist` fails, it stores the
real number of GIDs in `ngroups`, so we must resize our list and retry.
Only then, if it errors once more, we can raise an exception.
This should be backported to, at least, 2.9.x.
Bring back the possibility to copy CA paths with no reference (like the
outputs of FO derivations or stuff imported at eval time) between stores
that have a different prefix.
A mips64el Linux MIPS kernel can execute userspace code using any of
three ABIs:
mips64el-linux-*abin64
mips64el-linux-*abin32
mipsel-linux-*
The first of these is the native 64-bit ABI, and the only ABI with
64-bit pointers; this is sometimes called "n64". The last of these is
the old legacy 32-bit ABI, whose binaries can execute natively on
32-bit MIPS hardware; this is sometimes called "o32".
The second ABI, "n32" is essentially the 64-bit ABI with 32-bit
pointers and address space. Hardware 64-bit integer/floating
arithmetic is still allowed, as well as the much larger mips64
register set and more-efficient calling convention.
Let's enable seccomp filters for all of these. Likewise for big
endian (mips64-linux-*).
Without the change any CA deletion triggers linear scan on large
RealisationsRefs table:
sqlite>.eqp full
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SCAN RealisationsRefs
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
With the change it gets turned into a lookup:
sqlite> CREATE INDEX IndexRealisationsRefsRealisationReference on RealisationsRefs(realisationReference);
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SEARCH RealisationsRefs USING INDEX IndexRealisationsRefsRealisationReference (realisationReference=?)
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
If the derivation `foo` depends on `bar`, and they both have the same
output path (because they are CA derivations), then this output path
will depend both on the realisation of `foo` and of `bar`, which
themselves depend on each other.
This confuses SQLite which isn’t able to automatically solve this
diamond dependency scheme.
Help it by adding a trigger to delete all the references between the
relevant realisations.
Fix#5320
Otherwise the clang builds fail because the constructor of `SQLiteBusy`
inherits it, `SQLiteError::_throw` tries to call it, which fails.
Strangely, gcc works fine with it. Not sure what the correct behavior is
and who is buggy here, but either way, making it public is at the worst
a reasonable workaround
This ensures that use-sites properly trigger new monomorphisations on
one hand, and on the other hand keeps the main `sqlite.hh` clean and
interface-only. I think that is good practice in general, but in this
situation in particular we do indeed have `sqlite.hh` users that don't
need the `throw_` function.
Previously it only logged the builder's path, this changes it to log the
arguments at the same log level, and the environment variables at the
vomit level.
This helped me debug https://github.com/svanderburg/node2nix/issues/75
This was caused by SubstitutionGoal not setting the errorMsg field in
its BuildResult. We now get a more descriptive message than in 2.7.0, e.g.
error: path '/nix/store/13mh...' is required, but there is no substituter that can build it
instead of the misleading (since there was no build)
error: build of '/nix/store/13mh...' failed
Fixes#6295.
Impure derivations are derivations that can produce a different result
every time they're built. Example:
stdenv.mkDerivation {
name = "impure";
__impure = true; # marks this derivation as impure
outputHashAlgo = "sha256";
outputHashMode = "recursive";
buildCommand = "date > $out";
};
Some important characteristics:
* This requires the 'impure-derivations' experimental feature.
* Impure derivations are not "cached". Thus, running "nix-build" on
the example above multiple times will cause a rebuild every time.
* They are implemented similar to CA derivations, i.e. the output is
moved to a content-addressed path in the store. The difference is
that we don't register a realisation in the Nix database.
* Pure derivations are not allowed to depend on impure derivations. In
the future fixed-output derivations will be allowed to depend on
impure derivations, thus forming an "impurity barrier" in the
dependency graph.
* When sandboxing is enabled, impure derivations can access the
network in the same way as fixed-output derivations. In relaxed
sandboxing mode, they can access the local filesystem.
Rather than having four different but very similar types of hashes, make
only one, with a tag indicating whether it corresponds to a regular of
deferred derivation.
This implies a slight logical change: The original Nix+multiple-outputs
model assumed only one hash-modulo per derivation. Adding
multiple-outputs CA derivations changed this as these have one
hash-modulo per output. This change is now treating each derivation as
having one hash modulo per output.
This obviously means that we internally loose the guaranty that
all the outputs of input-addressed derivations have the same hash
modulo. But it turns out that it doesn’t matter because there’s nothing
in the code taking advantage of that fact (and it probably shouldn’t
anyways).
The upside is that it is now much easier to work with these hashes, and
we can get rid of a lot of useless `std::visit{ overloaded`.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
This avoids an infinite loop in the final test in
tests/binary-cache.sh. I think this was only not triggered previously
by accident (because we were clearing wantedOutputs in between).
LocalStore::addToStore() since
79ae9e4558 expects a regular NAR hash,
rather than a NAR hash modulo self-references. Fixes#6300.
Also, makeContentAddressed() now rewrites the entire closure (so 'nix
store make-content-addressable' no longer needs '-r'). See #6301.
1. `DerivationOutput` now as the `std::variant` as a base class. And the
variants are given hierarchical names under `DerivationOutput`.
In 8e0d0689be @matthewbauer and I
didn't know a better idiom, and so we made it a field. But this sort
of "newtype" is anoying for literals downstream.
Since then we leaned the base class, inherit the constructors trick,
e.g. used in `DerivedPath`. Switching to use that makes this more
ergonomic, and consistent.
2. `store-api.hh` and `derivations.hh` are now independent.
In bcde5456cc I swapped the dependency,
but I now know it is better to just keep on using incomplete types as
much as possible for faster compilation and good separation of
concerns.
Before the change garbage collector was not considering
`.drv` and outputs as alive even if configuration says otherwise.
As a result `nix store gc --dry-run` could visit (and parse)
`.drv` files multiple times (worst case it's quadratic).
It happens because `alive` set was populating only runtime closure
without regard for actual configuration. The change fixes it.
Benchmark: my system has about 139MB, 40K `.drv` files.
Performance before the change:
$ time nix store gc --dry-run
real 4m22,148s
Performance after the change:
$ time nix store gc --dry-run
real 0m14,178s
Don’t try and assume that we know the output paths when we’ve just built
with `--dry-run`. Instead make `--dry-run` follow a different code path
that won’t assume the knowledge of the output paths at all.
Fix#6275
Before the change on a system with `auto-optimise-store = true`:
$ nix store gc --verbose --max 1
deleted all the paths instead of one path (we requested 1 byte limit).
It happens because every file in `auto-optimise-store = true` has at
least 2 links: file itself and a link in /nix/store/.links/ directory.
The change conservatively assumes that any file that has one (as before)
or two links (assume auto-potimise mode) will free space.
Co-authored-by: Sandro <sandro.jaeckel@gmail.com>
This changes was taken from dynamic derivation (#4628). It` somewhat
undoes the refactors I first did for floating CA derivations, as the
benefit of hindsight + requirements of dynamic derivations made me
reconsider some things.
They aren't to consequential, but I figured they might be good to land
first, before the more profound changes @thufschmitt has in the works.
Continue progress on #5729.
Just as I hoped, this uncovered an issue: the daemon protocol is missing
a way to query build logs. This doesn't effect `unix://`, but does
effect `ssh://`. A FIXME is left for this, so we come back to it later.
This function is like buildPaths(), except that it returns a vector of
BuildResults containing the exact statuses and output paths of each
derivation / substitution. This is convenient for functions like
Installable::build(), because they then don't need to do another
series of calls to get the outputs of CA derivations. It's also a
precondition to impure derivations, where we *can't* query the output
of those derivations since they're not stored in the Nix database.
Note that PathSubstitutionGoal can now also return a BuildStatus.
Starts progress on #5729.
The idea is that we should not have these default methods throwing
"unimplemented". This is a small step in that direction.
I kept `addTempRoot` because it is a no-op, rather than failure. Also,
as a practical matter, it is called all over the place, while doing
other tasks, so the downcasting would be annoying.
Maybe in the future I could move the "real" `addTempRoot` to `GcStore`,
and the existing usecases use a `tryAddTempRoot` wrapper to downcast or
do nothing, but I wasn't sure whether that was a good idea so with a
bias to less churn I didn't do it yet.
Setting the `_NIX_FORCE_HTTP` environment variable is supposed to force `file://` store urls to use the `HttpBinaryCacheStore` implementation rather than the `LocalBinaryCacheStore` one (very useful for testing).
However because of a name mismatch, the `LocalBinaryCacheStore` was still registering the `file` scheme when this variable was set, meaning that the actual store implementation picked up on `file://` uris was dependent on the registration order of the stores (itself dependent on the link order of the object files).
Fix this by making the `LocalBinaryCacheStore` gracefully not register the `file` uri scheme when the variable is set.
Starting work on #5638
The exact boundary between `FetchSettings` and `EvalSettings` is not
clear to me, but that's fine. First lets clean out `libstore`, and then
worry about what, if anything, should be the separation between those
two.
To avoid that JSON messages are parsed twice in case of
remote builds with `ssh-ng://`, I split up the original
`handleJSONLogMessage` into three parts:
* `parseJSONMessage(const std::string&)` checks if it's a message in the
form of `@nix {...}` and tries to parse it (and prints an error if the
parsing fails).
* `handleJSONLogMessage(nlohmann::json&, ...)` reads the fields from the
message and passes them to the logger.
* `handleJSONLogMessage(const std::string&, ...)` behaves as before, but
uses the two functions mentioned above as implementation.
In case of `ssh-ng://`-logs the first two methods are invoked manually.
Right now when building a derivation remotely via
$ nix build -j0 -f . hello -L --builders 'ssh://builder'
it's possible later to read through the entire build-log by running
`nix log -f . hello`. This isn't possible however when using `ssh-ng`
rather than `ssh`.
The reason for that is that there are two different ways to transfer
logs in Nix through e.g. an SSH tunnel (that are used by `ssh`/`ssh-ng`
respectively):
* `ssh://` receives its logs from the fd pointing to `builderOut`. This
is directly passed to the "log-sink" (and to the logger on each `\n`),
hence `nix log` works here.
* `ssh-ng://` however expects JSON-like messages (i.e. `@nix {log data
in here}`) and passes it directly to the logger without doing anything
with the `logSink`. However it's certainly possible to extract
log-lines from this format as these have their own message-type in the
JSON payload (i.e. `resBuildLogLine`).
This is basically what I changed in this patch: if the code-path for
`builderOut` is not reached and a `logSink` is initialized, the
message was successfully processed by the JSON logger (i.e. it's in
the expected format) and the line is of the expected type (i.e.
`resBuildLogLine`), the line will be written to the log-sink as well.
Closes#5079
If we want to be careful about hitting the stack protector page, we should use `-fstack-check` instead.
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
This was removed in 2e199673a5 when
`copyPath` transitioned to use `RealisedPath`. But then in
e9848beca7 we added it back just for
`realisedPath`.
I think it is a good utility function --- one can easily imagine it
becoming optimized in the future, and copying paths *violating* the
closure is a very niche feature.
So if we have `copyPaths` for both sorts of paths, I think we should
have `copyClosure` for both sorts too.
This removes a dynamic stack allocation, making the derivation
unparsing logic robust against overflows when large strings are
added to a derivation.
Overflow behavior depends on the platform and stack configuration.
For instance, x86_64-linux/glibc behaves as (somewhat) expected:
$ (ulimit -s 20000; nix-instantiate tests/lang/eval-okay-big-derivation-attr.nix)
error: stack overflow (possible infinite recursion)
$ (ulimit -s 40000; nix-instantiate tests/lang/eval-okay-big-derivation-attr.nix)
error: expression does not evaluate to a derivation (or a set or list of those)
However, on aarch64-darwin:
$ nix-instantiate big-attr.nix ~
zsh: segmentation fault nix-instantiate big-attr.nix
This indicates a slight flaw in the single stack protection page
approach that is not encountered with normal stack frames.
There already existed a smoke test for the link content length,
but it appears that there exists some corruptions pernicious enough
to replace the file content with zeros, and keeping the same length.
--repair-path now goes as far as checking the content of the link,
making it true to its name and actually repairing the path for such
coruption cases.
This was already accidentally disabled in ba87b08. It also no longer
appears to be beneficial, and in fact slow things down, e.g. when
evaluating a NixOS system configuration:
elapsed time: median = 3.8170 mean = 3.8202 stddev = 0.0195 min = 3.7894 max = 3.8600 [rejected, p=0.00000, Δ=0.36929±0.02513]
Add a `_NIX_TRACE_BUILT_OUTPUTS` environment variable that can be set to
a filename in which the result of each build will be logged.
This is intentionally crude and undocumented as it’s only meant to be a
temporary thing to assess the usefulness of CA derivations.
Any other use would need a cleaner re-implementation first.
Make the build of unresolved derivations return the same status as the
resolved one, except in the case of an `AlreadyValid` in which case it
will return `ResolvesToAlreadyValid` to mean that the outputs of the unresolved
derivation weren’t known, but the resolved one is.
I downloaded Nix tonight, and immediately broke it by accidentally removing the default binary caching.
After figuring this out, I also failed to fix it properly, due to using the wrong key for Nix's default binary cache
If the diagnostic message would have been clearer about what/where a "signature" for a "substituter" is + comes from, it probably would have saved me a few hours.
Maybe we can save other noobs the same pain?
Because the manual is generated from default values which are themselves
generated from various sources (cpuid, bios settings (kvm), number of
cores). This commit hides non-reproducible settings from the manual
output.
No matter what, we need to resize the buffer to not have any scratch
space after we do the `read`. In the end of file case, `got` will be 0
from it's initial value.
Before, we forgot to resize in the EOF case with the break. Yes, we know
we didn't recieve any data in that case, but we still have the scatch
space to undo.
Co-Authored-By: Will Fancher <Will.Fancher@Obsidian.Systems>
This doesn't fix the bug, but makes the code less difficult to read.
Also improve the comments, now that it is clear what part is needed in
each code path.
For a typical desktop system (~2K packages) we can easily get 100K
entries in RealisationsRefs. Without indices query for RealisationsRefs
requires linear scan.
RealisationsRefs(referrer)
--------------------------
Inefficiency is seen as a 100% CPU load of nix-daemon for the following
scenario:
$ nix edit -f . bash # add unused environment variable, like FOO="1"
# populate RealisationsRefs, build fresh system
$ nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'
$ nix edit -f . bash # add unused environment variable, like FOO="2"
$ time nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'
In this case `bash `will be rebuilt a few times and then rest of CPU
time is spent on scanning RealisationsRefs table (about 5 CPU-minutes
on my machine).
Before the change:
$ time nix build -f nixos system ... # step 4 above
real 34m3,613s
user 0m5,232s
sys 0m0,758s
Of all this time about 29.5 minutes are taken by nix-daemon's CPU time.
After the change:
$ time nix build -f nixos system ... # step 4 above
real 4m50,061s
user 0m5,038s
sys 0m0,677s
Of all this time about 1 minute is taken by nix-daemon's CPU time.
Most of the time is spent polling for non-existent realisations on
cache-nixos.org.
Realisations(outputPath)
------------------------
After running CA system for two weeks I got ~1M entries in Realisations
table. `nix-collect-garbage` became very slow (seemingly 100 path deletions
per second). It happens due to a slow cascading delete from Realisations
triggered by deletion from ValidPaths.
The fix is to add an index on primary key from ValidPaths(id) that
triggers cascading deletions.
Before the change:
$ time nix-collect-garbage -d --max-freed 100G
<interrupted before finish, took too long>
real 23m32.411s
user 17m49.679s
sys 4m50.609s
Most of time was spent in re-scanning Realisations table on each path deletion.
After the change:
$ time nix-collect-garbage -d --max-freed 100G
real 8m43.226s
user 6m16.317s
sys 1m40.188s
Time is spent scanning sqlite indices and in kernel when unlinking directories.
Doing it as a side-effect of calling LocalStore::makeStoreWritable()
is very ugly.
Also, make sure that stopping the progress bar joins the update
thread, otherwise that thread should be unshared as well.
Since 4806f2f6b0, we can't have paths with
references passed to builtins.{path,filterSource}. This prevents many cases
of those functions called on IFD outputs from working. Resolve this by
passing the references found in the original path to the added path.
When setting flake-local options (with the `nixConfig` field), forward
these options to the daemon in case we’re using one.
This is necessary in particular for options like `binary-caches` or
`post-build-hook` to make sense.
Fix <343239fc8a (r44356843)>
Rather than having them plain strings scattered through the whole
codebase, create an enum containing all the known experimental features.
This means that
- Nix can now `warn` when an unkwown experimental feature is passed
(making it much nicer to spot typos and spot deprecated features)
- It’s now easy to remove a feature altogether (once the feature isn’t
experimental anymore or is dropped) by just removing the field for the
enum and letting the compiler point us to all the now invalid usages
of it.
Currently machine specification (`/etc/nix/machine`) parser fails
with a vague exception if the file had incorrect format.
This commit adds verbose exceptions and unit-tests for the parser.
This ensures any started processes can't write to /nix/store (except
during builds). This partially reverts 01d07b1e, which happened because
of #2646.
The problem was only happening after nix downloads anything, causing
me to suspect the download thread. The problem turns out to be:
"A process can't join a new mount namespace if it is sharing
filesystem-related attributes with another process", in this case this
process is the curl thread.
Ideally, we might kill it before spawning the shell process, but it's
inside a static variable in the getFileTransfer() function. So
instead, stop it from sharing FS state using unshare(). A strategy
such as the one from #5057 (single-threaded chroot helper binary) is
also very much on the table.
Fixes#4337.
This fixes a bug in the garbage collector where if a path
/nix/store/abcd-foo is valid, but we do a
isValidPath("/nix/store/abcd-foo.lock") first, then a negative entry
for /nix/store/abcd is added to pathInfoCache, so /nix/store/abcd-foo
is subsequently considered invalid and deleted.
(where "referrers" includes the reverse of derivation outputs and
derivers). Now we do a full traversal to look if we can reach any
root. If not, all paths reached can be deleted.