Previously, `state.mkList()` would set the type of the value to tList
and allocate the list vector, but it would not initialize the values
in the list. This has two problems:
* If an exception occurs, the list is left in an undefined state.
* More importantly, for multithreaded evaluation, if a value
transitions from thunk to non-thunk, it should be final (i.e. other
threads should be able to access the value safely).
To address this, there now is a `ListBuilder` class (analogous to
`BindingsBuilder`) to build the list vector prior to the call to
`Value::mkList()`. Typical usage:
auto list = state.buildList(size);
for (auto & v : list)
v = ... set value ...;
vRes.mkList(list);
When I started contributing to Nix, I found the mix of definitions and
names in `fmt.hh` to be rather confusing, especially the small
difference between `hintfmt` and `hintformat`. I've renamed many classes
and added documentation to most definitions.
- `formatHelper` is no longer exported.
- `fmt`'s documentation is now with `fmt` rather than (misleadingly)
above `formatHelper`.
- `yellowtxt` is renamed to `Magenta`.
`yellowtxt` wraps its value with `ANSI_WARNING`, but `ANSI_WARNING`
has been equal to `ANSI_MAGENTA` for a long time. Now the name is
updated.
- `normaltxt` is renamed to `Uncolored`.
- `hintfmt` has been merged into `hintformat` as extra constructor
functions.
- `hintformat` has been renamed to `hintfmt`.
- The single-argument `hintformat(std::string)` constructor has been
renamed to a static member `hintformat::interpolate` to avoid pitfalls
with using user-generated strings as format strings.
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.
Progress towards #9205
While preparing PRs like #9753, I've had to change error messages in
dozens of code paths. It would be nice if instead of
EvalError("expected 'boolean' but found '%1%'", showType(v))
we could write
TypeError(v, "boolean")
or similar. Then, changing the error message could be a mechanical
refactor with the compiler pointing out places the constructor needs to
be changed, rather than the error-prone process of grepping through the
codebase. Structured errors would also help prevent the "same" error
from having multiple slightly different messages, and could be a first
step towards error codes / an error index.
This PR reworks the exception infrastructure in `libexpr` to
support exception types with different constructor signatures than
`BaseError`. Actually refactoring the exceptions to use structured data
will come in a future PR (this one is big enough already, as it has to
touch every exception in `libexpr`).
The core design is in `eval-error.hh`. Generally, errors like this:
state.error("'%s' is not a string", getAttrPathStr())
.debugThrow<TypeError>()
are transformed like this:
state.error<TypeError>("'%s' is not a string", getAttrPathStr())
.debugThrow()
The type annotation has moved from `ErrorBuilder::debugThrow` to
`EvalState::error`.
these symbols are used a *lot*, so it makes sense to cache them. this
mostly increases clarity of the code (however clear one may wish to call
the parser desugaring here), but it also provides a small performance
benefit.
Previously, there were two mostly-identical value printers -- one in
`libexpr/eval.cc` (which didn't force values) and one in
`libcmd/repl.cc` (which did force values and also printed ANSI color
codes).
This PR unifies both of these printers into `print.cc` and provides a
`PrintOptions` struct for controlling the output, which allows for
toggling whether values are forced, whether repeated values are tracked,
and whether ANSI color codes are displayed.
Additionally, `PrintOptions` allows tuning the maximum number of
attributes, list items, and bytes in a string that will be displayed;
this makes it ideal for contexts where printing too much output (e.g.
all of Nixpkgs) is distracting. (As requested by @roberth in
https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735)
Please read the tests for example output.
Future work:
- It would be nice to provide this function as a builtin, perhaps
`builtins.toStringDebug` -- a printing function that never fails would
be useful when debugging Nix code.
- It would be nice to support customizing `PrintOptions` members on the
command line, e.g. `--option to-string-max-attrs 1000`.
Also move `SourcePath` into `libutil`.
These changes allow `error.hh` and `error.cc` to access source path and
position information, which we can use to produce better error messages
(for example, we could consider omitting filenames when two or more
consecutive stack frames originate from the same file).
This avoids a Value allocation for empty list constants. During a `nix
search nixpkgs`, about 82% of all thunked lists are empty, so this
removes about 3 million Value allocations.
Performance comparison on `nix search github:NixOS/nixpkgs/e1fa12d4f6c6fe19ccb59cac54b5b3f25e160870 --no-eval-cache`:
maximum RSS: median = 3845432.0000 mean = 3845432.0000 stddev = 0.0000 min = 3845432.0000 max = 3845432.0000 [rejected?, p=0.00000, Δ=-70084.00000±0.00000]
soft page faults: median = 965395.0000 mean = 965394.6667 stddev = 1.1181 min = 965392.0000 max = 965396.0000 [rejected?, p=0.00000, Δ=-17929.77778±38.59610]
system CPU time: median = 1.8029 mean = 1.7702 stddev = 0.0621 min = 1.6749 max = 1.8417 [rejected, p=0.00064, Δ=-0.12873±0.09905]
user CPU time: median = 14.1022 mean = 14.0633 stddev = 0.1869 min = 13.8118 max = 14.3190 [not rejected, p=0.03006, Δ=-0.18248±0.24928]
elapsed time: median = 15.8205 mean = 15.8618 stddev = 0.2312 min = 15.5033 max = 16.1670 [not rejected, p=0.00558, Δ=-0.28963±0.29434]
since `up` and `values` are both pointer-aligned the type field will
also be pointer-aligned, wasting 48 bits of space on most machines. we
can get away with removing the type field altogether by encoding some
information into the `with` expr that created the env to begin with,
reducing the GC load for the absolutely massive amount of single-entry
envs we create for lambdas. this reduces memory usage of system eval by
quite a bit (reducing heap size of our system eval from 8.4GB to 8.23GB)
and gives similar savings in eval time.
running `nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'`
before:
Time (mean ± σ): 5.576 s ± 0.003 s [User: 5.197 s, System: 0.378 s]
Range (min … max): 5.572 s … 5.581 s 10 runs
after:
Time (mean ± σ): 5.408 s ± 0.002 s [User: 5.019 s, System: 0.388 s]
Range (min … max): 5.405 s … 5.411 s 10 runs
This fixes a segfault on infinite function call recursion (rather than
infinite thunk recursion) by tracking the function call depth in
`EvalState`.
Additionally, to avoid printing extremely long stack traces, stack
frames are now deduplicated, with a `(19997 duplicate traces omitted)`
message. This should only really be triggered in infinite recursion
scenarios.
Before:
$ nix-instantiate --eval --expr '(x: x x) (x: x x)'
Segmentation fault: 11
After:
$ nix-instantiate --eval --expr '(x: x x) (x: x x)'
error: stack overflow
at «string»:1:14:
1| (x: x x) (x: x x)
| ^
$ nix-instantiate --eval --expr '(x: x x) (x: x x)' --show-trace
error:
… from call site
at «string»:1:1:
1| (x: x x) (x: x x)
| ^
… while calling anonymous lambda
at «string»:1:2:
1| (x: x x) (x: x x)
| ^
… from call site
at «string»:1:5:
1| (x: x x) (x: x x)
| ^
… while calling anonymous lambda
at «string»:1:11:
1| (x: x x) (x: x x)
| ^
… from call site
at «string»:1:14:
1| (x: x x) (x: x x)
| ^
(19997 duplicate traces omitted)
error: stack overflow
at «string»:1:14:
1| (x: x x) (x: x x)
| ^
this also reduces forceValue code size and removes the need for
hideInDiagnostics. coopting thunk forcing like this has the additional
benefit of clarifying how these errors can happen in the first place.
almost all uses of this are interactive, except for deepSeq. deepSeq is
going to be expensive and rare enough to not care much about, and
Value::determinePos should usually be cheap enough to not be too much of
a burden in any case.
checking for isBlackhole in the forceValue hot path is rather more
expensive than necessary, and with a little bit of trickery we can move
such handling into the isApp case. small performance benefit, but under
some circumstances we've seen 2% improvement as well.
〉 nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
before:
Time (mean ± σ): 4.429 s ± 0.002 s [User: 3.929 s, System: 0.500 s]
Range (min … max): 4.427 s … 4.433 s 10 runs
after:
Time (mean ± σ): 4.396 s ± 0.002 s [User: 3.894 s, System: 0.501 s]
Range (min … max): 4.393 s … 4.399 s 10 runs
This makes stack usage significantly more compact, allowing larger
amounts of data to be processed on the same stack.
PrimOp functions with more than 8 positional (curried) arguments
should use an attrset instead.
MemoryInputAccessor is an in-memory virtual filesystem that returns
files like <nix/fetchurl.nix>. This removes the need for special hacks
to handle those files.
This function is now trivial enough that it doesn't need to exist.
`EvalState` can still be initialized with a custom search path, but we
don't have a need to mutate the search path after it has been
constructed, and I don't see why we would need to in the future.
Fixes#8229
`EvalState::mkSingleDerivedPathString` previously contained its own
inverse (printing, rather than parsing) in order to validate what was
parsed. Now that is pulled out into its own separate function:
`EvalState::coerceToSingleDerivedPath`.
In additional that pulled out logic is deduplicated with
`EvalState::mkOutputString` via `EvalState::mkOutputStringRaw`, which is
itself deduplicated (and generalized) with
`DownstreamPlaceholder::mkOutputStringRaw`.
All these changes make the unit tests simpler.
(We would ideally write more unit tests for `mkSingleDerivedPathString`
`coerceToSingleDerivedPath` directly, but we cannot yet do that because
the IO in reading the store path won't work when the dummy store cannot
hold anything. Someday we'll have a proper in-memory store which will
work for this.)
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
We want to be able to write down `foo.drv^bar.drv^baz`:
`foo.drv^bar.drv` is the dynamic derivation (since it is itself a
derivation output, `bar.drv` from `foo.drv`).
To that end, we create `Single{Derivation,BuiltPath}` types, that are
very similar except instead of having multiple outputs (in a set or
map), they have a single one. This is for everything to the left of the
rightmost `^`.
`NixStringContextElem` has an analogous change, and now can reuse
`SingleDerivedPath` at the top level. In fact, if we ever get rid of
`DrvDeep`, `NixStringContextElem` could be replaced with
`SingleDerivedPath` entirely!
Important note: some JSON formats have changed.
We already can *produce* dynamic derivations, but we can't refer to them
directly. Today, we can merely express building or example at the top
imperatively over time by building `foo.drv^bar.drv`, and then with a
second nix invocation doing `<result-from-first>^baz`, but this is not
declarative. The ethos of Nix of being able to write down the full plan
everything you want to do, and then execute than plan with a single
command, and for that we need the new inductive form of these types.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
the original change broke many pre-existing anchor links.
also change formatting of the constants listing slightly:
- the type should not be part of the anchor
- add highlight to the "impure only" note
- Better types
- Own header / C++ file pair
- Test factored out methods
- Pass parsed thing around more than strings
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>