Prior to this change, we had a bunch of ad-hoc string manipulation code
scattered around. This made it hard to figure out what data model for
string contexts is.
Now, we still store string contexts most of the time as encoded strings
--- I was wary of the performance implications of changing that --- but
whenever we parse them we do so only through the
`NixStringContextElem::parse` method, which handles all cases. This
creates a data type that is very similar to `DerivedPath` but:
- Represents the funky `=<drvpath>` case as properly distinct from the
others.
- Only encodes a single output, no wildcards and no set, for the
"built" case.
(I would like to deprecate `=<path>`, after which we are in spitting
distance of `DerivedPath` and could maybe get away with fewer types, but
that is another topic for another day.)
this slightly increases the amount of memory used for any given symbol, but this
increase is more than made up for if the symbol is referenced more than once in
the EvalState that holds it. on average every symbol should be referenced at
least twice (once to introduce a binding, once to use it), so we expect no
increase in memory on average.
symbol tables are limited to 2³² entries like position tables, and similar
arguments apply to why overflow is not likely: 2³² symbols would require as many
string instances (at 24 bytes each) and map entries (at 24 bytes or more each,
assuming that the map holds on average at most one item per bucket as the docs
say). a full symbol table would require at least 192GB of memory just for
symbols, which is well out of reach. (an ofborg eval of nixpks today creates
less than a million symbols!)
Pos objects are somewhat wasteful as they duplicate the origin file name and
input type for each object. on files that produce more than one Pos when parsed
this a sizeable waste of memory (one pointer per Pos). the same goes for
ptr<Pos> on 64 bit machines: parsing enough source to require 8 bytes to locate
a position would need at least 8GB of input and 64GB of expression memory. it's
not likely that we'll hit that any time soon, so we can use a uint32_t index to
locate positions instead.
1. `DerivationOutput` now as the `std::variant` as a base class. And the
variants are given hierarchical names under `DerivationOutput`.
In 8e0d0689be @matthewbauer and I
didn't know a better idiom, and so we made it a field. But this sort
of "newtype" is anoying for literals downstream.
Since then we leaned the base class, inherit the constructors trick,
e.g. used in `DerivedPath`. Switching to use that makes this more
ergonomic, and consistent.
2. `store-api.hh` and `derivations.hh` are now independent.
In bcde5456cc I swapped the dependency,
but I now know it is better to just keep on using incomplete types as
much as possible for faster compilation and good separation of
concerns.
we'll retain the old coerceToString interface that returns a string, but callers
that don't need the returned value to outlive the Value it came from can save
copies by using the new interface instead. for values that weren't stringy we'll
pass a new buffer argument that'll be used for storage and shouldn't be
inspected.
Previously you had to remember to call value->attrs->sort() after
populating value->attrs. Now there is a BindingsBuilder helper that
wraps Bindings and ensures that sort() is called before you can use
it.
Changes:
* The divider lines are gone. These were in practice a bit confusing,
in particular with --show-trace or --keep-going, since then there
were multiple lines, suggesting a start/end which wasn't the case.
* Instead, multi-line error messages are now indented to align with
the prefix (e.g. "error: ").
* The 'description' field is gone since we weren't really using it.
* 'hint' is renamed to 'msg' since it really wasn't a hint.
* The error is now printed *before* the location info.
* The 'name' field is no longer printed since most of the time it
wasn't very useful since it was just the name of the exception (like
EvalError). Ideally in the future this would be a unique, easily
googleable error ID (like rustc).
* "trace:" is now just "…". This assumes error contexts start with
something like "while doing X".
Example before:
error: --- AssertionError ---------------------------------------------------------------------------------------- nix
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
assertion 'false' failed
----------------------------------------------------- show-trace -----------------------------------------------------
trace: while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
Example after:
error: assertion 'false' failed
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
… while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
Most functions now take a StorePath argument rather than a Path (which
is just an alias for std::string). The StorePath constructor ensures
that the path is syntactically correct (i.e. it looks like
<store-dir>/<base32-hash>-<name>). Similarly, functions like
buildPaths() now take a StorePathWithOutputs, rather than abusing Path
by adding a '!<outputs>' suffix.
Note that the StorePath type is implemented in Rust. This involves
some hackery to allow Rust values to be used directly in C++, via a
helper type whose destructor calls the Rust type's drop()
function. The main issue is the dynamic nature of C++ move semantics:
after we have moved a Rust value, we should not call the drop function
on the original value. So when we move a value, we set the original
value to bitwise zero, and the destructor only calls drop() if the
value is not bitwise zero. This should be sufficient for most types.
Also lots of minor cleanups to the C++ API to make it more modern
(e.g. using std::optional and std::string_view in some places).