* doc: primops: add more info for foldl
From the existing doc it is not obvious whether the first or the
second argument is the accumulator. This is however relevant to
know, as for certain scenarios, this might change the behavior.
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
Try to stay away from stack overflows.
These small vectors use stack space. Most instances will not need
to allocate because in general most things are small, and large
things are worth heap allocating.
16 * 3 * word = 384 bytes is still quite a bit, but these functions
tend not to be part of deep recursions.
VLAs are a dangerous feature, and their usage triggers an undefined
behavior since theire size can be zero in some cases.
So replace them with `boost::small_vector`s which fit the same goal but
are safer.
It's also incidentally consistently 1% faster on the benchmarks.
* Fix boost::bad_format_string exception in builtins.addErrorContext
The message passed to addTrace was incorrectly being used as a format
string and this this would cause an exception when the string contained
a '%', which can be hit in places where arbitrary file paths are
interpolated.
* add test
All OS and IO operations should be moved out, leaving only some misc
portable pure functions.
This is useful to avoid copious CPP when doing things like Windows and
Emscripten ports.
Newly exposed functions to break cycles:
- `restoreSignals`
- `updateWindowSize`
MemoryInputAccessor is an in-memory virtual filesystem that returns
files like <nix/fetchurl.nix>. This removes the need for special hacks
to handle those files.
We use the same nested map representation we used for goals, again in
order to save space. We might someday want to combine with `inputDrvs`,
by doing `V = bool` instead of `V = std::set<OutputName>`, but we are
not doing that yet for sake of a smaller diff.
The ATerm format for Derivations also needs to be extended, in addition
to the in-memory format. To accomodate this, we added a new basic
versioning scheme, so old versions of Nix will get nice errors. (And
going forward, if the ATerm format changes again the errors will be even
better.)
`parsedStrings`, an internal function used as part of parsing
derivations in A-Term format, used to consume the final `]` but expect
the initial `[` to already be consumed. This made for what looked like
unbalanced brackets at callsites, which was confusing. Now it consumes
both which is hopefully less confusing.
As part of testing, we also created a unit test for the A-Term format for
regular non-experimental derivations too.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
Apply suggestions from code review
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Types converted:
- `NixStringContextElem`
- `OutputsSpec`
- `ExtendedOutputsSpec`
- `DerivationOutput`
- `DerivationType`
Existing ones mostly conforming the pattern cleaned up:
- `ContentAddressMethod`
- `ContentAddressWithReferences`
The `DerivationGoal::derivationType` field had a bogus initialization,
now caught, so I made it `std::optional`. I think #8829 can make it
non-optional again because it will ensure we always have the derivation
when we construct a `DerivationGoal`.
See that issue (#7479) for details on the general goal.
`git grep 'Raw::Raw'` indicates the two types I didn't yet convert
`DerivedPath` and `BuiltPath` (and their `Single` variants) . This is
because @roberth and I (can't find issue right now...) plan on reworking
them somewhat, so I didn't want to churn them more just yet.
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.
With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!
Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.
In 60b7121d2c we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!
Part of RFC 92: dynamic derivations (tracking issue #6316)
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
`EvalState::mkSingleDerivedPathString` previously contained its own
inverse (printing, rather than parsing) in order to validate what was
parsed. Now that is pulled out into its own separate function:
`EvalState::coerceToSingleDerivedPath`.
In additional that pulled out logic is deduplicated with
`EvalState::mkOutputString` via `EvalState::mkOutputStringRaw`, which is
itself deduplicated (and generalized) with
`DownstreamPlaceholder::mkOutputStringRaw`.
All these changes make the unit tests simpler.
(We would ideally write more unit tests for `mkSingleDerivedPathString`
`coerceToSingleDerivedPath` directly, but we cannot yet do that because
the IO in reading the store path won't work when the dummy store cannot
hold anything. Someday we'll have a proper in-memory store which will
work for this.)
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
We want to be able to write down `foo.drv^bar.drv^baz`:
`foo.drv^bar.drv` is the dynamic derivation (since it is itself a
derivation output, `bar.drv` from `foo.drv`).
To that end, we create `Single{Derivation,BuiltPath}` types, that are
very similar except instead of having multiple outputs (in a set or
map), they have a single one. This is for everything to the left of the
rightmost `^`.
`NixStringContextElem` has an analogous change, and now can reuse
`SingleDerivedPath` at the top level. In fact, if we ever get rid of
`DrvDeep`, `NixStringContextElem` could be replaced with
`SingleDerivedPath` entirely!
Important note: some JSON formats have changed.
We already can *produce* dynamic derivations, but we can't refer to them
directly. Today, we can merely express building or example at the top
imperatively over time by building `foo.drv^bar.drv`, and then with a
second nix invocation doing `<result-from-first>^baz`, but this is not
declarative. The ethos of Nix of being able to write down the full plan
everything you want to do, and then execute than plan with a single
command, and for that we need the new inductive form of these types.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
- Better types
- Own header / C++ file pair
- Test factored out methods
- Pass parsed thing around more than strings
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Whereas `ContentAddressWithReferences` is a sum type complex because different
varieties support different notions of reference, and
`ContentAddressMethod` is a nested enum to support that,
`ContentAddress` can be a simple pair of a method and hash.
`ContentAddress` does not need to be a sum type on the outside because
the choice of method doesn't effect what type of hashes we can use.
Co-Authored-By: Cale Gibbard <cgibbard@gmail.com>
This is done in roughly the same way builtin functions are documented.
Also auto-link experimental features for primops, subsuming PR #8371.
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
I got very confused trying to keep all the `first` and `second` straight
reading the code, *especially* as there is also another `(boolean,
string)` pair type also being used.
Named fields is much better.
There are other cleanups that we can do (for example, the existing
TODO), but we can do them later. Doing them now would just make this
harder to review.
The remaining constructor RegisterPrimOp::RegisterPrimOp(Info && info)
allows specifying the documentation in .args and .doc members of the
Info structure.
Commit 8ec1ba0210 removed all uses of the removed constructor in the
nix binary. Here, we remove the constructor completely as well as its
use in a plugin test. According to #8515, we didn't promis to maintain
compatibility with external plugins.
Fixes#8515
The primop `builtins.replaceStrings` currently always strictly evaluates the
replacement strings, however time and space are wasted for their computation
if the corresponding pattern do not occur in the input string. This commit
makes the evaluation of the replacement strings lazy by deferring their
evaluation to when the corresponding pattern are matched and memoize the result
for efficient retrieval on subsequent matches.
The testcases for replaceStrings was updated to check for lazy evaluation
of the replacements. A note was also added in the release notes to
document the behavior change.
Motivation
`PathSet` is not correct because string contexts have other forms
(`Built` and `DrvDeep`) that are not rendered as plain store paths.
Instead of wrongly using `PathSet`, or "stringly typed" using
`StringSet`, use `std::std<StringContextElem>`.
-----
In support of this change, `NixStringContext` is now defined as
`std::std<StringContextElem>` not `std:vector<StringContextElem>`. The
old definition was just used by a `getContext` method which was only
used by the eval cache. It can be deleted altogether since the types are
now unified and the preexisting `copyContext` function already suffices.
Summarizing the previous paragraph:
Old:
- `value/context.hh`: `NixStringContext = std::vector<StringContextElem>`
- `value.hh`: `NixStringContext Value::getContext(...)`
- `value.hh`: `copyContext(...)`
New:
- `value/context.hh`: `NixStringContext = std::set<StringContextElem>`
- `value.hh`: `copyContext(...)`
----
The string representation of string context elements no longer contains
the store dir. The diff of `src/libexpr/tests/value/context.cc` should
make clear what the new representation is, so we recommend reviewing
that file first. This was done for two reasons:
Less API churn:
`Value::mkString` and friends did not take a `Store` before. But if
`NixStringContextElem::{parse, to_string}` *do* take a store (as they
did before), then we cannot have the `Value` functions use them (in
order to work with the fully-structured `NixStringContext`) without
adding that argument.
That would have been a lot of churn of threading the store, and this
diff is already large enough, so the easier and less invasive thing to
do was simply make the element `parse` and `to_string` functions not
take the `Store` reference, and the easiest way to do that was to simply
drop the store dir.
Space usage:
Dropping the `/nix/store/` (or similar) from the internal representation
will safe space in the heap of the Nix programming being interpreted. If
the heap contains many strings with non-trivial contexts, the saving
could add up to something significant.
----
The eval cache version is bumped.
The eval cache serialization uses `NixStringContextElem::{parse,
to_string}`, and since those functions are changed per the above, that
means the on-disk representation is also changed.
This is simply done by changing the name of the used for the eval cache
from `eval-cache-v4` to eval-cache-v5`.
----
To avoid some duplication `EvalCache::mkPathString` is added to abstract
over the simple case of turning a store path to a string with just that
string in the context.
Context
This PR picks up where #7543 left off. That one introduced the fully
structured `NixStringContextElem` data type, but kept `PathSet context`
as an awkward middle ground between internal `char[][]` interpreter heap
string contexts and `NixStringContext` fully parsed string contexts.
The infelicity of `PathSet context` was specifically called out during
Nix team group review, but it was agreeing that fixing it could be left
as future work. This is that future work.
A possible follow-up step would be to get rid of the `char[][]`
evaluator heap representation, too, but it is not yet clear how to do
that. To use `NixStringContextElem` there we would need to get the STL
containers to GC pointers in the GC build, and I am not sure how to do
that.
----
PR #7543 effectively is writing the inverse of a `mkPathString`,
`mkOutputString`, and one more such function for the `DrvDeep` case. I
would like that PR to have property tests ensuring it is actually the
inverse as expected.
This PR sets things up nicely so that reworking that PR to be in that
more elegant and better tested way is possible.
Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
In other words, use a plain `ContentAddress` not
`ContentAddressWithReferences` for `DerivationOutput::CAFixed`.
Supporting fixed output derivations with (fixed) references would be a
cool feature, but it is out of scope at this moment.
This introduces the SourcePath type from lazy-trees as an abstraction
for accessing files from inputs that may not be materialized in the
real filesystem (e.g. Git repositories). Currently, however, it's just
a wrapper around CanonPath, so it shouldn't change any behaviour. (On
lazy-trees, SourcePath is a <InputAccessor, CanonPath> tuple.)
With the switch to C++20, the rules became more strict, and we can no
longer initialize base classes. Make them comments instead.
(BTW
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html
this offers some new syntax for this use-case. Hopefully this will be
adopted and we can eventually use it.)
Allows checking directory entry type of a single file/directory.
This was added to optimize the use of `builtins.readDir` on some
filesystems and operating systems which cannot detect this information
using POSIX's `readdir`.
Previously `builtins.readDir` would eagerly use system calls to lookup
these filetypes using other interfaces; this change makes these
operations lazy in the attribute values for each file with application
of `builtins.readFileType`.
`DerivedPath::Built` and `DerivationGoal` were previously using a
regular set with the convention that the empty set means all outputs.
But it is easy to forget about this rule when processing those sets.
Using `OutputSpec` forces us to get it right.
This way the links are clearly within the manual (ie not absolute paths),
while allowing snippets to reference the documentation root reliably,
regardless of at which base url they're included.
Prior to this change, we had a bunch of ad-hoc string manipulation code
scattered around. This made it hard to figure out what data model for
string contexts is.
Now, we still store string contexts most of the time as encoded strings
--- I was wary of the performance implications of changing that --- but
whenever we parse them we do so only through the
`NixStringContextElem::parse` method, which handles all cases. This
creates a data type that is very similar to `DerivedPath` but:
- Represents the funky `=<drvpath>` case as properly distinct from the
others.
- Only encodes a single output, no wildcards and no set, for the
"built" case.
(I would like to deprecate `=<path>`, after which we are in spitting
distance of `DerivedPath` and could maybe get away with fewer types, but
that is another topic for another day.)
This makes the position object used in exceptions abstract, with a
method getSource() to get the source code of the file in which the
error originated. This is needed for lazy trees because source files
don't necessarily exist in the filesystem, and we don't want to make
libutil depend on the InputAccessor type in libfetcher.
When calling `builtins.readFile` on a store path, the references of that
path are currently added to the resulting string's context.
This change makes those references the *possible* context of the string,
but filters them to keep only the references whose hash actually appears
in the string, similarly to what is done for determining the runtime
references of a path.
* Clarify the documentation of foldl': That the arguments are forced
before application (?) of `op` is necessarily true. What is important
to stress is that we force every application of `op`, even when the
value turns out to be unused.
* Move the example before the comment about strictness to make it less
confusing: It is a general example and doesn't really showcase anything
about foldl' strictness.
* Add test cases which nail down aspects of foldl' strictness:
* The initial accumulator value is not forced unconditionally.
* Applications of op are forced.
* The list elements are not forced unconditionally.
The documentation for `parseDrvName` does not agree with the implementation when
the derivation name contains a dash which is followed by something that is
neither a letter nor a digit. This commit corrects the documentation to agree
with the implementation.
The current definition of `intersectAttrs` is incorrect:
> Return a set consisting of the attributes in the set e2 that also exist in the
> set e1.
Recall that (Nix manual, section 5.1):
> An attribute set is a collection of name-value-pairs (called attributes)
According to the existing description of `intersectAttrs`, the following should
evaluate to the empty set, since no key-value *pair* (i.e. attribute) exists in
both sets:
```
builtins.intersectAttrs { x=3; } {x="foo";}
```
And yet:
```
nix-repl> builtins.intersectAttrs { x=3; } {x="foo";}
{ x = "foo"; }
```
Clearly the intent here was for the *names* of the resulting attribute set to be
the intersection of the *names* of the two arguments, and for the values of the
resulting attribute set to be the values from the second argument.
This commit corrects the definition, making it match the implementation and intent.