Part of RFC 133
Extracted from our old IPFS branches.
Co-Authored-By: Matthew Bauer <mjbauer95@gmail.com>
Co-Authored-By: Carlo Nucera <carlo.nucera@protonmail.com>
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Co-authored-by: Florian Klink <flokli@flokli.de>
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.
Progress towards #9205
While preparing PRs like #9753, I've had to change error messages in
dozens of code paths. It would be nice if instead of
EvalError("expected 'boolean' but found '%1%'", showType(v))
we could write
TypeError(v, "boolean")
or similar. Then, changing the error message could be a mechanical
refactor with the compiler pointing out places the constructor needs to
be changed, rather than the error-prone process of grepping through the
codebase. Structured errors would also help prevent the "same" error
from having multiple slightly different messages, and could be a first
step towards error codes / an error index.
This PR reworks the exception infrastructure in `libexpr` to
support exception types with different constructor signatures than
`BaseError`. Actually refactoring the exceptions to use structured data
will come in a future PR (this one is big enough already, as it has to
touch every exception in `libexpr`).
The core design is in `eval-error.hh`. Generally, errors like this:
state.error("'%s' is not a string", getAttrPathStr())
.debugThrow<TypeError>()
are transformed like this:
state.error<TypeError>("'%s' is not a string", getAttrPathStr())
.debugThrow()
The type annotation has moved from `ErrorBuilder::debugThrow` to
`EvalState::error`.
In the process, partially undo e89b5bd0bf
in that the ancient < 2.4 version is now supported again by the
serializer again. `LegacySSHStore`, instead of also asserting that the
version is at least 4, just checks that `narHash` is set.
This allows us to better test the serializer in isolation for both
versions (< 4 and >= 4).
It is not inherently tied to `LocalStore`, it could probably even go in
`libnixutil`. Functions not attached to `LocalStore` should not be
declared in `local-store.hh`.
I am moving it to facilitate experimenting for #9344. If
canonicalisation should be done client-side in client-side builds, there
wouldn't be a `LocalStore` at all so having to include that header to
get this freestanding function is cumbersome and wrong.
Perhaps canonicalisation should still be done server-side for security
reasons --- I don't mean to make that judgement call now --- but even if
so, this freestanding function still isn't connected to `LocalStore` so
while less urgent it is still better to move out of this header.
All OS and IO operations should be moved out, leaving only some misc
portable pure functions.
This is useful to avoid copious CPP when doing things like Windows and
Emscripten ports.
Newly exposed functions to break cycles:
- `restoreSignals`
- `updateWindowSize`
Worker Protocol:
Note that the worker protocol already had a serialization for
`BuildResult`; this was added in
a4604f1928. It didn't have any versioning
support because at that time reusable seralizers were not away for the protocol
version. It could thus only be used for new messages also introduced in
that commit.
Now that we do support versioning in reusable serializers, we can expand
it to support all known versions and use it in many more places.
The exist test data becomes the version 1.29 tests: note that those
files' contents are unchanged. 1.28 and 1.27 tests are added to cover
the older code-paths.
The keyered build result test only has 1.29 because the keying was also
added in a4604f19284254ac98f19a13ff7c2216de7fe176; the older
serializations are always used unkeyed.
Serve Protocol:
Conversely, no attempt was made to factor out such a serializer for the
serve protocol, so our work there in this commit for that protocol
proceeds from scratch.
This will allow us to factor out logic, which is currently scattered
inline, into several reusable instances
The tests are also updated to support versioning. Currently all Worker
and Serve protocol tests are using the minimum version, since no
version-specific serialisers have been created yet. But in subsequent
commits when that changes, we will test individual versions to ensure
complete coverage.
To start, it is just a clone of the common protocol. But now that we
have the separate protocol implementations, we can add versioning
information without the versions of one protocol leaking into another.
Using the infrastructure from the previous commit, we don't have to
duplicate code for shared behavior.
Motivation: No more perverse incentives. [0] did some awkward things
because the serialisers did not store the version. I don't want anyone
making changes to be pushed towards keeping the serialization logic with
the core data types just because it's easier or the alternative is
tedious.
The actual versioning of the Worker and Serve protocol serialisers
(Common remains unversioned as the underlying mini-protocols are not
versioned) will happen in subsequent commits / PRs.
[0]: fe1f34fa60
This introduces some shared infrastructure for our notion of protocols.
We can then define multiple protocols in terms of that notion.
We an also express how particular protocols depend on each other.
For example, we can define a common protocol and a worker protocol,
where the second depends on the first in terms of the data types it can
read and write.
The "serve" protocol can just use the common one for now, but will
eventually need its own machinary just like the worker protocol for
version-aware serialisers
Whereas `ContentAddressWithReferences` is a sum type complex because different
varieties support different notions of reference, and
`ContentAddressMethod` is a nested enum to support that,
`ContentAddress` can be a simple pair of a method and hash.
`ContentAddress` does not need to be a sum type on the outside because
the choice of method doesn't effect what type of hashes we can use.
Co-Authored-By: Cale Gibbard <cgibbard@gmail.com>
Pass this around instead of `Source &` and `Sink &` directly. This will
give us something to put the protocol version on once the time comes.
To do this ergonomically, we need to expose `RemoteStore::Connection`,
so do that too. Give it some more API docs while we are at it.
The motivation is exactly the same as for the last commit. In addition,
this anticipates us formally defining separate serialisers for the serve
protocol.
See API docs on that struct for why. The pasing as as template argument
doesn't yet happen in that commit, but will instead happen in later
commit.
Also make `WorkerOp` (now `Op`) and enum struct. This led us to catch
that two operations were not handled!
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
This is generally a fine practice: Putting implementations in headers
makes them harder to read and slows compilation. Unfortunately it is
necessary for templates, but we can ameliorate that by putting them in a
separate header. Only files which need to instantiate those templates
will need to include the header with the implementation; the rest can
just include the declaration.
This is now documenting in the contributing guide.
Also, it just happens that these polymorphic serializers are the
protocol agnostic ones. (Worker and serve protocol have the same logic
for these container types.) This means by doing this general template
cleanup, we are also getting a head start on better indicating which
code is protocol-specific and which code is shared between protocols.
This is the more typically way to do [Argument-dependent
lookup](https://en.cppreference.com/w/cpp/language/adl)-leveraging
generic serializers in C++. It makes the relationship between the `read`
and `write` methods more clear and rigorous, and also looks more
familiar to users coming from other languages that do not have C++'s
libertine ad-hoc overloading.
I am returning to this because during the review in
https://github.com/NixOS/nix/pull/6223, it came up as something that
would make the code easier to read --- easier today hopefully already,
but definitely easier if we were have multiple codified protocols with
code sharing between them as that PR seeks to accomplish.
If I recall correctly, the main criticism of this the first time around
(in 2020) was that having to specify the type when writing, e.g.
`WorkerProto<MyType>::write`, was too verbose and cumbersome. This is
now addressed with the `workerProtoWrite` wrapper function.
This method is also the way `nlohmann::json`, which we have used for a
number of years now, does its serializers, for what its worth.
This reverts commit 45a0ed82f0. That
commit in turn reverted 9ab07e99f5.
In many cases we are dealing with a collection of realisations, they are
all outputs of the same derivation. In that case, we don't need
"derivation hashes modulos" to be part of our map key, because the
output names alone will be unique. Those hashes are still part of the
realisation proper, so we aren't loosing any information, we're just
"normalizing our schema" by narrowing the "primary key".
Besides making our data model a bit "tighter" this allows us to avoid a
double `for` loop in `DerivationGoal::waiteeDone`. The inner `for` loop
was previously just to select the output we cared about without knowing
its hash. Now we can just select the output by name directly.
Note that neither protocol is changed as part of this: we are still
transferring `DrvOutputs` over the wire for `BuildResult`s. I would only
consider revising this once #6223 is merged, and we can mention protocol
versions inside factored-out serialization logic. Until then it is
better not change anything because it would come a the cost of code
reuse.
Documentation on "classic" commands with many sub-commands are
notoriously hard to discover due to lack of overview and anchor links.
Additionally the information on common options and environment variables
is not accessible offline in man pages, and therefore often overlooked
by readers.
With this change, each sub-command of nix-store and nix-env gets its
own page in the manual (listed in the table of contents), and each own
man page.
Also, man pages for each subcommand now (again) list common options
and environment variables. While this makes each page quite long and
some common parameters don't apply, this should still make it easier
to navigate as that additional information was not accessible on the
command line at all.
It is now possible to run 'nix-store --<subcommand> --help` to display
help pages for the given subcommand.
Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
With the switch to C++20, the rules became more strict, and we can no
longer initialize base classes. Make them comments instead.
(BTW
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html
this offers some new syntax for this use-case. Hopefully this will be
adopted and we can eventually use it.)