Add a new `--log-format` cli argument to change the format of the logs.
The possible values are
- raw (the default one for old-style commands)
- bar (the default one for new-style commands)
- bar-with-logs (equivalent to `--print-build-logs`)
- internal-json (the internal machine-readable json format)
When used with `readFile`, we have a pretty good heuristic of the file
size, so `reserve` this in the `string`. This will save some allocation
/ copy when the string is growing.
This closes#3026 by allowing `builtins.readFile` to read a file with a
wrongly reported file size, for example, files in `/proc` may report a
file size of 0. Reading file in `/proc` is not a good enough motivation,
however I do think it just makes nix more robust by allowing more file
to be read. Especially, I do considerer the previous behavior to be
dangerous because nix was previously reading truncated files. Examples
of file system which incorrectly report file size may be network file
system or dynamic file system (for performance reason, a dynamic file
system such as FUSE may generate the content of the file on demand).
```
nix-repl> builtins.readFile "/proc/version"
""
```
With this commit:
```
nix-repl> builtins.readFile "/proc/version"
"Linux version 5.6.7 (nixbld@localhost) (gcc version 9.3.0 (GCC)) #1-NixOS SMP Thu Apr 23 08:38:27 UTC 2020\n"
```
Here is a summary of the behavior changes:
- If the reported size is smaller, previous implementation
was silently returning a truncated file content. The new implementation
is returning the correct file content.
- If a file had a bigger reported file size, previous implementation was
failing with an exception, but the new implementation is returning the
correct file content. This change of behavior is coherent with this pull
request.
Open questions
- The behavior is unchanged for correctly reported file size, however
performances may vary because it uses the more complex sink interface.
Considering that sink is used a lot, I don't think this impacts the
performance a lot.
- `builtins.readFile` on an infinite file, such as `/dev/random` may
fill the memory.
- it does not support adding file to store, such as `${/proc/version}`.
Suppose I have a path /nix/store/[hash]-[name]/a/a/a/a/a/[...]/a,
long enough that everything after "/nix/store/" is longer than 4096
(MAX_PATH) bytes.
Nix will happily allow such a path to be inserted into the store,
because it doesn't look at all the nested structure. It just cares
about the /nix/store/[hash]-[name] part. But, when the path is deleted,
we encounter a problem. Nix will move the path to /nix/store/trash, but
then when it's trying to recursively delete the trash directory, it will
at some point try to unlink
/nix/store/trash/[hash]-[name]/a/a/a/a/a/[...]/a. This will fail,
because the path is too long. After this has failed, any store deletion
operation will never work again, because Nix needs to delete the trash
directory before recreating it to move new things to it. (I assume this
is because otherwise a path being deleted could already exist in the
trash, and then moving it would fail.)
This means that if I can trick somebody into just fetching a tarball
containing a path of the right length, they won't be able to delete
store paths or garbage collect ever again, until the offending path is
manually removed from /nix/store/trash. (And even fixing this manually
is quite difficult if you don't understand the issue, because the
absolute path that Nix says it failed to remove is also too long for
rm(1).)
This patch fixes the issue by making Nix's recursive delete operation
use unlinkat(2). This function takes a relative path and a directory
file descriptor. We ensure that the relative path is always just the
name of the directory entry, and therefore its length will never exceed
255 bytes. This means that it will never even come close to AX_PATH,
and Nix will therefore be able to handle removing arbitrarily deep
directory hierachies.
Since the directory file descriptor is used for recursion after being
used in readDirectory, I made a variant of readDirectory that takes an
already open directory stream, to avoid the directory being opened
multiple times. As we have seen from this issue, the less we have to
interact with paths, the better, and so it's good to reuse file
descriptors where possible.
I left _deletePath as succeeding even if the parent directory doesn't
exist, even though that feels wrong to me, because without that early
return, the linux-sandbox test failed.
Reported-by: Alyssa Ross <hi@alyssa.is>
Thanks-to: Puck Meerburg <puck@puckipedia.com>
Tested-by: Puck Meerburg <puck@puckipedia.com>
Reviewed-by: Puck Meerburg <puck@puckipedia.com>
Most functions now take a StorePath argument rather than a Path (which
is just an alias for std::string). The StorePath constructor ensures
that the path is syntactically correct (i.e. it looks like
<store-dir>/<base32-hash>-<name>). Similarly, functions like
buildPaths() now take a StorePathWithOutputs, rather than abusing Path
by adding a '!<outputs>' suffix.
Note that the StorePath type is implemented in Rust. This involves
some hackery to allow Rust values to be used directly in C++, via a
helper type whose destructor calls the Rust type's drop()
function. The main issue is the dynamic nature of C++ move semantics:
after we have moved a Rust value, we should not call the drop function
on the original value. So when we move a value, we set the original
value to bitwise zero, and the destructor only calls drop() if the
value is not bitwise zero. This should be sufficient for most types.
Also lots of minor cleanups to the C++ API to make it more modern
(e.g. using std::optional and std::string_view in some places).
The intent of the code was that if the window size cannot be determined,
it would be treated as having the maximum possible size. Because of a
missing assignment, it was actually treated as having a width of 0.
The reason the width could not be determined was because it was obtained
from stdout, not stderr, even though the printing was done to stderr.
This commit addresses both issues.
This adds a command 'nix make-content-addressable' that rewrites the
specified store paths into content-addressable paths. The advantage of
such paths is that 1) they can be imported without signatures; 2) they
can enable deduplication in cases where derivation changes do not
cause output changes (apart from store path hashes).
For example,
$ nix make-content-addressable -r nixpkgs.cowsay
rewrote '/nix/store/g1g31ah55xdia1jdqabv1imf6mcw0nb1-glibc-2.25-49' to '/nix/store/48jfj7bg78a8n4f2nhg269rgw1936vj4-glibc-2.25-49'
...
rewrote '/nix/store/qbi6rzpk0bxjw8lw6azn2mc7ynnn455q-cowsay-3.03+dfsg1-16' to '/nix/store/iq6g2x4q62xp7y7493bibx0qn5w7xz67-cowsay-3.03+dfsg1-16'
We can then copy the resulting closure to another store without
signatures:
$ nix copy --trusted-public-keys '' ---to ~/my-nix /nix/store/iq6g2x4q62xp7y7493bibx0qn5w7xz67-cowsay-3.03+dfsg1-16
In order to support self-references in content-addressable paths,
these paths are hashed "modulo" self-references, meaning that
self-references are zeroed out during hashing. Somewhat annoyingly,
this means that the NAR hash stored in the Nix database is no longer
necessarily equal to the output of "nix hash-path"; for
content-addressable paths, you need to pass the --modulo flag:
$ nix path-info --json /nix/store/iq6g2x4q62xp7y7493bibx0qn5w7xz67-cowsay-3.03+dfsg1-16 | jq -r .[].narHash
sha256:0ri611gdilz2c9rsibqhsipbfs9vwcqvs811a52i2bnkhv7w9mgw
$ nix hash-path --type sha256 --base32 /nix/store/iq6g2x4q62xp7y7493bibx0qn5w7xz67-cowsay-3.03+dfsg1-16
1ggznh07khq0hz6id09pqws3a8q9pn03ya3c03nwck1kwq8rclzs
$ nix hash-path --type sha256 --base32 /nix/store/iq6g2x4q62xp7y7493bibx0qn5w7xz67-cowsay-3.03+dfsg1-16 --modulo iq6g2x4q62xp7y7493bibx0qn5w7xz67
0ri611gdilz2c9rsibqhsipbfs9vwcqvs811a52i2bnkhv7w9mgw