From 043135a84851b3c33fd8723686a44437eb82e66a Mon Sep 17 00:00:00 2001 From: John Ericson Date: Tue, 9 Apr 2024 17:07:39 -0400 Subject: [PATCH] Document file system object content addressing In addition: - Take the opportunity to add a bunch more missing hyperlinks, too. - Remove some glossary entries that are now subsumed by dedicated pages. We used to not be able to do this without breaking link fragments, but now we can, so pick up where we left off. Co-authored-by: Robert Hensing --- doc/manual/src/SUMMARY.md.in | 1 + .../src/command-ref/nix-collect-garbage.md | 2 +- .../command-ref/nix-env/delete-generations.md | 2 +- doc/manual/src/command-ref/nix-env/install.md | 2 +- doc/manual/src/command-ref/nix-hash.md | 11 ++- doc/manual/src/command-ref/nix-store/dump.md | 6 +- .../src/command-ref/nix-store/export.md | 6 +- .../src/command-ref/nix-store/import.md | 4 +- .../src/command-ref/nix-store/optimise.md | 3 +- .../src/command-ref/nix-store/realise.md | 4 +- .../src/command-ref/nix-store/restore.md | 4 +- doc/manual/src/contributing/documentation.md | 2 +- doc/manual/src/glossary.md | 23 +++++- .../src/language/advanced-attributes.md | 26 +++--- doc/manual/src/language/derivations.md | 4 +- .../src/language/import-from-derivation.md | 4 +- doc/manual/src/language/operators.md | 2 +- .../src/language/string-interpolation.md | 4 +- doc/manual/src/language/values.md | 2 +- .../src/protocols/json/store-object-info.md | 4 +- doc/manual/src/protocols/nix-archive.md | 3 +- doc/manual/src/protocols/store-path.md | 6 +- doc/manual/src/protocols/tarball-fetcher.md | 4 +- .../file-system-object/content-address.md | 80 +++++++++++++++++++ doc/manual/src/store/store-path.md | 10 +++ src/libcmd/misc-store-flags.cc | 24 ++++-- src/libexpr/primops.cc | 2 +- src/libexpr/primops/fetchTree.cc | 4 +- src/libstore/globals.hh | 2 +- src/libstore/path.hh | 2 +- src/libutil/file-content-address.hh | 39 +++++---- src/nix/derivation-show.md | 2 +- src/nix/unix/daemon.cc | 2 +- 33 files changed, 228 insertions(+), 68 deletions(-) create mode 100644 doc/manual/src/store/file-system-object/content-address.md diff --git a/doc/manual/src/SUMMARY.md.in b/doc/manual/src/SUMMARY.md.in index fdfd0a927..7ddafef70 100644 --- a/doc/manual/src/SUMMARY.md.in +++ b/doc/manual/src/SUMMARY.md.in @@ -18,6 +18,7 @@ - [Uninstalling Nix](installation/uninstall.md) - [Nix Store](store/index.md) - [File System Object](store/file-system-object.md) + - [Content-Addressing File System Objects](store/file-system-object/content-address.md) - [Store Object](store/store-object.md) - [Store Path](store/store-path.md) - [Store Types](store/types/index.md) diff --git a/doc/manual/src/command-ref/nix-collect-garbage.md b/doc/manual/src/command-ref/nix-collect-garbage.md index 1bc88d858..8e1307c48 100644 --- a/doc/manual/src/command-ref/nix-collect-garbage.md +++ b/doc/manual/src/command-ref/nix-collect-garbage.md @@ -74,4 +74,4 @@ $ nix-collect-garbage -d ``` [profiles]: @docroot@/command-ref/files/profiles.md -[store objects]: @docroot@/glossary.md#gloss-store-object +[store objects]: @docroot@/store/store-object.md diff --git a/doc/manual/src/command-ref/nix-env/delete-generations.md b/doc/manual/src/command-ref/nix-env/delete-generations.md index 6b6ea798e..ae618b2c6 100644 --- a/doc/manual/src/command-ref/nix-env/delete-generations.md +++ b/doc/manual/src/command-ref/nix-env/delete-generations.md @@ -49,7 +49,7 @@ Periodically deleting old generations is important to make garbage collection effective. The is because profiles are also garbage collection roots — any [store object] reachable from a profile is "alive" and ineligible for deletion. -[store object]: @docroot@/glossary.md#gloss-store-object +[store object]: @docroot@/store/store-object.md {{#include ./opt-common.md}} diff --git a/doc/manual/src/command-ref/nix-env/install.md b/doc/manual/src/command-ref/nix-env/install.md index d80bcb668..738902041 100644 --- a/doc/manual/src/command-ref/nix-env/install.md +++ b/doc/manual/src/command-ref/nix-env/install.md @@ -17,7 +17,7 @@ The install operation creates a new user environment. It is based on the current generation of the active [profile](@docroot@/command-ref/files/profiles.md), to which a set of [store paths] described by *args* is added. -[store paths]: @docroot@/glossary.md#gloss-store-path +[store paths]: @docroot@/store/store-path.md The arguments *args* map to store paths in a number of possible ways: diff --git a/doc/manual/src/command-ref/nix-hash.md b/doc/manual/src/command-ref/nix-hash.md index 37c8facec..24e91df12 100644 --- a/doc/manual/src/command-ref/nix-hash.md +++ b/doc/manual/src/command-ref/nix-hash.md @@ -20,16 +20,21 @@ an example. The hash is computed over a *serialisation* of each path: a dump of the file system tree rooted at the path. This allows directories and symlinks to be hashed as well as regular files. The dump is in the -*NAR format* produced by [`nix-store +*[Nix Archive (NAR)][Nix Archive] format* produced by [`nix-store --dump`](@docroot@/command-ref/nix-store/dump.md). Thus, `nix-hash path` yields the same cryptographic hash as `nix-store --dump path | md5sum`. +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive + # Options - `--flat`\ - Print the cryptographic hash of the contents of each regular file - *path*. That is, do not compute the hash over the dump of *path*. + Print the cryptographic hash of the contents of each regular file *path*. + That is, instead of computing + the hash of the [Nix Archive (NAR)](@docroot@/store/file-system-object/content-address.md#serial-nix-archive) of *path*, + just [directly hash]((@docroot@/store/file-system-object/content-address.md#serial-flat) *path* as is. + This requires *path* to resolve to a regular file rather than directory. The result is identical to that produced by the GNU commands `md5sum` and `sha1sum`. diff --git a/doc/manual/src/command-ref/nix-store/dump.md b/doc/manual/src/command-ref/nix-store/dump.md index c2f3c42ef..b1066fd4c 100644 --- a/doc/manual/src/command-ref/nix-store/dump.md +++ b/doc/manual/src/command-ref/nix-store/dump.md @@ -1,6 +1,6 @@ # Name -`nix-store --dump` - write a single path to a Nix Archive +`nix-store --dump` - write a single path to a [Nix Archive] ## Synopsis @@ -8,7 +8,7 @@ ## Description -The operation `--dump` produces a NAR (Nix ARchive) file containing the +The operation `--dump` produces a [NAR (Nix ARchive)][Nix Archive] file containing the contents of the file system tree rooted at *path*. The archive is written to standard output. @@ -33,6 +33,8 @@ but not other types of files (such as device nodes). A Nix archive can be unpacked using `nix-store --restore`. +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive + {{#include ./opt-common.md}} {{#include ../opt-common.md}} diff --git a/doc/manual/src/command-ref/nix-store/export.md b/doc/manual/src/command-ref/nix-store/export.md index 1bc46f53b..09f876865 100644 --- a/doc/manual/src/command-ref/nix-store/export.md +++ b/doc/manual/src/command-ref/nix-store/export.md @@ -1,6 +1,6 @@ # Name -`nix-store --export` - export store paths to a Nix Archive +`nix-store --export` - export store paths to a [Nix Archive] ## Synopsis @@ -11,7 +11,7 @@ The operation `--export` writes a serialisation of the specified store paths to standard output in a format that can be imported into another Nix store with `nix-store --import`. This is like `nix-store ---dump`, except that the NAR archive produced by that command doesn’t +--dump`, except that the [Nix Archive (NAR)][Nix Archive] produced by that command doesn’t contain the necessary meta-information to allow it to be imported into another Nix store (namely, the set of references of the path). @@ -19,6 +19,8 @@ This command does not produce a *closure* of the specified paths, so if a store path references other store paths that are missing in the target Nix store, the import will fail. +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive + {{#include ./opt-common.md}} {{#include ../opt-common.md}} diff --git a/doc/manual/src/command-ref/nix-store/import.md b/doc/manual/src/command-ref/nix-store/import.md index 2711316a7..42fae0b22 100644 --- a/doc/manual/src/command-ref/nix-store/import.md +++ b/doc/manual/src/command-ref/nix-store/import.md @@ -1,6 +1,8 @@ # Name -`nix-store --import` - import Nix Archive into the store +`nix-store --import` - import [Nix Archive] into the store + +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive # Synopsis diff --git a/doc/manual/src/command-ref/nix-store/optimise.md b/doc/manual/src/command-ref/nix-store/optimise.md index dc392aeb8..b257466b2 100644 --- a/doc/manual/src/command-ref/nix-store/optimise.md +++ b/doc/manual/src/command-ref/nix-store/optimise.md @@ -12,7 +12,7 @@ The operation `--optimise` reduces Nix store disk space usage by finding identical files in the store and hard-linking them to each other. It typically reduces the size of the store by something like 25-35%. Only regular files and symlinks are hard-linked in this manner. Files are -considered identical when they have the same NAR archive serialisation: +considered identical when they have the same [Nix Archive (NAR)][Nix Archive] serialisation: that is, regular files must have the same contents and permission (executable or non-executable), and symlinks must have the same contents. @@ -38,3 +38,4 @@ hashing files in `/nix/store/qhqx7l2f1kmwihc9bnxs7rc159hsxnf3-gcc-4.1.1' there are 114486 files with equal contents out of 215894 files in total ``` +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive diff --git a/doc/manual/src/command-ref/nix-store/realise.md b/doc/manual/src/command-ref/nix-store/realise.md index 5428d57fa..6e56387eb 100644 --- a/doc/manual/src/command-ref/nix-store/realise.md +++ b/doc/manual/src/command-ref/nix-store/realise.md @@ -25,11 +25,11 @@ Each of *paths* is processed as follows: If no substitutes are available and no store derivation is given, realisation fails. -[store paths]: @docroot@/glossary.md#gloss-store-path +[store paths]: @docroot@/store/store-path.md [valid]: @docroot@/glossary.md#gloss-validity [store derivation]: @docroot@/glossary.md#gloss-store-derivation [output paths]: @docroot@/glossary.md#gloss-output-path -[store objects]: @docroot@/glossary.md#gloss-store-object +[store objects]: @docroot@/store/store-object.md [closure]: @docroot@/glossary.md#gloss-closure [substituters]: @docroot@/command-ref/conf-file.md#conf-substituters [content-addressed derivations]: @docroot@/contributing/experimental-features.md#xp-feature-ca-derivations diff --git a/doc/manual/src/command-ref/nix-store/restore.md b/doc/manual/src/command-ref/nix-store/restore.md index fcba43df4..2d0aa3127 100644 --- a/doc/manual/src/command-ref/nix-store/restore.md +++ b/doc/manual/src/command-ref/nix-store/restore.md @@ -8,9 +8,11 @@ ## Description -The operation `--restore` unpacks a NAR archive to *path*, which must +The operation `--restore` unpacks a [Nix Archive (NAR)][Nix Archive] to *path*, which must not already exist. The archive is read from standard input. +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive + {{#include ./opt-common.md}} {{#include ../opt-common.md}} diff --git a/doc/manual/src/contributing/documentation.md b/doc/manual/src/contributing/documentation.md index 359fdb556..6e7c0a967 100644 --- a/doc/manual/src/contributing/documentation.md +++ b/doc/manual/src/contributing/documentation.md @@ -147,7 +147,7 @@ Please observe these guidelines to ease reviews: ``` A [store object] contains a [file system object] and [references] to other store objects. - [store object]: @docroot@/glossary.md#gloss-store-object + [store object]: @docroot@/store/store-object.md [file system object]: @docroot@/architecture/file-system-object.md [references]: @docroot@/glossary.md#gloss-reference ``` diff --git a/doc/manual/src/glossary.md b/doc/manual/src/glossary.md index cbffda187..080b25d30 100644 --- a/doc/manual/src/glossary.md +++ b/doc/manual/src/glossary.md @@ -1,5 +1,24 @@ # Glossary +- [content address]{#gloss-content-address} + + A + [*content address*](https://en.wikipedia.org/wiki/Content-addressable_storage) + is a secure way to reference immutable data. + The reference is calculated directly from the content of the data being referenced, which means the reference is + [*tamper proof*](https://en.wikipedia.org/wiki/Tamperproofing) + --- variations of the data should always calculate to distinct content addresses. + + For how Nix uses content addresses, see: + + - [Content-Addressing File System Objects](@docroot@/store/file-system-object/content-address.md) + - [content-addressed store object](#gloss-content-addressed-store-object) + - [content-addressed derivation](#gloss-content-addressed-derivation) + + Software Heritage's writing on [*Intrinsic and Extrinsic identifiers*](https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers) is also a good introduction to the value of content-addressing over other referencing schemes. + + Besides content addressing, the Nix store also uses [input addressing](#gloss-input-addressed-store-object). + - [derivation]{#gloss-derivation} A description of a build task. The result of a derivation is a @@ -266,13 +285,15 @@ See [installables](./command-ref/new-cli/nix.md#installables) for [`nix` commands](./command-ref/new-cli/nix.md) (experimental) for details. -- [NAR]{#gloss-nar} +- [Nix Archive (NAR)]{#gloss-nar} A *N*ix *AR*chive. This is a serialisation of a path in the Nix store. It can contain regular files, directories and symbolic links. NARs are generated and unpacked using `nix-store --dump` and `nix-store --restore`. + See [Nix Archive](store/file-system-object/content-address.html#serial-nix-archive) for details. + - [`∅`]{#gloss-emtpy-set} The empty set symbol. In the context of profile history, this denotes a package is not present in a particular version of the profile. diff --git a/doc/manual/src/language/advanced-attributes.md b/doc/manual/src/language/advanced-attributes.md index 1fcc5a95b..3b8e48554 100644 --- a/doc/manual/src/language/advanced-attributes.md +++ b/doc/manual/src/language/advanced-attributes.md @@ -199,19 +199,23 @@ Derivations can declare some infrequently used optional attributes. The `outputHashMode` attribute determines how the hash is computed. It must be one of the following two values: - - `"flat"`\ - The output must be a non-executable regular file. If it isn’t, - the build fails. The hash is simply computed over the contents - of that file (so it’s equal to what Unix commands like - `sha256sum` or `sha1sum` produce). + + + - `"flat"` + + The output must be a non-executable regular file; if it isn’t, the build fails. + The hash is + [simply computed over the contents of that file](@docroot@/store/file-system-object/content-address.md#serial-flat) + (so it’s equal to what Unix commands like `sha256sum` or `sha1sum` produce). This is the default. - - `"recursive"` or `"nar"`\ - The hash is computed over the [NAR archive](@docroot@/glossary.md#gloss-nar) dump of the output - (i.e., the result of [`nix-store --dump`](@docroot@/command-ref/nix-store/dump.md)). In - this case, the output can be anything, including a directory - tree. + - `"recursive"` or `"nar"` + + The hash is computed over the + [Nix Archive (NAR)](@docroot@/store/file-system-object/content-address.md#serial-nix-archive) + dump of the output (i.e., the result of [`nix-store --dump`](@docroot@/command-ref/nix-store/dump.md)). + In this case, the output is allowed to be any [file system object], including directories and more. `"recursive"` is the traditional way of indicating this, and is supported since 2005 (virtually the entire history of Nix). @@ -303,7 +307,7 @@ Derivations can declare some infrequently used optional attributes. [`disallowedReferences`](#adv-attr-disallowedReferences) and [`disallowedRequisites`](#adv-attr-disallowedRequisites), the following attributes are available: - - `maxSize` defines the maximum size of the resulting [store object](@docroot@/glossary.md#gloss-store-object). + - `maxSize` defines the maximum size of the resulting [store object](@docroot@/store/store-object.md). - `maxClosureSize` defines the maximum size of the output's closure. - `ignoreSelfRefs` controls whether self-references should be considered when checking for allowed references/requisites. diff --git a/doc/manual/src/language/derivations.md b/doc/manual/src/language/derivations.md index 75f824a34..b95900cdd 100644 --- a/doc/manual/src/language/derivations.md +++ b/doc/manual/src/language/derivations.md @@ -17,7 +17,7 @@ It outputs an attribute set, and produces a [store derivation] as a side effect A symbolic name for the derivation. It is added to the [store path] of the corresponding [store derivation] as well as to its [output paths](@docroot@/glossary.md#gloss-output-path). - [store path]: @docroot@/glossary.md#gloss-store-path + [store path]: @docroot@/store/store-path.md > **Example** > @@ -141,7 +141,7 @@ It outputs an attribute set, and produces a [store derivation] as a side effect By default, a derivation produces a single output called `out`. However, derivations can produce multiple outputs. - This allows the associated [store objects](@docroot@/glossary.md#gloss-store-object) and their [closures](@docroot@/glossary.md#gloss-closure) to be copied or garbage-collected separately. + This allows the associated [store objects](@docroot@/store/store-object.md) and their [closures](@docroot@/glossary.md#gloss-closure) to be copied or garbage-collected separately. > **Example** > diff --git a/doc/manual/src/language/import-from-derivation.md b/doc/manual/src/language/import-from-derivation.md index fb12ba51a..e901f5bcf 100644 --- a/doc/manual/src/language/import-from-derivation.md +++ b/doc/manual/src/language/import-from-derivation.md @@ -2,9 +2,9 @@ The value of a Nix expression can depend on the contents of a [store object]. -[store object]: @docroot@/glossary.md#gloss-store-object +[store object]: @docroot@/store/store-object.md -Passing an expression `expr` that evaluates to a [store path](@docroot@/glossary.md#gloss-store-path) to any built-in function which reads from the filesystem constitutes Import From Derivation (IFD): +Passing an expression `expr` that evaluates to a [store path](@docroot@/store/store-path.md) to any built-in function which reads from the filesystem constitutes Import From Derivation (IFD): - [`import`](./builtins.md#builtins-import)` expr` - [`builtins.readFile`](./builtins.md#builtins-readFile)` expr` diff --git a/doc/manual/src/language/operators.md b/doc/manual/src/language/operators.md index 698fed47e..311887e96 100644 --- a/doc/manual/src/language/operators.md +++ b/doc/manual/src/language/operators.md @@ -128,7 +128,7 @@ The result is a string. > The file or directory at *path* must exist and is copied to the [store]. > The path appears in the result as the corresponding [store path]. -[store path]: @docroot@/glossary.md#gloss-store-path +[store path]: @docroot@/store/store-path.md [store]: @docroot@/glossary.md#gloss-store [String and path concatenation]: #string-and-path-concatenation diff --git a/doc/manual/src/language/string-interpolation.md b/doc/manual/src/language/string-interpolation.md index 1f8fecca8..1e2c4ad95 100644 --- a/doc/manual/src/language/string-interpolation.md +++ b/doc/manual/src/language/string-interpolation.md @@ -107,9 +107,9 @@ An expression that is interpolated must evaluate to one of the following: A string interpolates to itself. -A path in an interpolated expression is first copied into the Nix store, and the resulting string is the [store path] of the newly created [store object](@docroot@/glossary.md#gloss-store-object). +A path in an interpolated expression is first copied into the Nix store, and the resulting string is the [store path] of the newly created [store object](@docroot@/store/store-object.md). -[store path]: @docroot@/glossary.md#gloss-store-path +[store path]: @docroot@/store/store-path.md > **Example** > diff --git a/doc/manual/src/language/values.md b/doc/manual/src/language/values.md index 2dd52b379..4eb1887fa 100644 --- a/doc/manual/src/language/values.md +++ b/doc/manual/src/language/values.md @@ -124,7 +124,7 @@ For example, assume you used a file path in an interpolated string during a `nix repl` session. Later in the same session, after having changed the file contents, evaluating the interpolated string with the file path again might not return a new [store path], since Nix might not re-read the file contents. Use `:r` to reset the repl as needed. - [store path]: @docroot@/glossary.md#gloss-store-path + [store path]: @docroot@/store/store-path.md Path literals can also include [string interpolation], besides being [interpolated into other expressions]. diff --git a/doc/manual/src/protocols/json/store-object-info.md b/doc/manual/src/protocols/json/store-object-info.md index 179cafbb4..22a14715f 100644 --- a/doc/manual/src/protocols/json/store-object-info.md +++ b/doc/manual/src/protocols/json/store-object-info.md @@ -28,9 +28,9 @@ Info about a [store object]. Content address of this store object's file system object, used to compute its store path. -[store path]: @docroot@/glossary.md#gloss-store-path +[store path]: @docroot@/store/store-path.md [file system object]: @docroot@/store/file-system-object.md -[Nix Archive]: @docroot@/glossary.md#gloss-nar +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive ## Impure fields diff --git a/doc/manual/src/protocols/nix-archive.md b/doc/manual/src/protocols/nix-archive.md index 4fb6282ee..bfc523b3d 100644 --- a/doc/manual/src/protocols/nix-archive.md +++ b/doc/manual/src/protocols/nix-archive.md @@ -1,9 +1,10 @@ # Nix Archive (NAR) format -This is the complete specification of the Nix Archive format. +This is the complete specification of the [Nix Archive] format. The Nix Archive format closely follows the abstract specification of a [file system object] tree, because it is designed to serialize exactly that data structure. +[Nix Archive]: @docroot@/store/file-system-object/content-address.md#nix-archive [file system object]: @docroot@/store/file-system-object.md The format of this specification is close to [Extended Backus–Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form), with the exception of the `str(..)` function / parameterized rule, which length-prefixes and pads strings. diff --git a/doc/manual/src/protocols/store-path.md b/doc/manual/src/protocols/store-path.md index 565c4fa75..657774238 100644 --- a/doc/manual/src/protocols/store-path.md +++ b/doc/manual/src/protocols/store-path.md @@ -1,12 +1,14 @@ # Complete Store Path Calculation -This is the complete specification for how store paths are calculated. +This is the complete specification for how [store path]s are calculated. The format of this specification is close to [Extended Backus–Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form), but must deviate for a few things such as hash functions which we treat as bidirectional for specification purposes. Regular users do *not* need to know this information --- store paths can be treated as black boxes computed from the properties of the store objects they refer to. But for those interested in exactly how Nix works, e.g. if they are reimplementing it, this information can be useful. +[store path](@docroot@/store/store-path.md) + ## Store path proper ```ebnf @@ -113,7 +115,7 @@ where Note that `id` = `"out"`, regardless of the name part of the store path. Also note that NAR + SHA-256 must not use this case, and instead must use the `type` = `"source:" ...` case. -[Nix Archive (NAR)]: @docroot@/glossary.md#gloss-NAR +[Nix Archive (NAR)]: @docroot@/store/file-system-object/content-address.md#serial-nix-archive [sha-256]: https://en.m.wikipedia.org/wiki/SHA-256 ### Historical Note diff --git a/doc/manual/src/protocols/tarball-fetcher.md b/doc/manual/src/protocols/tarball-fetcher.md index 274fa6d63..24ec7ae14 100644 --- a/doc/manual/src/protocols/tarball-fetcher.md +++ b/doc/manual/src/protocols/tarball-fetcher.md @@ -22,7 +22,7 @@ Link: ; rel="immutable" *flakeref* must be a tarball flakeref. It can contain the tarball flake attributes `narHash`, `rev`, `revCount` and `lastModified`. If `narHash` is included, its -value must be the NAR hash of the unpacked tarball (as computed via +value must be the [NAR hash][Nix Archive] of the unpacked tarball (as computed via `nix hash path`). Nix checks the contents of the returned tarball against the `narHash` attribute. The `rev` and `revCount` attributes are useful when the tarball flake is a mirror of a fetcher type that @@ -40,3 +40,5 @@ Link: **Warning** +> +> This method is part of the [`git-hashing`][xp-feature-git-hashing] experimental feature. + +Git's file system model is very close to Nix's, and so Git's content addressing method is a pretty good fit. +Just as with regular Git, files and symlinks are hashed as git "blobs", and directories are hashed as git "trees". + +However, one difference between Nix's and Git's file system model needs special treatment. +Plain files, executable files, and symlinks are not differentiated as distinctly addressable objects, but by their context: by the directory entry that refers to them. +That means so long as the root object is a directory, there is no problem: +every non-directory object is owned by a parent directory, and the entry that refers to it provides the missing information. +However, if the root object is not a directory, then we have no way of knowing which one of an executable file, non-executable file, or symlink it is supposed to be. + +In response to this, we have decided to treat a bare file as non-executable file. +This is similar to do what we do with [flat serialisation](#flat), which also lacks this information. +To avoid an address collision, attempts to hash a bare executable file or symlink will result in an error (just as would happen for flat serialisation also). +Thus, Git can encode some, but not all of Nix's "File System Objects", and this sort of content-addressing is likewise partial. + +In the future, we may support a Git-like hash for such file system objects, or we may adopt another Merkle DAG format which is capable of representing all Nix file system objects. + +[file system object]: ../file-system-object.md +[store object]: ../store-object.md +[xp-feature-git-hashing]: @docroot@/contributing/experimental-features.md#xp-feature-git-hashing diff --git a/doc/manual/src/store/store-path.md b/doc/manual/src/store/store-path.md index 085aead51..beec2389b 100644 --- a/doc/manual/src/store/store-path.md +++ b/doc/manual/src/store/store-path.md @@ -1,5 +1,11 @@ # Store Path +> **Example** +> +> `/nix/store/a040m110amc4h71lds2jmr8qrkj2jhxd-git-2.38.1` +> +> A rendered store path + Nix implements references to [store objects](./index.md#store-object) as *store paths*. Think of a store path as an [opaque], [unique identifier]: @@ -37,6 +43,10 @@ A store path is rendered to a file system path as the concatenation of > store directory digest name > ``` +Exactly how the digest is calculated depends on the type of store path. +Store path digests are *supposed* to be opaque, and so for most operations, it is not necessary to know the details. +That said, the manual has a full [specification of store path digests](@docroot@/protocols/store-path.md). + ## Store Directory Every [Nix store](./index.md) has a store directory. diff --git a/src/libcmd/misc-store-flags.cc b/src/libcmd/misc-store-flags.cc index e66d3f63b..063a9dd9e 100644 --- a/src/libcmd/misc-store-flags.cc +++ b/src/libcmd/misc-store-flags.cc @@ -81,9 +81,15 @@ Args::Flag fileIngestionMethod(FileIngestionMethod * method) How to compute the hash of the input. One of: - - `nar` (the default): Serialises the input as an archive (following the [_Nix Archive Format_](https://edolstra.github.io/pubs/phd-thesis.pdf#page=101)) and passes that to the hash function. + - `nar` (the default): + Serialises the input as a + [Nix Archive](@docroot@/store/file-system-object/content-address.md#serial-nix-archive) + and passes that to the hash function. - - `flat`: Assumes that the input is a single file and directly passes it to the hash function; + - `flat`: + Assumes that the input is a single file and + [directly passes](@docroot@/store/file-system-object/content-address.md#serial-flat) + it to the hash function. )", .labels = {"file-ingestion-method"}, .handler = {[method](std::string s) { @@ -97,16 +103,24 @@ Args::Flag contentAddressMethod(ContentAddressMethod * method) return Args::Flag { .longName = "mode", // FIXME indentation carefully made for context, this is messed up. + /* FIXME link to store object content-addressing not file system + object content addressing once we have that page. */ .description = R"( How to compute the content-address of the store object. One of: - - `nar` (the default): Serialises the input as an archive (following the [_Nix Archive Format_](https://edolstra.github.io/pubs/phd-thesis.pdf#page=101)) and passes that to the hash function. + - `nar` (the default): + Serialises the input as a + [Nix Archive](@docroot@/store/file-system-object/content-address.md#serial-nix-archive) + and passes that to the hash function. - - `flat`: Assumes that the input is a single file and directly passes it to the hash function; + - `flat`: + Assumes that the input is a single file and + [directly passes](@docroot@/store/file-system-object/content-address.md#serial-flat) + it to the hash function. - `text`: Like `flat`, but used for - [derivations](@docroot@/glossary.md#store-derivation) serialized in store object and + [derivations](@docroot@/glossary.md#store-derivation) serialized in store object and [`builtins.toFile`](@docroot@/language/builtins.html#builtins-toFile). For advanced use-cases only; for regular usage prefer `nar` and `flat. diff --git a/src/libexpr/primops.cc b/src/libexpr/primops.cc index 109127d1d..1ad1e3fc0 100644 --- a/src/libexpr/primops.cc +++ b/src/libexpr/primops.cc @@ -4515,7 +4515,7 @@ void EvalState::createBaseEnv() 1683705525 ``` - The [store path](@docroot@/glossary.md#gloss-store-path) of a derivation depending on `currentTime` will differ for each evaluation, unless both evaluate `builtins.currentTime` in the same second. + The [store path](@docroot@/store/store-path.md) of a derivation depending on `currentTime` will differ for each evaluation, unless both evaluate `builtins.currentTime` in the same second. )", .impureOnly = true, }); diff --git a/src/libexpr/primops/fetchTree.cc b/src/libexpr/primops/fetchTree.cc index e27f30512..fa462dc33 100644 --- a/src/libexpr/primops/fetchTree.cc +++ b/src/libexpr/primops/fetchTree.cc @@ -200,8 +200,8 @@ static RegisterPrimOp primop_fetchTree({ .doc = R"( Fetch a file system tree or a plain file using one of the supported backends and return an attribute set with: - - the resulting fixed-output [store path](@docroot@/glossary.md#gloss-store-path) - - the corresponding [NAR](@docroot@/glossary.md#gloss-nar) hash + - the resulting fixed-output [store path](@docroot@/store/store-path.md) + - the corresponding [NAR](@docroot@/store/file-system-object/content-address.md#serial-nix-archive) hash - backend-specific metadata (currently not documented). *input* must be an attribute set with the following attributes: diff --git a/src/libstore/globals.hh b/src/libstore/globals.hh index 108933422..dc18a11fc 100644 --- a/src/libstore/globals.hh +++ b/src/libstore/globals.hh @@ -910,7 +910,7 @@ public: "substituters", R"( A list of [URLs of Nix stores](@docroot@/store/types/index.md#store-url-format) to be used as substituters, separated by whitespace. - A substituter is an additional [store](@docroot@/glossary.md#gloss-store) from which Nix can obtain [store objects](@docroot@/glossary.md#gloss-store-object) instead of building them. + A substituter is an additional [store](@docroot@/glossary.md#gloss-store) from which Nix can obtain [store objects](@docroot@/store/store-object.md) instead of building them. Substituters are tried based on their priority value, which each substituter can set independently. Lower value means higher priority. diff --git a/src/libstore/path.hh b/src/libstore/path.hh index 4ca6747b3..3c26fc515 100644 --- a/src/libstore/path.hh +++ b/src/libstore/path.hh @@ -13,7 +13,7 @@ struct Hash; * \ref StorePath "Store path" is the fundamental reference type of Nix. * A store paths refers to a Store object. * - * See glossary.html#gloss-store-path for more information on a + * See store/store-path.html for more information on a * conceptual level. */ class StorePath diff --git a/src/libutil/file-content-address.hh b/src/libutil/file-content-address.hh index 145a8fb1f..c19de27ed 100644 --- a/src/libutil/file-content-address.hh +++ b/src/libutil/file-content-address.hh @@ -12,16 +12,28 @@ struct SourcePath; /** * An enumeration of the ways we can serialize file system * objects. + * + * See `file-system-object/content-address.md#serial` in the manual for + * a user-facing description of this concept, but note that this type is also + * used for storing or sending copies; not just for addressing. + * Note also that there are other content addressing methods that don't + * correspond to a serialisation method. */ enum struct FileSerialisationMethod : uint8_t { /** * Flat-file. The contents of a single file exactly. + * + * See `file-system-object/content-address.md#serial-flat` in the + * manual. */ Flat, /** * Nix Archive. Serializes the file-system object in * Nix Archive format. + * + * See `file-system-object/content-address.md#serial-nix-archive` in + * the manual. */ Recursive, }; @@ -81,33 +93,32 @@ HashResult hashPath( /** * An enumeration of the ways we can ingest file system * objects, producing a hash or digest. + * + * See `file-system-object/content-address.md` in the manual for a + * user-facing description of this concept. */ enum struct FileIngestionMethod : uint8_t { /** * Hash `FileSerialisationMethod::Flat` serialisation. + * + * See `file-system-object/content-address.md#serial-flat` in the + * manual. */ Flat, /** - * Hash `FileSerialisationMethod::Git` serialisation. + * Hash `FileSerialisationMethod::Recursive` serialisation. + * + * See `file-system-object/content-address.md#serial-flat` in the + * manual. */ Recursive, /** - * Git hashing. In particular files are hashed as git "blobs", and - * directories are hashed as git "trees". + * Git hashing. * - * Unlike `Flat` and `Recursive`, this is not a hash of a single - * serialisation but a [Merkle - * DAG](https://en.wikipedia.org/wiki/Merkle_tree) of multiple - * rounds of serialisation and hashing. - * - * @note Git's data model is slightly different, in that a plain - * file doesn't have an executable bit, directory entries do - * instead. We decide treat a bare file as non-executable by fiat, - * as we do with `FileIngestionMethod::Flat` which also lacks this - * information. Thus, Git can encode some but all of Nix's "File - * System Objects", and this sort of hashing is likewise partial. + * See `file-system-object/content-address.md#serial-git` in the + * manual. */ Git, }; diff --git a/src/nix/derivation-show.md b/src/nix/derivation-show.md index 2437ea08f..9fff58ef9 100644 --- a/src/nix/derivation-show.md +++ b/src/nix/derivation-show.md @@ -50,7 +50,7 @@ By default, this command only shows top-level derivations, but with `nix derivation show` outputs a JSON map of [store path]s to derivations in the following format: -[store path]: @docroot@/glossary.md#gloss-store-path +[store path]: @docroot@/store/store-path.md {{#include ../../protocols/json/derivation.md}} diff --git a/src/nix/unix/daemon.cc b/src/nix/unix/daemon.cc index 8afcbe982..de77a7b6b 100644 --- a/src/nix/unix/daemon.cc +++ b/src/nix/unix/daemon.cc @@ -58,7 +58,7 @@ struct AuthorizationSettings : Config { this, {"root"}, "trusted-users", R"( A list of user names, separated by whitespace. - These users will have additional rights when connecting to the Nix daemon, such as the ability to specify additional [substituters](#conf-substituters), or to import unsigned [NARs](@docroot@/glossary.md#gloss-nar). + These users will have additional rights when connecting to the Nix daemon, such as the ability to specify additional [substituters](#conf-substituters), or to import unsigned realisations or unsigned input-addressed store objects. You can also specify groups by prefixing names with `@`. For instance, `@wheel` means all users in the `wheel` group.