From 3e5797e97fe73f0468e2b3faa0ce6b1860617137 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Wed, 10 Apr 2024 15:21:22 -0400 Subject: [PATCH] Document the Nix Archive format This is adopted from Eelco's PhD thesis. --- doc/manual/src/SUMMARY.md.in | 1 + doc/manual/src/protocols/nix-archive.md | 42 +++++++++++++++++++++++++ 2 files changed, 43 insertions(+) create mode 100644 doc/manual/src/protocols/nix-archive.md diff --git a/doc/manual/src/SUMMARY.md.in b/doc/manual/src/SUMMARY.md.in index 43b9e925f..d9044fbda 100644 --- a/doc/manual/src/SUMMARY.md.in +++ b/doc/manual/src/SUMMARY.md.in @@ -110,6 +110,7 @@ - [Derivation](protocols/json/derivation.md) - [Serving Tarball Flakes](protocols/tarball-fetcher.md) - [Store Path Specification](protocols/store-path.md) + - [Nix Archive (NAR) Format](protocols/nix-archive.md) - [Derivation "ATerm" file format](protocols/derivation-aterm.md) - [Glossary](glossary.md) - [Contributing](contributing/index.md) diff --git a/doc/manual/src/protocols/nix-archive.md b/doc/manual/src/protocols/nix-archive.md new file mode 100644 index 000000000..4fb6282ee --- /dev/null +++ b/doc/manual/src/protocols/nix-archive.md @@ -0,0 +1,42 @@ +# Nix Archive (NAR) format + +This is the complete specification of the Nix Archive format. +The Nix Archive format closely follows the abstract specification of a [file system object] tree, +because it is designed to serialize exactly that data structure. + +[file system object]: @docroot@/store/file-system-object.md + +The format of this specification is close to [Extended Backus–Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form), with the exception of the `str(..)` function / parameterized rule, which length-prefixes and pads strings. +This makes the resulting binary format easier to parse. + +Regular users do *not* need to know this information. +But for those interested in exactly how Nix works, e.g. if they are reimplementing it, this information can be useful. + +```ebnf +nar = str("nix-archive-1"), nar-obj; + +nar-obj = str("("), nar-obj-inner, str(")"); + +nar-obj-inner + = str("type"), str("regular") regular + | str("type"), str("symlink") symlink + | str("type"), str("directory") directory + ; + +regular = [ str("executable"), str("") ], str("contents"), str(contents); + +symlink = str("target"), str(target); + +(* side condition: directory entries must be ordered by their names *) +directory = str("type"), str("directory") { directory-entry }; + +directory-entry = str("entry"), str("("), str("name"), str(name), str("node"), nar-obj, str(")"); +``` + +The `str` function / parameterized rule is defined as follows: + +- `str(s)` = `int(|s|), pad(s);` + +- `int(n)` = the 64-bit little endian representation of the number `n` + +- `pad(s)` = the byte sequence `s`, padded with 0s to a multiple of 8 byte