reduces peak hep memory use on eval of our test system from 264.4MB to 242.3MB,
possibly also a slight performance boost.
theoretically memory use could be cut down by another eight bytes per Pos on
average by turning it into a tuple containing an index into a global base
position table with row and column offsets, but that doesn't seem worth the
effort at this point.
This changes the representation of the interrupt callback list to
be safe to use during interrupt handling.
Holding a lock while executing arbitrary functions is something to
avoid in general, because of the risk of deadlock.
Such a deadlock occurs in https://github.com/NixOS/nix/issues/3294
where ~CurlDownloader tries to deregister its interrupt callback.
This happens during what seems to be a triggerInterrupt() by the
daemon connection's MonitorFdHup thread. This bit I can not confirm
based on the stack trace though; it's based on reading the code,
so no absolute certainty, but a smoking gun nonetheless.
we'll retain the old coerceToString interface that returns a string, but callers
that don't need the returned value to outlive the Value it came from can save
copies by using the new interface instead. for values that weren't stringy we'll
pass a new buffer argument that'll be used for storage and shouldn't be
inspected.
when given a string yacc will copy the entire input to a newly allocated
location so that it can add a second terminating NUL byte. since the
parser is a very internal thing to EvalState we can ensure that having
two terminating NUL bytes is always possible without copying, and have
the parser itself merely check that the expected NULs are present.
# before
Benchmark 1: nix search --offline nixpkgs hello
Time (mean ± σ): 572.4 ms ± 2.3 ms [User: 563.4 ms, System: 8.6 ms]
Range (min … max): 566.9 ms … 579.1 ms 50 runs
Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
Time (mean ± σ): 381.7 ms ± 1.0 ms [User: 348.3 ms, System: 33.1 ms]
Range (min … max): 380.2 ms … 387.7 ms 50 runs
Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
Time (mean ± σ): 2.936 s ± 0.005 s [User: 2.715 s, System: 0.221 s]
Range (min … max): 2.923 s … 2.946 s 50 runs
# after
Benchmark 1: nix search --offline nixpkgs hello
Time (mean ± σ): 571.7 ms ± 2.4 ms [User: 563.3 ms, System: 8.0 ms]
Range (min … max): 566.7 ms … 579.7 ms 50 runs
Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
Time (mean ± σ): 376.6 ms ± 1.0 ms [User: 345.8 ms, System: 30.5 ms]
Range (min … max): 374.5 ms … 379.1 ms 50 runs
Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
Time (mean ± σ): 2.922 s ± 0.006 s [User: 2.707 s, System: 0.215 s]
Range (min … max): 2.906 s … 2.934 s 50 runs
there's a couple places that can be easily converted from using strings to using
string_views instead. gives a slight (~1%) boost to system eval.
# before
nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
Time (mean ± σ): 2.946 s ± 0.026 s [User: 2.655 s, System: 0.209 s]
Range (min … max): 2.905 s … 2.995 s 20 runs
# after
Time (mean ± σ): 2.928 s ± 0.024 s [User: 2.638 s, System: 0.211 s]
Range (min … max): 2.893 s … 2.970 s 20 runs
this avoids one copy from `s` into `str`, and possibly another copy needed to
construct `s` at the call site. lexical_cast is also more efficient in general.
There already existed a smoke test for the link content length,
but it appears that there exists some corruptions pernicious enough
to replace the file content with zeros, and keeping the same length.
--repair-path now goes as far as checking the content of the link,
making it true to its name and actually repairing the path for such
coruption cases.
we don't have to create an ostream sentry object for every character of a JSON
string we write. format a bunch of characters and flush them to the stream all
at once instead.
this doesn't affect small numbers of string characters, but larger numbers of
total JSON string characters written gain a lot. at 1MB of total string written
we gain almost 30%, at 16MB it's almost a factor of 3x. large numbers of JSON
string characters do occur naturally in a nixos system evaluation to generate
documentation (though this is now somewhat mitigated by caching the largest part
of nixos option docs).
benchmarked with
hyperfine 'nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) {e})"' --warmup 1 -L e 1,4,256,4096,65536
before:
Benchmark 1: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 1)"
Time (mean ± σ): 12.5 ms ± 0.2 ms [User: 9.2 ms, System: 4.0 ms]
Range (min … max): 11.9 ms … 13.1 ms 223 runs
Benchmark 2: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 4)"
Time (mean ± σ): 12.5 ms ± 0.2 ms [User: 9.3 ms, System: 3.8 ms]
Range (min … max): 11.9 ms … 13.2 ms 220 runs
Benchmark 3: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 256)"
Time (mean ± σ): 13.2 ms ± 0.3 ms [User: 9.8 ms, System: 4.0 ms]
Range (min … max): 12.6 ms … 14.3 ms 205 runs
Benchmark 4: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 4096)"
Time (mean ± σ): 24.0 ms ± 0.4 ms [User: 19.4 ms, System: 5.2 ms]
Range (min … max): 22.7 ms … 25.8 ms 119 runs
Benchmark 5: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 65536)"
Time (mean ± σ): 196.0 ms ± 3.7 ms [User: 171.2 ms, System: 25.8 ms]
Range (min … max): 190.6 ms … 201.5 ms 14 runs
after:
Benchmark 1: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 1)"
Time (mean ± σ): 12.4 ms ± 0.3 ms [User: 9.1 ms, System: 4.0 ms]
Range (min … max): 11.7 ms … 13.3 ms 204 runs
Benchmark 2: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 4)"
Time (mean ± σ): 12.4 ms ± 0.2 ms [User: 9.2 ms, System: 3.9 ms]
Range (min … max): 11.8 ms … 13.0 ms 214 runs
Benchmark 3: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 256)"
Time (mean ± σ): 12.6 ms ± 0.2 ms [User: 9.5 ms, System: 3.8 ms]
Range (min … max): 12.1 ms … 13.3 ms 209 runs
Benchmark 4: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 4096)"
Time (mean ± σ): 15.9 ms ± 0.2 ms [User: 11.4 ms, System: 5.1 ms]
Range (min … max): 15.2 ms … 16.4 ms 171 runs
Benchmark 5: nix eval --raw --expr "let s = __concatStringsSep \"\" (__genList (_: \"c\") 256); in __toJSON (__genList (_: s) 65536)"
Time (mean ± σ): 69.0 ms ± 0.9 ms [User: 44.3 ms, System: 25.3 ms]
Range (min … max): 67.2 ms … 70.9 ms 42 runs
This was already accidentally disabled in ba87b08. It also no longer
appears to be beneficial, and in fact slow things down, e.g. when
evaluating a NixOS system configuration:
elapsed time: median = 3.8170 mean = 3.8202 stddev = 0.0195 min = 3.7894 max = 3.8600 [rejected, p=0.00000, Δ=0.36929±0.02513]
Due to missing <atomic> declaration the build fails as:
src/libutil/util.hh:350:24: error: no match for 'operator||' (operand types are 'std::atomic<bool>' and 'bool')
350 | if (_isInterrupted || (interruptCheck && interruptCheck()))
| ~~~~~~~~~~~~~~ ^~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| std::atomic<bool> bool
Because the manual is generated from default values which are themselves
generated from various sources (cpuid, bios settings (kvm), number of
cores). This commit hides non-reproducible settings from the manual
output.