Commit graph

869 commits

Author SHA1 Message Date
Robert Hensing
a865049c4f tryEval: Allocate true and false once 2024-03-20 23:25:28 +01:00
Robert Hensing
d71e74838a readDir: Allocate type strings only once 2024-03-20 23:25:28 +01:00
Eelco Dolstra
3e6730ee62 Mark Value pointers in Value::elems as const
This catches modification of finalized values (e.g. in prim_sort).
2024-03-15 18:26:37 +01:00
Eelco Dolstra
fecff520d7 Add a ListBuilder helper for constructing list values
Previously, `state.mkList()` would set the type of the value to tList
and allocate the list vector, but it would not initialize the values
in the list. This has two problems:

* If an exception occurs, the list is left in an undefined state.

* More importantly, for multithreaded evaluation, if a value
  transitions from thunk to non-thunk, it should be final (i.e. other
  threads should be able to access the value safely).

To address this, there now is a `ListBuilder` class (analogous to
`BindingsBuilder`) to build the list vector prior to the call to
`Value::mkList()`. Typical usage:

   auto list = state.buildList(size);
   for (auto & v : list)
       v = ... set value ...;
   vRes.mkList(list);
2024-03-15 18:26:37 +01:00
pennae
5d9fdab3de use byte indexed locations for PosIdx
we now keep not a table of all positions, but a table of all origins and
their sizes. position indices are now direct pointers into the virtual
concatenation of all parsed contents. this slightly reduces memory usage
and time spent in the parser, at the cost of not being able to report
positions if the total input size exceeds 4GiB. this limit is not unique
to nix though, rustc and clang also limit their input to 4GiB (although
at least clang refuses to process inputs that are larger, we will not).

this new 4GiB limit probably will not cause any problems for quite a
while, all of nixpkgs together is less than 100MiB in size and already
needs over 700MiB of memory and multiple seconds just to parse. 4GiB
worth of input will easily take multiple minutes and over 30GiB of
memory without even evaluating anything. if problems *do* arise we can
probably recover the old table-based system by adding some tracking to
Pos::Origin (or increasing the size of PosIdx outright), but for time
being this looks like more complexity than it's worth.

since we now need to read the entire input again to determine the
line/column of a position we'll make unsafeGetAttrPos slightly lazy:
mostly the set it returns is only used to determine the file of origin
of an attribute, not its exact location. the thunks do not add
measurable runtime overhead.

notably this change is necessary to allow changing the parser since
apparently nothing supports nix's very idiosyncratic line ending choice
of "anything goes", making it very hard to calculate line/column
positions in the parser (while byte offsets are very easy).
2024-03-06 23:48:42 +01:00
pennae
d384ecd553 keep copies of parser inputs that are in-memory only
the parser modifies its inputs, which means that sharing them between
the error context reporting system and the parser itself can confuse the
reporting system. usually this led to early truncation of error context
reports which, while not dangerous, can be quite confusing.
2024-03-06 23:11:12 +01:00
Rebecca Turner
14b0356dc5
Forbid nested debuggers 2024-03-04 09:24:57 -08:00
tomberek
ffe67c86a8
Merge pull request #9915 from 9999years/evaluating-attribute-position
Add position information to `while evaluating the attribute` errors in the debugger
2024-02-28 18:11:07 -05:00
Robert Hensing
4c7f0ef6ca
Merge pull request #9847 from pennae/inherit-from-dedup
deduplicate inherit-from source expr work
2024-02-26 20:25:58 +01:00
pennae
1cd87b7042 remove ExprAttrs::AttrDef::inherited
it's no longer widely used and has a rather confusing meaning now that
inherit-from is handled very differently.
2024-02-26 19:07:08 +01:00
pennae
cefd0302b5 evaluate inherit (from) exprs only once per directive
desugaring inherit-from to syntactic duplication of the source expr also
duplicates side effects of the source expr (such as trace calls) and
expensive computations (such as derivationStrict).
2024-02-26 19:07:08 +01:00
Rebecca Turner
91e89628fd
Make addErrorTrace variadic 2024-02-22 17:18:27 -08:00
Rebecca Turner
f05c13ecc2
Remove the concept of "skipped frames" 2024-02-22 17:14:55 -08:00
John Ericson
2080d89b87
Merge pull request #10038 from edolstra/tarball-git-cache
Use the Git cache for tarball flakes
2024-02-21 15:47:02 -05:00
Rebecca Turner
2a8fe9a938
:quit in the debugger should quit the whole program 2024-02-20 10:01:13 -08:00
Théophane Hufschmitt
d2c6a93bd5
Merge pull request #10044 from edolstra/empty-git-repos
Handle empty Git repositories / workdirs
2024-02-20 14:01:23 +01:00
Eelco Dolstra
cabee98152 Tarball fetcher: Use the content-addressed Git cache
Backported from the lazy-trees branch.
2024-02-20 12:57:36 +01:00
Eelco Dolstra
7cb4d0c5b7 fetchToStore(): Don't always respect settings.readOnlyMode
It's now up to the caller whether readOnlyMode should be applied. In
some contexts (like InputScheme::fetch()), we always need to fetch.
2024-02-20 11:46:49 +01:00
Eelco Dolstra
d52d91fe7a AllowListInputAccessor: Clarify that the "allowed paths" are actually allowed prefixes
E.g. adding "/" will allow access to the root and *everything below it*.
2024-02-20 11:23:26 +01:00
Eelco Dolstra
ec6ca6e42c
Merge pull request #9948 from obsidiansystems/no-canon-path-from-cwd
Get rid of `CanonPath::fromCwd`
2024-02-12 14:04:01 +01:00
pennae
1f542adb3e add ExprAttrs::AttrDef::chooseByKind
in place of inherited() — not quite useful yet since we don't
distinguish plain and inheritFrom attr kinds so far.
2024-02-12 13:34:59 +01:00
pennae
c66ee57edc preserve information about whether/how an attribute was inherited 2024-02-12 13:32:33 +01:00
Rebecca Turner
c0e7f50c1a
Rename hintfmt to HintFmt 2024-02-08 11:58:25 -08:00
Théophane Hufschmitt
1ba9780cf5
Merge pull request #9834 from 9999years/structured-errors
Towards structured error classes
2024-02-08 20:00:25 +01:00
John Ericson
4687beecef Get rid of CanonPath::fromCwd
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards #9205
2024-02-08 11:01:41 -05:00
Théophane Hufschmitt
9b8b486091
Merge pull request #9933 from pennae/debugger-fix
fix debugger crashing while printing envs
2024-02-08 10:48:02 +01:00
Théophane Hufschmitt
c4ed92fa6f
Merge pull request #9917 from 9999years/enter-debugger-more-reliably
Enter debugger more reliably in `let` expressions and function calls
2024-02-08 10:09:54 +01:00
Théophane Hufschmitt
f388a6148d
Merge pull request #9919 from 9999years/reduce-debugger-clutter
Reduce visual clutter in the debugger
2024-02-08 09:42:38 +01:00
Eelco Dolstra
a6737b7e17 CanonPath, SourcePath: Change operator + to /
This is less confusing and makes it more similar to std::filesystem::path.
2024-02-05 15:17:39 +01:00
pennae
5ccb06ee1b fix debugger crashing while printing envs
fixes #9932
2024-02-04 17:12:04 +01:00
Rebecca Turner
6414cd259e
Reduce visual clutter in the debugger 2024-02-02 19:58:35 -08:00
Rebecca Turner
0127d54d5e
Enter debugger more reliably in let expressions and calls 2024-02-02 19:14:22 -08:00
Rebecca Turner
016db2d10f
Add position information to while evaluating the attribute 2024-02-02 17:49:54 -08:00
Rebecca Turner
c6a89c1a16
libexpr: Support structured error classes
While preparing PRs like #9753, I've had to change error messages in
dozens of code paths. It would be nice if instead of

    EvalError("expected 'boolean' but found '%1%'", showType(v))

we could write

    TypeError(v, "boolean")

or similar. Then, changing the error message could be a mechanical
refactor with the compiler pointing out places the constructor needs to
be changed, rather than the error-prone process of grepping through the
codebase. Structured errors would also help prevent the "same" error
from having multiple slightly different messages, and could be a first
step towards error codes / an error index.

This PR reworks the exception infrastructure in `libexpr` to
support exception types with different constructor signatures than
`BaseError`. Actually refactoring the exceptions to use structured data
will come in a future PR (this one is big enough already, as it has to
touch every exception in `libexpr`).

The core design is in `eval-error.hh`. Generally, errors like this:

    state.error("'%s' is not a string", getAttrPathStr())
      .debugThrow<TypeError>()

are transformed like this:

    state.error<TypeError>("'%s' is not a string", getAttrPathStr())
      .debugThrow()

The type annotation has moved from `ErrorBuilder::debugThrow` to
`EvalState::error`.
2024-02-01 16:39:38 -08:00
Eelco Dolstra
b36ff47e7c Resolve symlinks in a few more places
Fixes #9882.
2024-01-30 15:35:31 +01:00
John Ericson
b83a2fb6dd
Merge pull request #9776 from pennae/parser-refactor
Refactor the parser somewhat
2024-01-26 23:56:48 -05:00
Robert Hensing
5b7bfd2d6b
Merge pull request #9754 from 9999years/print-value-when-coercion-fails
Print the value in `error: cannot coerce` messages
2024-01-24 12:48:39 +01:00
Rebecca Turner
83bb494a30
Print the value in error: cannot coerce messages
This extends the `error: cannot coerce a TYPE to a string` message
to print the value that could not be coerced. This helps with debugging
by making it easier to track down where the value is being produced
from, especially in errors with deep or unhelpful stack traces.
2024-01-23 15:15:41 -08:00
Maximilian Bosch
81499a0b93
libexpr: print value of what is attempted to be called as function
Low-hanging fruit in the spirit of #9753 and #9754 (means 9999years did
all the hard work already).

This basically prints out what was attempted to be called as function,
i.e.

  map (import <nixpkgs> {}) [ 1 2 3 ]

now gives the following error message:

    error:
           … while calling the 'map' builtin
             at «string»:1:1:
                1| map (import <nixpkgs> {}) [ 1 2 3 ]
                 | ^

           … while evaluating the first argument passed to builtins.map

           error: expected a function but found a set: { _type = "pkgs"; AAAAAASomeThingsFailToEvaluate = «thunk»; AMB-plugins = «thunk»; ArchiSteamFarm = «thunk»; BeatSaberModManager = «thunk»; CHOWTapeModel = «thunk»; ChowCentaur = «thunk»; ChowKick = «thunk»; ChowPhaser = «thunk»; CoinMP = «thunk»;  «18783 attributes elided»}
2024-01-22 22:41:42 +01:00
Rebecca Turner
cb7fbd4d83
Print value on type error
Adds the failing value to `value is <TYPE> while a <TYPE> is expected`
error messages.
2024-01-22 08:56:02 -08:00
pennae
80b84710b8
Update src/libexpr/eval.cc
Co-authored-by: John Ericson <git@JohnEricson.me>
2024-01-22 15:15:53 +01:00
pennae
09a1128d9e don't repeatedly look up ast internal symbols
these symbols are used a *lot*, so it makes sense to cache them. this
mostly increases clarity of the code (however clear one may wish to call
the parser desugaring here), but it also provides a small performance
benefit.
2024-01-15 16:52:18 +01:00
pennae
b596cc9e79 decouple parser and EvalState
there's no reason the parser itself should be doing semantic analysis
like bindVars. split this bit apart (retaining the previous name in
EvalState) and have the parser really do *only* parsing, decoupled from
EvalState.
2024-01-15 16:52:18 +01:00
pennae
e1aa585964 slim down parser.y
most EvalState and Expr members defined here could be elsewhere, where
they'd be easier to maintain (not being embedded in a file with arcane
syntax) and *somewhat* more faithfully placed according to the path of
the file they're defined in.
2024-01-15 16:52:18 +01:00
Rebecca Turner
0fa08b4516
Unify and refactor value printing
Previously, there were two mostly-identical value printers -- one in
`libexpr/eval.cc` (which didn't force values) and one in
`libcmd/repl.cc` (which did force values and also printed ANSI color
codes).

This PR unifies both of these printers into `print.cc` and provides a
`PrintOptions` struct for controlling the output, which allows for
toggling whether values are forced, whether repeated values are tracked,
and whether ANSI color codes are displayed.

Additionally, `PrintOptions` allows tuning the maximum number of
attributes, list items, and bytes in a string that will be displayed;
this makes it ideal for contexts where printing too much output (e.g.
all of Nixpkgs) is distracting. (As requested by @roberth in
https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735)

Please read the tests for example output.

Future work:
- It would be nice to provide this function as a builtin, perhaps
  `builtins.toStringDebug` -- a printing function that never fails would
  be useful when debugging Nix code.
- It would be nice to support customizing `PrintOptions` members on the
  command line, e.g. `--option to-string-max-attrs 1000`.
2024-01-11 16:34:36 -08:00
Rebecca Turner
4feb7d9f71
Combine AbstractPos, PosAdapter, and Pos
Also move `SourcePath` into `libutil`.

These changes allow `error.hh` and `error.cc` to access source path and
position information, which we can use to produce better error messages
(for example, we could consider omitting filenames when two or more
consecutive stack frames originate from the same file).
2024-01-08 10:59:41 -08:00
Eelco Dolstra
315aade89d
Merge pull request #9681 from edolstra/eval-optimisations
Optimize empty list constants
2024-01-03 10:43:01 +01:00
Eelco Dolstra
3f796514b3 Optimize empty list constants
This avoids a Value allocation for empty list constants. During a `nix
search nixpkgs`, about 82% of all thunked lists are empty, so this
removes about 3 million Value allocations.

Performance comparison on `nix search github:NixOS/nixpkgs/e1fa12d4f6c6fe19ccb59cac54b5b3f25e160870 --no-eval-cache`:

maximum RSS:        median = 3845432.0000  mean = 3845432.0000  stddev =      0.0000  min = 3845432.0000  max = 3845432.0000  [rejected?, p=0.00000, Δ=-70084.00000±0.00000]
soft page faults:   median = 965395.0000  mean = 965394.6667  stddev =      1.1181  min = 965392.0000  max = 965396.0000  [rejected?, p=0.00000, Δ=-17929.77778±38.59610]
system CPU time:    median =      1.8029  mean =      1.7702  stddev =      0.0621  min =      1.6749  max =      1.8417  [rejected, p=0.00064, Δ=-0.12873±0.09905]
user CPU time:      median =     14.1022  mean =     14.0633  stddev =      0.1869  min =     13.8118  max =     14.3190  [not rejected, p=0.03006, Δ=-0.18248±0.24928]
elapsed time:       median =     15.8205  mean =     15.8618  stddev =      0.2312  min =     15.5033  max =     16.1670  [not rejected, p=0.00558, Δ=-0.28963±0.29434]
2024-01-02 12:49:11 +01:00
Robert Hensing
83f5622545
Merge pull request #9658 from pennae/env-diet
reduce the size of Env by one pointer
2023-12-31 13:57:16 +01:00
pennae
1fe66852ff reduce the size of Env by one pointer
since `up` and `values` are both pointer-aligned the type field will
also be pointer-aligned, wasting 48 bits of space on most machines. we
can get away with removing the type field altogether by encoding some
information into the `with` expr that created the env to begin with,
reducing the GC load for the absolutely massive amount of single-entry
envs we create for lambdas. this reduces memory usage of system eval by
quite a bit (reducing heap size of our system eval from 8.4GB to 8.23GB)
and gives similar savings in eval time.

running `nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'`

before:

  Time (mean ± σ):      5.576 s ±  0.003 s    [User: 5.197 s, System: 0.378 s]
  Range (min … max):    5.572 s …  5.581 s    10 runs

after:

  Time (mean ± σ):      5.408 s ±  0.002 s    [User: 5.019 s, System: 0.388 s]
  Range (min … max):    5.405 s …  5.411 s    10 runs
2023-12-30 18:55:13 +01:00