Commit graph

39 commits

Author SHA1 Message Date
Eelco Dolstra
61080554ab SymbolStr: Remove std::string conversion
This refactoring allows the symbol table to be stored as something
other than std::strings.
2024-07-11 17:43:10 +02:00
Eelco Dolstra
8c0590fa32 Never update values after setting the type
Thunks are now overwritten by a helper function
`Value::finishValue(newType, payload)` (where `payload` is the
original anonymous union inside `Value`). This helps to ensure we
never update a value elsewhere, since that would be incompatible with
parallel evaluation (i.e. after a value has transitioned from being a
thunk to being a non-thunk, it should be immutable).

There were two places where this happened: `Value::mkString()` and
`ExprAttrs::eval()`.

This PR also adds a bunch of accessor functions for value contents,
like `Value::integer()` to access the integer field in the union.
2024-03-25 19:21:25 +01:00
John Ericson
ac89bb064a Split up util.{hh,cc}
All OS and IO operations should be moved out, leaving only some misc
portable pure functions.

This is useful to avoid copious CPP when doing things like Windows and
Emscripten ports.

Newly exposed functions to break cycles:

 - `restoreSignals`
 - `updateWindowSize`
2023-11-05 12:20:02 -05:00
Tom Bereknyei
399ef84420 refactor: use string accessors
Create context, string_view, and c_str, accessors throughout in order to
better support improvements to the underlying string representation.
2023-09-27 00:33:01 -04:00
Eelco Dolstra
01232358ff Merge remote-tracking branch 'origin/master' into source-path 2023-04-24 13:20:36 +02:00
John Ericson
85f0cdc370 Use std::set<StringContextElem> not PathSet for string contexts
Motivation

`PathSet` is not correct because string contexts have other forms
(`Built` and `DrvDeep`) that are not rendered as plain store paths.
Instead of wrongly using `PathSet`, or "stringly typed" using
`StringSet`, use `std::std<StringContextElem>`.

-----

In support of this change, `NixStringContext` is now defined as
`std::std<StringContextElem>` not `std:vector<StringContextElem>`. The
old definition was just used by a `getContext` method which was only
used by the eval cache. It can be deleted altogether since the types are
now unified and the preexisting `copyContext` function already suffices.

Summarizing the previous paragraph:

Old:

  - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>`
  - `value.hh`: `NixStringContext Value::getContext(...)`
  - `value.hh`: `copyContext(...)`

New:

  - `value/context.hh`: `NixStringContext = std::set<StringContextElem>`
  - `value.hh`: `copyContext(...)`
----

The string representation of string context elements no longer contains
the store dir. The diff of `src/libexpr/tests/value/context.cc` should
make clear what the new representation is, so we recommend reviewing
that file first. This was done for two reasons:

Less API churn:

`Value::mkString` and friends did not take a `Store` before. But if
`NixStringContextElem::{parse, to_string}` *do* take a store (as they
did before), then we cannot have the `Value` functions use them (in
order to work with the fully-structured `NixStringContext`) without
adding that argument.

That would have been a lot of churn of threading the store, and this
diff is already large enough, so the easier and less invasive thing to
do was simply make the element `parse` and `to_string` functions not
take the `Store` reference, and the easiest way to do that was to simply
drop the store dir.

Space usage:

Dropping the `/nix/store/` (or similar) from the internal representation
will safe space in the heap of the Nix programming being interpreted. If
the heap contains many strings with non-trivial contexts, the saving
could add up to something significant.

----

The eval cache version is bumped.

The eval cache serialization uses `NixStringContextElem::{parse,
to_string}`, and since those functions are changed per the above, that
means the on-disk representation is also changed.

This is simply done by changing the name of the used for the eval cache
from `eval-cache-v4` to eval-cache-v5`.

----

To avoid some duplication `EvalCache::mkPathString` is added to abstract
over the simple case of turning a store path to a string with just that
string in the context.

Context

This PR picks up where #7543 left off. That one introduced the fully
structured `NixStringContextElem` data type, but kept `PathSet context`
as an awkward middle ground between internal `char[][]` interpreter heap
string contexts and `NixStringContext` fully parsed string contexts.

The infelicity of `PathSet context` was specifically called out during
Nix team group review, but it was agreeing that fixing it could be left
as future work. This is that future work.

A possible follow-up step would be to get rid of the `char[][]`
evaluator heap representation, too, but it is not yet clear how to do
that. To use `NixStringContextElem` there we would need to get the STL
containers to GC pointers in the GC build, and I am not sure how to do
that.

----

PR #7543 effectively is writing the inverse of a `mkPathString`,
`mkOutputString`, and one more such function for the `DrvDeep` case. I
would like that PR to have property tests ensuring it is actually the
inverse as expected.

This PR sets things up nicely so that reworking that PR to be in that
more elegant and better tested way is possible.

Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-04-21 01:05:49 -04:00
Eelco Dolstra
a9759407e5 Origin: Use SourcePath 2023-04-06 15:25:06 +02:00
Eelco Dolstra
94812cca98 Backport SourcePath from the lazy-trees branch
This introduces the SourcePath type from lazy-trees as an abstraction
for accessing files from inputs that may not be materialized in the
real filesystem (e.g. Git repositories). Currently, however, it's just
a wrapper around CanonPath, so it shouldn't change any behaviour. (On
lazy-trees, SourcePath is a <InputAccessor, CanonPath> tuple.)
2023-04-06 13:15:50 +02:00
Eelco Dolstra
29abc8e764 Remove FormatOrString and remaining uses of format() 2023-03-02 15:57:54 +01:00
Eelco Dolstra
b3fdab28a2 Introduce AbstractPos
This makes the position object used in exceptions abstract, with a
method getSource() to get the source code of the file in which the
error originated. This is needed for lazy trees because source files
don't necessarily exist in the filesystem, and we don't want to make
libutil depend on the InputAccessor type in libfetcher.
2022-12-13 00:50:43 +01:00
pennae
8775be3393 store Symbols in a table as well, like positions
this slightly increases the amount of memory used for any given symbol, but this
increase is more than made up for if the symbol is referenced more than once in
the EvalState that holds it. on average every symbol should be referenced at
least twice (once to introduce a binding, once to use it), so we expect no
increase in memory on average.

symbol tables are limited to 2³² entries like position tables, and similar
arguments apply to why overflow is not likely: 2³² symbols would require as many
string instances (at 24 bytes each) and map entries (at 24 bytes or more each,
assuming that the map holds on average at most one item per bucket as the docs
say). a full symbol table would require at least 192GB of memory just for
symbols, which is well out of reach. (an ofborg eval of nixpks today creates
less than a million symbols!)
2022-04-21 21:56:31 +02:00
pennae
6526d1676b replace most Pos objects/ptrs with indexes into a position table
Pos objects are somewhat wasteful as they duplicate the origin file name and
input type for each object. on files that produce more than one Pos when parsed
this a sizeable waste of memory (one pointer per Pos). the same goes for
ptr<Pos> on 64 bit machines: parsing enough source to require 8 bytes to locate
a position would need at least 8GB of input and 64GB of expression memory. it's
not likely that we'll hit that any time soon, so we can use a uint32_t index to
locate positions instead.
2022-04-21 21:46:06 +02:00
pennae
ff0fd91ed2 remove Symbol::empty
the only use of this function is to determine whether a lambda has a non-set
formal, but this use is arguably better served by Symbol::set and using a
non-Symbol instead of an empty symbol in the parser when no such formal is present.
2022-04-21 21:25:18 +02:00
Eelco Dolstra
df552ff53e Remove std::string alias (for real this time)
Also use std::string_view in a few more places.
2022-02-25 16:13:02 +01:00
pennae
7d4cc5515c defer formals duplicate check for incresed efficiency all round
if we defer the duplicate argument check for lambda formals we can use more
efficient data structures for the formals set, and we can get rid of the
duplication of formals names to boot. instead of a list of formals we've seen
and a set of names we'll keep a vector instead and run a sort+dupcheck step
before moving the parsed formals into a newly created lambda. this improves
performance on search and rebuild by ~1%, pure parsing gains more (about 4%).

this does reorder lambda arguments in the xml output, but the output is still
stable. this shouldn't be a problem since argument order is not semantically
important anyway.

 before

  nix search --no-eval-cache --offline ../nixpkgs hello
    Time (mean ± σ):      8.550 s ±  0.060 s    [User: 6.470 s, System: 1.664 s]
    Range (min … max):    8.435 s …  8.666 s    20 runs

  nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
    Time (mean ± σ):     346.7 ms ±   2.1 ms    [User: 312.4 ms, System: 34.2 ms]
    Range (min … max):   343.8 ms … 353.4 ms    20 runs

  nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
    Time (mean ± σ):      2.720 s ±  0.031 s    [User: 2.415 s, System: 0.231 s]
    Range (min … max):    2.662 s …  2.780 s    20 runs

 after

  nix search --no-eval-cache --offline ../nixpkgs hello
    Time (mean ± σ):      8.462 s ±  0.063 s    [User: 6.398 s, System: 1.661 s]
    Range (min … max):    8.339 s …  8.542 s    20 runs

  nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
    Time (mean ± σ):     329.1 ms ±   1.4 ms    [User: 296.8 ms, System: 32.3 ms]
    Range (min … max):   326.1 ms … 330.8 ms    20 runs

  nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
    Time (mean ± σ):      2.687 s ±  0.035 s    [User: 2.392 s, System: 0.228 s]
    Range (min … max):    2.626 s …  2.754 s    20 runs
2022-01-19 17:07:29 +01:00
Eelco Dolstra
b6c8e57056 Support range-based for loop over list values 2021-11-25 16:31:39 +01:00
Kevin Amado
d0e9e18489
toXML: display errors position
- This change applies to builtins.toXML and inner workings
- Proof of concept:
  ```nix
  let e = builtins.toXML e; in e
  ```
- Before:
  ```
  $ nix-instantiate --eval poc.nix
  error: infinite recursion encountered
  ```
- After:
  ```
  $ nix-instantiate --eval poc.nix
  error: infinite recursion encountered

       at /data/github/kamadorueda/nix/poc.nix:1:9:

            1| let e = builtins.toXML e; in e
             |
  ```
2021-11-13 20:33:34 -05:00
Andreas Rammhold
cae41eebff libexpr: remove matchAttrs boolean from ExprLambda
The boolean is only used to determine if the formals are set to a
non-null pointer in all our cases. We can get rid of that allocation and
instead just compare the pointer value with NULL. Saving up to
sizeof(bool) + platform specific alignment per ExprLambda instace.
Probably not a lot of memory but perhaps a few kilobyte with nixpkgs?

This also gets rid of a potential issue with dereferencing formals based on
the value of the boolean that didn't have to be aligned with the formals
pointer but was in all our cases.
2021-10-06 17:24:06 +02:00
Robert Hensing
f10465774f Force all Pos* to be non-null
This fixes a class of crashes and introduces ptr<T> to make the
code robust against this failure mode going forward.

Thanks regnat for the idea of a ref<T> without overhead!

Closes #4895
Closes #4893
Closes #5127
Closes #5113
2021-08-29 18:11:58 +02:00
Silvan Mosberger
12e65078ef
Rename Value::normalType() -> Value::type() 2020-12-17 14:45:45 +01:00
Silvan Mosberger
22ead43a0b
Use Value::normalType on all forced values instead of Value::type 2020-12-12 03:31:48 +01:00
Eelco Dolstra
99b73fb507
OCD performance fix: {find,count}+insert => insert 2019-10-09 16:06:29 +02:00
Christian Theune
14ebde5289 First hit at providing support for floats in the language. 2016-01-05 00:40:40 +01:00
Eelco Dolstra
b83801f8b3 Optimize small lists
The value pointers of lists with 1 or 2 elements are now stored in the
list value itself. In particular, this makes the "concatMap (x: if
cond then [(f x)] else [])" idiom cheaper.
2015-07-23 22:05:09 +02:00
Eelco Dolstra
6bd2c7bb38 OCD: foreach -> C++11 ranged for 2015-07-17 20:13:56 +02:00
Shea Levy
608110804c Make all ExternalValueBase functions const 2014-12-02 10:27:10 -05:00
Shea Levy
320659b0cd Allow external code using libnixexpr to add types
Code that links to libnixexpr (e.g. plugins loaded with importNative, or
nix-exec) may want to provide custom value types and operations on
values of those types. For example, nix-exec is currently using sets
where a custom IO value type would be more appropriate. This commit
provides a generic hook for such types in the form of tExternal and the
ExternalBase virtual class, which contains all functions necessary for
libnixexpr's type-polymorphic functions (e.g. `showType`) to be
implemented.
2014-12-02 10:27:04 -05:00
Eelco Dolstra
605b16cd7b Fix compilation on FreeBSD
http://hydra.nixos.org/build/2213576

Not sure why compilation doesn't fail on other platforms...
2012-03-05 22:04:40 +01:00
Eelco Dolstra
e0b7fb8f27 * Keep attribute sets in sorted order to speed up attribute lookups.
* Simplify the representation of attributes in the AST.
* Change the behaviour of listToAttrs() in case of duplicate names.
2010-10-24 19:52:33 +00:00
Eelco Dolstra
0b305c534f * Store attribute sets as a vector instead of a map (i.e. a red-black
tree).  This saves a lot of memory.  The vector should be sorted so
  that names can be looked up using binary search, but this is not the
  case yet.  (Surprisingly, looking up attributes using linear search
  doesn't have a big impact on performance.)

  Memory consumption for

    $ nix-instantiate /etc/nixos/nixos/tests -A bittorrent.test --readonly-mode

  on x86_64-linux with GC enabled is now 185 MiB (compared to 946
  MiB on the trunk).
2010-10-24 00:41:29 +00:00
Eelco Dolstra
41c45a9b31 * Store Value nodes outside of attribute sets. I.e., Attr now stores
a pointer to a Value, rather than the Value directly.  This improves
  the effectiveness of garbage collection a lot: if the Value is
  stored inside the set directly, then any live pointer to the Value
  causes all other attributes in the set to be live as well.
2010-10-22 14:47:42 +00:00
Eelco Dolstra
f16fe2af8d * builtins.toXML: propagate the string context. This is a regression
from the old ATerm-based evaluator.
2010-06-10 10:29:50 +00:00
Eelco Dolstra
1a8eb6e3ec 2010-05-07 15:26:33 +00:00
Eelco Dolstra
83dfa89870 * Sync with the trunk. 2010-05-07 14:46:47 +00:00
Eelco Dolstra
e2d5e40f4f * Keep track of the source positions of attributes. 2010-05-07 12:11:05 +00:00
Eelco Dolstra
04c4bd3624 * Store lists as lists of pointers to values rather than as lists of
values.  This improves sharing and gives another speed up.
  Evaluation of the NixOS system attribute is now almost 7 times
  faster than the old evaluator.
2010-04-15 00:37:36 +00:00
Eelco Dolstra
ac1e8f40d4 * Use a symbol table to represent identifiers and attribute names
efficiently.  The symbol table ensures that there is only one copy
  of each symbol, thus allowing symbols to be compared efficiently
  using a pointer equality test.
2010-04-13 12:25:42 +00:00
Eelco Dolstra
4d6ad5be17 * Don't use ATerms for the abstract syntax trees anymore. Not
finished yet.
2010-04-12 18:30:11 +00:00
Eelco Dolstra
9a64454faa * expr-to-xml -> value-to-xml. 2010-04-07 13:59:45 +00:00
Renamed from src/libexpr/expr-to-xml.cc (Browse further)