nix-super/src/libexpr/value.hh

518 lines
13 KiB
C++
Raw Normal View History

#pragma once
///@file
#include <cassert>
#include <span>
#include "symbol-table.hh"
#include "value/context.hh"
#include "source-path.hh"
#include "print-options.hh"
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. Fixes: https://github.com/NixOS/nix/issues/10968 Original-CL: https://gerrit.lix.systems/c/lix/+/1596 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 17:22:34 +03:00
#include "checked-arithmetic.hh"
2016-08-30 14:12:12 +03:00
#if HAVE_BOEHMGC
#include <gc/gc_allocator.h>
#else
template<typename T>
using traceable_allocator = std::allocator<T>;
2016-08-30 14:12:12 +03:00
#endif
#include <nlohmann/json_fwd.hpp>
2016-08-30 14:12:12 +03:00
namespace nix {
struct Value;
class BindingsBuilder;
typedef enum {
2024-04-20 15:15:05 +03:00
tUninitialized = 0,
tInt = 1,
tBool,
tString,
tPath,
tNull,
tAttrs,
tList1,
tList2,
tListN,
tThunk,
tApp,
tLambda,
tPrimOp,
tPrimOpApp,
tExternal,
tFloat
} InternalType;
/**
* This type abstracts over all actual value types in the language,
* grouping together implementation details like tList*, different function
* types, and types in non-normal form (so thunks and co.)
*/
typedef enum {
nThunk,
nInt,
nFloat,
nBool,
nString,
nPath,
nNull,
nAttrs,
nList,
nFunction,
nExternal
} ValueType;
2014-01-21 19:29:55 +02:00
class Bindings;
struct Env;
struct Expr;
struct ExprLambda;
struct ExprBlackHole;
struct PrimOp;
2014-01-21 19:29:55 +02:00
class Symbol;
class PosIdx;
struct Pos;
class StorePath;
class EvalState;
class XMLWriter;
class Printer;
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. Fixes: https://github.com/NixOS/nix/issues/10968 Original-CL: https://gerrit.lix.systems/c/lix/+/1596 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 17:22:34 +03:00
using NixInt = checked::Checked<int64_t>;
using NixFloat = double;
/**
* External values must descend from ExternalValueBase, so that
* type-agnostic nix functions (e.g. showType) can be implemented
*/
class ExternalValueBase
{
friend std::ostream & operator << (std::ostream & str, const ExternalValueBase & v);
friend class Printer;
protected:
/**
* Print out the value
*/
virtual std::ostream & print(std::ostream & str) const = 0;
public:
/**
* Return a simple string describing the type
*/
2022-02-21 17:37:25 +02:00
virtual std::string showType() const = 0;
/**
* Return a string to be used in builtins.typeOf
*/
2022-02-21 17:37:25 +02:00
virtual std::string typeOf() const = 0;
/**
* Coerce the value to a string. Defaults to uncoercable, i.e. throws an
2022-02-21 17:37:25 +02:00
* error.
*/
virtual std::string coerceToString(EvalState & state, const PosIdx & pos, NixStringContext & context, bool copyMore, bool copyToStore) const;
/**
* Compare to another value of the same type. Defaults to uncomparable,
* i.e. always false.
*/
virtual bool operator ==(const ExternalValueBase & b) const noexcept;
/**
* Print the value as JSON. Defaults to unconvertable, i.e. throws an error
*/
virtual nlohmann::json printValueAsJSON(EvalState & state, bool strict,
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 03:31:10 +02:00
NixStringContext & context, bool copyToStore = true) const;
/**
* Print the value as XML. Defaults to unevaluated
*/
virtual void printValueAsXML(EvalState & state, bool strict, bool location,
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 03:31:10 +02:00
XMLWriter & doc, NixStringContext & context, PathSet & drvsSeen,
const PosIdx pos) const;
virtual ~ExternalValueBase()
{
};
};
std::ostream & operator << (std::ostream & str, const ExternalValueBase & v);
class ListBuilder
{
const size_t size;
Value * inlineElems[2] = {nullptr, nullptr};
public:
Value * * elems;
ListBuilder(EvalState & state, size_t size);
ListBuilder(ListBuilder && x)
: size(x.size)
, inlineElems{x.inlineElems[0], x.inlineElems[1]}
, elems(size <= 2 ? inlineElems : x.elems)
{ }
Value * & operator [](size_t n)
{
return elems[n];
}
typedef Value * * iterator;
iterator begin() { return &elems[0]; }
iterator end() { return &elems[size]; }
friend struct Value;
};
struct Value
{
private:
2024-04-20 15:15:05 +03:00
InternalType internalType = tUninitialized;
friend std::string showType(const Value & v);
public:
void print(EvalState &state, std::ostream &str, PrintOptions options = PrintOptions {});
// Functions needed to distinguish the type
// These should be removed eventually, by putting the functionality that's
// needed by callers into methods of this type
// type() == nThunk
inline bool isThunk() const { return internalType == tThunk; };
inline bool isApp() const { return internalType == tApp; };
inline bool isBlackhole() const;
// type() == nFunction
inline bool isLambda() const { return internalType == tLambda; };
inline bool isPrimOp() const { return internalType == tPrimOp; };
inline bool isPrimOpApp() const { return internalType == tPrimOpApp; };
/**
* Strings in the evaluator carry a so-called `context` which
* is a list of strings representing store paths. This is to
* allow users to write things like
*
* "--with-freetype2-library=" + freetype + "/lib"
*
* where `freetype` is a derivation (or a source to be copied
* to the store). If we just concatenated the strings without
* keeping track of the referenced store paths, then if the
* string is used as a derivation attribute, the derivation
* will not have the correct dependencies in its inputDrvs and
* inputSrcs.
* The semantics of the context is as follows: when a string
* with context C is used as a derivation attribute, then the
* derivations in C will be added to the inputDrvs of the
* derivation, and the other store paths in C will be added to
* the inputSrcs of the derivations.
* For canonicity, the store paths should be in sorted order.
*/
struct StringWithContext {
const char * c_str;
const char * * context; // must be in sorted order
};
2023-11-12 04:03:37 +02:00
struct Path {
SourceAccessor * accessor;
2023-11-12 04:03:37 +02:00
const char * path;
};
2023-11-12 04:06:04 +02:00
struct ClosureThunk {
Env * env;
Expr * expr;
};
struct FunctionApplicationThunk {
Value * left, * right;
};
2023-11-12 04:07:32 +02:00
struct Lambda {
Env * env;
ExprLambda * fun;
};
using Payload = union
{
NixInt integer;
bool boolean;
StringWithContext string;
Path path;
Bindings * attrs;
struct {
2018-05-02 14:56:34 +03:00
size_t size;
Value * const * elems;
} bigList;
Value * smallList[2];
2023-11-12 04:06:04 +02:00
ClosureThunk thunk;
FunctionApplicationThunk app;
2023-11-12 04:07:32 +02:00
Lambda lambda;
PrimOp * primOp;
FunctionApplicationThunk primOpApp;
ExternalValueBase * external;
NixFloat fpoint;
};
Payload payload;
/**
* Returns the normal type of a Value. This only returns nThunk if
* the Value hasn't been forceValue'd
*
* @param invalidIsThunk Instead of aborting an an invalid (probably
* 0, so uninitialized) internal type, return `nThunk`.
*/
inline ValueType type(bool invalidIsThunk = false) const
{
switch (internalType) {
2024-04-20 15:15:05 +03:00
case tUninitialized: break;
case tInt: return nInt;
case tBool: return nBool;
case tString: return nString;
case tPath: return nPath;
case tNull: return nNull;
case tAttrs: return nAttrs;
case tList1: case tList2: case tListN: return nList;
case tLambda: case tPrimOp: case tPrimOpApp: return nFunction;
case tExternal: return nExternal;
case tFloat: return nFloat;
case tThunk: case tApp: return nThunk;
}
if (invalidIsThunk)
return nThunk;
else
2024-07-20 23:46:09 +03:00
unreachable();
}
inline void finishValue(InternalType newType, Payload newPayload)
{
payload = newPayload;
internalType = newType;
}
2024-04-20 15:15:05 +03:00
/**
* A value becomes valid when it is initialized. We don't use this
* in the evaluator; only in the bindings, where the slight extra
* cost is warranted because of inexperienced callers.
*/
inline bool isValid() const
{
2024-04-20 15:15:05 +03:00
return internalType != tUninitialized;
}
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. Fixes: https://github.com/NixOS/nix/issues/10968 Original-CL: https://gerrit.lix.systems/c/lix/+/1596 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 17:22:34 +03:00
inline void mkInt(NixInt::Inner n)
{
mkInt(NixInt{n});
}
inline void mkInt(NixInt n)
{
finishValue(tInt, { .integer = n });
}
inline void mkBool(bool b)
{
finishValue(tBool, { .boolean = b });
}
inline void mkString(const char * s, const char * * context = 0)
{
finishValue(tString, { .string = { .c_str = s, .context = context } });
}
void mkString(std::string_view s);
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 03:31:10 +02:00
void mkString(std::string_view s, const NixStringContext & context);
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 03:31:10 +02:00
void mkStringMove(const char * s, const NixStringContext & context);
2024-08-13 05:18:14 +03:00
inline void mkString(const SymbolStr & s)
{
2024-08-13 05:18:14 +03:00
mkString(s.c_str());
}
void mkPath(const SourcePath & path);
2023-07-14 16:53:30 +03:00
void mkPath(std::string_view path);
inline void mkPath(SourceAccessor * accessor, const char * path)
{
finishValue(tPath, { .path = { .accessor = accessor, .path = path } });
}
inline void mkNull()
{
finishValue(tNull, {});
}
inline void mkAttrs(Bindings * a)
{
finishValue(tAttrs, { .attrs = a });
}
Value & mkAttrs(BindingsBuilder & bindings);
void mkList(const ListBuilder & builder)
{
if (builder.size == 1)
finishValue(tList1, { .smallList = { builder.inlineElems[0] } });
else if (builder.size == 2)
finishValue(tList2, { .smallList = { builder.inlineElems[0], builder.inlineElems[1] } });
else
finishValue(tListN, { .bigList = { .size = builder.size, .elems = builder.elems } });
}
inline void mkThunk(Env * e, Expr * ex)
{
finishValue(tThunk, { .thunk = { .env = e, .expr = ex } });
}
inline void mkApp(Value * l, Value * r)
{
finishValue(tApp, { .app = { .left = l, .right = r } });
}
inline void mkLambda(Env * e, ExprLambda * f)
{
finishValue(tLambda, { .lambda = { .env = e, .fun = f } });
}
inline void mkBlackhole();
2023-11-16 12:10:25 +02:00
void mkPrimOp(PrimOp * p);
inline void mkPrimOpApp(Value * l, Value * r)
{
finishValue(tPrimOpApp, { .primOpApp = { .left = l, .right = r } });
}
/**
* For a `tPrimOpApp` value, get the original `PrimOp` value.
*/
const PrimOp * primOpAppPrimOp() const;
inline void mkExternal(ExternalValueBase * e)
{
finishValue(tExternal, { .external = e });
}
inline void mkFloat(NixFloat n)
{
finishValue(tFloat, { .fpoint = n });
}
bool isList() const
{
return internalType == tList1 || internalType == tList2 || internalType == tListN;
}
Value * const * listElems()
{
return internalType == tList1 || internalType == tList2 ? payload.smallList : payload.bigList.elems;
}
std::span<Value * const> listItems() const
{
assert(isList());
return std::span<Value * const>(listElems(), listSize());
}
Value * const * listElems() const
{
return internalType == tList1 || internalType == tList2 ? payload.smallList : payload.bigList.elems;
}
2018-05-02 14:56:34 +03:00
size_t listSize() const
{
return internalType == tList1 ? 1 : internalType == tList2 ? 2 : payload.bigList.size;
}
PosIdx determinePos(const PosIdx pos) const;
/**
* Check whether forcing this value requires a trivial amount of
* computation. In particular, function applications are
* non-trivial.
*/
bool isTrivial() const;
2020-06-29 20:08:37 +03:00
SourcePath path() const
{
assert(internalType == tPath);
2023-11-30 17:44:54 +02:00
return SourcePath(
ref(payload.path.accessor->shared_from_this()),
CanonPath(CanonPath::unchecked_t(), payload.path.path));
}
std::string_view string_view() const
{
assert(internalType == tString);
return std::string_view(payload.string.c_str);
}
const char * c_str() const
{
assert(internalType == tString);
return payload.string.c_str;
}
const char * * context() const
{
return payload.string.context;
}
ExternalValueBase * external() const
{ return payload.external; }
const Bindings * attrs() const
{ return payload.attrs; }
const PrimOp * primOp() const
{ return payload.primOp; }
bool boolean() const
{ return payload.boolean; }
NixInt integer() const
{ return payload.integer; }
NixFloat fpoint() const
{ return payload.fpoint; }
};
extern ExprBlackHole eBlackHole;
bool Value::isBlackhole() const
{
return internalType == tThunk && payload.thunk.expr == (Expr*) &eBlackHole;
}
void Value::mkBlackhole()
{
mkThunk(nullptr, (Expr *) &eBlackHole);
}
2022-05-25 16:49:41 +03:00
typedef std::vector<Value *, traceable_allocator<Value *>> ValueVector;
2024-06-06 20:32:46 +03:00
typedef std::unordered_map<Symbol, Value *, std::hash<Symbol>, std::equal_to<Symbol>, traceable_allocator<std::pair<const Symbol, Value *>>> ValueMap;
2022-05-25 16:49:41 +03:00
typedef std::map<Symbol, ValueVector, std::less<Symbol>, traceable_allocator<std::pair<const Symbol, ValueVector>>> ValueVectorMap;
/**
* A value allocated in traceable memory.
*/
typedef std::shared_ptr<Value *> RootValue;
RootValue allocRootValue(Value * v);
}