Commit graph

91 commits

Author SHA1 Message Date
Jade Lovelace
473d2d56fc Remove 100s of CPU time (10%) from build times (1465s -> 1302s)
Result's from Mic92's framework 13th Gen Intel Core i7-1360P:

Before: 3595.92s user 183.01s system 1360% cpu 4:37.74 total
After: 3486.07s user 168.93s system 1354% cpu 4:29.79 total

I saw that boost/lexical_cast was costing about 100s in CPU time on our
compiles. We can fix this trivially by doing explicit template
instantiation in exactly one place and eliminating all other includes of
it, which is a code improvement anyway by hiding the boost.

Before:
```
lix/lix2 » ClangBuildAnalyzer --analyze buildtimeold.bin
Analyzing build trace from 'buildtimeold.bin'...
**** Time summary:
Compilation (551 times):
  Parsing (frontend):         1465.3 s
  Codegen & opts (backend):   1110.9 s

<snip>

**** Expensive headers:
178153 ms: ../src/libcmd/installable-value.hh (included 52 times, avg 3426 ms), included via:
  40x: command.hh
  5x: command-installable-value.hh
  3x: installable-flake.hh
  2x: <direct include>
  2x: installable-attr-path.hh

176217 ms: ../src/libutil/error.hh (included 246 times, avg 716 ms), included via:
  36x: command.hh installable-value.hh installables.hh derived-path.hh config.hh experimental-features.hh
  12x: globals.hh config.hh experimental-features.hh
  11x: file-system.hh file-descriptor.hh
  6x: serialise.hh strings.hh
  6x: <direct include>
  6x: archive.hh serialise.hh strings.hh
  ...

173243 ms: ../src/libstore/store-api.hh (included 152 times, avg 1139 ms), included via:
  55x: <direct include>
  39x: command.hh installable-value.hh installables.hh
  7x: libexpr.hh
  4x: local-store.hh
  4x: command-installable-value.hh installable-value.hh installables.hh
  3x: binary-cache-store.hh
  ...

170482 ms: ../src/libutil/serialise.hh (included 201 times, avg 848 ms), included via:
  37x: command.hh installable-value.hh installables.hh built-path.hh realisation.hh hash.hh
  14x: store-api.hh nar-info.hh hash.hh
  11x: <direct include>
  7x: primops.hh eval.hh attr-set.hh nixexpr.hh value.hh source-path.hh archive.hh
  7x: libexpr.hh value.hh source-path.hh archive.hh
  6x: fetchers.hh hash.hh
  ...

169397 ms: ../src/libcmd/installables.hh (included 53 times, avg 3196 ms), included via:
  40x: command.hh installable-value.hh
  5x: command-installable-value.hh installable-value.hh
  3x: installable-flake.hh installable-value.hh
  2x: <direct include>
  1x: installable-derived-path.hh
  1x: installable-value.hh
  ...

159740 ms: ../src/libutil/strings.hh (included 221 times, avg 722 ms), included via:
  37x: command.hh installable-value.hh installables.hh built-path.hh realisation.hh hash.hh serialise.hh
  19x: <direct include>
  14x: store-api.hh nar-info.hh hash.hh serialise.hh
  11x: serialise.hh
  7x: primops.hh eval.hh attr-set.hh nixexpr.hh value.hh source-path.hh archive.hh serialise.hh
  7x: libexpr.hh value.hh source-path.hh archive.hh serialise.hh
  ...

156796 ms: ../src/libcmd/command.hh (included 51 times, avg 3074 ms), included via:
  42x: <direct include>
  7x: command-installable-value.hh
  2x: installable-attr-path.hh

150392 ms: ../src/libutil/types.hh (included 251 times, avg 599 ms), included via:
  36x: command.hh installable-value.hh installables.hh path.hh
  11x: file-system.hh
  10x: globals.hh
  6x: fetchers.hh
  6x: serialise.hh strings.hh error.hh
  5x: archive.hh
  ...

133101 ms: /nix/store/644b90j1vms44nr18yw3520pzkrg4dd1-boost-1.81.0-dev/include/boost/lexical_cast.hpp (included 226 times, avg 588 ms), included via
:
  37x: command.hh installable-value.hh installables.hh built-path.hh realisation.hh hash.hh serialise.hh strings.hh
  19x: file-system.hh
  11x: store-api.hh nar-info.hh hash.hh serialise.hh strings.hh
  7x: primops.hh eval.hh attr-set.hh nixexpr.hh value.hh source-path.hh archive.hh serialise.hh strings.hh
  7x: libexpr.hh value.hh source-path.hh archive.hh serialise.hh strings.hh
  6x: eval.hh attr-set.hh nixexpr.hh value.hh source-path.hh archive.hh serialise.hh strings.hh
  ...

132887 ms: /nix/store/h2abv2l8irqj942i5rq9wbrj42kbsh5y-gcc-12.3.0/include/c++/12.3.0/memory (included 262 times, avg 507 ms), included via:
  36x: command.hh installable-value.hh installables.hh path.hh types.hh ref.hh
  16x: gtest.h
  11x: file-system.hh types.hh ref.hh
  10x: globals.hh types.hh ref.hh
  10x: json.hpp
  6x: serialise.hh
  ...

  done in 0.6s.
```

After:
```
lix/lix2 » maintainers/buildtime_report.sh build
Processing all files and saving to '/home/jade/lix/lix2/maintainers/../buildtime.bin'...
  done in 0.6s. Run 'ClangBuildAnalyzer --analyze /home/jade/lix/lix2/maintainers/../buildtime.bin' to analyze it.
Analyzing build trace from '/home/jade/lix/lix2/maintainers/../buildtime.bin'...
**** Time summary:
Compilation (551 times):
  Parsing (frontend):         1302.1 s
  Codegen & opts (backend):    956.3 s

<snip>

**** Expensive headers:
178145 ms: ../src/libutil/error.hh (included 246 times, avg 724 ms), included via:
  36x: command.hh installable-value.hh installables.hh derived-path.hh config.hh experimental-features.hh
  12x: globals.hh config.hh experimental-features.hh
  11x: file-system.hh file-descriptor.hh
  6x: <direct include>
  6x: serialise.hh strings.hh
  6x: fetchers.hh hash.hh serialise.hh strings.hh
  ...

154043 ms: ../src/libcmd/installable-value.hh (included 52 times, avg 2962 ms), included via:
  40x: command.hh
  5x: command-installable-value.hh
  3x: installable-flake.hh
  2x: <direct include>
  2x: installable-attr-path.hh

153593 ms: ../src/libstore/store-api.hh (included 152 times, avg 1010 ms), included via:
  55x: <direct include>
  39x: command.hh installable-value.hh installables.hh
  7x: libexpr.hh
  4x: local-store.hh
  4x: command-installable-value.hh installable-value.hh installables.hh
  3x: binary-cache-store.hh
  ...

149948 ms: ../src/libutil/types.hh (included 251 times, avg 597 ms), included via:
  36x: command.hh installable-value.hh installables.hh path.hh
  11x: file-system.hh
  10x: globals.hh
  6x: fetchers.hh
  6x: serialise.hh strings.hh error.hh
  5x: archive.hh
  ...

144560 ms: ../src/libcmd/installables.hh (included 53 times, avg 2727 ms), included via:
  40x: command.hh installable-value.hh
  5x: command-installable-value.hh installable-value.hh
  3x: installable-flake.hh installable-value.hh
  2x: <direct include>
  1x: installable-value.hh
  1x: installable-derived-path.hh
  ...

136585 ms: ../src/libcmd/command.hh (included 51 times, avg 2678 ms), included via:
  42x: <direct include>
  7x: command-installable-value.hh
  2x: installable-attr-path.hh

133394 ms: /nix/store/h2abv2l8irqj942i5rq9wbrj42kbsh5y-gcc-12.3.0/include/c++/12.3.0/memory (included 262 times, avg 509 ms), included via:
  36x: command.hh installable-value.hh installables.hh path.hh types.hh ref.hh
  16x: gtest.h
  11x: file-system.hh types.hh ref.hh
  10x: globals.hh types.hh ref.hh
  10x: json.hpp
  6x: serialise.hh
  ...

89315 ms: ../src/libstore/derived-path.hh (included 178 times, avg 501 ms), included via:
  37x: command.hh installable-value.hh installables.hh
  25x: store-api.hh realisation.hh
  7x: primops.hh eval.hh attr-set.hh nixexpr.hh value.hh context.hh
  6x: eval.hh attr-set.hh nixexpr.hh value.hh context.hh
  6x: libexpr.hh value.hh context.hh
  6x: shared.hh
  ...

87347 ms: /nix/store/h2abv2l8irqj942i5rq9wbrj42kbsh5y-gcc-12.3.0/include/c++/12.3.0/ostream (included 273 times, avg 319 ms), included via:
  35x: command.hh installable-value.hh installables.hh path.hh types.hh ref.hh memory unique_ptr.h
  12x: regex sstream istream
  10x: file-system.hh types.hh ref.hh memory unique_ptr.h
  10x: gtest.h memory unique_ptr.h
  10x: globals.hh types.hh ref.hh memory unique_ptr.h
  6x: fetchers.hh types.hh ref.hh memory unique_ptr.h
  ...

85249 ms: ../src/libutil/config.hh (included 213 times, avg 400 ms), included via:
  37x: command.hh installable-value.hh installables.hh derived-path.hh
  20x: globals.hh
  20x: logging.hh
  16x: store-api.hh logging.hh
  6x: <direct include>
  6x: eval.hh attr-set.hh nixexpr.hh value.hh context.hh derived-path.hh
  ...

  done in 0.5s.
```

Adapated from 18aa3e1d57
2024-05-31 13:00:09 +02:00
pennae
5d9fdab3de use byte indexed locations for PosIdx
we now keep not a table of all positions, but a table of all origins and
their sizes. position indices are now direct pointers into the virtual
concatenation of all parsed contents. this slightly reduces memory usage
and time spent in the parser, at the cost of not being able to report
positions if the total input size exceeds 4GiB. this limit is not unique
to nix though, rustc and clang also limit their input to 4GiB (although
at least clang refuses to process inputs that are larger, we will not).

this new 4GiB limit probably will not cause any problems for quite a
while, all of nixpkgs together is less than 100MiB in size and already
needs over 700MiB of memory and multiple seconds just to parse. 4GiB
worth of input will easily take multiple minutes and over 30GiB of
memory without even evaluating anything. if problems *do* arise we can
probably recover the old table-based system by adding some tracking to
Pos::Origin (or increasing the size of PosIdx outright), but for time
being this looks like more complexity than it's worth.

since we now need to read the entire input again to determine the
line/column of a position we'll make unsafeGetAttrPos slightly lazy:
mostly the set it returns is only used to determine the file of origin
of an attribute, not its exact location. the thunks do not add
measurable runtime overhead.

notably this change is necessary to allow changing the parser since
apparently nothing supports nix's very idiosyncratic line ending choice
of "anything goes", making it very hard to calculate line/column
positions in the parser (while byte offsets are very easy).
2024-03-06 23:48:42 +01:00
Jade Lovelace
a82aeedb5b Warn on implicit switch case fallthrough
This seems to have found one actual bug in fs-sink.cc: the symlink case
was falling into the regular file case, which can't possibly be
intentional, right?
2024-02-24 15:52:16 -08:00
Rebecca Turner
c0e7f50c1a
Rename hintfmt to HintFmt 2024-02-08 11:58:25 -08:00
Rebecca Turner
c6a89c1a16
libexpr: Support structured error classes
While preparing PRs like #9753, I've had to change error messages in
dozens of code paths. It would be nice if instead of

    EvalError("expected 'boolean' but found '%1%'", showType(v))

we could write

    TypeError(v, "boolean")

or similar. Then, changing the error message could be a mechanical
refactor with the compiler pointing out places the constructor needs to
be changed, rather than the error-prone process of grepping through the
codebase. Structured errors would also help prevent the "same" error
from having multiple slightly different messages, and could be a first
step towards error codes / an error index.

This PR reworks the exception infrastructure in `libexpr` to
support exception types with different constructor signatures than
`BaseError`. Actually refactoring the exceptions to use structured data
will come in a future PR (this one is big enough already, as it has to
touch every exception in `libexpr`).

The core design is in `eval-error.hh`. Generally, errors like this:

    state.error("'%s' is not a string", getAttrPathStr())
      .debugThrow<TypeError>()

are transformed like this:

    state.error<TypeError>("'%s' is not a string", getAttrPathStr())
      .debugThrow()

The type annotation has moved from `ErrorBuilder::debugThrow` to
`EvalState::error`.
2024-02-01 16:39:38 -08:00
pennae
b596cc9e79 decouple parser and EvalState
there's no reason the parser itself should be doing semantic analysis
like bindVars. split this bit apart (retaining the previous name in
EvalState) and have the parser really do *only* parsing, decoupled from
EvalState.
2024-01-15 16:52:18 +01:00
pennae
835a6c7bcf rename ParserState::{makeCurPos -> at}
most instances of this being used do not refer to the "current"
position, sometimes not even to one reasonably close by. it could also
be called `makePos` instead, but `at` seems clear in context.
2024-01-15 16:52:18 +01:00
pennae
0076056164 move ParseData to own header, rename to ParserState
ParserState better describes what this struct really is. the parser
really does modify its state (most notably position and symbol tables),
so calling it that rather than obliquely "data" (which implies being
input only) makes sense.
2024-01-15 16:52:18 +01:00
John Ericson
e739a5002d Avoid Windows macros in the parser and lexer
`FLOAT`, `INT`, and `IN` are identifers taken by macros.

The name `IN_KW` is chosen to match `OR_KW`, which is presumably named
that way for the same reason of dodging macros.
2024-01-12 19:51:36 -05:00
pennae
b78e77b34c use custom location type in the parser
~1% parser speedup from not using TLS indirections, less on system eval.
this could have also gone in flex yyextra data, but that's significantly
slower for some reason (albeit still faster than thread locals).

before:

  Time (mean ± σ):      4.231 s ±  0.004 s    [User: 3.725 s, System: 0.504 s]
  Range (min … max):    4.226 s …  4.240 s    10 runs

after:

  Time (mean ± σ):      4.224 s ±  0.005 s    [User: 3.711 s, System: 0.512 s]
  Range (min … max):    4.218 s …  4.234 s    10 runs
2023-12-19 19:32:16 +01:00
pennae
2e0321912a use aligned flex tables
~2% speedup on parsing without eval, less (but still significant) on
system eval. having flex generate faster parsers leads to very strange
misparses. maybe re2c is worth investigating.

before:

  Time (mean ± σ):      4.260 s ±  0.003 s    [User: 3.754 s, System: 0.505 s]
  Range (min … max):    4.257 s …  4.266 s    10 runs

after:

  Time (mean ± σ):      4.231 s ±  0.004 s    [User: 3.725 s, System: 0.504 s]
  Range (min … max):    4.226 s …  4.240 s    10 runs
2023-12-19 19:32:16 +01:00
Yingchi Long
3c90340fe6 libexpr: use thread_local to make the parser thread-safe
If we call `adjustLoc`, the global variable `prev_yylloc` is shared
between threads and racy.

Currently, nix itself does not concurrently parsing files, but this is
helpful for libexpr users. (The parser is thread-safe except this.)
2023-07-03 16:05:43 +08:00
Eelco Dolstra
27ebb97d0a
Handle EOFs in string literals correctly
We can't return a STR token without setting a valid StringToken,
otherwise the parser will crash.

Fixes #6562.
2022-05-25 17:58:13 +02:00
pennae
6526d1676b replace most Pos objects/ptrs with indexes into a position table
Pos objects are somewhat wasteful as they duplicate the origin file name and
input type for each object. on files that produce more than one Pos when parsed
this a sizeable waste of memory (one pointer per Pos). the same goes for
ptr<Pos> on 64 bit machines: parsing enough source to require 8 bytes to locate
a position would need at least 8GB of input and 64GB of expression memory. it's
not likely that we'll hit that any time soon, so we can use a uint32_t index to
locate positions instead.
2022-04-21 21:46:06 +02:00
Sergei Trofimovich
9174d884d7 lexer: add error location to lexer errors
Before the change lexter errors did not report the location:

    $ nix build -f. mc
    error: path has a trailing slash
    (use '--show-trace' to show detailed location information)

Note that it's not clear what file generates the error.

After the change location is reported:

    $ src/nix/nix --extra-experimental-features nix-command build -f ~/nm mc
    error: path has a trailing slash

           at .../pkgs/development/libraries/glib/default.nix:54:18:

               53|   };
               54|   src = /tmp/foo/;
                 |                  ^
               55|
    (use '--show-trace' to show detailed location information)

Here we see both problematic file and the string itself.
2022-03-24 08:16:14 +00:00
pennae
0a7746603e remove ExprIndStr
it can be replaced with StringToken if we add another bit if information to
StringToken, namely whether this string should take part in indentation scanning
or not. since all escaping terminates indentation scanning we need to set this
bit only for the non-escaped IND_STRING rule.

this improves performance by about 1%.

 before

  nix search --no-eval-cache --offline ../nixpkgs hello
    Time (mean ± σ):      8.880 s ±  0.048 s    [User: 6.809 s, System: 1.643 s]
    Range (min … max):    8.781 s …  8.993 s    20 runs

  nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
    Time (mean ± σ):     375.0 ms ±   2.2 ms    [User: 339.8 ms, System: 35.2 ms]
    Range (min … max):   371.5 ms … 379.3 ms    20 runs

  nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
    Time (mean ± σ):      2.831 s ±  0.040 s    [User: 2.536 s, System: 0.225 s]
    Range (min … max):    2.769 s …  2.912 s    20 runs

 after

  nix search --no-eval-cache --offline ../nixpkgs hello
    Time (mean ± σ):      8.832 s ±  0.048 s    [User: 6.757 s, System: 1.657 s]
    Range (min … max):    8.743 s …  8.921 s    20 runs

  nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
    Time (mean ± σ):     367.4 ms ±   3.2 ms    [User: 332.7 ms, System: 34.7 ms]
    Range (min … max):   364.6 ms … 374.6 ms    20 runs

  nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
    Time (mean ± σ):      2.810 s ±  0.030 s    [User: 2.517 s, System: 0.225 s]
    Range (min … max):    2.742 s …  2.854 s    20 runs
2022-01-19 13:39:42 +01:00
pennae
72f42093e7 optimize unescapeStr
mainly to avoid an allocation and a copy of a string that can be
modified in place (ever since EvalState holds on to the buffer, not the
generated parser itself).

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     571.7 ms ±   2.4 ms    [User: 563.3 ms, System: 8.0 ms]
  Range (min … max):   566.7 ms … 579.7 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     376.6 ms ±   1.0 ms    [User: 345.8 ms, System: 30.5 ms]
  Range (min … max):   374.5 ms … 379.1 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.922 s ±  0.006 s    [User: 2.707 s, System: 0.215 s]
  Range (min … max):    2.906 s …  2.934 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     570.4 ms ±   2.8 ms    [User: 561.3 ms, System: 8.6 ms]
  Range (min … max):   564.6 ms … 578.1 ms    50 runs

Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
  Time (mean ± σ):     375.4 ms ±   1.3 ms    [User: 343.2 ms, System: 31.7 ms]
  Range (min … max):   373.4 ms … 378.2 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.925 s ±  0.006 s    [User: 2.704 s, System: 0.219 s]
  Range (min … max):    2.910 s …  2.942 s    50 runs
2022-01-13 18:06:15 +01:00
pennae
61a9d16d5c don't strdup tokens in the lexer
every stringy token the lexer returns is turned into a Symbol and not
used further, so we don't have to strdup. using a string_view is
sufficient, but due to limitations of the current parser we have to use
a POD type that holds the same information.

gives ~2% on system build, 6% on search, 8% on parsing alone

 # before

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     610.6 ms ±   2.4 ms    [User: 602.5 ms, System: 7.8 ms]
  Range (min … max):   606.6 ms … 617.3 ms    50 runs

Benchmark 2: nix eval -f hackage-packages.nix
  Time (mean ± σ):     430.1 ms ±   1.4 ms    [User: 393.1 ms, System: 36.7 ms]
  Range (min … max):   428.2 ms … 434.2 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      3.032 s ±  0.005 s    [User: 2.808 s, System: 0.223 s]
  Range (min … max):    3.023 s …  3.041 s    50 runs

 # after

Benchmark 1: nix search --offline nixpkgs hello
  Time (mean ± σ):     574.7 ms ±   2.8 ms    [User: 566.3 ms, System: 8.0 ms]
  Range (min … max):   569.2 ms … 580.7 ms    50 runs

Benchmark 2: nix eval -f hackage-packages.nix
  Time (mean ± σ):     394.4 ms ±   0.8 ms    [User: 361.8 ms, System: 32.3 ms]
  Range (min … max):   392.7 ms … 395.7 ms    50 runs

Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
  Time (mean ± σ):      2.976 s ±  0.005 s    [User: 2.757 s, System: 0.218 s]
  Range (min … max):    2.966 s …  2.990 s    50 runs
2022-01-13 18:06:14 +01:00
Eelco Dolstra
81e7c40264 Optimize primop calls
We now parse function applications as a vector of arguments rather
than as a chain of binary applications, e.g. 'substring 1 2 "foo"' is
parsed as

  ExprCall { .fun = <substring>, .args = [ <1>, <2>, <"foo"> ] }

rather than

  ExprApp (ExprApp (ExprApp <substring> <1>) <2>) <"foo">

This allows primops to be called immediately (if enough arguments are
supplied) without having to allocate intermediate tPrimOpApp values.

On

  $ nix-instantiate --dry-run '<nixpkgs/nixos/release-combined.nix>' -A nixos.tests.simple.x86_64-linux

this gives a substantial performance improvement:

  user CPU time:      median =      0.9209  mean =      0.9218  stddev =      0.0073  min =      0.9086  max =      0.9340  [rejected, p=0.00000, Δ=-0.21433±0.00677]
  elapsed time:       median =      1.0585  mean =      1.0584  stddev =      0.0024  min =      1.0523  max =      1.0623  [rejected, p=0.00000, Δ=-0.20594±0.00236]

because it reduces the number of tPrimOpApp allocations from 551990 to
42534 (i.e. only small minority of primop calls are partially
applied) which in turn reduces time spent in the garbage collector.
2021-11-04 15:03:40 +01:00
Taeer Bar-Yam
f14660d5e2 reset yylloc when yyless(0) is called 2021-09-29 19:47:01 -04:00
Taeer Bar-Yam
8f9429dcab add antiquotations to paths 2021-08-06 06:46:05 -04:00
Pamplemousse
99f8fc995b libexpr: Fix read out-of-bound on the heap
Signed-off-by: Pamplemousse <xav.maso@gmail.com>
2021-07-14 09:09:42 -07:00
regnat
0d9e1af695 Remove an unknown pragma gcc warning 2020-12-02 14:33:20 +01:00
regnat
438977731c shut up clang warnings
- Fix some class/struct discrepancies
- Explicit the overloading of `run` in the `Cmd*` classes
- Ignore a warning in the generated lexer
2020-12-01 15:04:03 +01:00
Eelco Dolstra
e14e62fddd Remove trailing whitespace 2020-06-15 14:12:39 +02:00
Ben Burdette
3bc9155dfc a few more 'format's rremoved 2020-04-22 15:00:11 -06:00
Guillaume Maudoux
6a5bf9b143 simplify handling of extra '}' 2018-10-27 00:14:51 +02:00
aszlig
0ad643ed5c
libexpr: Use int64_t for NixInt
Using a 64bit integer on 32bit systems will come with a bit of a
performance overhead, but given that Nix doesn't use a lot of integers
compared to other types, I think the overhead is negligible also
considering that 32bit systems are in decline.

The biggest advantage however is that when we use a consistent integer
size across all platforms it's less likely that we miss things that we
break due to that. One example would be:

https://github.com/NixOS/nixpkgs/pull/44233

On Hydra it will evaluate, because the evaluator runs on a 64bit
machine, but when evaluating the same on a 32bit machine it will fail,
so using 64bit integers should make that consistent.

While the change of the type in value.hh is rather easy to do, we have a
few more options available for doing the conversion in the lexer:

  * Via an #ifdef on the architecture and using strtol() or strtoll()
    accordingly depending on which architecture we are. For the #ifdef
    we would need another AX_COMPILE_CHECK_SIZEOF in configure.ac.
  * Using istringstream, which would involve copying the value.
  * As we're already using boost, lexical_cast might be a good idea.

Spoiler: I went for the latter, first of all because lexical_cast does
have an overload for const char* and second of all, because it doesn't
involve copying around the input string. Also, because istringstream
seems to come with a bigger overhead than boost::lexical_cast:

https://www.boost.org/doc/libs/release/doc/html/boost_lexical_cast/performance.html

The first method (still using strtol/strtoll) also wasn't something I
pursued further, because it is also locale-aware which I doubt is what
we want, given that the regex for int is [0-9]+.

Signed-off-by: aszlig <aszlig@nix.build>
Fixes: #2339
2018-08-29 01:05:52 +02:00
Eelco Dolstra
1ad19232c4
Don't return negative numbers from the flex tokenizer
Fixes #1374.
Closes #2129.
2018-05-11 12:05:12 +02:00
Eelco Dolstra
f3c85f9eb3
Revert "Throw a specific error for incomplete parse errors."
This reverts commit 6498adb002. We don't
actually use IncompleteParseError in 'nix repl'.
2018-05-11 11:40:50 +02:00
Tuomas Tynkkynen
a0e38c16bc libexpr: Recognize newline in more places in lexer
Flex's regexes have an annoying feature: the dot matches everything
except a newline. This causes problems for expressions like:

"${0}\
"

where the backslash-newline combination matches this rule instead of the
intended one mentioned in the comment:

    <STRING>\$|\\|\$\\ {
                    /* This can only occur when we reach EOF, otherwise the above
                    (...|\$[^\{\"\\]|\\.|\$\\.)+ would have triggered.
                    This is technically invalid, but we leave the problem to the
                    parser who fails with exact location. */
                    return STR;
                }
However, the parser actually accepts the resulting token sequence
('"' DOLLAR_CURLY 0 '}' STR '"'), which is a problem because the lexer
rule didn't assign anything to yylval. Ultimately this leads to a crash
when dereferencing a NULL pointer in ExprConcatStrings::bindVars().

The fix does change the syntax of the language in some corner cases
but I think it's only turning previously invalid (or crashing) syntax
to valid syntax. E.g.

"a\
b"

and

''a''\
b''

were previously syntax errors but now both result in "a\nb".

Found by afl-fuzz.
2018-03-02 17:30:48 +02:00
Tuomas Tynkkynen
f67a7007a2 libexpr: Pre-reserve space in string in unescapeStr()
Avoids some malloc() traffic.
2018-02-16 04:39:43 +02:00
Eelco Dolstra
2c39e4eca0
Revert "Don't parse "x:x" as a URI"
This reverts commit f90f660b24.

This broke Hydra's release.nix, which contained

  preCheck = ''export LOGNAME=${LOGNAME:-foo}'';
2017-11-14 15:10:52 +01:00
Eelco Dolstra
f90f660b24
Don't parse "x:x" as a URI
URIs now have to contain "://" or start with "channel:".
2017-10-30 17:58:01 +01:00
Jörg Thalheim
2fd8f8bb99 Replace Unicode quotes in user-facing strings by ASCII
Relevant RFC: NixOS/rfcs#4

$ ag -l | xargs sed -i -e "/\"/s/’/'/g;/\"/s/‘/'/g"
2017-07-30 12:32:45 +01:00
Guillaume Maudoux
a143014d73 lexer: remove catch-all rules hiding real errors
With catch-all rules, we hide potential errors.
It turns out that a4744254 made one cath-all useless. Flex detected that
is was impossible to reach.
The other is more subtle, as it can only trigger on unfinished escapes
in unfinished strings, which only occurs at EOF.
2017-05-01 01:18:06 +02:00
Guillaume Maudoux
a474425425 Fix lexer to support $' in multiline strings. 2017-05-01 01:15:40 +02:00
Eelco Dolstra
603f08506e
Tweak error message 2016-12-06 17:18:40 +01:00
Guillaume Maudoux
e4b82af387 Improve error message on trailing path slashes 2016-11-27 17:48:46 +01:00
Guillaume Maudoux
a5e761dddb Fix comments parsing
Fixed the parsing of multiline strings ending with an even number of
stars, like /** this **/.
Added test cases for comments.
2016-11-13 17:20:34 +01:00
Scott Olson
6498adb002 Throw a specific error for incomplete parse errors.
`nix-repl` will use this for deciding whether to keep waiting for input or
error out right away.
2016-02-24 04:32:21 -06:00
Eelco Dolstra
b3e8d72770 Merge pull request #762 from ctheune/ctheune-floats
Implement floats
2016-02-12 12:49:59 +01:00
Eelco Dolstra
5d8b7eb3e1 Revert "Revert "next try for "don't abort when given unmatched '}' with 'start-condition stack underflow'. This fixes #751"""
This reverts commit b669d3d2e8.
2016-01-20 16:34:42 +01:00
Eelco Dolstra
b669d3d2e8 Revert "next try for "don't abort when given unmatched '}' with 'start-condition stack underflow'. This fixes #751""
This reverts commit ed23c8568e. Let's
merge this *after* the 1.11.1 release.
2016-01-20 00:05:28 +01:00
Fabian Schmitthenner
ed23c8568e next try for "don't abort when given unmatched '}' with 'start-condition stack underflow'. This fixes #751"
This reverts commit 8120b6fb8a and fixes the regression introduced in
8d22b26448.
2016-01-19 20:35:35 +00:00
Eelco Dolstra
8120b6fb8a Revert "don't abort when given unmatched '}' with 'start-condition stack underflow'. This fixes #751"
This reverts commit 8d22b26448. It
breaks Nixpkgs:

$ nix-env -qa
error: syntax error, unexpected IND_STR, expecting '}', at /home/eelco/Dev/nixpkgs-stable/pkgs/top-level/python-packages.nix:7605:8
2016-01-19 20:33:32 +01:00
Fabian Schmitthenner
8d22b26448 don't abort when given unmatched '}' with 'start-condition stack underflow'. This fixes #751 2016-01-12 20:40:41 +00:00
Christian Theune
a12a43046b Edge condition: parser did not pick up floats starting exactly with 0. 2016-01-05 09:54:49 +01:00
Christian Theune
f872262e08 Fix up float parsing. 2016-01-05 09:46:37 +01:00
Christian Theune
494fc5acbb Try a simplified version of float lexing that didn't work.
The last one I tried was botchered anyway ...
2016-01-05 00:53:22 +01:00