nix-super

mirror of https://github.com/privatevoid-net/nix-super.git synced 2024-11-14 02:06:16 +02:00

Author	SHA1	Message	Date
Eelco Dolstra	a737f51fd9	Retry all SQLite operations To deal with SQLITE_PROTOCOL, we also need to retry read-only operations.	2013-10-16 15:58:20 +02:00
Eelco Dolstra	ff02f5336c	Fix a race in registerFailedPath() Registering the path as failed can fail if another process does the same thing after the call to hasPathFailed(). This is extremely unlikely though.	2013-10-16 14:55:53 +02:00
Eelco Dolstra	4bd5282573	Convenience macros for retrying a SQLite transaction	2013-10-16 14:46:35 +02:00
Eelco Dolstra	bce14d0f61	Don't wrap read-only queries in a transaction There is no risk of getting an inconsistent result here: if the ID returned by queryValidPathId() is deleted from the database concurrently, subsequent queries involving that ID will simply fail (since IDs are never reused).	2013-10-16 14:36:53 +02:00
Eelco Dolstra	7cdefdbe73	Print a distinct warning for SQLITE_PROTOCOL	2013-10-16 14:27:36 +02:00
Eelco Dolstra	d05bf04444	Treat SQLITE_PROTOCOL as SQLITE_BUSY In the Hydra build farm we fairly regularly get SQLITE_PROTOCOL errors (e.g., "querying path in database: locking protocol"). The docs for this error code say that it "is returned if some other process is messing with file locks and has violated the file locking protocol that SQLite uses on its rollback journal files." However, the SQLite source code reveals that this error can also occur under high load: if( cnt>5 ){ int nDelay = 1; /* Pause time in microseconds / if( cnt>100 ){ VVA_ONLY( pWal->lockError = 1; ) return SQLITE_PROTOCOL; } if( cnt>=10 ) nDelay = (cnt-9)238; /* Max delay 21ms. Total delay 996ms */ sqlite3OsSleep(pWal->pVfs, nDelay); } i.e. if certain locks cannot be not acquired, SQLite will retry a number of times before giving up and returing SQLITE_PROTOCOL. The comments say: Circumstances that cause a RETRY should only last for the briefest instances of time. No I/O or other system calls are done while the locks are held, so the locks should not be held for very long. But if we are unlucky, another process that is holding a lock might get paged out or take a page-fault that is time-consuming to resolve, during the few nanoseconds that it is holding the lock. In that case, it might take longer than normal for the lock to free. ... The total delay time before giving up is less than 1 second. On a heavily loaded machine like lucifer (the main Hydra server), which often has dozens of processes waiting for I/O, it seems to me that a page fault could easily take more than a second to resolve. So, let's treat SQLITE_PROTOCOL as SQLITE_BUSY and retry the transaction. Issue NixOS/hydra#14.	2013-10-16 14:19:59 +02:00
Eelco Dolstra	936f9d45ba	Don't apply the CPU affinity hack to nix-shell (and other Perl programs) As discovered by Todd Veldhuizen, the shell started by nix-shell has its affinity set to a single CPU. This is because nix-shell connects to the Nix daemon, which causes the affinity hack to be applied. So we turn this off for Perl programs.	2013-09-06 16:36:56 +02:00
Eelco Dolstra	b29d3f4aee	Only show trace messages when tracing is enabled	2013-09-02 12:01:04 +02:00
Eelco Dolstra	efe4289464	Add an option to limit the log output of builders This is mostly useful for Hydra to deal with builders that get stuck in an infinite loop writing data to stdout/stderr.	2013-09-02 11:58:18 +02:00
Ivan Kozik	34bb806f74	Fix typos, especially those that end up in the Nix manual	2013-08-26 11:15:22 +02:00
Gergely Risko	c6c024ca6f	Fix personality switching from x86_64 to i686 On Linux, Nix can build i686 packages even on x86_64 systems. It's not enough to recognize this situation by settings.thisSystem, we also have to consult uname(). E.g. we can be running on a i686 Debian with an amd64 kernel. In that situation settings.thisSystem is i686-linux, but we still need to change personality to i686 to make builds consistent.	2013-08-26 11:12:35 +02:00
Eelco Dolstra	a583a2bc59	Run the daemon worker on the same CPU as the client On a system with multiple CPUs, running Nix operations through the daemon is significantly slower than "direct" mode: $ NIX_REMOTE= nix-instantiate '<nixos>' -A system real 0m0.974s user 0m0.875s sys 0m0.088s $ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system real 0m2.118s user 0m1.463s sys 0m0.218s The main reason seems to be that the client and the worker get moved to a different CPU after every call to the worker. This patch adds a hack to lock them to the same CPU. With this, the overhead of going through the daemon is very small: $ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system real 0m1.074s user 0m0.809s sys 0m0.098s	2013-08-07 14:02:04 +02:00
Eelco Dolstra	a4921b8ceb	Revert "build-remote.pl: Enforce timeouts locally" This reverts commit `69b8f9980f`. The timeout should be enforced remotely. Otherwise, if the garbage collector is running either locally or remotely, if will block the build or closure copying for some time. If the garbage collector takes too long, the build may time out, which is not what we want. Also, on heavily loaded systems, copying large paths to and from the remote machine can take a long time, also potentially resulting in a timeout.	2013-07-18 12:52:29 +02:00
Shea Levy	16591eb3cc	Allow bind-mounting regular files into the chroot mount(2) with MS_BIND allows mounting a regular file on top of a regular file, so there's no reason to only bind directories. This allows finer control over just which files are and aren't included in the chroot without having to build symlink trees or the like. Signed-off-by: Shea Levy <shea@shealevy.com>	2013-07-15 16:01:33 +02:00
Eelco Dolstra	aeb810b01e	Garbage collector: Don't follow symlinks arbitrarily Only indirect roots (symlinks to symlinks to the Nix store) are now supported.	2013-07-12 14:03:36 +02:00
Eelco Dolstra	7ccd946407	Don't set $preferLocalBuild and $requiredSystemFeatures in builders With C++ std::map, doing a comparison like ‘map["foo"] == ...’ has the side-effect of adding a mapping from "foo" to the empty string if "foo" doesn't exist in the map. So we ended up setting some environment variables by accident.	2013-06-20 18:07:27 +00:00
Eelco Dolstra	5558652709	Don't substitute derivations that have preferLocalBuild set In particular this means that "trivial" derivations such as writeText are not substituted, reducing the number of GET requests to the binary cache by about 200 on a typical NixOS configuration.	2013-06-20 19:26:31 +02:00
Eelco Dolstra	1906cce6fc	Increase SQLite's auto-checkpoint interval Common operations like instantiating a NixOS system config no longer fitted in 8192 pages, leading to more fsyncs. So increase this limit.	2013-06-20 14:01:33 +00:00
Eelco Dolstra	9b11165aec	Disable the copy-from-other-stores substituter This substituter basically cannot work reliably since we switched to SQLite, since SQLite databases may need write access to open them even just for reading (and in WAL mode they always do).	2013-06-20 12:01:33 +02:00
Eelco Dolstra	22144afa8d	Don't keep "disabled" substituters running For instance, it's pointless to keep copy-from-other-stores running if there are no other stores, or download-using-manifests if there are no manifests. This also speeds things up because we don't send queries to those substituters.	2013-06-20 11:55:15 +02:00
Eelco Dolstra	1b6ee8f4c7	Allow hard links between the outputs of a derivation	2013-06-13 17:29:56 +02:00
Eelco Dolstra	cd49ee0897	Fix a security bug in hash rewriting Before calling dumpPath(), we have to make sure the files are owned by the build user. Otherwise, the build could contain a hard link to (say) /etc/shadow, which would then be read by the daemon and rewritten as a world-readable file. This only affects systems that don't have hard link restrictions enabled.	2013-06-13 17:12:24 +02:00
Eelco Dolstra	1e2c7c04b1	Fix assertion failure in canonicalisePathMetaData() after hash rewriting The assertion in canonicalisePathMetaData() failed because the ownership of the path already changed due to the hash rewriting. The solution is not to check the ownership of rewritten paths. Issue #122.	2013-06-13 17:12:06 +02:00
Eelco Dolstra	6cc2a8f8ed	computeFSClosure: Only process the missing/corrupt paths Issue #122.	2013-06-13 16:43:20 +02:00
Eelco Dolstra	f9ff67e948	In repair mode, update the hash of rebuilt paths Otherwise subsequent invocations of "--repair" will keep rebuilding the path. This only happens if the path content differs between builds (e.g. due to timestamps).	2013-06-13 14:46:07 +02:00
Eelco Dolstra	ca70fba0bf	Remove obsolete EOF checks	2013-06-07 15:10:23 +02:00
Eelco Dolstra	5959c591a0	Process stderr from substituters while doing have/info queries	2013-06-07 15:02:14 +02:00
Eelco Dolstra	c5f9d0d080	Buffer reads from the substituter This greatly reduces the number of system calls.	2013-06-07 14:00:23 +02:00
Eelco Dolstra	b09b87321c	nix-store --export: Export paths in topologically sorted order Fixes #118.	2013-05-23 14:55:36 -04:00
Eelco Dolstra	2ee9da9e22	In trace messages, don't print the output path This doesn't work if there is no output named "out". Hydra didn't use it anyway.	2013-05-10 00:24:33 +02:00
Eelco Dolstra	6eba05613a	Communicate build timeouts to Hydra	2013-05-09 18:39:04 +02:00
Eelco Dolstra	69b8f9980f	build-remote.pl: Enforce timeouts locally Don't pass --timeout / --max-silent-time to the remote builder. Instead, let the local Nix process terminate the build if it exceeds a timeout. The remote builder will be killed as a side-effect. This gives better error reporting (since the timeout message from the remote side wasn't properly propagated) and handles non-Nix problems like SSH hangs.	2013-05-09 17:17:17 +02:00
Eelco Dolstra	470553bd05	Don't let stderr writes in substituters cause a deadlock	2013-05-01 13:21:39 +02:00
Eelco Dolstra	0374d94437	addAdditionalRoots(): Check each path only once	2013-04-26 12:07:25 +02:00
Eelco Dolstra	772b70952f	Fix --timeout I'm not sure if it has ever worked correctly. The line "lastWait = after;" seems to mean that the timer was reset every time a build produced log output. Note that the timeout is now per build, as documented ("the maximum number of seconds that a builder can run").	2013-04-23 18:04:59 +02:00
Eelco Dolstra	934cf2d1f4	Nix daemon: respect build timeout from the client	2013-04-23 16:59:06 +02:00
Eelco Dolstra	258897c265	Complain if /homeless-shelter exists	2013-04-04 11:16:26 +02:00
Shea Levy	cc63db1dd5	makeStoreWritable: Ask forgiveness, not permission It is surprisingly impossible to check if a mountpoint is a bind mount on Linux, and in my previous commit I forgot to check if /nix/store was even a mountpoint at all. statvfs.f_flag is not populated with MS_BIND (and even if it were, my check was wrong in the previous commit). Luckily, the semantics of mount with MS_REMOUNT \| MS_BIND make both checks unnecessary: if /nix/store is not a mountpoint, then mount will fail with EINVAL, and if /nix/store is not a bind-mount, then it will not be made writable. Thus, if /nix/store is not a mountpoint, we fail immediately (since we don't know how to make it writable), and if /nix/store IS a mountpoint but not a bind-mount, we fail at first write (see below for why we can't check and fail immediately). Note that, due to what is IMO buggy behavior in Linux, calling mount with MS_REMOUNT \| MS_BIND on a non-bind readonly mount makes the mountpoint appear writable in two places: In the sixth (but not the 10th!) column of mountinfo, and in the f_flags member of struct statfs. All other syscalls behave as if the mount point were still readonly (at least for Linux 3.9-rc1, but I don't think this has changed recently or is expected to soon). My preferred semantics would be for MS_REMOUNT \| MS_BIND to fail on a non-bind mount, as it doesn't make sense to remount a non bind-mount as a bind mount.	2013-03-25 19:00:16 +01:00
Shea Levy	2c9cf50746	makeStoreWritable: Use statvfs instead of /proc/self/mountinfo to find out if /nix/store is a read-only bind mount /nix/store could be a read-only bind mount even if it is / in its own filesystem, so checking the 4th field in mountinfo is insufficient. Signed-off-by: Shea Levy <shea@shealevy.com>	2013-03-25 19:00:16 +01:00
Eelco Dolstra	bdd4646338	Revert "Prevent config.h from being clobbered" This reverts commit `28bba8c44f`.	2013-03-08 01:24:59 +01:00
Eelco Dolstra	28bba8c44f	Prevent config.h from being clobbered	2013-03-07 23:55:55 +01:00
Eelco Dolstra	8057a192e3	Handle systems without lutimes() or lchown()	2013-02-28 19:55:09 +01:00
Eelco Dolstra	f45c731cd7	Handle symlinks properly Now it's really brown paper bag time...	2013-02-28 14:51:08 +01:00
Eelco Dolstra	0111ba98ea	Handle hard links to other files in the output	2013-02-27 17:18:41 +01:00
Eelco Dolstra	b008674e46	Refactoring: Split off the non-recursive canonicalisePathMetaData() Also, change the file mode before changing the owner. This prevents a slight time window in which a setuid binary would be setuid root.	2013-02-27 16:42:19 +01:00
Eelco Dolstra	5526a282b5	Security: Don't allow builders to change permissions on files they don't own It turns out that in multi-user Nix, a builder may be able to do ln /etc/shadow $out/foo Afterwards, canonicalisePathMetaData() will be applied to $out/foo, causing /etc/shadow's mode to be set to 444 (readable by everybody but writable by nobody). That's obviously Very Bad. Fortunately, this fails in NixOS's default configuration because /nix/store is a bind mount, so "ln" will fail with "Invalid cross-device link". It also fails if hard-link restrictions are enabled, so a workaround is: echo 1 > /proc/sys/fs/protected_hardlinks The solution is to check that all files in $out are owned by the build user. This means that innocuous operations like "ln ${pkgs.foo}/some-file $out/" are now rejected, but that already failed in chroot builds anyway.	2013-02-26 02:30:19 +01:00
Ludovic Courtès	3e067ac11c	Add `Settings::nixDaemonSocketFile'.	2013-02-19 10:19:18 +01:00
Ludovic Courtès	5ea138dc4b	Enable chroot support on old glibc versions.	2013-02-19 10:19:11 +01:00
Eelco Dolstra	5e9c3da412	Only warn about SQLite being busy once No need to get annoying.	2013-01-23 16:45:10 +01:00
Eelco Dolstra	536c85ea49	Store build logs in /nix/var/log/nix/drvs/<XX> ...where <XX> is the first two characters of the derivation. Otherwise /nix/var/log/nix/drvs may become so large that we run into all sorts of weird filesystem limits/inefficiences. For instance, ext3/ext4 filesystems will barf with "ext4_dx_add_entry:1551: Directory index full!" once you hit a few million files.	2013-01-17 15:47:26 +01:00

1 2 3 4 5 ...

780 commits