nix-super/src/libutil/url-parts.hh

45 lines
2.2 KiB
C++
Raw Normal View History

#pragma once
///@file
#include <string>
#include <regex>
namespace nix {
// URI stuff.
const static std::string pctEncoded = "(?:%[0-9a-fA-F][0-9a-fA-F])";
const static std::string schemeRegex = "(?:[a-z][a-z0-9+.-]*)";
const static std::string ipv6AddressSegmentRegex = "[0-9a-fA-F:]+(?:%\\w+)?";
libstore/openStore: fix stores with IPv6 addresses In `nixStable` (2.3.7 to be precise) it's possible to connect to stores using an IPv6 address: nix ping-store --store ssh://root@2001:db8::1 This is also useful for `nixops(1)` where you could specify an IPv6 address in `deployment.targetHost`. However, this behavior is broken on `nixUnstable` and fails with the following error: $ nix store ping --store ssh://root@2001:db8::1 don't know how to open Nix store 'ssh://root@2001:db8::1' This happened because `openStore` from `libstore` uses the `parseURL` function from `libfetchers` which expects a valid URL as defined in RFC2732. However, this is unsupported by `ssh(1)`: $ nix store ping --store 'ssh://root@[2001:db8::1]' cannot connect to 'root@[2001:db8::1]' This patch now allows both ways of specifying a store (`root@2001:db8::1`) and also `root@[2001:db8::1]` since the latter one is useful to pass query parameters to the remote store. In order to achieve this, the following changes were made: * The URL regex from `url-parts.hh` now allows an IPv6 address in the form `2001:db8::1` and also `[2001:db8::1]`. * In `libstore`, a new function named `extractConnStr` ensures that a proper URL is passed to e.g. `ssh(1)`: * If a URL looks like either `[2001:db8::1]` or `root@[2001:db8::1]`, the brackets will be removed using a regex. No additional validation is done here as only strings parsed by `parseURL` are expected. * In any other case, the string will be left untouched. * The rules above only apply for `LegacySSHStore` and `SSHStore` (a.k.a `ssh://` and `ssh-ng://`). Unresolved questions: * I'm not really sure whether we want to allow both variants of IPv6 addresses in the URL parser. However it should be noted that both seem to be possible according to RFC2732: > This document incudes an update to the generic syntax for Uniform > Resource Identifiers defined in RFC 2396 [URL]. It defines a syntax > for IPv6 addresses and allows the use of "[" and "]" within a URI > explicitly for this reserved purpose. * Currently, it's not supported to specify a port number behind the hostname, however it seems as this is not really supported by the URL parser. Hence, this is probably out of scope here.
2020-12-05 16:33:16 +02:00
const static std::string ipv6AddressRegex = "(?:\\[" + ipv6AddressSegmentRegex + "\\]|" + ipv6AddressSegmentRegex + ")";
const static std::string unreservedRegex = "(?:[a-zA-Z0-9-._~])";
const static std::string subdelimsRegex = "(?:[!$&'\"()*+,;=])";
const static std::string hostnameRegex = "(?:(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + ")*)";
const static std::string hostRegex = "(?:" + ipv6AddressRegex + "|" + hostnameRegex + ")";
const static std::string userRegex = "(?:(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + "|:)*)";
const static std::string authorityRegex = "(?:" + userRegex + "@)?" + hostRegex + "(?::[0-9]+)?";
const static std::string pcharRegex = "(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + "|[:@])";
const static std::string queryRegex = "(?:" + pcharRegex + "|[/? \"])*";
const static std::string segmentRegex = "(?:" + pcharRegex + "*)";
const static std::string absPathRegex = "(?:(?:/" + segmentRegex + ")*/?)";
const static std::string pathRegex = "(?:" + segmentRegex + "(?:/" + segmentRegex + ")*/?)";
/// A Git ref (i.e. branch or tag name).
/// \todo check that this is correct.
const static std::string refRegexS = "[a-zA-Z0-9@][a-zA-Z0-9_.\\/@-]*";
extern std::regex refRegex;
/// Instead of defining what a good Git Ref is, we define what a bad Git Ref is
/// This is because of the definition of a ref in refs.c in https://github.com/git/git
/// See tests/fetchGitRefs.sh for the full definition
const static std::string badGitRefRegexS = "//|^[./]|/\\.|\\.\\.|[[:cntrl:][:space:]:?^~\[]|\\\\|\\*|\\.lock$|\\.lock/|@\\{|[/.]$|^@$|^$";
extern std::regex badGitRefRegex;
/// A Git revision (a SHA-1 commit hash).
const static std::string revRegexS = "[0-9a-fA-F]{40}";
extern std::regex revRegex;
/// A ref or revision, or a ref followed by a revision.
const static std::string refAndOrRevRegex = "(?:(" + revRegexS + ")|(?:(" + refRegexS + ")(?:/(" + revRegexS + "))?))";
}