Perfect Bootstrap #38

Open
opened 2023-08-28 03:52:47 +03:00 by max · 0 comments
max commented 2023-08-28 03:52:47 +03:00 (Migrated from git.privatevoid.net)

The cluster should be able to bootstrap itself completely from scratch. It should be possible to run an entire virtual cluster in a giant NixOS test to verify that the bootstrap process works.

Implementation idea:

  • Every cluster service depends on cluster-ready.target
  • cluster-ready.target waits for some service that ensures the node is joined to the cluster (cluster-join.service)
  • If the node is not already joined, the service waits for the join procedure to occur

Cluster Join Service:

  • A simple libp2p application
  • Writes key and ID down permanently (e.g. $STATE_DIRECTORY/key) (this key can also be installed via side channels)
  • Joins the DHT, sets up relays, etc.
  • Waits on protocol: /cluster/bootstrap/1.0.0
  • Each connecting node needs to be authenticated via side channel
  • After n (probably 3) nodes are joined together, keys for further communication (WireGuard) are exchanged, the service marks this node as "joined", and exits

Side channels:

  • Any "management interface" that enables securely inserting keys into a node
  • Some of these may require continuous polling
  • For VMs: hijack cloud-init metadata support
  • Physical hosts: pre-existing keys from customized system installation
  • Manual key placement (e.g. previously confirmed trustworthy SSH connection, physical access)
  • For VM tests: dummy key placement

Protocol details:

  • Probably just a simple JSON RPC thing
  • Flow: 1 request, 1 response, close stream
The cluster should be able to bootstrap itself completely from scratch. It should be possible to run an entire virtual cluster in a giant NixOS test to verify that the bootstrap process works. Implementation idea: - Every cluster service depends on `cluster-ready.target` - `cluster-ready.target` waits for some service that ensures the node is joined to the cluster (`cluster-join.service`) - If the node is not already joined, the service waits for the join procedure to occur Cluster Join Service: - A simple libp2p application - Writes key and ID down permanently (e.g. `$STATE_DIRECTORY/key`) (this key can also be installed via side channels) - Joins the DHT, sets up relays, etc. - Waits on protocol: `/cluster/bootstrap/1.0.0` - Each connecting node needs to be authenticated via side channel - After `n` (probably 3) nodes are joined together, keys for further communication (WireGuard) are exchanged, the service marks this node as "joined", and exits Side channels: - Any "management interface" that enables securely inserting keys into a node - Some of these may require continuous polling - For VMs: hijack cloud-init metadata support - Physical hosts: pre-existing keys from customized system installation - Manual key placement (e.g. previously confirmed trustworthy SSH connection, physical access) - For VM tests: dummy key placement Protocol details: - Probably just a simple JSON RPC thing - Flow: 1 request, 1 response, close stream
max self-assigned this 2024-11-16 23:52:51 +02:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: privatevoid.net/depot#38
No description provided.